# 3. Equilibrium volume of FCC Si, take 2¶

Here we will continue where the previous tutorial left of. In particular we will now create a workchain that execute the VASP workchain (basically, this executes VASP) for each volume and stores it. In the end we will even use some Python functions to locate the minimum and store it. We hope that you after this tutorial will finally start to appreciate the convenience of AiiDA-VASP. As before we assume you have completed the steps of the previous tutorial.

In order to make it as easy as possible, we have created a cookie cutter recipe for creating AiiDA-VASP workchains. This is located in cookiecutters/workchain on the root of the repository. If you have not cloned it, you can download only the cookiecutter folder using:

%/$svn checkout https://github.com/aiida-vasp/aiida-vasp/branches/develop/cookiecutters  What we want to do is to create a workchain that calculates some equation of state, in this case over volume. For simplicity we will call the workchain Eos. It is always recommended to dissect the operations we will perform into parts before writing the workchain. In this way, it is possibly to possibly reuse larger parts of the workchain at a later point. Also, it is easier to understand what is going on from just inspecting the definitions of the inputs, steps and outputs of the workchain. Please, have a look at Workchains and references therein before continuing. 1. Make sure you have cookiecutter installed: %/$ pip install cookiecutter

2. Now we simply execute the cookiecutter using the workchains folder as the template:

%/$cookiecutter cookiecutters/workchains workchain_name [testchain]: eos workchain_docstring [This workchain is a testchain]: This workchain will accept a dictionary of structures and extract the total energies for each structure. The data is saved and the energy minimum is calculated and stored. next_workchain_to_call [vasp.vasp]:  3. A new folder eos was created which contains the eos.py workchain file. Inspect it. Maybe the most important part of a workchain is the define function, which, as its name implies, defines the workchain. For this particular case it looks like:  def define(cls, spec): super(EosWorkChain, cls).define(spec) spec.expose_inputs(cls._next_workchain, exclude=['structure']) spec.input_namespace('structures', valid_type=DataFactory('structure'), dynamic=True, help='a dictionary of structures to use') spec.exit_code(0, 'NO_ERROR', message='the sun is shining') spec.exit_code(420, 'ERROR_NO_CALLED_WORKCHAIN', message='no called workchain detected') spec.exit_code(500, 'ERROR_UNKNOWN', message='unknown error detected in the eos workchain') spec.outline( cls.initialize, while_(cls.run_next_workchains)( cls.init_next_workchain, cls.run_next_workchain, cls.verify_next_workchain, cls.extract_quantities ), cls.finalize ) # yapf: disable spec.output('quantities', valid_type=DataFactory('array'), help='a container for various quantities') spec.expose_outputs(cls._next_workchain)  We have defined some inputs: • All inputs defined in the _next_workchain, which in this case is the VASP workchain. We can include those very easily by using expose_inputs. You can of course expose inputs from other workchains as long as AiiDA can find them. • A dictionary structures containing as many structures you want to run. Notice that the data type of each entry is StructureData. • Some generic exit codes. Then we have defined some outputs: • We attach all outputs from _next_workchain, i.e. the VASP workchain to this workchain (basically a link is created to avoid storing things twice). Similar to the inputs, expose_outputs can be leveraged. Finally, there is the outline which tells the workchain which functions to run. In this case we have: • A function that initializes what we need, initialize. • A while loop that iterates over all the structures. • A init_next_workchain function that initializes what we need for the next workchain at the current iteration (here we make sure the specific structure is set). • A run_next_workchain which basically executes VASP workchain. • A verify_next_workchain which verifies that there is a valid VASP workchain and inherits any exit codes present from VASP workchain. • A finalize which stores the results. 4. The workchain is not completely ready. We need to extract the total energies and specify the general workchain further. Please make the following changes to the generated workchain: --- /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/eos_base.py +++ /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/eos.py @@ -10,6 +10,8 @@ # pylint: disable=attribute-defined-outside-init import random import numpy as np +from scipy.optimize import minimize +from scipy.interpolate import interp1d from aiida.common.extendeddicts import AttributeDict from aiida.engine import calcfunction, WorkChain, while_, append_ @@ -47,13 +49,13 @@ cls.init_next_workchain, cls.run_next_workchain, cls.verify_next_workchain, - cls.extract_quantities + cls.extract_volume_and_energy ), cls.finalize ) # yapf: disable - spec.output('quantities', valid_type=DataFactory('array'), help='a container for various quantities') - spec.expose_outputs(cls._next_workchain) + spec.output('eos', valid_type=DataFactory('array'), help='a list containing the cell volumes and total energies') + spec.output('eos_minimum', valid_type=DataFactory('dict'), help='a dictionary containing the cell volume at energy minimum') def initialize(self): """Initialize the eos workchain.""" @@ -73,13 +75,13 @@ self.ctx.is_finished = False # Define an interation index - self.ctx.interation = 0 + self.ctx.iteration = 0 # Define the context inputs self.ctx.inputs = AttributeDict() - # Define container to store quantities that is extracted in each step - self.ctx.quantities_container = [] + # Define the total energies list + self.ctx.total_energies = [] def _init_inputs(self): """Initialize inputs.""" @@ -101,7 +103,7 @@ def init_next_workchain(self): """Initialize the next workchain.""" - # Elevate iteration index + # Elavate iteraetion index self.ctx.iteration += 1 # Check that the context inputs exists @@ -156,42 +158,81 @@ self.report('The called {}<{}> returned a non-zero exit status. ' 'The exit status {} is inherited'.format(workchain.__class__.__name__, workchain.pk, self.ctx.exit_code)) - # Stop further executions of workchains if there are no more structure + # Stop further execution of workchains if there are no more structure # entries in the structures dictionary if not self.ctx.structures: self.ctx.is_finished = True return self.ctx.exit_code - def extract_quantities(self): - """Extract the quantities you want and store them in the container.""" - - # What happens in this routine needs modifications by you. - # Typically you get the output of the called workchain by doing - # workchain = self.ctx.workchains[-1] - # some_output_quantity = workchain.outputs.some_output_quantity - - # An example which stores nonsense. - self.ctx.quantities_container.append([self.ctx.iteration, 'Some quantity for iteration: {}'.format(self.ctx.iteration)]) + def extract_volume_and_energy(self): + """Extract the cell volume and total energy for this structure.""" + + workchain = self.ctx.workchains[-1] + + # Fetch the total energy + misc = workchain.outputs.misc.get_dict() + total_energy = misc['total_energies']['energy_extrapolated'] + + # Fetch the volume + volume = self.ctx.inputs.structure.get_cell_volume() + + # Store both in a list + self.ctx.total_energies.append([volume, total_energy]) def finalize(self): """ Finalize the workchain. - Take the quantity container and set it as an output of this workchain. + Take the total energies container and set is as an output of this workchain. """ # Due to data provenance we cannot return AiiDA data containers that have # not been passed through a calcfunction, workfunction or a workchain. Create this now. - quantities_container = store_quantities(DataFactory('list')(list=self.ctx.quantities_container)) + total_energies = store_total_energies(DataFactory('list')(list=self.ctx.total_energies)) + + # Let us also try to find a better minimum, just as an example using the power of Python + energy = locate_minimum(total_energies) # And then store the output on the workchain - self.out('quantities', quantities_container) + self.out('eos', total_energies) + self.out('eos_minimum', energy) @calcfunction -def store_quantities(quantities_container): - """Stores the quantities to keep data provenance.""" - quantities_container_list = quantities_container.get_list() - quantities_container_array = DataFactory('array')() - quantities_container_array.set_array('quantities', np.array(quantities_container_list)) - return quantities_container_array +def store_total_energies(total_energies): + """Stores the total energies in ArrayData to keep data provenance.""" + total_energies_list = total_energies.get_list() + # Let us also sort by volume as we picked the entries in the structures by random + # above + total_energies_array = np.array(total_energies_list) + total_energies_array_sorted = total_energies_array[total_energies_array[:, 0].argsort()] + array_data = DataFactory('array')() + array_data.set_array('eos', total_energies_array_sorted) + + return array_data + + +@calcfunction +def locate_minimum(total_energies): + """Locate the volume with the lowest energy using interpolation.""" + total_energies_array = total_energies.get_array('eos') + volumes = total_energies_array[:, 0] + energies = total_energies_array[:, 1] + + # Establish some initial guess (take lowest energy point from the original dataset) + min_energy_guess_index = np.argmin(energies) + + # Create the function that can be used to extract interpolated values + # Using cubic interpolation here, which is not necessarly physically correct, + # only serves as an example. Please have a look at the theory behind more realiastic + # models that can be used to fit such data. + new_energies = interp1d(volumes, energies, kind='cubic') + + # Use minimize from scipy to locate the minimal point that can be ejected by the + # interpolation routines + minimized_point = minimize(new_energies, volumes[min_energy_guess_index], tol=1e-3) + + # Create a dictionary to house the results and return + dict_data = DataFactory('dict')(dict={'volume': minimized_point.x[0], 'energy': minimized_point.fun}) + + return dict_data  and save it as eos.py. Or you could also download it: %/$ wget https://github.com/aiida-vasp/aiida-vasp/raw/develop/tutorials/eos.py


The majority of changes were related to being more specific, except two things:

• The necessity of decorating the function that generates the output array containing the volume and total energies in a calcfunction. This is to preserve data provenance, otherwise we would not have known how the data was collected from each of the underlying workchains.

• The inclusion of a calcfunction that interpolates the calculated data to find and ever better estimate of the volume at the energy minima. The example uses a cubic fit, which is certainly not very physical and should not be used in production. It is only to show how simply it is to leverage the power of Python, NumPy and SciPy. Again, this was decorated with a calcfunction in order to make sure AiiDA can preserve the data provenance.

5. Next, download the launch script that is tailored to launch the workchain we have now developed:

%/$wget https://github.com/aiida-vasp/aiida-vasp/raw/develop/tutorials/run_fcc_si_workchain.py  6. Change the options and code_string as you did previously. Warning Make sure you have activated your AiiDA virtual environment and that the AiiDA daemon is running before continuing. 7. Now we need to make sure the daemon can pick up the workchain. We can do this by making sure the daemon sees the directory where eos.py and run_fcc_si_workchain.py is located. The simplest approach is to add the following, to your virtual environment activate script (assuming you do not use Conda): $ echo "export PYTHONPATH=$PYTHONPATH:<yourdirectory>" >> ~/env/aiida-vasp/bin/activate  assuming <yourdirectory> is the directory containing the eos.py and run_fcc_si_workchain.py files. The location of the activate is assumed from the previous steps in the tutorial. If you use Conda, please do: % echo "export PYTHONPATH=$PYTHONPATH:<yourdirectory>" >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh  8. Submit the workchain by running the call script: %/$ python run_fcc_si_workchain.py

9. After a while we check the status:

%/$verdi process list -a PK Created Process label Process State Process status ------ --------- ---------------------------- ----------------- ----------------------------------------------------------- 103573 14m ago EosWorkChain ⏹ Finished [0] 103574 14m ago VaspWorkChain ⏹ Finished [0] 103575 14m ago VaspCalculation ⏹ Finished [0] 103579 13m ago VaspWorkChain ⏹ Finished [0] 103580 13m ago VaspCalculation ⏹ Finished [0] 103584 12m ago VaspWorkChain ⏹ Finished [0] 103585 12m ago VaspCalculation ⏹ Finished [0] 103589 11m ago VaspWorkChain ⏹ Finished [0] 103590 11m ago VaspCalculation ⏹ Finished [0] 103594 10m ago VaspWorkChain ⏹ Finished [0] 103595 10m ago VaspCalculation ⏹ Finished [0] 103599 9m ago VaspWorkChain ⏹ Finished [0] 103600 9m ago VaspCalculation ⏹ Finished [0] 103604 8m ago VaspWorkChain ⏹ Finished [0] 103605 8m ago VaspCalculation ⏹ Finished [0] 103609 7m ago VaspWorkChain ⏹ Finished [0] 103610 7m ago VaspCalculation ⏹ Finished [0] 103614 5m ago VaspWorkChain ⏹ Finished [0] 103615 5m ago VaspCalculation ⏹ Finished [0] 103620 4m ago store_total_energies ⏹ Finished [0] 103622 4m ago locate_minimum ⏹ Finished [0]  As you can see, seven VASP calculation was performed, one for each supplied volume. Also, there is a separate entry for the storrage of the total energies, which also performs a sort. In this was, all generated results are trackable and we have preserved data provenance. 10. Let us have a look at the output of EosWorkChain: %/$ verdi process show 103573
Property       Value
-------------  ------------------------------------
type           WorkChainNode
pk             103573
uuid           1055b7ac-e02e-4510-9ed8-62177a9c2bd1
label
description
ctime          2019-10-08 12:50:46.739221+00:00
mtime          2019-10-08 13:00:47.530855+00:00
process state  Finished
exit status    0
computer       [6] mycluster

Inputs              PK      Type
------------------  ------  -------------
structures
silicon_at_4_3  103563  StructureData
silicon_at_4_2  103562  StructureData
silicon_at_4_1  103561  StructureData
silicon_at_4_0  103560  StructureData
silicon_at_3_9  103559  StructureData
silicon_at_3_8  103558  StructureData
silicon_at_3_7  103557  StructureData
silicon_at_3_6  103556  StructureData
silicon_at_3_5  103555  StructureData
clean_workdir       103572  Bool
code                101271  Code
kpoints             103564  KpointsData
max_iterations      103571  Int
options             103568  Dict
parameters          103565  Dict
potential_family    103566  Str
potential_mapping   103567  Dict
settings            103569  Dict
verbose             103570  Bool

Outputs          PK  Type
-----------  ------  ---------
eos          103621  ArrayData
eos_minimum  103623  Dict

Called        PK  Type
--------  ------  ----------------
CALL      103622  CalcFunctionNode
CALL      103620  CalcFunctionNode
CALL      103614  WorkChainNode
CALL      103609  WorkChainNode
CALL      103604  WorkChainNode
CALL      103599  WorkChainNode
CALL      103594  WorkChainNode
CALL      103589  WorkChainNode
CALL      103584  WorkChainNode
CALL      103579  WorkChainNode
CALL      103574  WorkChainNode

Log messages
---------------------------------------------
There are 9 log messages for this calculation
Run 'verdi process report 103573' to see them

11. Inspect the total energies versus volume:

%/$verdi data array show 103621 { "eos": [ [ 10.71875, -4.42341934 ], [ 11.664, -4.66006377 ], [ 12.66325, -4.79595549 ], [ 13.718, -4.86303425 ], [ 14.82975, -4.87588342 ], [ 16.0, -4.8481406 ], [ 17.23025, -4.78451894 ], [ 18.522, -4.69228806 ], [ 19.87675, -4.58122037 ] ] }  12. And the located minimum: %/$ verdi data dict show 103623
{
"energy": -4.8769539175,
"volume": 14.55935661641
}


That concludes this tutorial. We hope at this point you have now realized that AiiDA-VASP seems somewhat usefull and that you would like to continue to learn more, maybe even start to write your own Workflows or Workchains.