3. Equilibrium volume of FCC Si, take 2¶
Here we will continue where the previous tutorial left of. In particular we will now create a workchain that execute the VASP workchain (basically, this executes VASP) for each volume and stores it. In the end we will even use some Python functions to locate the minimum and store it. We hope that you after this tutorial will finally start to appreciate the convenience of AiiDA-VASP. As before we assume you have completed the steps of the previous tutorial.
In order to make it as easy as possible, we have created a cookie cutter recipe for creating
AiiDA-VASP workchains. This is located in cookiecutters/workchain
on the root of the
repository. If you have not cloned it, you can download only the cookiecutter folder using:
%/$ svn checkout https://github.com/aiida-vasp/aiida-vasp/branches/develop/cookiecutters
What we want to do is to create a workchain that calculates some equation of state, in this
case over volume. For simplicity we will call the workchain Eos
. It is always recommended to
dissect the operations we will perform into parts before writing the workchain. In this way, it
is possibly to possibly reuse larger parts of the workchain at a later point. Also, it is easier
to understand what is going on from just inspecting the definitions of the inputs, steps and outputs
of the workchain. Please, have a look at Workchains and references therein before continuing.
Make sure you have
cookiecutter
installed:%/$ pip install cookiecutter
Now we simply execute the cookiecutter using the
workchains
folder as the template:%/$ cookiecutter cookiecutters/workchains workchain_name [testchain]: eos workchain_docstring [This workchain is a testchain]: This workchain will accept a dictionary of structures and extract the total energies for each structure. The data is saved and the energy minimum is calculated and stored. next_workchain_to_call [vasp.vasp]:
A new folder
eos
was created which contains theeos.py
workchain file. Inspect it. Maybe the most important part of a workchain is thedefine
function, which, as its name implies, defines the workchain. For this particular case it looks like:def define(cls, spec): super(EosWorkChain, cls).define(spec) spec.expose_inputs(cls._next_workchain, exclude=['structure']) spec.input_namespace('structures', valid_type=DataFactory('structure'), dynamic=True, help='a dictionary of structures to use') spec.exit_code(0, 'NO_ERROR', message='the sun is shining') spec.exit_code(420, 'ERROR_NO_CALLED_WORKCHAIN', message='no called workchain detected') spec.exit_code(500, 'ERROR_UNKNOWN', message='unknown error detected in the eos workchain') spec.outline( cls.initialize, while_(cls.run_next_workchains)( cls.init_next_workchain, cls.run_next_workchain, cls.verify_next_workchain, cls.extract_quantities ), cls.finalize ) # yapf: disable spec.output('quantities', valid_type=DataFactory('array'), help='a container for various quantities') spec.expose_outputs(cls._next_workchain)
We have defined some inputs:
All inputs defined in the
_next_workchain
, which in this case is the VASP workchain. We can include those very easily by usingexpose_inputs
. You can of course expose inputs from other workchains as long as AiiDA can find them.A dictionary
structures
containing as many structures you want to run. Notice that the data type of each entry isStructureData
.Some generic exit codes.
Then we have defined some outputs:
We attach all outputs from
_next_workchain
, i.e. the VASP workchain to this workchain (basically a link is created to avoid storing things twice). Similar to the inputs,expose_outputs
can be leveraged.
Finally, there is the
outline
which tells the workchain which functions to run. In this case we have:A function that initializes what we need,
initialize
.A while loop that iterates over all the structures.
A
init_next_workchain
function that initializes what we need for the next workchain at the current iteration (here we make sure the specific structure is set).A
run_next_workchain
which basically executes VASP workchain.A
verify_next_workchain
which verifies that there is a valid VASP workchain and inherits any exit codes present from VASP workchain.A
finalize
which stores the results.
The workchain is not completely ready. We need to extract the total energies and specify the general workchain further. Please make the following changes to the generated workchain:
--- /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/eos_base.py +++ /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/eos.py @@ -10,6 +10,8 @@ # pylint: disable=attribute-defined-outside-init import random import numpy as np +from scipy.optimize import minimize +from scipy.interpolate import interp1d from aiida.common.extendeddicts import AttributeDict from aiida.engine import calcfunction, WorkChain, while_, append_ @@ -47,13 +49,13 @@ cls.init_next_workchain, cls.run_next_workchain, cls.verify_next_workchain, - cls.extract_quantities + cls.extract_volume_and_energy ), cls.finalize ) # yapf: disable - spec.output('quantities', valid_type=DataFactory('array'), help='a container for various quantities') - spec.expose_outputs(cls._next_workchain) + spec.output('eos', valid_type=DataFactory('array'), help='a list containing the cell volumes and total energies') + spec.output('eos_minimum', valid_type=DataFactory('dict'), help='a dictionary containing the cell volume at energy minimum') def initialize(self): """Initialize the eos workchain.""" @@ -73,13 +75,13 @@ self.ctx.is_finished = False # Define an interation index - self.ctx.interation = 0 + self.ctx.iteration = 0 # Define the context inputs self.ctx.inputs = AttributeDict() - # Define container to store quantities that is extracted in each step - self.ctx.quantities_container = [] + # Define the total energies list + self.ctx.total_energies = [] def _init_inputs(self): """Initialize inputs.""" @@ -101,7 +103,7 @@ def init_next_workchain(self): """Initialize the next workchain.""" - # Elevate iteration index + # Elavate iteraetion index self.ctx.iteration += 1 # Check that the context inputs exists @@ -156,42 +158,81 @@ self.report('The called {}<{}> returned a non-zero exit status. ' 'The exit status {} is inherited'.format(workchain.__class__.__name__, workchain.pk, self.ctx.exit_code)) - # Stop further executions of workchains if there are no more structure + # Stop further execution of workchains if there are no more structure # entries in the structures dictionary if not self.ctx.structures: self.ctx.is_finished = True return self.ctx.exit_code - def extract_quantities(self): - """Extract the quantities you want and store them in the container.""" - - # What happens in this routine needs modifications by you. - # Typically you get the output of the called workchain by doing - # workchain = self.ctx.workchains[-1] - # some_output_quantity = workchain.outputs.some_output_quantity - - # An example which stores nonsense. - self.ctx.quantities_container.append([self.ctx.iteration, 'Some quantity for iteration: {}'.format(self.ctx.iteration)]) + def extract_volume_and_energy(self): + """Extract the cell volume and total energy for this structure.""" + + workchain = self.ctx.workchains[-1] + + # Fetch the total energy + misc = workchain.outputs.misc.get_dict() + total_energy = misc['total_energies']['energy_extrapolated'] + + # Fetch the volume + volume = self.ctx.inputs.structure.get_cell_volume() + + # Store both in a list + self.ctx.total_energies.append([volume, total_energy]) def finalize(self): """ Finalize the workchain. - Take the quantity container and set it as an output of this workchain. + Take the total energies container and set is as an output of this workchain. """ # Due to data provenance we cannot return AiiDA data containers that have # not been passed through a calcfunction, workfunction or a workchain. Create this now. - quantities_container = store_quantities(DataFactory('list')(list=self.ctx.quantities_container)) + total_energies = store_total_energies(DataFactory('list')(list=self.ctx.total_energies)) + + # Let us also try to find a better minimum, just as an example using the power of Python + energy = locate_minimum(total_energies) # And then store the output on the workchain - self.out('quantities', quantities_container) + self.out('eos', total_energies) + self.out('eos_minimum', energy) @calcfunction -def store_quantities(quantities_container): - """Stores the quantities to keep data provenance.""" - quantities_container_list = quantities_container.get_list() - quantities_container_array = DataFactory('array')() - quantities_container_array.set_array('quantities', np.array(quantities_container_list)) - return quantities_container_array +def store_total_energies(total_energies): + """Stores the total energies in ArrayData to keep data provenance.""" + total_energies_list = total_energies.get_list() + # Let us also sort by volume as we picked the entries in the structures by random + # above + total_energies_array = np.array(total_energies_list) + total_energies_array_sorted = total_energies_array[total_energies_array[:, 0].argsort()] + array_data = DataFactory('array')() + array_data.set_array('eos', total_energies_array_sorted) + + return array_data + + +@calcfunction +def locate_minimum(total_energies): + """Locate the volume with the lowest energy using interpolation.""" + total_energies_array = total_energies.get_array('eos') + volumes = total_energies_array[:, 0] + energies = total_energies_array[:, 1] + + # Establish some initial guess (take lowest energy point from the original dataset) + min_energy_guess_index = np.argmin(energies) + + # Create the function that can be used to extract interpolated values + # Using cubic interpolation here, which is not necessarly physically correct, + # only serves as an example. Please have a look at the theory behind more realiastic + # models that can be used to fit such data. + new_energies = interp1d(volumes, energies, kind='cubic') + + # Use minimize from scipy to locate the minimal point that can be ejected by the + # interpolation routines + minimized_point = minimize(new_energies, volumes[min_energy_guess_index], tol=1e-3) + + # Create a dictionary to house the results and return + dict_data = DataFactory('dict')(dict={'volume': minimized_point.x[0], 'energy': minimized_point.fun}) + + return dict_data
and save it as
eos.py
. Or you could also download it:%/$ wget https://github.com/aiida-vasp/aiida-vasp/raw/develop/tutorials/eos.py
The majority of changes were related to being more specific, except two things:
The necessity of decorating the function that generates the output array containing the volume and total energies in a
calcfunction
. This is to preserve data provenance, otherwise we would not have known how the data was collected from each of the underlying workchains.The inclusion of a
calcfunction
that interpolates the calculated data to find and ever better estimate of the volume at the energy minima. The example uses a cubic fit, which is certainly not very physical and should not be used in production. It is only to show how simply it is to leverage the power of Python, NumPy and SciPy. Again, this was decorated with acalcfunction
in order to make sure AiiDA can preserve the data provenance.
Next, download the launch script that is tailored to launch the workchain we have now developed:
%/$ wget https://github.com/aiida-vasp/aiida-vasp/raw/develop/tutorials/run_fcc_si_workchain.py
Change the
options
andcode_string
as you did previously.Now we need to make sure the daemon can pick up the workchain. We can do this by making sure the daemon sees the directory where
eos.py
andrun_fcc_si_workchain.py
is located. The simplest approach is to add the following, to your virtual environmentactivate
script (assuming you do not use Conda):$ echo "export PYTHONPATH=$PYTHONPATH:<yourdirectory>" >> ~/env/aiida-vasp/bin/activate
assuming
<yourdirectory>
is the directory containing theeos.py
andrun_fcc_si_workchain.py
files. The location of theactivate
is assumed from the previous steps in the tutorial. If you use Conda, please do:% echo "export PYTHONPATH=$PYTHONPATH:<yourdirectory>" >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
Submit the workchain by running the call script:
%/$ python run_fcc_si_workchain.py
After a while we check the status:
%/$ verdi process list -a PK Created Process label Process State Process status ------ --------- ---------------------------- ----------------- ----------------------------------------------------------- 103573 14m ago EosWorkChain ⏹ Finished [0] 103574 14m ago VaspWorkChain ⏹ Finished [0] 103575 14m ago VaspCalculation ⏹ Finished [0] 103579 13m ago VaspWorkChain ⏹ Finished [0] 103580 13m ago VaspCalculation ⏹ Finished [0] 103584 12m ago VaspWorkChain ⏹ Finished [0] 103585 12m ago VaspCalculation ⏹ Finished [0] 103589 11m ago VaspWorkChain ⏹ Finished [0] 103590 11m ago VaspCalculation ⏹ Finished [0] 103594 10m ago VaspWorkChain ⏹ Finished [0] 103595 10m ago VaspCalculation ⏹ Finished [0] 103599 9m ago VaspWorkChain ⏹ Finished [0] 103600 9m ago VaspCalculation ⏹ Finished [0] 103604 8m ago VaspWorkChain ⏹ Finished [0] 103605 8m ago VaspCalculation ⏹ Finished [0] 103609 7m ago VaspWorkChain ⏹ Finished [0] 103610 7m ago VaspCalculation ⏹ Finished [0] 103614 5m ago VaspWorkChain ⏹ Finished [0] 103615 5m ago VaspCalculation ⏹ Finished [0] 103620 4m ago store_total_energies ⏹ Finished [0] 103622 4m ago locate_minimum ⏹ Finished [0]
As you can see, seven VASP calculation was performed, one for each supplied volume. Also, there is a separate entry for the storrage of the total energies, which also performs a sort. In this was, all generated results are trackable and we have preserved data provenance.
Let us have a look at the output of
EosWorkChain
:%/$ verdi process show 103573 Property Value ------------- ------------------------------------ type WorkChainNode pk 103573 uuid 1055b7ac-e02e-4510-9ed8-62177a9c2bd1 label description ctime 2019-10-08 12:50:46.739221+00:00 mtime 2019-10-08 13:00:47.530855+00:00 process state Finished exit status 0 computer [6] mycluster Inputs PK Type ------------------ ------ ------------- structures silicon_at_4_3 103563 StructureData silicon_at_4_2 103562 StructureData silicon_at_4_1 103561 StructureData silicon_at_4_0 103560 StructureData silicon_at_3_9 103559 StructureData silicon_at_3_8 103558 StructureData silicon_at_3_7 103557 StructureData silicon_at_3_6 103556 StructureData silicon_at_3_5 103555 StructureData clean_workdir 103572 Bool code 101271 Code kpoints 103564 KpointsData max_iterations 103571 Int options 103568 Dict parameters 103565 Dict potential_family 103566 Str potential_mapping 103567 Dict settings 103569 Dict verbose 103570 Bool Outputs PK Type ----------- ------ --------- eos 103621 ArrayData eos_minimum 103623 Dict Called PK Type -------- ------ ---------------- CALL 103622 CalcFunctionNode CALL 103620 CalcFunctionNode CALL 103614 WorkChainNode CALL 103609 WorkChainNode CALL 103604 WorkChainNode CALL 103599 WorkChainNode CALL 103594 WorkChainNode CALL 103589 WorkChainNode CALL 103584 WorkChainNode CALL 103579 WorkChainNode CALL 103574 WorkChainNode Log messages --------------------------------------------- There are 9 log messages for this calculation Run 'verdi process report 103573' to see them
Inspect the total energies versus volume:
%/$ verdi data array show 103621 { "eos": [ [ 10.71875, -4.42341934 ], [ 11.664, -4.66006377 ], [ 12.66325, -4.79595549 ], [ 13.718, -4.86303425 ], [ 14.82975, -4.87588342 ], [ 16.0, -4.8481406 ], [ 17.23025, -4.78451894 ], [ 18.522, -4.69228806 ], [ 19.87675, -4.58122037 ] ] }
And the located minimum:
%/$ verdi data dict show 103623 { "energy": -4.8769539175, "volume": 14.55935661641 }
That concludes this tutorial. We hope at this point you have now realized that AiiDA-VASP seems somewhat usefull and that you would like to continue to learn more, maybe even start to write your own Workflows or Workchains.