Writing workchains¶
Before we start¶
This document describes briefly how to write Workchains that can be used as a standalone Workflows or composed into a more extensive one. It assumes the reader to be familiar with the AiiDA documentation on how to construct a Workchain as well as how VASP operates.
Modularity¶
Workchains are modular and can be nested indefinitely. This means it makes sense to build a hierarchy of reusable Workchains, where the more complex Workflows reuse a set of fundamental ones. This example will show how to build a basic workchain, but the process of building more complex ones is similar.
Reusable skeleton code¶
From the AiiDA documentation you should be familiar with this code. It runs the VASP workchain with most input parameters inherited from the definition specified in VASP workchain, but we override the structure
. The reason we want to do this is to demonstrate the simple example of building inputs on the fly in workchains such that it is possible for a user to call the workchain we here construct with minimal input parameters. Say that we always want to do calculations on the same structure (this could of course be a list of structures or the whole set of structures in say the Materials Project database). Since VASP workchain requires a structure
as an input, we will in the example below override that such that we in the example workchain specify the structure. Consequently, then the user wants to call the example workchain, they do not have to supply a structure, a default is chosen if they do not:
from aiida.engine import WorkChain
from aiida.orm import Str, Int
from aiida.plugins import WorkflowFactory, DataFactory
from aiida.common.extendeddicts import AttributeDict
class ExampleVaspWorkchain(WorkChain):
_next_workchain = WorkflowFactory('vasp.vasp')
@classmethod
def define(cls, spec):
super(ExampleVaspWorkchain, cls).define(spec)
spec.expose_inputs(cls._next_workchain, exclude=['structure'])
spec.input('structure', valid_type=(DataFactory('structure'), DataFactory('cif'))
cls.outline(
cls.generate_inputs,
cls.generate_structure,
cls.run_next_workchain,
cls.verify_next_workchain,
cls.results
)
spec.expose_outputs(cls._next_workchain)
spec.exit_code(420, 'ERROR_NO_CALLED_WORKCHAIN', message='no called workchain detected')
def generate_missing_inputs(self):
"""Here we create the inputs required to run the 'vasp.vasp' workchain and store them in the context."""
self.ctx.inputs = AttributeDict()
self.ctx.inputs.update(self.exposed_inputs(self._next_workchain))
def generate_structure(self):
"""Here we generate the structure if it is missing from the input."""
try:
self.ctx.inputs.structure = self.inputs.structure
except AttributeError:
# Generate an example silicon structure
structure_class = DataFactory('structure')
alat = 5.4
structure = structure_class(cell=numpy.array([[.5, 0, .5], [.5, .5, 0], [0, .5, .5]]) * alat)
structure.append_atom(position=numpy.array([0.0, 0.0, 0.0]) * alat, symbols='Si')
structure.append_atom(position=numpy.array([.25, .25, .25]) * alat, symbols='Si')
self.ctx.inputs.structure = structure
def run_next_workchain(self):
running = self.submit(self._next_workchain, **self.ctx.inputs)
return self.to_context(workchains=running)
def verify_next_workchain(self):
"""Make sure we attach all results coming from next_workchain to this workchain."""
try:
workchain = self.ctx.workchains[-1]
except IndexError:
self.report("Could not find the next_workchain.")
return self.exit_codes.ERROR_NO_CALLED_WORKCHAIN
def results(self):
"""Attach all outputs from next_workchain to this workchain."""
workchain = self.ctx.workchains[-1]
self.out_many(self.exposed_ouputs(workchain, self._next_workchain))
This example uses the VASP workchain to run a single VASP calculation with its defaults. Please also consult the example files in the examples
folder, which calls the bundled workchains.
Determine the inputs and outputs¶
One of the first questions in designing a workchain should be which inputs will be required and what outputs should be generated. A workchain might provide defaults for everything and work as a passthrough, it might set up all inputs or outputs, or only parts of them. As a user writing new workchains, one should thus first be concerned about defining these and writing the spec.input
, spec.output
.
You may encounter a conflict in spec.input
; that is, some parameters correspond to the same variable in a workchain.
In that case, consider following the overriding policy in AiiDA-VASP, shown in Parameters section.
Determine the required steps¶
It is helpful to sketch out a flow diagram before approaching writing a workchain. How to translate such a flow diagram into a cls.outline
should be obvious from the AiiDA documentation. One should take care on trying to factor out components and avoiding to write very large workchains realize a workflow. By segmenting the problem, its steps, inputs and outputs one ensures a greater opportunity to reuse the workchain in other workflows.
Detailed usage of VaspWorkChain¶
A note about compatibility: WorkChains provide a handy pattern for interactively building input sets both under AiiDA < 1.0.0 as from AiiDA 1.0.0a1 onwards. They are very similar but different enough to recommend using a python dictionary or aiida.common.extendeddicts.AttributeDict
instead in scripts where compatibility for both should be achieved.
Required inputs¶
The VaspWorkChain requires a number of inputs, these comprise the minimum set of information to run a VASP calculation from AiiDA.
code
: an AiiDAInstalledCode
, describes the VASP executable and holds a reference to theComputer
instance on which it lives.
structure
: an AiiDAStructureData
orCifData
, describes the structure on which VASP is to be run.
kpoints
: an AiiDAKpointsData
instance, describing the kpoints mesh or path.
potential_family
: an AiiDAStr
, the name given to a set of uploaded POTCAR files.
potential_mapping
: an AiiDADict
, containing an entry for at least every kind name in thestructure
input with the full name of the POTCAR from thepotential_family
. Example:{'In1': 'In_d', 'In2': 'In_h'}
.
incar
: an AiiDADict
instance, containing key/value pairs that get written to INCAR asKEY = VALUE
, keys can be lower case and builtin python types should be used for values.
options
, an AiiDADict
instance, containing at least the keysresources
andqueue_name
. More information about calculation options is available in the AiiDA documentation.
Optional inputs¶
Optional inputs are not required and can be used to change aspects of the VASP run:
wavecar
: an instance ofWavefunData
(factory string:vasp.wavefun
). Used to pass Wavefunctions from a previous run to a follow up calculation.
chgcar
: an instance ofChargedensityData
(factory string:vasp.chargedensity
. Used to pass charge densities calculated in a previous run.
settings
:Dict
, contains additional settings for AiiDA-side aspects of the VASP calculation, like additional files to retrieve, optional quantities to be parsed, etc.
Outputs¶
The outputs, if no additional ones are requested using the settings
input, are:
parameters
:Dict
, scalar and low dimensional vector quantities, like energies, forces, etc, parsed from OUTCAR and vasprun.xml
structure
:StructureData
, what VASP outputs in CONTCAR
retrieved
:FolderData
containing the retrieved files
remote_folder
:RemoteData
containing information about the remote work folder in which VASP was run