1. Introduction¶
Here we will introduce the plugin. We will now follow the FCC Si example in the VASP tutorials tutorial. We will obtain the same data, but using different strategies so that you as a user will be familiar with what is going on with the plugin. If you are a familiar VASP user what we will go through should be well known. The FCC Si tutorial is all about calculating the total energies at different volumes to find the volume that gives you the lowest energy.
In this tutorial we will complete the original FCC Si tutorial without using AiiDA-VASP and then continue to execute the first static (fixed volume) calculation with AiiDA-VASP in order to get to know how the submission, monitoring and inspection process works. In the next tutorial will start to perform calculations at different volumes.
Before starting, we would like you to get a bit familiar with the concepts of AiiDA, so please have a look at AiiDA concepts before continuing.
First complete the FCC Si tutorial without using AiiDA or the AiiDA-VASP plugin.
Enable your AiiDA virtual environment where AiiDA-VASP is installed.
Make sure your AiiDA daemon runs:
$ verdi daemon start
We will now following the lines of the FCC Si tutorial, but using AiiDA-VASP. In the process we will be touching different strategies to you get a feel for how you can structurize a simple workflow. Let us fetch the AiiDA-VASP run file for this example:
$ wget https://github.com/aiida-vasp/aiida-vasp/raw/master/tutorials/run_fcc_si_one_volume.py
Inspect the file, which has the following content:
""" Call script to calculate the total energies for one volume of standard silicon. This particular call script set up a standard calculation that execute a calculation for the fcc silicon structure. """ # pylint: disable=too-many-arguments import numpy as np from aiida import load_profile from aiida.common.extendeddicts import AttributeDict from aiida.engine import submit from aiida.orm import Bool, Code, Str from aiida.plugins import DataFactory, WorkflowFactory load_profile() def get_structure(): """ Set up Si primitive cell fcc Si: 3.9 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.0000000000000000 0.5000000000000000 0.5000000000000000 0.5000000000000000 0.0000000000000000 0.5000000000000000 Si 1 Cartesian 0.0000000000000000 0.0000000000000000 0.0000000000000000 """ structure_data = DataFactory('structure') alat = 3.9 lattice = np.array([[.5, .5, 0], [0, .5, .5], [.5, 0, .5]]) * alat structure = structure_data(cell=lattice) for pos_direct in [[0.0, 0.0, 0.0]]: pos_cartesian = np.dot(pos_direct, lattice) structure.append_atom(position=pos_cartesian, symbols='Si') return structure def main(code_string, incar, kmesh, structure, potential_family, potential_mapping, options): """Main method to setup the calculation.""" # First, we need to fetch the AiiDA datatypes which will # house the inputs to our calculation dict_data = DataFactory('dict') kpoints_data = DataFactory('array.kpoints') # Then, we set the workchain you would like to call workchain = WorkflowFactory('vasp.vasp') # And finally, we declare the options, settings and input containers settings = AttributeDict() inputs = AttributeDict() # Set inputs for the following WorkChain execution # Code inputs.code = Code.get_from_string(code_string) # Structure inputs.structure = structure # k-points grid density kpoints = kpoints_data() kpoints.set_kpoints_mesh(kmesh) inputs.kpoints = kpoints # Parameters inputs.parameters = dict_data(dict=incar) # Potential family and their mapping between element and potential type to use inputs.potential_family = Str(potential_family) inputs.potential_mapping = dict_data(dict=potential_mapping) # Options inputs.options = dict_data(dict=options) # Settings inputs.settings = dict_data(dict=settings) # Workchain related inputs, in this case, give more explicit output to report inputs.verbose = Bool(True) # Submit the workchain with the set inputs submit(workchain, **inputs) if __name__ == '__main__': # Code_string is chosen among the list given by 'verdi code list' CODE_STRING = 'vasp@mycluster' # INCAR equivalent # Set input parameters INCAR = {'incar': {'istart': 0, 'icharg': 2, 'encut': 240, 'ismear': 0, 'sigma': 0.1}} # KPOINTS equivalent # Set kpoint mesh KMESH = [11, 11, 11] # POTCAR equivalent # Potential_family is chosen among the list given by # 'verdi data vasp-potcar listfamilies' POTENTIAL_FAMILY = 'PBE.54' # The potential mapping selects which potential to use, here we use the standard # for silicon, this could for instance be {'Si': 'Si_GW'} to use the GW ready # potential instead POTENTIAL_MAPPING = {'Si': 'Si'} # Jobfile equivalent # In options, we typically set scheduler options. # See https://aiida.readthedocs.io/projects/aiida-core/en/latest/scheduler/index.html # AttributeDict is just a special dictionary with the extra benefit that # you can set and get the key contents with mydict.mykey, instead of mydict['mykey'] OPTIONS = AttributeDict() OPTIONS.account = '' OPTIONS.qos = '' OPTIONS.resources = {'num_machines': 1, 'num_mpiprocs_per_machine': 1} OPTIONS.queue_name = '' OPTIONS.max_wallclock_seconds = 3600 OPTIONS.max_memory_kb = 2000000 # POSCAR equivalent # Set the silicon structure STRUCTURE = get_structure() main(CODE_STRING, INCAR, KMESH, STRUCTURE, POTENTIAL_FAMILY, POTENTIAL_MAPPING, OPTIONS)
Change the
CODE_STRING
based on the code and computer you have stored. This can be inspected withverdi code list
.Change the following to comply with requirements of your cluster or your project:
options.account = '' options.qos = '' options.resources = {'num_machines': 1, 'num_mpiprocs_per_machine': 8} options.queue_name = ''
For example, if you use a SGE scheduler, you need to modify resources as follows:
resources = {'num_machines': 1, 'tot_num_mpiprocs': 8, 'parallel_env': 'mpi*'} # for SGE
Please consult the documentation on AiiDA job resources and adjust accordingly.
Save and execute the resulting run script by issuing:
$ python run_fcc_si_one_volume.py
Check its progress with:
$ verdi process list PK Created Process label Process State Process status ---- --------- --------------- ---------------- ----------------------------------- 883 14s ago VaspWorkChain ⏵ Waiting Waiting for child processes: 885 885 13s ago VaspCalculation ⏵ Waiting Waiting for transport task: upload Total results: 2 Report: last time an entry changed state: 18s ago (at 14:29:24 on 2022-12-19) Report: Using 0% of the available daemon worker slots.
By running
verdi process list
we get a list of all active processes. Depending on when you run this command, yourPK
and specific information shown might be different, but the key observation is that we launched a VASP workchain, which is the main entrypoint for launching a simple VASP calculation. This launches a VASP calculation which is the process in AiiDA which launches the actual VASP calculation. This is presently in theWaiting for transport task: upload
process status, meaning, that it is currently uploading results to the computer.Note
Notice that the
verdi process list
only lists the active processes. Hopefully, after a while the launched processes will complete without errors and it will not be visible withverdi process list
any more as finished processes are not considered active. In order to also list these processes we can useverdi process list -a
.After a while, we execute
verdi process list -a
and get:$ verdi process list -a PK Created Process label Process State Process status ---- --------- --------------- ---------------- ----------------------------------- 883 8m ago VaspWorkChain ⏹ Finished [0] 885 8m ago VaspCalculation ⏹ Finished [0] Total results: 2 Report: last time an entry changed state: 6m ago (at 17:07:04 on 2022-12-19) Report: Using 0% of the available daemon worker slots.
The processes composing the workflow are now finished. And there is a zero inside the brackets. This shows the exit code, and usual practice is that a zero is a sign of a successfully process execution. Please consult the documentation of the AiiDA exit codes for more details. AiiDA defines a few internal exit codes and the AiiDA-VASP plugin adds to those.
From the finished state we can conclude that your VASP calculation, or workflow is done.
Let us have a look at what happened during the execution of the workflow. We typically inspect the topmost, i.e. the workchain with the lowest PK as a starting point, here
883
. Let us look at logs, or report:$ verdi process report 883 2022-12-19 17:04:54 [85 | REPORT]: [883|VaspWorkChain|run_process]: launching VaspCalculation<885> iteration #1 2022-12-19 17:07:04 [86 | REPORT]: [883|VaspWorkChain|results]: work chain completed after 1 iterations 2022-12-19 17:07:06 [87 | REPORT]: [883|VaspWorkChain|on_terminated]: cleaned remote folders of calculations: 885
Nothing particularly interesting and what you would expect.
Note
Notice that the logs states that the remote folders was cleaned. The default setting of the plugin is to, after the VASP workchain is finished with a zero exit code to clean the remote folder. The remote folder is the folder on the computer running the calculations. Typically this is the remote configured cluster for VASP calculations. Consult the documentation of VASP workchain how to modify this behavior if you want to change the default setting.
Let us have a look at what is stored on the VASP workchain with
PK
of883
. The topmost workchain typically contain the relevant output of the workflow calculation:$ verdi process show 883 Property Value ----------- ------------------------------------ type VaspWorkChain state Finished [0] pk 883 uuid 0c769ee8-07dc-410b-b1eb-7975ca7e7029 label description ctime 2022-12-19 17:04:53.027011+01:00 mtime 2022-12-19 17:07:04.374171+01:00 Inputs PK Type ----------------- ---- ------------- clean_workdir 882 Bool code 818 InstalledCode kpoints 874 KpointsData max_iterations 881 Int options 878 Dict parameters 875 Dict potential_family 876 Str potential_mapping 877 Dict settings 879 Dict structure 873 StructureData verbose 880 Bool Outputs PK Type ------------- ---- ---------- misc 888 Dict remote_folder 886 RemoteData retrieved 887 FolderData Called PK Type ------------ ---- --------------- iteration_01 885 VaspCalculation Log messages --------------------------------------------- There are 3 log messages for this calculation Run 'verdi process report 883' to see them
Here you can see the inputs and outputs of your workflow, which is attached as outputs on a workchain. You can also observe the inputs and what other processes have been called, or called this process.
Note
Most things in AiiDA that are stored are considered a node and we will continue to use this terminology and it does not matter if this is an input, output, or a process node, like
VaspWorkChain
. If you see aPK
it is for sure a node. Please, at this point, reconsider if you need to fresh up on the concepts of AiiDA as explained in AiiDA concepts.Note
Most nodes can after being stored, which typically is the case when it is passed to or from a process, like a workchain or the special calculation process
VaspCalculation
not be modified. This is a natural consequence of honoring the data provenance concept. At first this can seem a bit frustrating, i.e. if you define a computer, which is also considered a node, do some calculations with this computer and find out you have to change it, you cannot. You have to create a new computer with the modified settings. After a while this will come as second nature, but takes a bit of getting used to. In the end, if you want data provenance, there is really no other good alternative to this.Note
Notice that there are three outputs. The
remote_folder
gives the path of the remote folder (which is cleaned by default if the workflow is considered successful), theretrieved
, which is the folder containing the retrieved and kept VASP files andmisc
. For the VASP workchain, this is the default. If you want to modify what is attached in the output, please consult the documentation on Parsing.Let us inspect the
misc
output:$ verdi data core.dict show 888 { "maximum_force": 0.0, "maximum_stress": 20.22402923, "notifications": [], "run_stats": { "average_memory_used": null, "elapsed_time": 2.547, "maximum_memory_used": 47700.0, "mem_usage_base": 30000.0, "mem_usage_fftplans": 296.0, "mem_usage_grid": 451.0, "mem_usage_nonl-proj": 493.0, "mem_usage_one-center": 3.0, "mem_usage_wavefun": 779.0, "system_time": 0.22, "total_cpu_time_used": 0.827, "user_time": 0.607 }, "run_status": { "consistent_nelm_breach": false, "contains_nelm_breach": false, "electronic_converged": true, "finished": true, "ionic_converged": null, "last_iteration_index": [ 1, 8 ], "nbands": 6, "nelm": 60, "nsw": 0 }, "total_energies": { "energy_extrapolated": -4.87588357, "energy_extrapolated_electronic": -4.87588357 }, "version": "6.3.2" }
As you can see, this contains the
maximum_force
,maximum_stress
andtotal_energies
in standard VASP units. The containermisc
is used to house quantities that are not system size dependent (with size, we also mean grid sizes etc., like the k-point grid, or number of atoms). Inmisc
you also have access to useful run time statistics (mainly what is printed inOUTCAR
) and run status data.