2. Iterations

Here we continue from the previous tutorial performing the initial setup and test run for a static volume for the FCC Si example in the VASP tutorials. Here we will perform the calculations of the total energies at different volumes. Please make sure you have completed the previous tutorial before continuing.

  1. Let us first fetch the AiiDA-VASP run file for the multiple volumes:

    $ wget https://github.com/aiida-vasp/aiida-vasp/raw/master/tutorials/run_fcc_si_multiple_volumes.py
    
  2. Inspect the file, which has the following changes compared to the file used in the previous tutorial:

    --- /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/run_fcc_si_one_volume.py
    +++ /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/run_fcc_si_multiple_volumes.py
    @@ -1,8 +1,8 @@
     """
    -Call script to calculate the total energies for one volume of standard silicon.
    +Call script to calculate the total energies for different volumes of the silicon structure.
     
    -This particular call script set up a standard calculation that execute a calculation for
    -the fcc silicon structure.
    +This particular call script set up a standard calculation for each structure and submits
    +each of them.
     """
     # pylint: disable=too-many-arguments
     import numpy as np
    @@ -16,12 +16,12 @@
     load_profile()
     
     
    -def get_structure():
    +def get_structure(alat):
         """
         Set up Si primitive cell
     
         fcc Si:
    -       3.9
    +       alat
            0.5000000000000000    0.5000000000000000    0.0000000000000000
            0.0000000000000000    0.5000000000000000    0.5000000000000000
            0.5000000000000000    0.0000000000000000    0.5000000000000000
    @@ -33,7 +33,6 @@
         """
     
         structure_data = DataFactory('structure')
    -    alat = 3.9
         lattice = np.array([[.5, .5, 0], [0, .5, .5], [.5, 0, .5]]) * alat
         structure = structure_data(cell=lattice)
         for pos_direct in [[0.0, 0.0, 0.0]]:
    @@ -110,13 +109,15 @@
         OPTIONS = AttributeDict()
         OPTIONS.account = ''
         OPTIONS.qos = ''
    -    OPTIONS.resources = {'num_machines': 1, 'num_mpiprocs_per_machine': 1}
    +    OPTIONS.resources = {'num_machines': 1, 'num_mpiprocs_per_machine': 8}
         OPTIONS.queue_name = ''
         OPTIONS.max_wallclock_seconds = 3600
         OPTIONS.max_memory_kb = 2000000
     
    -    # POSCAR equivalent
    -    # Set the silicon structure
    -    STRUCTURE = get_structure()
    +    # Iterate over each lattice constant and pass it explicitly
    +    for lattice_constant in [3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3]:
    +        # POSCAR equivalent
    +        # Set the silicon structure
    +        STRUCTURE = get_structure(lattice_constant)
     
    -    main(CODE_STRING, INCAR, KMESH, STRUCTURE, POTENTIAL_FAMILY, POTENTIAL_MAPPING, OPTIONS)
    +        main(CODE_STRING, INCAR, KMESH, STRUCTURE, POTENTIAL_FAMILY, POTENTIAL_MAPPING, OPTIONS)
    

    As you can see, we did only minimal changes, all which should be self explanatory. In essence we perform the same workchain as before, only that we now iterate over multiple workchains, each getting supplied a different volume of the unit cell.

  3. Change the options and code_string as you did in the previous tutorial to suit your setup.

    Warning

    Make sure you have activated your AiiDA virtual environment and that the AiiDA daemon is running before continuing.

  4. Save and execute the resulting run script by issuing:

    $ python run_fcc_si_multiple_volumes.py
    
  5. Let us check the progress:

    $ verdi process list -a
      PK  Created    Process label    Process State     Process status
    ----  ---------  ---------------  ----------------  -----------------------------------
    1043  37s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1067
    1054  37s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1102
    1065  36s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1132
    1067  36s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1078  36s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1139
    1089  36s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1141
    1100  35s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1143
    1102  35s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1113  35s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1145
    1124  35s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1147
    1137  34s ago    VaspWorkChain    ⏵ Waiting         Waiting for child processes: 1149
    1132  34s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1139  33s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1141  32s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1143  31s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1145  30s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1147  29s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    1149  27s ago    VaspCalculation  ⏵ Waiting         Waiting for transport task: upload
    
    Total results: 18
    
    Report: last time an entry changed state: 26s ago (at 09:24:10 on 2022-12-20)
    Report: Using 9% of the available daemon worker slots.
    

    All calculations are submitted to the cluster in parallel, which is pretty convenient and certainly an improvement compared to the regular tutorial, where you do the calculations sequentially. Not a problem for a few calculations, but maybe not ideal if you want to execute a few thousand calculations.

  6. After a while, we again execute verdi process list -a, hoping our runs are finished and get:

    $ verdi process list -a
      PK  Created    Process label    Process State     Process status
    ----  ---------  ---------------  ----------------  -----------------------------------
    1043  4m ago     VaspWorkChain    ⏹ Finished [0]
    1054  4m ago     VaspWorkChain    ⏹ Finished [0]
    1065  4m ago     VaspWorkChain    ⏹ Finished [0]
    1067  4m ago     VaspCalculation  ⏹ Finished [0]
    1078  4m ago     VaspWorkChain    ⏹ Finished [0]
    1089  4m ago     VaspWorkChain    ⏹ Finished [0]
    1100  4m ago     VaspWorkChain    ⏹ Finished [0]
    1102  4m ago     VaspCalculation  ⏹ Finished [0]
    1113  4m ago     VaspWorkChain    ⏹ Finished [0]
    1124  4m ago     VaspWorkChain    ⏹ Finished [0]
    1137  4m ago     VaspWorkChain    ⏹ Finished [0]
    1132  4m ago     VaspCalculation  ⏹ Finished [0]
    1139  4m ago     VaspCalculation  ⏹ Finished [0]
    1141  3m ago     VaspCalculation  ⏹ Finished [0]
    1143  3m ago     VaspCalculation  ⏹ Finished [0]
    1145  3m ago     VaspCalculation  ⏹ Finished [0]
    1147  3m ago     VaspCalculation  ⏹ Finished [0]
    1149  3m ago     VaspCalculation  ⏹ Finished [0]
    
    Total results: 18
    
    Report: last time an entry changed state: 30s ago (at 09:27:33 on 2022-12-20)
    Report: Using 0% of the available daemon worker slots.
    

    All processes are in a finished state and we can extract the total energies for each step. However, it should be obvious that extracting the total energies from misc from each step manually seems a bit inefficient. One could query (AiiDA has shortcuts for this), but one would still need to look up the values in some way. Surely the computer should be able to eject what we need, right? Maybe it is easier to be able to access them directly when all the calculations are complete? Let us do another modification to the run script above, namely:

    --- /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/run_fcc_si_multiple_volumes.py
    +++ /home/docs/checkouts/readthedocs.org/user_builds/aiida-vasp/checkouts/latest/tutorials/run_fcc_si_multiple_volumes_eos.py
    @@ -2,14 +2,15 @@
     Call script to calculate the total energies for different volumes of the silicon structure.
     
     This particular call script set up a standard calculation for each structure and submits
    -each of them.
    +each of them. We, in addition collect the total energies and eject them to a file called
    +eos.
     """
     # pylint: disable=too-many-arguments
     import numpy as np
     
     from aiida import load_profile
     from aiida.common.extendeddicts import AttributeDict
    -from aiida.engine import submit
    +from aiida.engine import run
     from aiida.orm import Bool, Code, Str
     from aiida.plugins import DataFactory, WorkflowFactory
     
    @@ -77,12 +78,13 @@
         # Workchain related inputs, in this case, give more explicit output to report
         inputs.verbose = Bool(True)
         # Submit the workchain with the set inputs
    -    submit(workchain, **inputs)
    +    results = run(workchain, **inputs)
    +    return results
     
     
     if __name__ == '__main__':
         # Code_string is chosen among the list given by 'verdi code list'
    -    CODE_STRING = 'vasp@mycluster'
    +    CODE_STRING = 'VASP/6.3.2-gompi-2021b-std-wannier90-libxc-hdf5-beef-d7238be44ec2ed23315a16cc1549a1e3@betzy'
     
         # INCAR equivalent
         # Set input parameters
    @@ -95,7 +97,7 @@
         # POTCAR equivalent
         # Potential_family is chosen among the list given by
         # 'verdi data vasp-potcar listfamilies'
    -    POTENTIAL_FAMILY = 'PBE.54'
    +    POTENTIAL_FAMILY = 'PBE'
         # The potential mapping selects which potential to use, here we use the standard
         # for silicon, this could for instance be {'Si': 'Si_GW'} to use the GW ready
         # potential instead
    @@ -107,17 +109,27 @@
         # AttributeDict is just a special dictionary with the extra benefit that
         # you can set and get the key contents with mydict.mykey, instead of mydict['mykey']
         OPTIONS = AttributeDict()
    -    OPTIONS.account = ''
    -    OPTIONS.qos = ''
    +    OPTIONS.account = 'nn9997k'
    +    OPTIONS.qos = 'devel'
         OPTIONS.resources = {'num_machines': 1, 'num_mpiprocs_per_machine': 8}
         OPTIONS.queue_name = ''
         OPTIONS.max_wallclock_seconds = 3600
         OPTIONS.max_memory_kb = 2000000
     
    +    EOS = []
         # Iterate over each lattice constant and pass it explicitly
         for lattice_constant in [3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3]:
             # POSCAR equivalent
             # Set the silicon structure
             STRUCTURE = get_structure(lattice_constant)
     
    -        main(CODE_STRING, INCAR, KMESH, STRUCTURE, POTENTIAL_FAMILY, POTENTIAL_MAPPING, OPTIONS)
    +        results = main(CODE_STRING, INCAR, KMESH, STRUCTURE, POTENTIAL_FAMILY, POTENTIAL_MAPPING, OPTIONS)
    +
    +        # The output are stored as AiiDA datatypes, which is Dict in this case. To obtain a regular
    +        # dictionary, we use get_dict
    +        misc = results['misc'].get_dict()
    +        EOS.append([lattice_constant, misc['total_energies']['energy_extrapolated']])
    +    # Write volume and total energies to file
    +    with open('eos', 'w', encoding='utf8') as file_object:
    +        for item in EOS:
    +            file_object.write(f'{item[0]} {item[1]}\n')
    

    Notice that except for the iteration over the call to the main which eventually runs the workchain, we have replaced submit with run. The reason for this is that we have to make sure the VASP run completes before we can extract the total energies. This also has the consequence that the VASP calculations are now executed sequentially, instead of in parallel. We will later suggest how we can fix this, but for the time being, let us go with this approach. Save the new file as run_fcc_si_multiple_volumes_eos.py or fetch it with:

    wget https://github.com/aiida-vasp/aiida-vasp/raw/master/tutorials/run_fcc_si_multiple_volumes_eos.py
    

    This file should enable automatic extraction of total energies versus the different volumes.

  7. Execute run_fcc_si_multiple_volumes_eos.py:

    $ python run_fcc_si_multiple_volumes_eos.py
    

    Give the script time to complete. Contrary to before you will now see the output in the terminal (the similar output we previously got with verdi process report for a given process). This is because we now use run instead of submit. More on this later. A file eos should be generated in the folder you executed the Python command.

  8. Inspect the eos file that was generated:

    $ more eos
    3.5 -4.42341939
    3.6 -4.66006381
    3.7 -4.79595554
    3.8 -4.86303429
    3.9 -4.87588353
    4.0 -4.8481407
    4.1 -4.78451926
    4.2 -4.69228837
    4.3 -4.58122058
    
  9. Plot the data with your favorite plotting tool, for instance Gnuplot:

    $ gnuplot
    gnuplot> plot "eos" with lp
    

    which should give you something similar that is shown in the FCC Si tutorial.

We have now completed the FCC Si tutorial in AiiDA-VASP. However, maybe not in the best way. As you have already realized, we had to change submit to run in order to make sure we got the output of each executed workchain. If we would have used submit we would only have access to a link to some output that we did not know was occupied or not at the point of inspection (unless we waited until we know the execution of the workchain was complete). In doing so, all the volume steps now run sequentially and thus it takes far longer for all the volume calculations to be finished (assuming your cluster started each volume calculations at the time of submission).

In the next tutorial we will instead develop a workchain to perform the different volume calculations (in fact, more general, namely different structure calculations) and call that from the run script. In doing so, you will realize that we can keep the benefits of automatic extraction of the data and parallel execution. In addition, we get the benefit of storing the result in the database.