Skip to content

GeorgLovric/HydroML

Repository files navigation

HydroML – Materials Database HT-DFT Program

To fetch data from OQMD and NOMAD, specify your material query in the "input_query.xlsx" file's sheet (there are separate sheets for OQMD and NOMAD). This acts as our placeholder GUI. If you want to return ALL materials that match the result of your query you can specify 'max' in the user input for 'page_size'. You can also specify which groups of elements you want the material to include/exclude (like rare_earth, alkali_metals, lanthanides, or custom_list, as defined in 'elements_functions.py') by inputting them into the user input fields for elements_any/elements_exclude.

Once you have inputted your criteria, go into 'mainfile.py' and run it. In the terminal type either (1) to run a NOMAD database query, (2) to run an OQMD database query, or (3) to query both (NOTE: This is currently disabled in the mainfile. Option 1 is now chosen automatically as it contains more information than can be fetched from OQMD). The resulting query returns two JSON files: db_data/oqmd_test_output and OUTCAR_files/nmd_test_output for OQMD and NOMAD respectively. Note that each time the code is run these will be overwritten by the new query. Thankfully, all the data is automatically converted to an excel file under OUTPUT/output_n (where 'n' is a number that changes so that the file is not overwritten). This way you get high-throughput DFT data that can be used to train models for materials discovery!

I have now added functionality such that when 'OUTCAR_files' has a user input of 'y', the code automatically unzips the raw zip file, extracts the OUTCAR file AND the VASP input files (which are put inside their own sub directory), and deletes the raw zip file and any other irrelevant files to save space (if you want to keep more files, such as the WAVECAR, this can easily be adapten in query_functions.py). I have also made the code automatically extract the DFT data from the OUTCAR files, saving it to 'OUTPUT/output_DFT.xlsx'. However, I had some trouble extracting all the interesting information, so now it just fetches the free energy (TOTEN). Eventually I would like it to fetch total charge, total magnetization, fermi energy, force on cell, etc. as well.

Also, if you specify 'y' in the input_query file in the 'VASP_workflow' cell, the code automatically generates all VASP input files using the name from the column 'results.material.chemical_formula_reduced' in 'OUTPUT/output.xlsx'. I.e. the code automatically generates vasp input files for all the elements that were returned in your query.

In the current excel file version for 'input_query.xlsx' the data looks for all elements that match the following query: The material must include any of the elements "Co", "Fe", or "Ni". It cannot contain any "rare_earth" elements or "custom_list" elements (defined in elements_functions.py as just being "Se"). They must have a strukturbericht designation of "C14" (I.e. Laves Phases with MgZn2 structure). The data must be DFT data gotten using VASP software. The amount of data returned should be 'max': all elements that match the query. No Outcar files should be fetched, and no VASP input files should be generated.

Functionality to add later: Currently I am in the process of setting up a workflow between the code and the SAGA HPC to automatically perform DFT on the material that we want to send to it. Though it seems like all the materials I have queried NOMAD for have an OUTCAR file that can be fetched, maybe some do not have sufficient enough accuracy such that we would like to run them again. For now though, the script 'vasp_inputs.py' automatically generates VASP input files based on a list of chemical formulas that it can extract from excel (though the name of the column has to be called 'results.material.chemical_formula_reduced', or you have to manually change in the code the name of the cell that contains it if you want to use a specific excel file).

Also: Decision Tree algorithm/random forest to predict Curie Temperature of binary and ternary Laves Phases.

--Georg Lovric, IFE Hydrogen Department (17/09/2024)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages