Skip to content

ramanathanlab/parslfold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parslfold

Fold proteins in parallel using Parsl.

Supported folding methods:

Installation

On Workstations (rbdgx, lambda, etc.)

To install the package, run the following command:

conda create -n parslfold-env python==3.10
conda activate parslfold-env
git clone git@github.com:ramanathanlab/parslfold.git
cd parslfold
pip install -U pip setuptools wheel
pip install -e .

On Polaris

To install the package on Polaris@ALCF, run the following commands before the pip install command:

module use /soft/modulefiles; module load conda
conda create -n parslfold-env python==3.10
conda activate parslfold-env
git clone git@github.com:ramanathanlab/parslfold.git
cd parslfold
pip install -U pip setuptools wheel
pip install -e .

On Aurora

Make sure to be on a login node (not compute).

git clone git@github.com:ramanathanlab/parslfold.git
cd parslfold
module load frameworks
pip install -U pip setuptools wheel
pip install -e .
# Command above will install packages in your .local directory.
# Add it to the path like so (add this in your .bashrc)
export PATH=/home/<user-name>/.local/aurora/frameworks/2024.2.1_u1/bin:$PATH

Usage

To fold a set of proteins, run the following command (see example YAML config for details):

nohup python -m parslfold.main --config examples/workstation_config.yaml &> nohup.log &

The output folder structure will look like this:

examples/output/
├── config.yaml
├── parsl
│   └── 000
│       ├── htex
│       │   ├── block-0
│       │   │   └── 082881fe477f
│       │   │       ├── manager.log
│       │   │       ├── worker_0.log
│       │   │       └── worker_1.log
│       │   └── interchange.log
│       ├── parsl.log
│       └── submit_scripts
│           ├── parsl.htex.block-0.1737608324.8257468.sh
│           ├── parsl.htex.block-0.1737608324.8257468.sh.ec
│           ├── parsl.htex.block-0.1737608324.8257468.sh.err
│           └── parsl.htex.block-0.1737608324.8257468.sh.out
└── structures
    ├── uniprotkb_accession_A0LFF8_OR_accession_2024_12_19_seq_0
    │   ├── input.fasta
    │   ├── pred.model_idx_0.pdb
    │   └── scores.model_idx_0.npz
    └── uniprotkb_accession_A0LFF8_OR_accession_2024_12_19_seq_1
        ├── input.fasta
        ├── pred.model_idx_4.pdb
        └── scores.model_idx_4.npz
  • config.yaml: The configuration file used to run the folding.
  • parsl/: The Parsl logs and submit scripts (containing stdout and stderr).
  • structures/: The folded protein structures.
    • input.fasta: The input sequence used for folding.
    • pred.model_idx_X.pdb: The highest confidence folded protein structure.
    • scores.model_idx_X.npz: The scores for the folded protein structure.

Notes

  • We only keep the highest confidence folded protein structure and its scores.
  • The subdirectories within structures/ are named based on the input sequence fasta file name and the index of the sequence in the file (e.g., <fasta-name>_seq_X).
  • See examples/chai1_example.py for a quick example of how to fold a protein using the Chai-1 method in our package. This is the same core functionality as the main script, but without the parallelism provided by Parsl.

Contributing

For development, it is recommended to use a virtual environment. The following commands will create a virtual environment, install the package in editable mode, and install the pre-commit hooks.

python -m venv parslfold-env
source parslfold-venv/bin/activate
pip install -U pip setuptools wheel
pip install -e '.[dev,docs]'
pre-commit install

To test the code, run the following command:

pre-commit run --all-files
tox -e py310

About

Fold proteins in parallel using Parsl.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages