A tool for managing numerical simulations across multiple compute hosts.
But nothing stops you to use it to use it for any directory on any of your computers you want to find by id or name.
In that case, mentally substitute simulation for stuff in this file.
With only the name or id of a simulations, get any or all of the following three without manual work:
- find the simulations on any host :
smurf search {id or name} - instantly get a shell at the directory of your simulation on any host :
scd {id or name} - mount the data onto your local machine (python3):
m = smurf.mount.Mount("{id or name}")
You
- have
python3andbash=/=zsh? - use ssh keys and ssh-agent (or similar)?
- have
rsync,python3 setuptoolsinstalled? - are willing to put a
metadir in the directories you want to find?
Then, yes!
You put a directory called meta inside the dir you care about (smurf init).
This meta dir contains meta data including a name.txt file and a unique id (meta/uuid/{the uuid}).
Then standard unix tools and caching are used to find and store the location of the dir you care about (you only need to use smurf search ...).
Clone this repo, navigate to it and run ./scripts/deploy.sh
If you get an import error for setuptools.find_namespace_packages, try upgrading setuptools (python3 -m pip install -U setuptools).
Follow these steps:
smurf initcreate themetadir and the id. The name will be set to the directories name.smurf cache --notifytells smurf about the simulations
smurf initall simulation directories.- add the root directory containing all the simulations with
smurf config add rootdir /path/to/simulations - search through the rootdir and generate a cache with
smurf cache -g
- check the entries in the cache and scrub it with
smurf cache -s - maybe regenerate the cache with
smurf cache -g
- install smurf onto the remote host by running
./scripts/deploy.sh remotehostinside the repo dir. - ssh to remote host
- configure smurf on the remote host just as on your local machine
Smurf uses ssh to automatically connect to your remote hosts and saves you to manually search for simulations on them and navigate to them.
It also uses sshfs to mount data automatically.
To do its job, smurf assumes that you login to remote hosts using ssh keys and that you set up a ssh-agent, such that you don’t need to type your password every time you connect to a remote host.
If you are unsure, try ssh remotehost and see if its works (you can configure your remote hosts in ~/.ssh.config).
If it fails, search online on how to set up ssh keys and a ssh-agent.
Smurf provides a python3 API and a command line interface.
There is a bash and zsh plugin which features tab completion and the scd (get a shell at any simulation directory on any host) command.
Run
smurf configto show the current config values.
Smurf allows you to specify root directories in which you can place the directories you want to track. The whole directory tree under each root dir is searched when generating the cache.
To add/remove the rootdir /scratch/simulations run
smurf config {add,remove} rootdir /scratch/simulationsTo add/remove a remote host on which to search on, run
smurf config {add,remove} host {remotehost}Smurf uses ssh in the background, so you can use any address (user@host or just host) which you can use with ssh {remotehost}.
Please make sure that you have set up a key agent (e.g. ssh-agent) so that you can login automatically.
Otherwise you have to type your password times.
To install smurf on your computer or server, follow these steps:
- Make sure you have installed all the requirements listed below.
- Clone this repository.
- Navigate to the repository in your terminal and run
./scripts/deploy.shThis installs the python packages (using python3 setup.py install --user) and creates a wrapper to call it from the command line. Try it by running the command
smurfIf you get an error saying that the command can’t be found, make sure that ~/.local/bin is in your PATH variable (echo $PATH | grep ~/.local/bin should produce some output).
You can add it by running
export PATH="$PATH:$HOME/.local/bin"and adding the same line in your .bashrc/.zshrc file.
To install the bash/zsh integration to use the tab completion, run
smurf enable_shell_pluginand follow the instructions to activate it and add it to your .bashrc/.zshrc file.
If you have rsync installed on your machine, you can install smurf on a remote host to which you can ssh with ssh {remotehost} via
./scripts/deploy.sh {remotehost}This saves you from cloning the repository on all your hosts and makes it easy to setup a whole smurf network.
python3- python3’s
setuptoolspackage
For each simulation I run, I store the complete source code, the config files and the binary and the output data in one single directory. I refer to this as the simulation directory (simdir). Alongside all the required code and data, I store meta data and scripts to run/build/queue the simulation in a directory called job inside the simdir. Usually, the simdir’s name indicates some of the simulations parameters.
Each simulation also gets its own uuid, such that it can be located. This uuid is stored inside {simdir}/job/uuid/{the uuid} as a file with the uuid as its filename. That way, its extremely easy to locate the simdir of a simulation given its uuid by using unix find or locate. For this smurf find can be used.
Additionally to the concept of simulation directories, I use the concept of project directories. These are indicated by a .project file which contains the name of the project. This file also serves for a way to find the project root directory walking up the directory tree until this file is found. (smurf project root).
- store items in a database (e.g. mysql) instead of json file
Define a structure for the meta information and add an identifier to the meta dir to make it detectable. Define which information is where and add versioning.
- smurf info add set/add command for fields
- create .local/bin if not present and make sure its in PATH