Back
to index
Setup
In order to do large-scale runs, we use a supercomputer with 56
processors
and split up the runs accordingly. The management software for
the
cluster is called Condor. I will give you the basics about
condor,
but if you want to know more, you can look at the
condor manual.
This tutorial assumes you have already configured and installed condor
on your machine. Our cluster is named 'jazz-mgmt' so if you see
'jazz-mgmt'
in this tutorial, simply replace 'jazz-mgmt' with the system name of
your
cluster.
I suggest you first set up a parallel directory structure on your
cluster's management node to that which you have on your home linux
box. You need to have rosetta_scripts, rosetta_database, and the
executables (under bin in your rosetta parent directory) on your
cluster. Also copy .rosettarc into your home
directory. To do this use the Linux command scp.
scp has the following format:
scp(-r)sourcefile
destination_machine:destinationfilepath
Note that
destination_machine and
destinationfilepath
are
separated by a colon (:)
Use -r (recursive) if you are copying a directory
So to copy your rosetta_database directory from your machine to
jazz-mgmt,
do the following
(from your home directory):
scp -r
rosetta_database jazz-mgmt:
Do this also for the other rosetta directories I mentioned earlier.
In this case,
sourcefile is "rosetta",
destination_machine
is jazz-mgmt, and
destinationfilepath is your home directory on
jazz-mgmt
(the
default file path, hence it is not given). Also copy over your
.bashrc
and .rosettarc, or again make the link on jazz-mgmt to the
rosetta_scripts/docking/rosettarc
file.
Condor
config file
The condor config file contains all of the variables needed to run
rosetta on condor, including both rosetta variables and condor
variables. I have provided an example, test.config, in
condor_scripts. This is a highly adaptable script that can be
customized to do all kinds of rosetta runs on condor. A condor
launch script in $rosetta_scripts translates the config file into a
condor script and a wrapper script to run rosetta, and the launch
script then launches the condor jobs directly.
Read over 'test.config' for some helpful notes.
The condor config file contains the same groups of rosetta variables
found in farun.bash, the rosetta run script from the basics tutorial,
plus a condor variable. Here are the major groups of flags:
1)
prefix: This is the
pdb path for decoys (inside the pdb name directory). You can set
this to anything you want, but if you do more than one condor run in a
directory, use different prefixes for the different runs.
2)
nstruct: the number
of structures that Rosetta outputs. For large scale runs, this is
usually in the thousands. As I discuss different types of large
scale runs in the refinment and blind prediction tutorials, I will
suggest numbers for
nstruct.
Several examples are given in test.config.
nstruct cannot be bigger than 9999,
otherwise Rosetta will overwrite files after decoy 9999 is
produced. The condor config file provides a way to run more than
10,000 structures. To see how, see the "more than 10,000
structures" at the bottom.
3)
search flags: These
control how the docking search is done, and they are the
most important variables for large
scale runs. As I discuss different types of large scale runs in
the refinment and blind prediction tutorials, I will suggest different
combinations of search flags. Several examples are given in
test.config.
4)
side chain flags: If
you want to change the way RosettaDock repacks sidechains. If you
are using a bound structure for partner 1, add -norepack1 here, and
similarly for partner 2. This is described in the condor config
file.
5)
smart scorefilter - I have
this turned off in test.config. Leave it turned off unless you
are doing a 100K run (this is described in the blind runs tutorial).
6)
antibodies - insert
'-fab1' or '-fab2' if you are running an antibody.
7)
Njobs - The number of jobs
to queue for condor. I like 100 for most runs and 10 for test
runs. Generally, you want to keep the cluster full to maximize
your efficiency.
8) Make sure to change 'compiler' if you are using a new executable
A final note: Do not specify more than one value for each
variable in the condor config file (the last value to be assigned will
be used). I have provided several examples of each variable; you
can turn one of these examples on by commenting out the default value
and uncommenting the example value. You can also insert your own
value for a variable; just make sure you comment out the default value.
An
example - perturbation run
Copy the
samplerun
directory from
examples
over to your cluster.
If you have not already prepacked test.pdb in
samplerun,
do so now. The large scale .bash scripts assume a prepacked pdb
already exists.
The following should be done on jazz-mgmt (or your cluster), in
samplerun:
Copy test.config from
$rosetta_scripts/condor_scripts and rename it pert.config.
cp
$rosetta_scripts/condor_scripts/test.config pert.config
Open up 'pert.config' and change nstruct
to 1000 (deactivate the default option and activate the "perturbation
run" option). Also change Njobs
to 100. You could leave it at 10, but that would take forever to
run.
That's it. Now launch the run on condor using crun.bash (located
in $rosetta_scripts):
crun.bash
test pert
Of course, if you are using a pdb other than the 'test.pdb' provided,
then change 'test' to the name of your pdb.
crun.bash requires a copy of the condor config file in your local
directory to run.
Run crun.bash without arguments to get the usage message. The
arguments are <pdb> and <config>, where <config> is
the name of the config file without the '.config' extention. The
extension must be '.config' for the script to work properly.
crun.bash will create two files: an executable 'test.pert.bash'
and a condor script 'test.pert.con'. It will then directly launch
the condor script. You should get a condor message that 100 jobs
have been submitted.
If you want to see
your jobs in the condor queue, type:
condor_q
You will get a list of jobs including the cluster number (first
column), the owner of the jobs, and whether they are running or idle.
If you want to stop the run, use condor_rm:
condor_rm
<your user name> or
condor_rm
<job cluster number>
You can get the cluster number by looking at the first column in
condor_q. The second method of removal is helpful if you only
want to destroy one set of jobs. The first command will remove
all jobs owned by you.
Other
cases
In the refinement tutorial and
the blind runs tutorial, I discuss how to run other types of large
scale runs in a similar way.
More
than 10,000 structures
Earlier I mentioned that it is not possible to have
nstruct > 9999. That is
because Rosetta will not put decoy numbers greater than 9999 in the
decoy filenames, so if you run more than 10000, structure 10,001 will
come out as '0001' and will overwrite structure 0001.
There is a way to get around this and make more than 10K
structures. You may want to do this in a calibration run trying
to find a low-scoring outlier in a blind prediction.
Several options are provided in the condor config file to do
this. First, under 'PREFIX', comment out the default option and
activate the prefix assignment under "More than 10K structures."
'prefix=$1' causes the condor job number to be passed in as the prefix
rather than setting prefix explicitly. Rosetta then converts this
to a directory name like
'aa', 'ab', 'ac', ...
Because of this,
nstruct now
means the number of decoys per job instead of the number of decoys
total. As a result, the number of structures you produce is
Njobs * nstruct. If you want
20,000 structures, leave
Njobs
at 100 and set
Nstruct to
200.
To extract the structures into one directory afterward and to combine
them into one scorefile, type (in the
samplerun
directory):
pp_extract_set.sh <pdb> <topN>
where pdb is your pdb name and
topN
says 'extract the top N structures.' This will also create a
directory
scorefiles
with a combined scorefile from all 100 directories.
pp_extract_set.sh is a rosetta script, and pp stands for 'post
processing'.
All this is quite annoying, so it is usually best to start with fewer
than 10K structures.
Back
to index