Mosflm/SCALA Bug News


We have had a number of reports of a problem with output MTZ files from MOSFLM being fed into SCALA, which have arisen from changes made in both programs for the 4.2 release.


SCALA now uses information about datasets in the input MTZ file. If multiple datasets are present (from multiple runs of MOSFLM), the default is to scale them together but split them into separate output files: this is intended for MAD datasets. In MOSFLM, if the user doesn't specify project and dataset names for runs, then the program gives each run a unique project and dataset name based on the date and time.


The problem occurs when these runs are sorted together and put into SCALA. By default SCALA treats each dataset independently, and outputs two MTZ files with names based on the dataset names from the input file.  This is confusing if you wanted to just scale all the data together.


The ounce of prevention is to make sure that you explicitly set appropriate project and dataset names when running MOSFLM, using the PNAME & DNAME commands.


The pound of cure (workaround) using the existing SCALA gui is to define separate runs (the default "run 1 all" is OK for this purpose) and assign this run to a single output project/dataset (i.e. delete all but one dataset & give it a sensible name). Alternatively REBATCH can be used to reset the project and dataset names before running (but be aware that these options are not currently available through the REBATCH gui and must be scripted manually). If you are running SCALA with a script, add a command line:


name project <project_name> dataset <dataset_name>


to force all data into a single dataset.


An updated version of the SCALA interface has a new option to combine all input datasets into a single output dataset, which should solve the problem. In addition future versions of MOSFLM will use "Unspecified" as the default project and dataset names. (Both these changes will be in the patch release 4.2.1; the SCALA interface is available via the CCP4 problems page). But the best practice is still to name your own projects and datasets at the outset.


Peter Briggs, Phil Evans.