Package TEES :: Module batch
[hide private]

Module batch

source code

Process a large number of input files

Functions [hide private]
 
getMaxJobsFromFile(controlFilename) source code
 
getMaxJobs(maxJobs, controlFilename=None) source code
 
prepareCommand(template, input=None, jobTag=None, output=None) source code
 
submitJob(command, input, connection, jobTag=None, output=None, regex=None, dummy=False, rerun=None, hideFinished=False) source code
 
waitForJobs(maxJobs, submitCount, connection, controlFilename=None, sleepTime=15) source code
 
getOutputDir(currentDir, currentItem, input, output=None) source code
 
batch(command, input=None, connection=None, jobTag=None, output=None, regex=None, regexDir=None, dummy=False, rerun=None, hideFinished=False, controlFilename=None, sleepTime=None, debug=False, limit=None, loop=False)
Process a large number of input files
source code
Variables [hide private]
  __package__ = 'TEES'
Function Details [hide private]

batch(command, input=None, connection=None, jobTag=None, output=None, regex=None, regexDir=None, dummy=False, rerun=None, hideFinished=False, controlFilename=None, sleepTime=None, debug=False, limit=None, loop=False)

source code 

Process a large number of input files

Parameters:
  • input - An input file or directory. A directory will be processed recursively
  • connection - A parameter set defining a local connection for submitting the jobs
  • jobTag - The name of the job file, usually if input is not defined. Can be used in the command template.
  • output - An optional output directory. The input directory tree will be replicated here.
  • regex - A regular expression for selecting input files
  • regexDir - A regular expression for input directories, allowing early out for entire subtrees
  • dummy - In dummy mode, jobs are only printed on screen, not submitted. Good for testing
  • rerun - A job is normally submitted only if it does not already exist. If an existing job needs to be resubmitted, this defines the status codes, usually FAILED or FINISHED
  • hideFinished - Do not print a notification when skipping an existing job
  • controlFilename - A file with only one number inside it. This is the job limit, and can be changed while batch.py is running.
  • sleepTime - The time to wait between checks when waiting for jobs to finish. Default is 15 seconds.
  • debug - Job submission scripts are printed on screen.
  • limit - Maximum number of jobs. Overrides controlFilename
  • loop - Loop over the input directory. Otherwise process it once.