NAME rq v2.3.2 SYNOPSIS rq (queue | export RQ_Q=q) mode [mode_args]* [options]* URIS http://raa.ruby-lang.org/project/rq/ http://www.linuxjournal.com/article/7922 DESCRIPTION ruby queue (rq) is a zero-admin zero-configuration tool used to create instant linux clusters. rq requires only a central nfs filesystem in order to manage a simple sqlite database as a distributed priority work queue. this simple design allows researchers to install and configure, in only a few minutes and without root privileges, a robust linux cluster capable of distributing processes to many nodes - bringing dozens of powerful cpus to their knees with a single blow. clearly this software should be kept out of the hands of free radicals, seti enthusiasts, and one mr. j safran. the central concept of rq is that n nodes work in isolation to pull jobs from an centrally mounted nfs priority work queue in a synchronized fashion. the nodes have absolutely no knowledge of each other and all communication is done via the queue meaning that, so long as the queue is available via nfs and a single node is running jobs from it, the system will continue to process jobs. there is no centralized process whatsoever - all nodes work to take jobs from the queue and run them as fast as possible. this creates a system which load balances automatically and is robust in face of node failures. although the rq system is simple in it's design it features powerful functionality such as priority management, predicate and sql query , compact streaming command-line processing, programmable api, hot-backup, and input/capture of the stdin/stdout/stderr io streams of remote jobs. to date rq has had no reported runtime failures and is in operation at dozens of research centers around the world. INVOCATION the first argument to any rq command is the always the name of the queue while the second is the mode of operation. the queue name may be omitted if, and only if, the environment variable RQ_Q has been set to contain the absolute path of target queue. for instance, the command ~ > rq queue list is equivalent to ~ > export RQ_Q=queue ~ > rq list this facility can be used to create aliases for several queues, for example, a .bashrc containing alias MYQ="RQ_Q=/path/to/myq rq" alias MYQ2="RQ_Q=/path/to/myq2 rq" would allow syntax like MYQ2 submit < joblist MODES rq operates in modes create, submit, resubmit, list, status, delete, update, query, execute, configure, snapshot, lock, backup, rotate, feed, recover, ioview, and help, and a few others. the meaning of 'mode_args' will naturally change depending on the mode of operation. the following mode abbreviations exist, note that not all modes have abbreviations c => create s => submit r => resubmit l => list ls => list t => status d => delete rm => delete u => update q => query e => execute C => configure S => snapshot L => lock b => backup R => rotate f => feed io => ioview h => help create, c : creates a queue. the queue must be located on an nfs mounted file system visible from all nodes intended to run jobs from it. nfs locking must be functional on this file system. examples : 0) to create a queue ~ > rq /path/to/nfs/mounted/q create or, using the abbreviation ~ > rq /path/to/nfs/mounted/q c submit, s : submit jobs to a queue to be proccesed by some feeding node. any 'mode_args' are taken as the command to run. note that 'mode_args' are subject to shell expansion - if you don't understand what this means do not use this feature and pass jobs on stdin. when running in submit mode a file may by specified as a list of commands to run using the '--infile, -i' option. this file is taken to be a newline separated list of commands to submit, blank lines and comments (#) are allowed. if submitting a large number of jobs the input file method is MUCH, more efficient. if no commands are specified on the command line rq automatically reads them from stdin. yaml formatted files are also allowed as input (http://www.yaml.org/) - note that the output of nearly all rq commands is valid yaml and may, therefore, be piped as input into the submit command. the leading '---' of yaml file may not be omitted. when submitting the '--priority, -p' option can be used here to determine the priority of jobs. priorities may be any whole number including negative ones - zero is the default. note that submission of a high priority job will NOT supplant a currently running low priority job, but higher priority jobs WILL always migrate above lower priority jobs in the queue in order that they be run as soon as possible. constant submission of high priority jobs may create a starvation situation whereby low priority jobs are never allowed to run. avoiding this situation is the responsibility of the user. the only guaruntee rq makes regarding job execution is that jobs are executed in an 'oldest-highest-priority' order and that running jobs are never supplanted. jobs submitted with the '--stage' option will not be eligible to be run by any node and will remain in a 'holding' state until updated (see update mode) into the 'pending' mode, this option allows jobs to entered, or 'staged', in the queue and then made candidates for running at a later date. rq allows the stdin of commands to be provided and also captures the stdout and stderr of any job run (of course standard shell redirects may be used as well) and all three will be stored in a directory relative the the queue itself. the stdin/stdout/stderr files are stored by job id and there location (though relative to the queue) is shown in the output of 'list' (see docs for list). examples : 0) submit the job ls to run on some feeding host ~ > rq q s ls 1) submit the job ls to run on some feeding host, at priority 9 ~ > rq -p9 q s ls 2) submit a list of jobs from file. note the '-' used to specify reading jobs from stdin ~ > cat joblist job1.sh job2.sh job2.sh ~ > rq q submit --infile=joblist 3) submit a joblist on stdin ~ > cat joblist | rq q submit - or ~ > rq q submit - rq q submit cat --stdin=cat.in 5) submit cat as a job, providing the stdin for the cat job on stdin ~ > cat cat.in | rq q submit cat --stdin=- or ~ > rq q submit cat --stdin=- wc -l cmdfile 42 ~ > rq -p9 -timportant q s < cmdfile 6) re-submit all the 'important' jobs (see 'query' section below) ~ > rq q query tag=important | rq q s - 8) re-submit all jobs which are already finished (see 'list' section below) ~ > rq q l f | rq q s 9) stage the job wont_run_yet to the queue in a 'holding' state. no feeder will run this job until it's state is upgraded to 'pending' ~ > rq q s --stage wont_run_yet ioview, io : as shown in the description for submit, a job maybe be provided stdin during job submission. the stdout and stderr of the job are also captured as the job is run. all three streams are captured in files located relative to the queue. so, if one has submitted a job, and it's jid was shown to be 42, by using something like ~ > rq /path/to/q submit myjob --stdin=myjob.in --- - jid : 42 priority : 0 ... stdin : stdin/42 stdout : stdout/42 stderr : stderr/42 ... command : myjob the stdin file will exists as soon as the job is submitted and the others will exist once the job has begun running. note that these paths are shown relative to the queue. in this case the actual paths would be /path/to/q/stdin/42 /path/to/q/stdout/42 /path/to/q/stderr/42 but, since our queue is nfs mounted the /path/to/q may or may not be the same on every host. thus the path is a relative one. this can make it anoying to view these files, but rq assists here with the ioview command. the ioview command spawns an external editor to view all three files. it's use is quite simple examples : 0) view the stdin/stdout/stderr of job id 42 ~ > rq ioview 42 by default this will open up all three files in vim. the editor command can be specified using the '--editor' option or the ENV var RQ_EDITOR. the default value is 'vim -R -o' which allows all three files to be opened in a single window. resubmit, r : resubmit jobs back to a queue to be proccesed by a feeding node. resubmit is essentially equivalent to submitting a job that is already in the queue as a new job and then deleting the original job except that using resubmit is atomic and, therefore, safer and more efficient. resubmission respects any previous stdin provided for job input. read docs for delete and submit for more info. examples : 0) resubmit job 42 to the queue ~> rq q resubmit 42 1) resubmit all failed jobs ~> rq q query exit_status!=0 | rq q resubmit - 2) resubmit job 4242 with different stdin ~ rq q resubmit 4242 --stdin=new_stdin.in list, l, ls : list mode lists jobs of a certain state or job id. state may be one of pending, holding, running, finished, dead, or all. any 'mode_args' that are numbers are taken to be job id's to list. states may be abbreviated to uniqueness, therefore the following shortcuts apply : p => pending h => holding r => running f => finished d => dead a => all examples : 0) show everything in q ~ > rq q list all or ~ > rq q l all or ~ > export RQ_Q=q ~ > rq l 1) show q's pending jobs ~ > rq q list pending 2) show q's running jobs ~ > rq q list running 3) show q's finished jobs ~ > rq q list finished 4) show job id 42 ~ > rq q l 42 5) show q's holding jobs ~ > rq q list holding status, t : status mode shows the global state the queue and statistics on it's the cluster's performance. there are no 'mode_args'. the meaning of each state is as follows: pending => no feeder has yet taken this job holding => a hold has been placed on this job, thus no fee