Introduction
The GNU Parallel tool is installed on the frontend and compute nodes. A very complete tutorial is available here.
Here we simply present a basic usage of the GNU Parallel tool. The goal is to efficiently deal with a large number of similar, independent and sequential (or weakly parallel jobs).
Test case
Suppose we want to process 100 files named
file_001.txt
...
file_100.txt
using a script which takes a filename as input argument
#!/bin/bash
infile=$1
outfile="out$infile"
sleep_secs=$(($RANDOM % 5 + 3))
echo "Write file $infile to $outfile and sleep for ${sleep_secs} seconds."
cat $infile > $outfile
sleep $sleep_secs
For the purpose of this tutorial, let's generate some files.
for i in {1..100}; do printf "%3d times %3d is %5d\n" "${i}" "${i}" "$(( ${i} * ${i} ))" > file_${i}.txt;done
The script will just read the content of these files, write it to a new one and sleep for a few seconds.
Don't forget to make the process_file script executable (chmod +x).
Parallel processing
The processing of these 100 files in parallel can be easily done as follows
parallel ./process_file file_{}.txt ::: {1..100}
In this command {} is replaced by 1,2,...,100 and the resulting 100 tasks are executed in parallel.
Using the -a option the input arguments can be read from a file :
ls file_*.txt > filelist
parallel -a filelist ./process_file
or via stdin :
ls file*.txt | parallel ./process_file
Multiple input arguments
When multiple input arguments are provided GNU parallel generates all possible combinations :
parallel echo {} ::: A B C ::: 1 2
Output:
A 1
A 2
B 1
B 2
C 1
C 2
Here, {} is replaced by all combinations of input arguments. To access them individually use {1}and {2}:
parallel echo "{1} et {2}" ::: A B C ::: 1 2
The --link option links the input arguments as follows :
parallel --link echo "{1} et {2}" ::: A B C ::: D E F
Output:
A D
B E
C F
Execution control
With the -j option (or --jobs) one can specify the number of concurrent jobs.
By default, parallel will use all available cores.
In the parallel command the symbol {%} is replaced by the process/core identifier.
For instance,
parallel -j 3 'echo hello {} from {%} && sleep 2s' ::: {1..6}
gives
hello 1 from 1
hello 2 from 2
hello 3 from 3
hello 4 from 1
hello 5 from 2
hello 6 from 3
One may want to delay the starting time of jobs, for instance if they do a lot of I/O. Example:
parallel --delay 3.5 'echo Starting {} on $(date)' ::: 1 2 3
Output:
Starting 1 on lun. 24 juil. 2023 14:26:52 CEST
Starting 2 on lun. 24 juil. 2023 14:26:56 CEST
Starting 3 on lun. 24 juil. 2023 14:26:59 CEST
GNU Parallel in Slurm jobs
It is a bit more complicated to use GNU Parallel in multi-node mode - check out the official documentation if you're interested.
To submit a GNU Parallel job to Slurm one should use a script of type multi-core
Monitoring
The option --joblog allows to generate simple statistics on finished jobs and provides a way to restart interrupted jobs. For instance,
parallel --joblog mylog sleep ::: 1 2 3 4 5 6
creates a file mylog starting with
Seq Host Starttime JobRuntime Send Receive Exitval Signal Command
1 : 1690206751.025 1.098 0 0 0 0 sleep 1
2 : 1690206751.026 2.002 0 0 0 0 sleep 2
Supposing that the execution was interrupted after a little more than 2 seconds and that the file mylog stop at job 2.
The --resume option (with the same input parameters!) allows to restart the execution launching the remaining jobs :
parallel --resume --joblog mylog echo ::: A B C D E F
In the following output note the starting dates of jobs 3,4,5,6 compared to jobs 1 and 2:
Seq Host Starttime JobRuntime Send Receive Exitval Signal Command
1 : 1690206751.025 1.098 0 0 0 0 sleep 1
2 : 1690206751.026 2.002 0 0 0 0 sleep 2
3 : 1690206787.147 3.000 0 0 0 0 sleep 3
4 : 1690206787.148 4.002 0 0 0 0 sleep 4
5 : 1690206787.149 5.002 0 0 0 0 sleep 5
6 : 1690206787.150 6.001 0 0 0 0 sleep 6