This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
wiki:user_guide [2022/05/06 14:17] phegde |
wiki:user_guide [2022/05/28 18:18] (current) cnr-guest [Job preparation ans submission] |
||
|---|---|---|---|
| Line 101: | Line 101: | ||
| Complete documentation is avalailable at '' | Complete documentation is avalailable at '' | ||
| - | SLUR is an open source software sytstem for cluster management; it is highly scalable and integrates fault-tolerance and job scheduling mechanisms. | + | SLURM is an open source software sytstem for cluster management; it is highly scalable and integrates fault-tolerance and job scheduling mechanisms. |
| ==== SLURM basic concepts ==== | ==== SLURM basic concepts ==== | ||
| Line 325: | Line 325: | ||
| < | < | ||
| - | #SBATCH --nodes= | + | #SBATCH --nodes=[nnodes] |
| - | #SBATCH --ntasks-per-node= #number of cores per node | + | #SBATCH --ntasks-per-node=[ntasks per node] #number of cores per node |
| - | #SBATCH --gres=gpu: | + | #SBATCH --gres=gpu:[ngpu] |
| - | NPROC= | + | |
| + | === Example of parallel jobs submission === | ||
| Suppose a given python code has to be executed for different values of a variable " | Suppose a given python code has to be executed for different values of a variable " | ||
| < | < | ||
| Line 336: | Line 336: | ||
| < | < | ||
| - | #SBATCH --nodes= | + | #SBATCH --nodes=[nnodes] |
| - | #SBATCH --ntasks-per-node= | + | #SBATCH --ntasks-per-node=[ntasks per node] # |
| - | #SBATCH --gres=gpu: | + | #SBATCH --gres=gpu:[ngpu] |
| - | NPROC= | + | NPROC=[nprocesses] |
| tmpstring=tmp | tmpstring=tmp | ||
| Line 358: | Line 358: | ||
| * Parallelization can be implemented within the python code itself. For example, the evaluation of a function for different variable values can be done in parallel. Python offers many packages to parallelize the given process. The basic one among them is [[https:// | * Parallelization can be implemented within the python code itself. For example, the evaluation of a function for different variable values can be done in parallel. Python offers many packages to parallelize the given process. The basic one among them is [[https:// | ||
| - | * The keras module | + | * The keras and Pytorch modules |