Differences

This shows you the differences between two versions of the page.

--- wiki:user_guide [2022/05/06 14:17]
phegde
+++ wiki:user_guide [2022/05/28 18:18] (current)
cnr-guest [Job preparation ans submission]
@@ Line 101: / Line 101: @@
 Complete documentation is avalailable at ''https://slurm.schedmd.com/''.
-SLUR is an open source software sytstem for cluster management; it is highly scalable and  integrates fault-tolerance and job scheduling mechanisms.
+SLURM is an open source software sytstem for cluster management; it is highly scalable and  integrates fault-tolerance and job scheduling mechanisms.
 ==== SLURM basic concepts ====
@@ Line 325: / Line 325: @@
 <code>#!/bin/bash
-#SBATCH --nodes=           #number of nodes
+#SBATCH --nodes=[nnodes]           #number of nodes
-#SBATCH --ntasks-per-node= #number of cores per node
+#SBATCH --ntasks-per-node=[ntasks per node] #number of cores per node
-#SBATCH --gres=gpu:        #number of GPUs per node
+#SBATCH --gres=gpu:[ngpu]        #number of GPUs per node</code>
-NPROC=                     #number of parallel codes to be run </code>
+=== Example of parallel jobs submission ===
 Suppose a given python code has to be executed for different values of a variable "rep". It is possible to execute the python codes parallelly during the job submission process by creating temporary files each file with rep=a1, a2,... The python code example.py can have a line:
 <code> rep=REP </code>
@@ Line 336: / Line 336: @@
 <code>#!/bin/bash
-#SBATCH --nodes=            #number of nodes
+#SBATCH --nodes=[nnodes]            #number of nodes
-#SBATCH --ntasks-per-node=  #number of cores per node
+#SBATCH --ntasks-per-node=[ntasks per node]  #number of cores per node
-#SBATCH --gres=gpu:         #number of GPUs per node
+#SBATCH --gres=gpu:[ngpu]         #number of GPUs per node
-NPROC=                      #number of parallel codes to be run
+NPROC=[nprocesses]                     #number of processing units to be accessed
 tmpstring=tmp               #temporary files generated
@@ Line 358: / Line 358: @@
    * Parallelization can be implemented within the python code itself. For example, the evaluation of a function for different variable values can be done in parallel. Python offers many packages to parallelize the given process. The basic one among them is [[https://docs.python.org/3/library/multiprocessing.html|multiprocessing]].
-   * The keras module in tensorflow which is mainly used for machine learning detects the GPUs automatically.
+   * The keras and Pytorch modules in tensorflow which are mainly used for machine learning detects the GPUs automatically.

ibisco HPC Wiki

User Tools

Site Tools

Differences

Page Tools