JobManagement » History » Version 7
Paweł Widera, 11/28/2008 08:43 PM
Maui grid status commands added
1 | 7 | Paweł Widera | |
---|---|---|---|
2 | 7 | Paweł Widera | h1. Job Management |
3 | 7 | Paweł Widera | |
4 | 1 | Anonymous | The queuing system (resource manager) is the heart of the distributed computing on a cluster. It consists of three parts: the server, the scheduler, and the machine-oriented mini-server (MOM) executing the jobs. |
5 | 1 | Anonymous | |
6 | 1 | Anonymous | We are assuming the following configuration: |
7 | 1 | Anonymous | |
8 | 3 | Anonymous | ||PBS TORQUE|| version 2.1.8 ||server, basic scheduler, mom ||[source:Externals/Cluster/torque-2.1.8.tgz download from repository] |
9 | 3 | Anonymous | ||MAUI || version 3.2.6.p18 ||scheduler ||[source:Externals/Cluster/maui-3.2.6p18.tgz download from repository] |
10 | 1 | Anonymous | |
11 | 1 | Anonymous | |
12 | 1 | Anonymous | |
13 | 3 | Anonymous | Please check the distributors website's for newer versions: |
14 | 3 | Anonymous | |
15 | 1 | Anonymous | ||PBS TORQUE ||http://www.clusterresources.com/pages/products/torque-resource-manager.php |
16 | 1 | Anonymous | ||MAUI ||http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php |
17 | 1 | Anonymous | |
18 | 1 | Anonymous | |
19 | 7 | Paweł Widera | The install directories for _TORQUE_ and _MAUI_ will be: |
20 | 1 | Anonymous | |
21 | 7 | Paweł Widera | ||PBS TORQUE ||_/var/spool/torque_ |
22 | 7 | Paweł Widera | ||MAUI ||_/var/spool/maui_ |
23 | 1 | Anonymous | |
24 | 1 | Anonymous | |
25 | 1 | Anonymous | |
26 | 1 | Anonymous | |
27 | 7 | Paweł Widera | h2. TORQUE |
28 | 7 | Paweł Widera | |
29 | 7 | Paweł Widera | |
30 | 7 | Paweł Widera | |
31 | 7 | Paweł Widera | h3. Register new services |
32 | 7 | Paweł Widera | |
33 | 7 | Paweł Widera | Edit _/etc/services_ and add at the end: |
34 | 7 | Paweł Widera | <pre> |
35 | 1 | Anonymous | # PBS/Torque services |
36 | 1 | Anonymous | pbs 15001/tcp # pbs_server |
37 | 1 | Anonymous | pbs 15001/udp # pbs_server |
38 | 1 | Anonymous | pbs_mom 15002/tcp # pbs_mom <-> pbs_server |
39 | 1 | Anonymous | pbs_mom 15002/udp # pbs_mom <-> pbs_server |
40 | 1 | Anonymous | pbs_resmom 15003/tcp # pbs_mom resource management |
41 | 1 | Anonymous | pbs_resmom 15003/udp # pbs_mom resource management |
42 | 1 | Anonymous | pbs_sched 15004/tcp # pbs scheduler (pbs_sched) |
43 | 1 | Anonymous | pbs_sched 15004/udp # pbs scheduler (pbs_sched) |
44 | 7 | Paweł Widera | </pre> |
45 | 1 | Anonymous | |
46 | 1 | Anonymous | |
47 | 7 | Paweł Widera | |
48 | 7 | Paweł Widera | h3. Setup and Configuration on the Master Node |
49 | 7 | Paweł Widera | |
50 | 1 | Anonymous | Extract and build the distribution TORQUE on the master node. Configure server, monitor and clients to use secure file transfer (scp). |
51 | 7 | Paweł Widera | <pre> |
52 | 1 | Anonymous | export TORQUECFG=/var/spool/torque |
53 | 1 | Anonymous | tar -xzvf TORQUE.tar.gz |
54 | 1 | Anonymous | cd TORQUE |
55 | 7 | Paweł Widera | </pre> |
56 | 1 | Anonymous | |
57 | 1 | Anonymous | Configuration for a 64bit machine with the following compiler options: |
58 | 7 | Paweł Widera | <pre> |
59 | 1 | Anonymous | FFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
60 | 1 | Anonymous | CFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
61 | 1 | Anonymous | CXXFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
62 | 1 | Anonymous | LDFLAGS = "-L/usr/local/lib -L/usr/local/lib64" |
63 | 7 | Paweł Widera | </pre> |
64 | 7 | Paweł Widera | *Attention*: For Intel Xenon processors use _-march=nocona_, for AMD Opteron processors use _-march=opteron_. |
65 | 1 | Anonymous | |
66 | 1 | Anonymous | Configure, build, and install: |
67 | 7 | Paweł Widera | <pre> |
68 | 1 | Anonymous | ./configure --prefix=/usr/local --with-spooldir=$TORQUECFG |
69 | 1 | Anonymous | make |
70 | 1 | Anonymous | make install |
71 | 7 | Paweł Widera | </pre> |
72 | 7 | Paweł Widera | If not configures otherwise, binaries are installed in _/usr/local/bin_ and _/usr/local/sbin_. |
73 | 1 | Anonymous | |
74 | 7 | Paweł Widera | Initialise/configure the queuing system's server daemon (_pbs_server_): |
75 | 7 | Paweł Widera | <pre> |
76 | 1 | Anonymous | pbs_server -t create |
77 | 7 | Paweł Widera | </pre> |
78 | 1 | Anonymous | |
79 | 1 | Anonymous | Set the PBS operator and manager (must be a valid user name). |
80 | 7 | Paweł Widera | <pre> |
81 | 1 | Anonymous | qmgr |
82 | 1 | Anonymous | > set server_name = master01.procksi.local |
83 | 1 | Anonymous | > set server scheduling = true |
84 | 1 | Anonymous | > set server operators = "root@master01.procksi.local,procksi@master01.procksi.local" |
85 | 1 | Anonymous | > set server managers = "root@master01.procksi.local,procksi@master01.procksi.local" |
86 | 7 | Paweł Widera | </pre> |
87 | 1 | Anonymous | |
88 | 7 | Paweł Widera | Allow only _procksi_ and _root_ to submit jobs into the queue: |
89 | 7 | Paweł Widera | <pre> |
90 | 1 | Anonymous | > set server acl_users = "root,procksi" |
91 | 2 | Anonymous | > set server acl_user_enable = true |
92 | 7 | Paweł Widera | </pre> |
93 | 1 | Anonymous | |
94 | 1 | Anonymous | Set email address for email that is sent by PBS: |
95 | 7 | Paweł Widera | <pre> |
96 | 1 | Anonymous | > set server mail_from = pbs@procksi.net |
97 | 7 | Paweł Widera | </pre> |
98 | 1 | Anonymous | |
99 | 1 | Anonymous | Allow submissions from slave hosts (only): |
100 | 7 | Paweł Widera | *ATTENTION: NEEDS TO BE CHECKED. DOES NOT WORK PROPERLY YET!! * |
101 | 7 | Paweł Widera | <pre> |
102 | 7 | Paweł Widera | <pre> |
103 | 1 | Anonymous | > set server allow_node_submit = true |
104 | 1 | Anonymous | > set server submit_hosts = master01.procksi.local |
105 | 1 | Anonymous | slave01.procksi.local |
106 | 1 | Anonymous | slave02.procksi.local |
107 | 1 | Anonymous | slave03.procksi.local |
108 | 1 | Anonymous | slave04.procksi.local |
109 | 7 | Paweł Widera | </pre> |
110 | 1 | Anonymous | |
111 | 1 | Anonymous | |
112 | 1 | Anonymous | Restrict nodes that can access the PBS server: |
113 | 7 | Paweł Widera | <pre> |
114 | 1 | Anonymous | > set server acl_hosts = master01.procksi.local |
115 | 2 | Anonymous | slave01.procksi.local |
116 | 1 | Anonymous | slave02.procksi.local |
117 | 1 | Anonymous | slave03.procksi.local |
118 | 1 | Anonymous | slave04.procksi.local |
119 | 1 | Anonymous | > set acl_host_enable = true |
120 | 7 | Paweł Widera | </pre> |
121 | 1 | Anonymous | |
122 | 7 | Paweł Widera | And set in _torque.cfg_ in order |
123 | 1 | Anonymous | to use the internal interface: |
124 | 7 | Paweł Widera | <pre> |
125 | 1 | Anonymous | SERVERHOST master01.procksi.local |
126 | 1 | Anonymous | ALLOWCOMPUTEHOSTSUBMIT true |
127 | 7 | Paweł Widera | </pre> |
128 | 7 | Paweł Widera | </pre> |
129 | 1 | Anonymous | |
130 | 1 | Anonymous | Configure default node to be used (see below): |
131 | 7 | Paweł Widera | <pre> |
132 | 1 | Anonymous | > set server default_node = slave |
133 | 7 | Paweł Widera | </pre> |
134 | 1 | Anonymous | |
135 | 1 | Anonymous | |
136 | 7 | Paweł Widera | Set the default queue to _batch_ |
137 | 7 | Paweł Widera | <pre> |
138 | 1 | Anonymous | > set server default_queue=batch |
139 | 7 | Paweł Widera | </pre> |
140 | 1 | Anonymous | |
141 | 7 | Paweł Widera | Configure the main queue _batch_: |
142 | 7 | Paweł Widera | <pre> |
143 | 1 | Anonymous | > create queue batch queue_type=execution |
144 | 1 | Anonymous | > set queue batch started=true |
145 | 1 | Anonymous | > set queue batch enabled=true |
146 | 1 | Anonymous | > set queue batch resources_default.nodes=1 |
147 | 7 | Paweł Widera | </pre> |
148 | 1 | Anonymous | |
149 | 7 | Paweł Widera | Configure queue _test _accordingly_. |
150 | 1 | Anonymous | |
151 | 7 | Paweł Widera | Specify all compute nodes to be used by creating/editing _$TORQUECFG/server_priv/nodes._ This may include the same machine where pbs_server will run. If the compute nodes have more than one processor, just add np=X after the name with X being the number of processors. Add node attributes so that a subset of nodes can be requested during the submission stage. |
152 | 7 | Paweł Widera | <pre> |
153 | 1 | Anonymous | master01.procksi.local np=2 procksi master xeon |
154 | 1 | Anonymous | slave01.procksi.local np=2 procksi slave xeon |
155 | 1 | Anonymous | slave02.procksi.local np=2 procksi slave xeon |
156 | 1 | Anonymous | slave03.procksi.local np=4 procksi slave opteron |
157 | 1 | Anonymous | slave04.procksi.local np=4 procksi slave opteron |
158 | 7 | Paweł Widera | </pre> |
159 | 1 | Anonymous | |
160 | 7 | Paweł Widera | Although the master node (_master01_) has two processors as well, we only allow one processor to be used for the queueing system as the other processor will be used for handling all frontend communication and I/O. (Make sure that hyperthreading technology is disabled on the head node and all compute nodes!) |
161 | 1 | Anonymous | |
162 | 1 | Anonymous | Request job to be run on specific nodes (on submission): |
163 | 7 | Paweł Widera | * Run on any compute node: |
164 | 7 | Paweł Widera | <pre> |
165 | 1 | Anonymous | qsub -q batch -l nodes=1:procksi |
166 | 7 | Paweł Widera | </pre> |
167 | 7 | Paweł Widera | * Run on any slave node: |
168 | 7 | Paweł Widera | <pre> |
169 | 1 | Anonymous | qsub -q batch -l nodes=1:slave |
170 | 7 | Paweł Widera | </pre> |
171 | 7 | Paweł Widera | * Run on master node: |
172 | 7 | Paweł Widera | <pre> |
173 | 1 | Anonymous | qsub -q batch -l nodes=1:master |
174 | 7 | Paweł Widera | </pre> |
175 | 1 | Anonymous | |
176 | 1 | Anonymous | |
177 | 1 | Anonymous | |
178 | 1 | Anonymous | |
179 | 7 | Paweł Widera | |
180 | 7 | Paweł Widera | h3. Setup and Configuration on the Slave Nodes |
181 | 7 | Paweł Widera | |
182 | 1 | Anonymous | Extract and build the distribution TORQUE on each slave node. Configure monitor and clients to use secure file transfer (scp). |
183 | 7 | Paweł Widera | <pre> |
184 | 1 | Anonymous | export TORQUECFG=/var/spool/torque |
185 | 1 | Anonymous | tar -xzvf TORQUE.tar.gz |
186 | 1 | Anonymous | cd TORQUE |
187 | 7 | Paweł Widera | </pre> |
188 | 1 | Anonymous | |
189 | 1 | Anonymous | Configuration for a 64bit machine with the following compiler options: |
190 | 7 | Paweł Widera | <pre> |
191 | 1 | Anonymous | FFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
192 | 1 | Anonymous | CFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
193 | 1 | Anonymous | CXXFLAGS = "-m64 -march=[Add Architecture] -O3 -fPIC" |
194 | 1 | Anonymous | LDFLAGS = "-L/usr/local/lib -L/usr/local/lib64" |
195 | 7 | Paweł Widera | </pre> |
196 | 7 | Paweł Widera | Attention: For Intel Xenon processors use _-march=nocona_, for AMD Opteron processors use _-march=opteron_. |
197 | 1 | Anonymous | |
198 | 1 | Anonymous | Configure, build, and install: |
199 | 7 | Paweł Widera | <pre> |
200 | 1 | Anonymous | ./configure --prefix=/usr/local --with-spooldir=$TORQUECFG --disable-server --enable-mom --enable-clients --with-default-server=master01.procksi.local |
201 | 1 | Anonymous | make |
202 | 1 | Anonymous | make install |
203 | 7 | Paweł Widera | </pre> |
204 | 1 | Anonymous | |
205 | 7 | Paweł Widera | Configure the compute nodes by creating/editing _$TORQUECFG/mom_priv/config_. The first line specifies the PBS server, the second line specifies hosts which can be trusted to access mom services as non-root, and the last line allows copying data via NFS without using SCP. |
206 | 7 | Paweł Widera | <pre> |
207 | 1 | Anonymous | $pbsserver master01.procksi.local |
208 | 1 | Anonymous | $loglevel 255 |
209 | 1 | Anonymous | $restricted master01.procksi.local |
210 | 1 | Anonymous | $usecp master01.procksi.local:/home/procksi /home/procksi |
211 | 7 | Paweł Widera | </pre> |
212 | 1 | Anonymous | |
213 | 1 | Anonymous | Start the queueing system (manually) in the correct order: |
214 | 7 | Paweł Widera | * Start the mom: |
215 | 7 | Paweł Widera | <pre> |
216 | 1 | Anonymous | /usr/local/sbin/pbs_mom |
217 | 7 | Paweł Widera | </pre> |
218 | 7 | Paweł Widera | * Kill the server: |
219 | 7 | Paweł Widera | <pre> |
220 | 1 | Anonymous | /usr/local/sbin/qterm -t quick |
221 | 7 | Paweł Widera | </pre> |
222 | 7 | Paweł Widera | * Start the server: |
223 | 7 | Paweł Widera | <pre> |
224 | 1 | Anonymous | /usr/local/sbin/pbs_server |
225 | 7 | Paweł Widera | </pre> |
226 | 7 | Paweł Widera | * Start the scheduler: |
227 | 7 | Paweł Widera | <pre> |
228 | 1 | Anonymous | /usr/local/sbin/pbs_sched |
229 | 7 | Paweł Widera | </pre> |
230 | 1 | Anonymous | |
231 | 7 | Paweł Widera | If you want to use MAUI as the final scheduler, keep in mind to kill _pbs_sched_ after testing the TORQURE installation. |
232 | 1 | Anonymous | |
233 | 1 | Anonymous | |
234 | 1 | Anonymous | Check that all nodes are properly configured and correctly reporting |
235 | 7 | Paweł Widera | <pre> |
236 | 1 | Anonymous | qstat -q |
237 | 1 | Anonymous | pbsnodes -a |
238 | 7 | Paweł Widera | </pre> |
239 | 1 | Anonymous | |
240 | 1 | Anonymous | |
241 | 7 | Paweł Widera | |
242 | 7 | Paweł Widera | h3. Prologue and Epilogue Scripts |
243 | 7 | Paweł Widera | |
244 | 1 | Anonymous | Get [repos:Externals/procksi_pbs.tgz] from the repository and untar it: |
245 | 7 | Paweł Widera | <pre> |
246 | 1 | Anonymous | untar –xvzf procksi_pbs.tgz |
247 | 7 | Paweł Widera | </pre> |
248 | 1 | Anonymous | |
249 | 7 | Paweł Widera | The _prologue_ script is executed just before the submitted job starts. Here, it generates a unique temp directory for each job in _/scratch_. |
250 | 1 | Anonymous | It must be installed on each NODE (master, slave): |
251 | 7 | Paweł Widera | <pre> |
252 | 1 | Anonymous | cp ./pbs/NODE/var/spool/torque/mom/priv/prologue $TORQUECFG/mom_priv |
253 | 1 | Anonymous | chmod 500 $TORQUECFG/mom_priv/prologue |
254 | 7 | Paweł Widera | </pre> |
255 | 1 | Anonymous | |
256 | 7 | Paweł Widera | The _epilogue_ script is executed right after the submitted job has ended. Here, it deletes the job's temp directory from _/scratch._ It must be installed on each NODE (master, slave) |
257 | 7 | Paweł Widera | <pre> |
258 | 1 | Anonymous | cp ./pbs/NODE/var/spool/torque/mom/priv/epilogue $TORQUECFG/mom_priv |
259 | 1 | Anonymous | chmod 500 $TORQUECFG/mom_priv/epilogue |
260 | 7 | Paweł Widera | </pre> |
261 | 1 | Anonymous | |
262 | 1 | Anonymous | |
263 | 1 | Anonymous | |
264 | 7 | Paweł Widera | h2. MAUI |
265 | 7 | Paweł Widera | |
266 | 7 | Paweł Widera | |
267 | 7 | Paweł Widera | |
268 | 7 | Paweł Widera | h3. Register new services |
269 | 7 | Paweł Widera | |
270 | 7 | Paweł Widera | Edit _/etc/services_ and add at the end: |
271 | 7 | Paweł Widera | <pre> |
272 | 1 | Anonymous | # PBS/MAUI services |
273 | 1 | Anonymous | pbs_maui 42559/tcp # pbs scheduler (maui) |
274 | 1 | Anonymous | pbs_maui 42559/udp # pbs scheduler (maui) |
275 | 7 | Paweł Widera | </pre> |
276 | 1 | Anonymous | |
277 | 1 | Anonymous | |
278 | 7 | Paweł Widera | |
279 | 7 | Paweł Widera | h3. Setup and Configuration on the Head Node |
280 | 7 | Paweł Widera | |
281 | 1 | Anonymous | Extract and build the distribution MAUI. |
282 | 7 | Paweł Widera | <pre> |
283 | 1 | Anonymous | export MAUIDIR=/var/spool/maui |
284 | 1 | Anonymous | tar -xzvf MAUI.tar.gz |
285 | 1 | Anonymous | cd TORQUE |
286 | 7 | Paweł Widera | </pre> |
287 | 1 | Anonymous | |
288 | 1 | Anonymous | Configuration for a 64bit machine with the following compiler options: |
289 | 7 | Paweł Widera | <pre> |
290 | 1 | Anonymous | FFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC" |
291 | 1 | Anonymous | CFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC" |
292 | 1 | Anonymous | CXXFLAGS = “-m64 -march=[Add Architecture] -O3 -fPIC" |
293 | 1 | Anonymous | LDFLAGS = “-L/usr/local/lib -L/usr/local/lib64" |
294 | 7 | Paweł Widera | </pre> |
295 | 7 | Paweł Widera | *Attention*: For Intel Xenon processors use _-march=nocona_, for AMD Opteron processors use _-march=opteron_. |
296 | 1 | Anonymous | |
297 | 1 | Anonymous | Configure, build, and install: |
298 | 7 | Paweł Widera | <pre> |
299 | 1 | Anonymous | ./configure --with-pbs=$TORQUECFG --with-spooldir=$MAUIDIR |
300 | 5 | Paweł Widera | make |
301 | 5 | Paweł Widera | make install |
302 | 7 | Paweł Widera | </pre> |
303 | 5 | Paweł Widera | |
304 | 7 | Paweł Widera | Fine-tune MAUI in $_MAUIDIR/maui.cfg_: |
305 | 7 | Paweł Widera | <pre> |
306 | 1 | Anonymous | SERVERHOST master01.procksi.local |
307 | 5 | Paweł Widera | |
308 | 1 | Anonymous | # primary admin must be first in list |
309 | 1 | Anonymous | ADMIN1 procksi |
310 | 1 | Anonymous | ADMIN1 root |
311 | 1 | Anonymous | |
312 | 1 | Anonymous | # Resource Manager Definition |
313 | 1 | Anonymous | RMCFG[MASTER01.PROCKSI.LOCAL] |
314 | 1 | Anonymous | ] |
315 | 1 | Anonymous | TYPE=PBS@RMNHOST@ |
316 | 1 | Anonymous | PORT=15001 |
317 | 1 | Anonymous | EPORT=15004 [CAN BE ALTERNATIVELY: 15017 - TRY!!!] |
318 | 1 | Anonymous | |
319 | 3 | Anonymous | SERVERPORT 42559 |
320 | 1 | Anonymous | SERVERMODE NORMAL |
321 | 3 | Anonymous | |
322 | 1 | Anonymous | # Node Allocation: |
323 | 1 | Anonymous | # JOBCOUNT number of jobs currently running on node |
324 | 3 | Anonymous | # LOAD current 1 minute load average |
325 | 3 | Anonymous | # AMEM real memory currently available to batch jobs |
326 | 3 | Anonymous | # APROCS processors currently available to batch jobs |
327 | 3 | Anonymous | # PREF node meets job specific resource preferences |
328 | 3 | Anonymous | |
329 | 3 | Anonymous | NODEALLOCATIONPOLICY PRIORITY |
330 | 3 | Anonymous | NODECFG[DEFAULT] PRIORITYF='-JOBCOUNT - 2*LOAD + 0.5*AMEM + 0.25*APROCS + PREF' |
331 | 7 | Paweł Widera | </pre> |
332 | 3 | Anonymous | |
333 | 3 | Anonymous | |
334 | 1 | Anonymous | Start the MAUI scheduler manually. Make sure that pbs_sched is not running any longer. |
335 | 3 | Anonymous | |
336 | 7 | Paweł Widera | * Start the scheduler: |
337 | 7 | Paweł Widera | <pre> |
338 | 1 | Anonymous | /usr/local/sbin/maui |
339 | 7 | Paweł Widera | </pre> |
340 | 1 | Anonymous | |
341 | 1 | Anonymous | |
342 | 1 | Anonymous | Get [repos:Externals/Cluster/procksi_pbs.tgz] from the repository and untar it: |
343 | 7 | Paweł Widera | <pre> |
344 | 1 | Anonymous | untar –xvzf procksi_pbs.tgz |
345 | 7 | Paweł Widera | </pre> |
346 | 3 | Anonymous | |
347 | 3 | Anonymous | Make the entire queuing system (Torque + Maui) start at bootup: |
348 | 7 | Paweł Widera | <pre> |
349 | 1 | Anonymous | cp ./pbs/master/etc/init.d/pbs_* /etc/init.d/ |
350 | 6 | Paweł Widera | /sbin/chkconfig --add pbs_mom |
351 | 6 | Paweł Widera | /sbin/chkconfig --add pbs_maui |
352 | 6 | Paweł Widera | /sbin/chkconfig --add pbs_server |
353 | 6 | Paweł Widera | /sbin/chkconfig pbs_mom on |
354 | 6 | Paweł Widera | /sbin/chkconfig pbs_maui on |
355 | 6 | Paweł Widera | /sbin/chkconfig pbs_server on |
356 | 7 | Paweł Widera | </pre> |
357 | 6 | Paweł Widera | |
358 | 7 | Paweł Widera | If you want to use the simple scheduler that comes with PBS Torque, then substitute _pbs_maui_ with _pbs_sched_. |
359 | 6 | Paweł Widera | |
360 | 6 | Paweł Widera | |
361 | 7 | Paweł Widera | |
362 | 7 | Paweł Widera | h3. Setup and Configuration on the Slave Nodes |
363 | 7 | Paweł Widera | |
364 | 6 | Paweł Widera | Get [repos:Externals/Cluster/procksi_pbs.tgz] from the repository and untar it: |
365 | 7 | Paweł Widera | <pre> |
366 | 6 | Paweł Widera | untar –xvzf procksi_pbs.tgz |
367 | 7 | Paweł Widera | </pre> |
368 | 6 | Paweł Widera | |
369 | 1 | Anonymous | Make the entire queuing system start at bootup: |
370 | 7 | Paweł Widera | <pre> |
371 | 1 | Anonymous | cp ./pbs/slave/etc/init.d/pbs_mom /etc/init.d/ |
372 | 1 | Anonymous | /sbin/chkconfig --add pbs_mom |
373 | 1 | Anonymous | /sbin/chkconfig pbs_mom on |
374 | 7 | Paweł Widera | </pre> |
375 | 1 | Anonymous | |
376 | 1 | Anonymous | |
377 | 7 | Paweł Widera | h3. Monitoring Grid Status |
378 | 7 | Paweł Widera | |
379 | 7 | Paweł Widera | |
380 | 7 | Paweł Widera | * display queue information (active/idle jobs) |
381 | 7 | Paweł Widera | <pre> |
382 | 1 | Anonymous | showq |
383 | 7 | Paweł Widera | </pre> |
384 | 7 | Paweł Widera | * current and historical scheduling statistics |
385 | 7 | Paweł Widera | <pre> |
386 | 1 | Anonymous | showstats -v |
387 | 7 | Paweł Widera | </pre> |
388 | 7 | Paweł Widera | * display job state and resources information |
389 | 7 | Paweł Widera | <pre> |
390 | 1 | Anonymous | checkjob <JOB_ID> |
391 | 7 | Paweł Widera | </pre> |
392 | 7 | Paweł Widera | * display node state and resources information |
393 | 7 | Paweł Widera | <pre> |
394 | 1 | Anonymous | checknode <NODE_NAME> |
395 | 7 | Paweł Widera | </pre> |