hpc-cmb
all about hpc-cmb, the cluster maintained by hpc at usc for the cmb department.
1 howto
1.1 useful tips 2006-01
"hpcc":http://www.usc.edu/hpcc/
"pbs doc": http://www.usc.edu/hpcc/systems/use-l-4.php
"check cmb queue": 2008-05-30 http://hpc-cmb.usc.edu/cgi-bin/status-h.pl
cmb wiki (a link in this plone, search it) 2008-05-30 dead after Drake left.
rpm -qa -- show all installed packages
yum list -- 2008-05-02 list all packages available to the system (installed + uninstalled)
checkjob JOB_NUMBER -- detailed look of a job, this could tell why a job doesn't start
clusterusage -- a status report of all queues and users, a perl script written by Drake
pbsnodes -a -- detailed look of the load of all nodes
qbalance -h -- what accounts you may access and their balances(in hour)
qstat QUEUENAME -u USERNAME (-n) -- -n is used to show nodes
qstat -Q -f -- check the info of all queues
qstat -n JOB_NUMBER -- In addition to the basic information, nodes allocated to a job are listed.
qsub -A lc_yh -- submit job via lc_yh account
showstart JOB_NUMBER -- check the schedule of a job
pbstop -- 02/09/09 PBS-version of the famous top command. use / to search for a node by its name or job by its id to get a detailed report.
1.3 tips
- the whole hpc cluster is a hybrid of 32bit and 64bit nodes (hpc-cmb is 64bit)
- 32bit program can run under 64bit platform, but not reverse. (01-06-06, this seems not true. annot/bin/graph/ modules compiled on hpc-master, 32bit can't be run on hpc-cmb, 64bit).
- 32bit program has 2G threshold for output file, 64bit doesn't (probably very large)
- For parallel program, -l arch=x86_64 specifies all nodes 64bit
- 2006-01-05 csv and modules in annot/bin/graph/ are compiled into 32bit on hpc-master in order to utilize the vast amount of node in main queue.
- 2008-03-15 use -l mem=4G to specify memory resource request. -l is for resource list. it could be followed by one request or a list of resource request separated by ,(coma), -l arch=x86_64,mem=4G. 4G or 4gb or 4g is all same.
- 2008-05-15 for parallel jobs, -l mem=100G specifies the amount of memory the whole multi-node job needs, not the actual memory it needs on a single node.
- 2009-10-5 to specify a node by its name: -l nodes=hpc0704:ppn=1
1.4 testing mpi program (2006-08-09)
qsub an interactive job:
qsub -I -A lc_yh -l arch=x86_64 -l nodes=2:ppn=2 -l walltime=30:00
execute the parallel program:
mpiexec ...
capacity of the quick queue (2006-08-09):
nodes=4:ppn=4 walltime=30:00
2009-06-05 Notes:
memory specification has always to be in integer. like 3g, 30g, no float.
the actual memory specification has to be a bit lower than the official declared max-memory on the nodes.
Table. Memory Specification Lower than Official Label memory specification
official memory labelling
0-3g
4G 12G 32G node
4-11g
12G 32G node
12-30g
32G node
2 misc
2.1 make for annot (python2.2 and customized boost library) 2006-01
swig -- directory of include files for python2.2
boost -- include and library files for boost-1.33.1. System is boost-1.31.0.
make CxxFlags="-O3 -fPIC -L /home/rcf-14/yuhuang/lib/boost/lib/ -I /home/rcf-14/yuhuang/lib/boost/include/" SwigInclude=/usr/include/python2.2/ SharedLibFlags="-shared -fPIC -L /home/rcf-14/yuhuang/lib/boost/lib/" Libs=""
12-16-05 -- Garrick installed boost 1.33.1. Simplify lots of things: 'make SwigInclude=/usr/include/python2.2/ Libs=""'
module_cc/, tightClust/ and graph/clustering.cc failed to compile but they are not necessary.
2.2 ssh from mainnode to other nodes 2006-01
Usually, the nodes are close to ssh.
When the user has a job running on that node, then the user is allowed to ssh into that node, but only that node.
For a parallel program, you can log into any node from hpc-cmb. If you log into the master node among all nodes, 1st node, you are allowed to log into other slave nodes. But if you log into a slave node, you can't log into other slave or master nodes.
After you log into the node, you can log back into hpc-cmb if you want to submit another job by the end of the job.
2.3 problem with 'fasta_block_iterator' of transfacdb.py on hpc-cmb (2006-08-31)
It seems 'for line in self.inf' doesn't work with python2.2 on hpc-cmb, though it works fine on my desktop(python2.3). After changed to 'line = self.inf_r.readline() while(line):', it worked.
3 packages
3.1 install python-newt 2006-01-06
grab the debian source code -- apt-get source python-newt
copy source code to hpc-cmb -- scp newt-0.51.6/newt-0.51.6.tar.gz
- configure
- make
- cp snack.py python2.2/_snackmodule.so to somewhere in PYTHONPATH
3.2 install matplotlib on hpc-cmb(2006-09-06)
- download the source code from http://matplotlib.sourceforge.net/
- untar the ball and cd into it
- /usr/usc/python/default/bin/python setup.py build
- /usr/usc/python/default/bin/python setup.py install --home=~/lib64/matplotlib/
- mv ../matplotlib/lib/python/* ~/lib64/python/ (moved back to the right place) and clear directory '../matplotlib'
3.3 install psycopg on hpc-cmb(2006-10-05)
- download http://initd.org/pub/software/psycopg/PSYCOPG-1-0/psycopg-1.0.15.1.tar.gz
- untar the ball and cd directory
- ./configure --with-python=/usr/usc/python/default/bin/python2.4 --with-postgres-includes=/usr/include/pgsql/server/
- make
- install the module, cp psycopgmodule.so ../python/
3.4 install SQLAlchemy on hpc-cmb(2008-05-09)
- wget http://pypi.python.org/packages/source/S/SQLAlchemy/SQLAlchemy-0.4.5.tar.gz
- tar -zxvf SQLAlchemy-0.4.5.tar.gz
- cd SQLAlchemy-0.4.5
- python setup.py build (it'll get easy-install package from cheese shop as well)
- export PYTHONPATH=$PYTHONPATH:~/lib64/python_install_home_dir/ (might need to add easy-install.pth in ~/lib64/python_install_home_dir/lib/python. SQLAlchemy requires easy-install package while python2.3 on hpc-cmb doesn't have it.)
- python setup.py install --home=~/lib64/python_install_home_dir/
- cp -r ~/lib64/python_install_home_dir/lib/python/SQLAlchemy-0.4.5-py2.3.egg/sqlalchemy/ ~/lib64/python/
3.5 install numpy on hpc-cmb(2008-05-09)
- download
- tar -zxvf numpy-1.0.4.tar.gz
- cd numpy-1.0.4
- python setup.py install --home=~/lib64/python_install_home_dir/
- cp -r ../python_install_home_dir/lib64/python/numpy/ ../python/
3.6 install elixir on hpc-cmb (2008-07-17)
- cd ~/lib/64
- wget http://pypi.python.org/packages/source/E/Elixir/Elixir-0.5.2.tar.gz#md5=ea37e917896dce419f777d17afc43fce
- tar -zxvf Elixir-0.5.2.tar.gz
- cd Elixir-0.5.2
- cp Elixir-0.5.2/elixir/ -r python/ (python setup.py build reported error: from setuptools import setup, find_packages ImportError: No module named setuptools.)
- cd ~/lib/python
- ln -s ../../lib64/python/elixir/ .
Elixir is written in all-python. so copy-and-paste is good enough.