Saddleback Install Records & Specifics



Updated June 5, 2018

Specifications
Cluster is installed in ERC room 214 (ERC computer room).

sb

How to Run Jobs
Saddleback jobs must be run in batch mode using the Torque job manager which uses only the compute nodes sb1, sb2.... Small interactive debug runs may be run on 12 cores or less on the master, saddleback, including jobs run under TotalView. If you have a job to run under TotalView that will use more cores, contact Kelley.

Home Directories :
/Users/username

Other Directories :
/usr/local/*
/temp
/data
/Models

Disk Storage :
User disk (boot drive) : 2TB shared by the OS and others
/disk2 : 3TB
/disk3 : 3TB
/pond : 82.5 TB raid
/pool : 164 TB raid

Program Locations :
PGI fortran : /usr/local/pgi/linux86-64/14.7/bin/pgf95, pgf90, pgf77...
Totalview : /usr/local/toolworks/totalview/bin
netCDF : /usr/local/...

Shell Environment & Paths : default shell is csh
setenv PGI /usr/local/pgi
setenv PGRSH ssh
setenv LD_LIBRARY_PATH /usr/local/pgi/linux86-64/14.7/libso
include these in your path statement:
/usr/local/bin
/usr/local/pgi/linux86-64/14.7/bin
/usr/local/pgi/linux86-64/2014/mpi/mvapich/bin

Running Your Program :
Copy the commands below into a file (ex: test.pbs)
#!/bin/sh
#
#PBS -N Test1
#PBS -q batch
#PBS -l nodes=4:ppn=24
#PBS -d /Users/username/working directory
#PBS -M username@kiwi.atmos.colostate.edu
#PBS -l walltime=00:20:00
#PBS -m abe

mpirun -np 96 mpi_code
In order, these lines mean Finally, you run your program with the following command:
> qsub test.pbs

This is the qsub command documentation.

Debugging with TotalView : Default shell is csh. TotalView can only be run on 1 node: the master. If you need more than 6 cores please talk to Kelley or Mostafa for some non-batch reservation time.
TotalView variables:
setenv TVROOT /usr/local/toolworks/totalview
setenv TOTALVIEW /usr/local/toolworks/totalview/bin/totalview
setenv TVDSVRLAUNCHCMD ssh
setenv LM_LICENSE_FILE /usr/local/toolworks/license.dat:/usr/local/pgi/license.dat

Launch totalview with:
mpirun -tv -np 1 program (where 1 is the number of processors in this example)
Note that with the pgi compiler, doing both -g and -O0 will allow you to set breakpoints on almost any line. If you use just -g then some of the lines don't seem to be available for breakpoints.

If you want to specify nodes, use this command:
mpirun -tv -np 32 -hostfile myhostfile program
Where myhostfile contains the name of the nodes you want to use.

Generating Your Authorized Keys for the Nodes :
If you get an error indicating you do not have permission to run on the nodes (sb1, sb2, ...), you probably have not generated your keys yet. Do this:
> ssh-keygen (use default filename, do not enter a passphrase when asked)
> cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys

Templates for Some of Our Models :
Don has put together a PDF file of how to port some of our models to saddleback, namely the GCRM, SAM, VVM, SPCAM and BUGS. Click here for Don's notes.


Kelley's Install Notes
See the equipment list for serial numbers, etc. Winning bid was Nor-Tech out of Burnsville, MN. Contact Bob Dreis.
saddleback IP = 129.82.48.243
Warranty info : http://prd1warser.cps.intel.com and enter serial number:
master : azgd1150033
sb1/sb2 : BZMY93400408
sb3/sb4 : BZMY93500179
sb5/sb6 : BZMY93600292
sb7/sb8 : BZMY93600343

Operating system: CentOS v7 (upgraded by Mostafa May 2018)
Additional Software

RAID Install
Chasis: SuperMicro SuperChasis 847E26-R1400LPB RAID chasis configured for Infiniband.
Vendor is Quick-800 from La Mesa, CA. Contact AJ Jackson.

Disks:
2 300GB Intel SSD 320 Series SATA2 drives for cache and logging from Quick-800
1 160GB 7.5K rpm SATA drive for OS from Quick-800
36 3TB Seagate Constellation enterprise drives.