Kungl. Tekniska Högskolan Royal Institute of Technology
School of Biotechnology
Division of
Theoretical Chemistry
& Biology
Top Illustration
 Home | Contact | People | Research | Publications | Internal docs | Education | Links | Networks Site map

Getting started on Lenngren

Getting an account

You need an ordinary user account at PDC. Go to PDC accounts page and follow the instructions.

Kerberos

You need to have kerberos installed on your computer to be able to log into lenngren. Kerberos is available in the form of packages in many linux distrubtions. All theochem workstations have kerberos installed.

If you want to be able to run X-client applications (e.g. xterm, emacs windows, graphical debuggers) you should use KTH:s own implementation of kerberos known as heimdal. There are a selection of pre-compiled binaries availble as kerberos "travel-kits" at http://www.pdc.kth.se/support/kerberos/travelkit.html

For windows, there is a travelkit as well, but a nicer solution is cygwin which gives you a complete unix-style interface with X-windows and all necessary kerberos components http://support/kerberos/cygwin-install.html

Login process

Make sure the kerberos executable directory is in your path (at theochem /usr/kerberos/bin)

Obtain "tickets":

      
   kinit username@NADA.KTH.SE
Note that they have a finite life-time, the default can be changed with "-l" option.

Check tickets:

      
   klist
Login
      
   telnet [-ax] -F juliana.pdc.kth.se
(options -ax should be used from theochem workstations, with the redhat/fedora kerberos distribution). The -F option makes sure that your tickets are forwarded and that they are reforwardable (necessary when you submit jobs)

Comment: kerberos provides a mechanism for both authentication (who are you) and authorization (what you are allowed to do). Your home directory at PDC is part of a global file system AFS which is controled by kerberos as well. Therefore, once you have logged in at PDC, you must have valid tickets to have access to your own files! The -F option takes care of this.

With your accounts there are special scripts that are executed when you log in. These are called .login in csh and .profile in sh/bash (.bash_profile works for bash too). To be on the safe side you can add the following line in your login script

   /usr/heimdal/bin/kinit -l 30d
this to make sure that long batch jobs do not fail because the tickets expire. (See Note on kerberos ticket system below!) Users concerned with security may do a kdestroy before logging out. This does not interfere with your batch job, it is the valid tickets at submit time which are copied to the batch system.

To summarize: using -F to forward tickets properly, or to regenerate tickets once you have logged in is your choice. Make a habit of checking your tickets before submitting batch jobs. Even the experienced user makes this mistake not having sufficient tickets that will last for the whole batch job.

Note on kerberos ticket system.

Kerberos does never send your password to the server. Somewhat simplified the Kerberos works in this way: In the process of authentification upon login the server sends an encrypted package to your computer. If you can open the package using your password (on your local computer, never transmitting the password anywhere) you'll gain access to the server as requested. This means that you can create a ticket on your local machine without sending your PDC-password to anyone. Then you forward the ticket to PDC and use it there.
Now, if you instead decide to create a new ticket once you have logged on to PDC, this means that your password is transmitted over internet! Thus you are advised not to create tickets this way!

File transfer

Use the ftp of your kerberos distribution. The heimdal-version of ftp seems to work only in passive mode (-p option)

   ftp [-p] -f ftp.pdc.kth.se
Use "prot" at the command prompt to obtain encrypted communication With MIT kerberos this is obtained with
   /usr/kerberos/bin/ftp -x -f ftp.pdc.kth.se

There are some firewall issues rcp/rsh. The only working rsh is the heimdal version and in this case only with the "-e" option which disables output to stderr. Unfortunately rcp does not have such a flag. In this case one has to resort writing a script that does rcp over rsh contining e.g.

   dd if=localfile | rsh -e juliana.pdc.kth.se of=remotefile
With redhat kerberos you can only use ftp. Remember to use the -f flag.

Modules

The "module" system is used for setting up paths and environment variables I recommend at least the following in your login script

   module add heimdal easy
for kerberos and batch job commands respectively. Some online help is available with

   module help

Disk quota

Your home directory is mounted on an AFS volume with the name "H.user" Your current quota and disk usage is shown with

   module add afsws
   fs lq ~

Note that there is a backup directory called ~/OldFiles (with volume name H.user.backup). This normally contains the backup of your home directory from the previous night - if you accidentally remove a file ~/yo it is easy to repair the damage as by recovering it from ~/OldFiles/yo. It is also possible to recover older versions of a particular file, but then you need assistance from pdc-staff. If your disk space is too small for your normal activities the quota can be raised (within reasonable limits). Contact vahtras@theochem.kth.se

Submitting jobs

A normal submit is

   esubmit -n <nodes> -t <minutes>  [program program_arguments]

Comment: in order to submit jobs you have to belong to a cac (charge account category). Check with the command

   cac members <username>
On the chemistry portions of the cluster you will have a personal cac if the group leaders agree that you are allowed to run. On the snic part you may belong to additional cac's which may be specified with the esubmit command line option -c .

You will receive an email from the batch system when the job starts, and finishes

If you leave out the program and program_arguments, the nodes will be reserved for the time you have specified. Then you have sole access to the nodes - you can log in and try things interactively. The email that announces that the job has started contains the list of nodes that are available to you

To list jobs in the batch queue

   spq

To delete a job in the queue:

   sprelease -j JID

where JID is the job id number that is printed by the "spq" command

To find out the current queue limit settings there are a few commands that can help. The end of the output from

 spstatus -s 

gives the division of the cluster into job classes - the number of nodes that are available for jobs of different length e.g.

----- Space Information -----

D: 61 of 61 available for   4h jobs.
D:  4 of 61  excluded for  15h jobs, [2006-05-10 13:00:00, 2006-05-10 18:00:00].
D:  4 of 61  excluded for  60h jobs, [2006-05-08 13:00:00, 2006-05-12 18:00:00].
D: 16 of 61  excluded for 240h jobs, [2006-05-07 02:00:00, 2006-05-14 02:00:00].
D: 48 of 61  excluded for 960h jobs, [2006-05-07 02:00:00, 2006-05-14 02:00:00].
means that all nodes can run short jobs (< 4h ), 57 nodes accept jobs in the 4-60h range, 45 nodes accept jobs in the 60-240h range and finally 13 nodes accept jobs in the 240-960h range. The consequence is that for long jobs there are fewer nodes available and it will take longer for the job to start at all.
Note that these limitations are fixed with respect to nodes; there are certain nodes for long jobs, so if they're busy you have to wait until they are released nomatter if they just started or just have a few hours left. Also, when allocating nodes the system picks (the available) ones allowing the longest reservation first, even though not needed. The rationale behind this is that, given some variation on jobs in line, it gives shorter (possibly parallel) jobs (slightly) better turnaround time at the expense of longer jobs having to wait slightly longer. (a three week doesn't 'suffer' as much waiting a few days more compared to a two-day job having to wait for three weeks.)

Another command that shows limits is

d10n03$ spq -L
        INTERVAL  NICKNAME                           NJOB WALLTIME NODETIME
  - ]960h,8760h]  no_no_no -        -                   -        -        -
  -    ]4h,960h]  n_other  -        -                   8        -        -
  -    ]4h,960h]  nodetime -        -                   -        -   480h01
  -   ]0m01s,4h]  n_4h     -        -                   4        -        -
which means that requests over 960h are not considered, requests in the 4-960h bracket is tested against the number of queued jobs as well as the total queued nodetime (obtained by summing for each job a user has in the queue, the number of requested nodes times the requested time) Finally, short jobs are tested against the number of queued jobs.

Note that these tests are applied to a submitted job against previously queued jobs (not already running jobs) - for normal load situations this distinction is not that important but for a rare case that the machine is empty one should be able to use the whole machine.

Example:if a person has 8 long jobs in the queue a ninth submit will be put in "held" state and not queued for execution (until the the first of the other queued jobs start running). This will happen sooner if the the total number of nodehours for the queued jobs exceed 480h. Similarly, if a user is submitting many short jobs only 4 will be queued for execution and the remaining will be held.

Chemistry software status

Jaguar:
The license problems has been solved. Interactive use: if your input-file is test.in do
   module add jaguar
   jaguar run test
For batch jobs use
   esubmit -n <nodes>  -t <minutes> $SCHRODINGER/jaguar.easy test
To enable parallel calculations you need to have "module add jaguar" in your shell startup script (.bashrc or .cshrc). This is because mpich handles communication with the nodes with "rsh" and this is a way to make sure that the slave nodes have correct information of the executables. /Olav
Gaussian:
Version g03 D.01 is available: if your input file is gjob.in do
   module add gaussian/latest
   esubmit -n <nodes>  -t <minutes> $g03root/pdc/g03.easy gjob
A new version with both shared memeroy parallelism (intra-node:link command %nprocshared) and distributed memory parallism (across nodes:link command %nproclinda)

Dalton:
A parallel and serial version is available. Use the script
esubmit -t <minutes> -n <nodes> /pdc/vol/dalton/2.0/bin/dalton.easy -N <procs> <dalton_args>
The "-N" flag for the dalton script tells you how many MPI processes will be started. Typically this will be twice the number of nodes, but you can also try four processer per node; logically the nodes appear to have four processors (due to hyperthreading) but the number of physical processors is two.
Molden:
Simple graphical analyzer
module add molden
molden

Comments to vahtras@theochem.kth.se


Olav Vahtras

| Theoretical Chemistry home page > Documentation

webmaster-at-theochem.kth.se