Here, you will find some useful notes about working on the DKRZ (Deutsches Klimarechenzentrum) computers and running ECHAM(5). It is assumed that you are comfortable working with terminal emulators and have knowledge of basic shell commands. These notes are meant to ease you into using ECHAM on Blizzard and should be regarded as supplemental material to the existing official documentations for the climate model and the DKRZ facilities.
For questions: This email address is being protected from spambots. You need JavaScript enabled to view it.
1. What you need
A username, accountname and password will be given to you by the DKRZ. Also, you need a version of ECHAM - this should include the actual FORTRAN codes as well as controlling shell scripts, include files, etc. Furthermore, you need the .nc input files. These exist for different resolutions (T63, T85, etc.).
(If ECHAM is not run on the DKRZ computers, you need to make sure you have netCDF libraries, open mpi, GNU make and other utilities installed. For this guide, it will be assumed that you have all relevant utilities on your computers or plan to run the model on the DKRZ computers, where these should already be installed)
2. Login
You login to the DKRZ computers using the command:
ssh [username]@blizzard[1/2].dkrz.de
You can either use node 1 or 2, but should be consistent with that. The username starts with a 'b', followed by a 6 digit number. Note that the nodes may be used for compiling and submitting jobs, but not for interactive job processing or data storage!
3. SCP files into home directory
Secure-copy the model into your home directory on Blizzard (your location after login)
/pf/b/[username]/
Only copy the input files of desired resolution. They are quite large and you should be economical with the storage space in your home directory.
4. Set the right environment
Useful aliases
To avoid the loss of important shell scripts or files by accidental or careless removal, you might want to customise your remove commands like this:
alias rm 'mv \!* /tmp'
rm [FILENAME]
'rm' [FILENAME] will permanently delete the file.
These should be defined in the .cshrc or .bashrc files (depending on which shell you are using) in your home folder.
5. Commands you should know
It is assumed that the user is familiar with the basic commands, so only a handfull of useful options and commands that you should know when working on Blizzard are listed here. The most important is:
llcancel [job ID]
This command will cancel a job. When you notice that a model is not running correctly (anymore), you can stop it using this command. This will prevent it from unneccessarily using up cpu hours and wasting storage space with wrong output. Once you have deleted the wrong output, you can restart the climate model. Since it restarts itself regularly (once you submitted the job), sourcing previous output as new input, you will not have to start it from your first model year. All you need to do is find out the model year before the first errors occured, set this year as the first model year and run it. It should source its own output (restart files) from that year and use it as an input for the next cycle (until it restarts itself again). To avoid browsing through wrong output, it is recommended you check up on your model on a daily basis.
llsubmit [jobscript]
This command lets you submit a job in form of a job shellscript.
llq
This checks the status of a job. With the -u option, it reduces it to jobs submitted by the user. If you want to check the status of a specific job, you can simply add the job ID behind the llq command.
ps
will show you running processes.
ps ux
will show the processes of the user
ps aux
will show the processes of all users
ls -lt
A very useful option for the ls command (shows files) is to list (-l) the files in order of time of modification (-t).
ls -la
The -a option will show all files, including the hidden (which start with a full stop), such as the .cshrc file.
6. Compilation
This section describes how to configure and compile ECHAM. If you have a compiled version of the model, you can skip this step.
6.1 Configuration
Configuration files for different operating systems and compilers can be found in the config directory. To find out about your system, run config.guess like this:
sh config.guess
This will allow you to figure out, which of the config files you need to edit. Files starting with mh- are machine specific files. Here, you need to make sure, paths are set correctly. For example, the NETCDFROOT path variable needs to be given the correct value:
NETCDFROOT = /location/of/your/netcdf/root/folder
(As described in the qdel section in 5., the model continuously restarts itself using previous output. Since that is not available when you first start the model, UCAR's netCDF files are used as initial 'restart files'. For netCDF handling, netCDF utilities need to be installed and sourced.)
Then, run the file 'configuration' (which should be found in your ECHAM root folder):
sh configuration
When it cannot find a required file or command (alias), make neccessary changes in you configuration files in the 'config' folder and/or in .cshrc and make sure the required utilities are installed.
6.2 Compilation
After successful configuration, run gmake to compile ECHAM - simply type 'make'.
NOTE: Different versions of required libraries are available on Blizzard. This includes the latest versions. Generally, the latest is always recommended to start with. However, if you run into problems, you might want to fall back to other versions.
7. Running ECHAM
7.1 First run
To run ECHAM, you send a run-script to the scheduler as a "job". In the run-script you define all paths to in- and output directories and set all options for your simulation. Your script needs to write the values for the option variables in the "ECHAM namelists". These will be passed on to the model once it starts. You also need to write the scheduler options in your run-script, i.e. what resources you wish to use for your simulation, what happens when errors occur, etc. The format of these options depends on the scheduler used (in the case of Blizzard, it is LoadLeveler). You are free to write the run-script itself as a csh-, bash- or ksh script.
Before you submit the script, make sure of the following:
- You have the input files
- You defined the paths correctly to your input files, model executable, etc.
- There is plenty of space in your output directory - expect 0.1-1.5 TB of output, depending on your output interval, number of model years and levels
- You know how to send a job cancelling command to the scheduler (see commands you should know section)
For details on model- and job options, see the documents in the ECHAM/doc folder and the template korn shell run-script below.
For more detailed information on scheduler options, visit:
http://www.dkrz.de/Nutzerportal-en/doku/blizzard/LL
7.2 Restart
When your job reached the wallclock limit - the real time, not model time, after which you your model stops running (see scheduler options in template script) - or terminated for reasons not related to wrong setup, it needs to be restarted. You do this by simply changing the value of three option variable in your run-script:
RERUN = .true.
REYR = [year from which restart is desired]
REMON = [month from which restart is desired; month in restart file +1]
To determine the values for REYR and REMON, you need to find out the last time step for which calculations were made and written in files. You can find out through several ways.
- View the output log (saved in your run-script directory by default)
- View the rerun file in your output directory
- View the last saved output file in your output directory
Whereas the output log is a simple text file, the other files are in netCDF format (unless specified otherwise). You can view these as text with the command ncdump. To view the information page by page, type:
ncdump [filename] | less
Once you have changed the values of the three variables, simply submit the modified script as you did for your first run.
8. Transferring model output to tape archives
Your simulation generated output and it looks reasonable? Wonderful! If you want to clear the space in your output directory for another simulation and store the old output in the DKRZ HPSS filesystem, follow the instructions below.
8.1 Organising your data
One large file is preferable over many small files. You might want to organise your individual output files into a tar archive for that reason by typing:
tar - cvf [filename.tar] [directory you wish to tar]
The options c,v and f mean:
c - create new archive
v - be verbose (displays which file is currently added to the archive) - useful to monitor the progress
f - specify filename
If you also want to compress your archive with gzip, add the z option to the command
tar - cvzf [filename.tar.gz] [directory you wish to tar and gzip]
The preferred size for transferred files is 10GB - 50GB and 500GB is the maximum. If you have several TB of output, you might want to separate them into several files, e.g. experiment01_1950-1969.tar, experiment01_1970-1999.tar.
8.2 Data transfer
First, go to the location of your tar or tar.gz file on Blizzard. Then, type pftp to gain access to the tape archives.
It will ask you for your user password. Once you have entered those, you should be logged in and find yourself in the directory /hpss/arch. You can check your location using the pwd command. Now, go to your HPSS project directory /hpss/arch/[your account/project number]. Feel free to create subdirectories for your experiments. To transfer a file from your current Blizzard folder to your current HPSS folder, simply type:
put filename.tar
To transfer a file from the HPSS archives to your Blizzard folder, use the command get instead of put.
To log out and return to Blizzard, type quit.
Below is an example of how to transfer filename.tar.gz from the Blizzard directory /work/account_no/experiment_01 to hpss/arch/account_no
username@blizzard1$ cd /work/account_no/experiment_01 # go to location of yusername
username@blizzard1$ ls # check if file exists
filename.tar.gz
username@blizzard1$ pftp
[...]
Name (tape:username): # press enter if displayed username is correct
Password: # enter your user password
ftp> cd account_no # got to project directory
ftp> put filename.tar.gz # transfer command
[...]
ftp> ls # check if file is on archives
[...]
filename.tar.gz
ftp> quit # quit and return to Blizzard
username@blizzard1$
For more information, visit:
http://www.dkrz.de/Nutzerportal-en/doku/hpss