Gaea, Analysis and GFDL notes
This document is a collection of all the poorly documented or
undocumented things that I have come across while hacking around with
the HSt42
model in idealized.xml
(basic_hs.xml
and zonalacc.xml
are
the resulting files that I have written). I assume that most of these
principles are fairly general to the GFDL/gaea system.
Table of Contents
1 Storage and Data
- long term fast scratch (
ltfs
) items are deleted after 90 days (roughly?) if not modified. This is automatic. Therefore keep my modified code in user directory. I use symbolic links in ltfs to my code to avoid having to deal with paths. - fast scratch (
fs
) is cleared more frequently so shouldn't be used for anything. - archive (see below) is backed up and permanent.
1.1 Analysis and Archive
/archive/Thomas.Flannaghan
is where data is delivered after runs
finish (automatic). This directory is on the analysis
machine (also
mounted as read-only on public
). Data analysis should be done on the
analysis
machine. Log in via public using a different RSA code.
Data is delivered to rather cryptic folders. I've written a little
tool called archtool.py
to automate the organising of runs. This tool
displays a list of all runs on archive
along with when they were
created. At present, the only thing that can be done is automatically
extracting the data (and run namelists) to a new sub-directory ("run
code" in the run index). The tool automatically recognises and handles
restart runs.
1.2 Copying data
gcp
allows movement between gaea, analysis and GFDL but not outside it.- Normal
scp
andssh
are available on GFDL workstations (i.e.public
). archive
is mounted onpublic
at/archive/$USER/
(read only) so we can dopublic> scp /archive/$USER/.... atmos02:/...
to move data to sayre.
2 XML, Compiling and Running
This section is valid for running the HSt42 spectral dynamical core
(defined in idealized.xml
). Other models may be more complicated but
I suppose the principles are the same.
2.1 Commands
2.1.1 fremake
Performs a CVS operation (see below) used to make a new experiment. This command does not actually do the make, but instead returns a command that will run the makescript. This command should be run after any source modifications are made.
fremake -x <xmlfile> <experiment>
2.1.2 frerun
frerun
does two things; first, it constructs the input namelists and
diag-table files. Then it submits the experiment as a job.
frerun --overwrite -r basic --submit-staged -x <xmlfile> <experiment>
The --overwrite
flag refers to overwriting the results output on
analysis
. It does not overwrite the run environment on fast scratch.
Alternatively, --unique
can be used instead; this saves results to
unique locations on analysis
. Use --unique
to run multiple copies of
the same experiment simultaneously. The --unique
flag uses the same
binary as the previous run, so don't use when code updated.
The -r basic
option specifies the regression run. Omit and fremake
will do a production run. Currently I only know about regression runs,
which are good for any job that takes less than 16 hours to complete
(typically 5000 days at T42 with 60 levels).
2.1.3 Other useful commands
showq -u $USER
lists the current jobs belonging to$USER
.mjobctl -c gaea.xxxxxxx
cancels a job.gcp
copies stuff betweengaea
andanalysis
. Cannot do directories, sotar
first.
2.1.4 My custom commands
archtool.py
is a tool for managing archive. It detects restart runs, extracts the tar files, and saves the restart information.add_restart.py
is a tool for setting up regression runs with restart information. See Restart runs for documentation.
2.2 Notes
All settings for compiling experiments and for running jobs can be
specified in the XML file. frerun
remakes all namelist files so if
only runtime options in the XML are modified, we do not need to
remake.
The "group letter" is found in the XML inside the project
tag. This
is "gfdlw". Any references to c1 and c2 should be removed from this
string. The job will silently fail if this is wrong or not set :-(. An
XML example is
... <platform name="ncrc2.default"> <directory stem="$(FRE_STEM)"/> <project>gfdl_w</project> ...
2.3 Making a new experiment
An experiment is defined as a unique codebase. We do not need new experiments if we are just changing namelists. They are defined in the XML.
- Copy an existing experiment in the XML
- Change the name and path of the experiment (see example below)
- Run
fremake
but don't run command given at end of fremake - Check to see that CVS operations took place during
fremake
. If not, the name or path was not changed correctly (see step 2) and an existing experiment was in the path given.
The experiment XML should end up like this:
<experiment name="{experiment name}"> <description> Blah blah blah </description> <component name="fms_spectral_solo" paths="$(rootDir)/{experiment name}/src" includeDir="$(srcDir)/shared/include"> <source versionControl="cvs"> ...
where {experiment name}
is the name of the experiment (must be unique
in the XML file).
2.4 Modifying the model code
2.4.1 Basic Procedure
If modifying an existing experiment, skip to step 2 below. Running
fremake
again is not a good idea (although it will detect and not
overwrite existing code).
- Make a new experiment (see above)
- Change fortran code on
ltfs
. Save changes in home dir (ltfs is volatile). - Run command returned by step 1 (don't need to use
msub
– run it directly). It's a make script and it will compile the code onltfs
. - Fiddle with namelists, output tables in the XML etc
- Submit the model job using *frerun with overwrite flag set. For some reason, using the unique flag uses the old executable.
2.4.2 Comments
When first run, fremake downloads the model code to ltfs. Other invocations of fremake do not do this. It then gives a script to compile the code. Don't run this until you've made changes to the code stored on ltfs under the correct name.
ltfs stuff is sometimes deleted. Always keep a copy of changed code (or maybe the whole tree) in your home directory.
The make script just compiles the binaries. It does not read the namelist parameters set in the xml file.
The run script does not change the ltfs code either. It uses fs.
2.5 Editing and making directly
There is an interactive mode which is good for debugging compiler and runtime errors. To get an interactive session, use
msub -I -l walltime=01:00:00,size=16
This gives a terminal that will work for 1 hour with 16 cores (the
number of cores must be the same as npes
in the XML – see *Horizontal resolution). From this interactive terminal, we can make the model
using make
(with env script) or using the build script, and we can run
the model executable using
aprun -n 16 fms_HSt42_.....x
Running the model produces a lot of netcdf files which need combining
and sending to archive. I'm not sure how this is done yet. Again, the
n
option refers to the number of cores and must be the same as npes
in
the XML.
2.6 Restart runs
All runs (regression and production) can be restarted.
2.6.1 Regression runs
The process for regression runs has now been automated by my script
xml/add_restart.py
. It is used in the following manner:
frerun -u <frerun-options> | ./add_restart.py <restart-dir>
where <frerun-options>
are standard frerun
options but cannot include
--submit-staged
, and <restart-dir>
is a directory containing the
relevant restart files. The script has been tested and works
well. Suitable restart directories are stored in ~/restarts
on gaea
.
The procedure that the script implements is as follows:
- Find the original regression run on
/fs/work/...
. It should contain a RESTART directory. Note down it's path usingpwd
. - Run
frerun
with the--unique
flag and without the--submit-staged
flag. It will exit with anmsub
command to run. do not run this msub command yet! - Open the file that is the argument to
msub
(the run script). It is located in~/siena_201207/.../scripts/run
, and will contain something like this:cd $workDir set dataFilesNotOK = ( ) if ( $dataFilesNotOK > 0 ) then
A search for "dataFilesNotOK" will find this region.
- Edit this region of the runscript:
cd $workDir set dataFilesNotOK = ( ) # add this line. it copies the restart files to input of new run # folder. <path_to_run> should be an absolute path. cp <path_to_run>/RESTART/* INPUT/. ### if ( $dataFilesNotOK > 0 ) then
where
<path_to_run>
is the full path from step 1 - Issue the
msub
command given at the end of step 2
- Details
The files needed for a correct restart areatmos_model.res
,atmosphere.res.nc
andspectral_dynamics.res.nc
. These just contain the model state and the number of days elapsed. Namelists are not loaded from the restarted run. Therefore, we must use appropriate restarts given the namelist used.
3 Model Configuration
3.1 Horizontal resolution
The HSt42
model defined in idealized.xml
gives a typical T42
resolution run, but we can modify the xml to give different
resolutions. The key spectral_dynamics_nml
namelist parameters are
namelist parameter | T21 | T42 | T85 |
---|---|---|---|
num_fourier | 21 | 42 | 85 |
num_spectral | 22 | 43 | 86 |
lat_max | 32 | 64 | 128 |
lon_max | 64 | 128 | 256 |
In addition, we must change the regression or production run specification to indicate the number of processors. For regression runs, the specification for T42 is something like
<regression name="..."> <run days="..." npes="16" runTimePerJob="..."/> </regression>
The npes
field is something to do with the number of processors
required for the run (this is also the number used when triggering an
interactive session). For T85, increase npes
to "32". When the run
actually is submitted, it typically uses twice the number of cores as
specified here.
3.2 Vertical coordinate
The spectral_dynamics_nml
controls the vertical coordinate. The
vert_coord_option
allows various different ways to define vertical
coordinates. The simplest of these are "even_sigma
" and
"uneven_sigma
".
"even_sigma
" simply divides the vertical coordinate evenly in sigma,
from 0 to 1. num_levels
gives the number of levels.
"uneven_sigma
" allows more complicated sigma coordinates. Again,
num_levels
gives the number of levels. The configuration namelist
settings are as follows
option | meaning | Default |
---|---|---|
surf_res | Confusing – see exact definitions | 0.1 |
exponent | Confusing – see exact definitions | 2.5 |
scale_heights | Top level expressed in scale heights | 4 |
zero_top | Forces top sigma level to be 0 | .true. |
Reasonable (i.e. approx 1 km) resolution at the TTL/lower stratosphere
can be obtained with 60 levels, and scale_heights = 10
(the remaining
parameters are set to the defaults). Polvani-Kushner use a set of
custom specified levels but these lie closest to
=scale_heights= =11.0, =surf_res= = 0.5, exponent = 3,
These values are from Martin Jucker (he fitted these parameters to the PK levels).
The exact definitions (from atmos_spectral/init/vert_coordinate.F90
):
p = a * p_ref + b * p_surf
where p_surf
is the instantaneous surface pressure and p_ref
is some
constant. uneven_sigma
defines a
and b
as follows
! a is set to zero a = 0 ! b is defined by: do k = 1,num_levels zeta = 1 - float(k - 1) / float(num_levels) z = =surf_res= * zeta + (1 - surf_res) * zeta**exponent b(k) = exp(-z * scale_heights) enddo ! sort out the model surface level and top level b(num_levels+1) = 1.0 if (zero_top) b(1) = 0.0
Therefore, when surf_res = 1
we have roughly evenly spaced levels in
height (exactly even in log-pressure coordinates). When surf_res = 0
we have exponentially spaced levels. This means that surf_res
doesn't
really have anything to do with the surface – it is more like saying
"how close to linear are we?". It's value affects the resolution
everywhere.
Here is a simple calculator which also gives the approximate height (log-pressure height).
# namelist parameters: num_levels = 10 scale_heights = 11 surf_res = 0.5 exponent = 3 import math for k in range(1, num_levels + 1): zeta = 1 - (k - 1) / float(num_levels) z = surf_res * zeta + (1 - surf_res) * zeta ** exponent b = math.exp(-z * scale_heights) b_km = 7 * z * scale_heights print 'b(%d) =\t%0.3f (approx %0.1f km)' % (k, b, b_km)
3.3 Sponge
There is no sponge by default in the HSt42
model defined in
idealized.xml
. There is only an optional very simple sponge available,
which acts on only the top level (defined in spectral_damping.F90
). It
is defined by the following parameters in spectral_dynamics_nml
:
option | meaning | default |
---|---|---|
eddy_sponge_coeff | Eddy damping in top layer | 0 |
zmu_sponge_coeff | Zonal mean u damping | 0 |
zmv_sponge_coeff | Zonal mean v damping | 0 |
All coefficients are in units of m2/s, and are applied as damping terms like \(coeff * \Delta^2\).
The Polvani-Kushner hs_forcing.F90
file (given to me by Martin) has a
more advanced sponge. I have added it to my hs_forcing_zonalacc.F90
file. The following options control the sponge, and are given in the
hs_forcing_nml
namelist.
option | meaning | default |
---|---|---|
sponge_flag | Is sponge used? | .false. |
sponge_tau_days | The sponge timescale (days) | 1 |
sponge_pbottom | The lowest level to apply sponge (Pascals) | 100 |
The sponge is a Rayleigh friction with coefficient given by
sponge_coeff * (1 - p / sponge_pbottom) ** 2
. sponge_coeff
is the
inverse timescale derived from sponge_tau_days
.
3.4 Held-Suarez Forcing
option | default | |
---|---|---|
t_zero | Surface temperature at equator | 315 |
t_strat | Stratospheric temperature | 200 |
delh | Equator to pole temperature difference | 60 |
delv | Delta Theta per scale height | 10 |
eps | Hemispheric assymetry | 0 |
sigma_b | Boundary layer sigma | 0.7 |
ka | Newtonian damping in free atmos | -40 |
ks | Newtonian damping in boundary layer | -4 |
kf | Rayleigh damping in boundary layer | -1 |
do_conserve_energy | Converves energy dissipated via | .true. |
Rayleigh damping (and other momentum | ||
forcings) by heating. |
The damping rates k are given in units of days if negative. Not sure about units if positive, but probably days-1. Temperatures are all in Kelvin.
do_conserve_energy
does not really affect the solution to the
local_forcing
-type problems. Both have been tested with minimal
differences (<5% difference).
3.5 Local Momentum Forcing
This is my custom code documentation. I have added this behaviour to
hs_forcing_zonalacc.F90
, which replaces hs_forcing.F90
(using a
symbolic link). The namelist parameters go into the hs_forcing
namelist. All options are prefixed with local_forcing
and are given
here:
local_forcing* | meaning | units | default |
---|---|---|---|
_option | Defines the forcing mode | '' | |
_xwidth | Longitudinal half-width | degrees | 10 |
_ywidth | Latitudinal half-width | degrees | 10 |
_ycenter | Central latitude | degrees | 0 |
_zwidth | Half-width in height | scale height | 0.5 |
_zcenter | Central height. | scale height | 2.2 |
_amp | Amplitude of forcing. | m/s/day | 1 |
_start | Start time of forcing. | days | -1 |
_end | End time of forcing. | days | -1 |
local_forcing_start
and local_forcing_end
allow control of when
the forcing is turned on and off. A value of -1 (the default) removes
that time constraint. For example, switching on from 100 days would
give local_forcing_start = 100
and local_forcing_end = -1
.
local_forcing_option
is a string and can have several values:
local_forcing_option | meaning |
---|---|
'' | No forcing. The local_forcing code is not run. |
'Cos2Dipole' | A two-signed forcing in the vertical. |
For positive amplitude, negative forcing is | |
on the top and positive forcing is on the | |
bottom. The forcing has the form | |
cos(x)2 cos(y)2 sin(z) | |
'CosDipole' | Vertical structure as above, but with |
cos(x) cos(y) horizontal structure. | |
'Cos2Single' | Single signed forcing (cos(z) vertical) |
'CosSingle' | – ditto – |
'CosDipoleZM' | Zonal mean CosDipole – Zonal mean caveats |
'Cos2DipoleZM' | Zonal mean Cos2Dipole – Zonal mean caveats |
3.5.1 Zonal mean caveats
local_forcing_amp
in the *ZM cases is the peak zonal mean forcing but
in the other cases, it is the peak local amplitude. Here's a
calculator that works out the equivalent local amplitude equivalent to
the zonal mean case (i.e. with same zonal mean), and to get the
10S-10N equivalent amplitude,
from numpy import * Lx = 30 # degrees Ly = 10 # degrees A = 0.3 # zonal mean amplitude x = linspace(-180, 180, 5000) y = linspace(-10, 10, 100) f = cos(0.5 * pi * x / Lx) * cos(0.5 * pi * y[:,None] / Ly) \ * ((abs(x) < Lx) & (abs(y[:,None]) < Ly)) f *= A / f.mean(axis=-1).max() print 'zonal mean amplitude, A ', f.mean(axis=-1).max() print '10N-10S zonal mean amplitude', f.mean() print 'local amplitude ', f.max() print '10N-10S local amplitude ', f.mean(axis=0).max() print 'Theoretical local amplitude ', A * pi * 360 / (4. * Lx)
3.6 Local Heating
The base version of hs_forcing.F90
includes Isidoro's local
heating. In my version, I've removed this and replaced it my a local
heating routine that mirrors the local forcing routine. All namelist
variables do the same thing here as they do for local forcing and have
the same defaults, but are named local_heating_xxxx
in place of
local_forcing_xxxx
.
The only difference is that local_heating_amp
is in units of K/day.
See above for all local_forcing
settings.