TABCMP_V

Latest: TABCMP_V v1.21 (05AUG15)

Contents

04NOV14

Introduction

Part of the RFM handling of
Look-Up Tables.

tabcmp_v.f is the FORTRAN77 program to compress RFM-generated look-up tables of pre-tabulated absorption coefficient (.tab files, created using the TAB option) in the spectral domain, essentially by removing spectral grid points where the absorption coefficient can be accurately reconstructed by interpolation from the adjacent points.

This typically provides a factor 10 compression, and the resulting files can be fed back into the RFM (using the LUT flag) identically to the uncompressed files since the .tab format allows for irregularly spaced spectral grids.

Further compression in the other tabulation axes can then be obtained by running the tabcmp_x program.

General Comments

  1. This program can only be run on .tab files with a regular spectral grid - it can't be re-run using its output as input.
  2. The program can be run on either ASCII or binary versions of the .tab file, and output can be in either format (see note on Binary files).
  3. The program asks for a convolution width, which should approximately match the spectral resolution of the measurements you are simulating. If in doubt, go for a smaller width, or no convolution at all (type <CR> in response to the prompt), however this will result in less compression.
  4. The program also asks for an Accuracy Criterion. This approximates the maximum interpolation error in transmittance, or as a fraction of radiance, resulting from the compression. If in doubt, go for a smaller number (but again, resulting in less compression).
  5. You are advised to compare RFM runs for a few selected cases using both uncompressed and compressed .tab files (just a matter of changing the filename(s) in the *LUT section of the driver file) - just to verify that the additional error from the compression is acceptable.

Compression Procedure

A
.tab file contains a tabulation of absorption coefficient k(ν,p,T,q) [m2/kmol] of a single species, where ν,p,T,q are axes of wavenumber, pressure, temperature and volume mixing ratio (VMR) scale factor (the last two are optional), as well embedded profiles of temperature and VMR interpolated to the pressure axis.

By default, the RFM generates .tab files on a regularly spaced spectral axis. As with the line-by-line calculations, a fine grid (typically 0.001 cm-1 for the mid-infrared for nadir viewing, 0.0005 cm-1 for limb-viewing) is required to capture the sharpest details present in the atmospheric spectrum, even if subsequently convolved to a much lower instrument resolution. However, between the spectral lines, large regions may be adequately represented by an interpolation of k (In practice, the RFM performs a linear interpolation of ln(k)). Also, if it is known that the spectra will be subsequently convolved to a lower resolution, spectrally-confined interpolation errors of different sign may cancel out in the convolution.

Using the p,T,q tabulation axis intervals to define 'Cells' (see Accuracy Criteria), the user then sets the accuracy Δτ within which the transmittance of every cell has to match the original transmittance as spectral points k are removed and replaced by an interpolated value k':

| exp(-k'lu) - exp(-klu) | ≤ Δτ
This limit has to be satisfied for every spectral point νl for every cell (equivalent to (pi, Tj, qk) axis points). However, the user can instead specify that it only has to be met after convolution with a triangular kernel Ψ(ν) of FWHM Δν (i.e., triangle with base 2Δν)
| l Ψ(νl) [ exp(-k'lu) - exp(-klu)] | ≤ Δτ
where the summation includes all fine grid points within ± Δν and is evaluated at a spacing of Δν along the spectral axis. It is suggested that Δν is chosen to match the resolution of the modelled instrument.

The procedure is to evaluate the maximum error over all cells if each spectral grid point were replaced by an interpolated value, and the point with the smallest maximum error is removed. The procedure is repeated until no more points can be removed without exceeding the specified interpolation error.

Installing tabcmp_v

First download the source code: [tabcmp_v.f]

Then compile with any FORTRAN77 compiler, eg

f77 tabcmp_v.f -o tabcmp_v
(although, if using binary .tab files, use the same compiler as for the RFM which created the original files - see Introduction).

Running tabcmp_v

To run the program, simply type tabcmp_v and respond to the prompts.

A typical run might be (user responses in bold)

tabcmp_v R-TABCMP_V: Running TABCMP_V v1.0 Input file: tab_h2o.asc Convolved resolution [cm-1]: 0.25 Max absorption error (eg 0.001): 0.001 Output file: tab_h2o_v.asc I-TABCMP_V: Compression will be performed in 10 spectral subsets I-TABCMP_V: Processing sub-set# 1 of 10 Spectral range: 1400.0000 1401.0000 I-TABCMP_V: Evaluating initial errors ... I-TABCMP_V: Starting to remove points. Initial No.= 1001 1000 1400.2450 2.2090E-09 900 1400.1880 1.4674E-07 ... 100 1400.7560 1.1976E-04 41 Err.lim.= 1.0000E-03 I-TABCMP_V: Processing sub-set# 2 of 10 Spectral range: 1401.0000 1402.0000 I-TABCMP_V: Evaluating initial errors ... I-TABCMP_V: Starting to remove points. Initial No.= 1001 1000 1401.3780 3.1711E-10 900 1401.4250 1.3250E-08 ... 100 1409.2380 3.5321E-04
37 Err.lim.= 1.0000E-03
I-TABCMP_V: Summary: Orig Npts= 10001 New Npts= 351 Red.Factor= 3.5% I-TABCMP_V: Writing new .tab file ... R-TABCMP_V: Successful completion

The program takes from a couple of seconds to a couple of minutes per wavenumber depending on the width of the convolution (wider=slower).

Notes

  1. The first line tells you which version (1.0) of the program is being executed.
  2. The second line asks you for the uncompressed .tab file, i.e. the file originally generated by the RFM with the TAB option
  3. The third line asks for the convolution width, i.e. the FWHM of the triangular convolution function to be applied to the transmittance spectra. Typing <CR> will result in no convolution.
  4. The fourth line asks for the maximum error in transmittance for each cell. A value 0.001 is suggested which approximately equates to a 0.1% maximum difference in spectra (transmittance, absorption or radiance) from the applied spectral compression.
  5. The following lines indicate progress. The first such message tells you that the spectral range of the input file (1400-1410 cm-1 in this case) will be processed as 10 subsets. Then, for each subset, at every multiple of 100 points it lists the number of points remaining, the wavenumber of the last point removed and the maximum interpolation error so far, terminating with the final number of points in the subset (41) when the error limit is reached.
  6. The processing finishes with a summary of the number of points retained, in this case just 351 of the original 10 000, and the program then runs through the input file again, copying just each retained spectral point to the output file.

Error Messages

When running the program, after responding to the prompt for the input file, the program may halt with a message printed to the terminal

F-RWAXIS: No.p*T*q-axis pts > MAXX

This is because the total number of Np × NT × Nq axis points (=14600 in this case) exceeds the local array size 10000. To fix this, edit the source code and set PARAMETER MAXX to be ≥ Np × NT × Nq. Similarly for other array size error messages.

Subset Processing

An uncompressed LUT is usually too large to be accommodated in memory so the program splits the file into smaller spectral ranges, each processed separately (although the last point in one subset is always retained as the first point in the next subset).

An additional complication is that the .tab file header contains information on the total number of spectral points in the file. Since this isn't known until after the last subset is processed, this prevents the compressed file being written as each subset is processed. So during processing a scratch file is written containing a flag for each original spectral grid point showing whether it is to be retained or excluded. At the end of processing pointers for both the input file and the scratch file are reset to the start of the file and the program then steps through both using the flag scratch file to determine which input records are copied to the new output file.

The maximum size of each subset is determined by the parameter MAXV, currently set to 1001. For an input file of 0.001 cm-1 spacing, as in this case, this corresponds to a complete 1 cm-1 interval, including end-points, being processed at a time. A 'feature' of the algorithm is that the end points of each subset are always retained, so this guarantees that the compressed file will have a grid point at each 1 cm-1 interval starting with the lower limit of the file. There is little to be gained in making the subsets larger, but assuming the user will have their own favourite fine grid spacing Δν, it is suggested that MAXV be set to 1 + (1/Δν) to provide 1 cm-1 subset sizes.

Version History

v1.21 (05AUG15)
Change '$' in WRITE statements to avoid gfortran warnings
v1.2 (24FEB15)
Improve auto-detection of input file-type
v1.1 (29DEC14)
Allow for extra format identifier record in .tab file header and for tabulated ln(k) rather than tabulated k.
v1.00 (04NOV14)
Original code