merra

https://travis-ci.org/TUW-GEO/merra.svg?branch=master https://readthedocs.org/projects/merra2/badge/?version=latest https://coveralls.io/repos/github/TUW-GEO/merra/badge.svg?branch=master

The package provides readers and converters for the Land Surface Diagnostics within the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2). MERRA-2 is a NASA atmospheric reanalysis integrating satellite data assimilation and aims at historical climate analyses. MERRA-2 covers Land Surface Diagnostics for the period 1980-present at 0.5 ° x 0.625 ° spatial- and 1-hourly temporal-resolution.

The structure of the package is as follows:

  • grid.py : implements the asymmetrical GMAO 0.5 x 0.625 grid
  • interface.py : classes for reading a single image, image stacks and time series
  • reshuffle.py : provides a command line utility for reshuffling a stack of 1-hourly sampled native images to time series format with an arbitraty temporal sampling between 1-hour and daily
  • download.py : command line utility for downloading MERRA-2 data from the NASA GES DISC datapool

Installation

For developers, it is recommended to first clone the repository and then use the provided environment.yml file to install all needed conda and pip dependencies:

git clone https://github.com/TUW-GEO/merra.git --recursive
cd merra
conda create -n merra python=3.7 # or any supported python version
source activate merra
conda update -f environment.yml
python setup.py develop

Supported Products

Contribute

We are happy if you want to contribute. Please raise an issue explaining what is missing or if you find a bug. We will also gladly accept pull requests against our master branch for new features or bug fixes.

Guidelines

If you want to contribute please follow these steps:

  • Fork the merra repository to your account
  • make a new feature branch from the merra master branch
  • Add your feature
  • please include tests for your contributions in one of the test directories We use py.test so a simple function called test_my_feature is enough
  • submit a pull request to our master branch

Reading MERRA-2 images

Reading of the MERRA-2 netcdf files can be done in two ways:

1) Reading by file name

import os
from datetime import datetime
from merra.interface import MerraImage

# parameters to read
param_list = ['SFMC', 'RZMC', 'PRECTOTLAND', 'TSOIL1']

# timestamp (needed because there are 24 hourly simulations within
# each file (3D-stack), so you need to choose one to return a 2D image)
timestamp = datetime(2018, 10, 1, 0, 30)

# the class is initialized with the exact filename.
img = MerraImage(os.path.join(os.path.dirname(__file__),
                             'merra-test-data',
                             'M2T1NXLND.5.12.4',
                             '2018',
                             '10',
                             'MERRA2_400.tavg1_2d_lnd_Nx.20181001.nc4'),
                          parameter=param_list)

# reading returns an image object which contains a data dictionary
# with one array per parameter. The returned data is a global image/array
# with shape (361, 576)
image = img.read(timestamp=timestamp)
data = image.data

2) Reading by date

All the MERRA-2 data in a directory structure can be accessed by date. The filename is automatically built from the given date.

from merra.interface import MerraImageStack

# parameters to read
param_list = ['SFMC', 'RZMC', 'PRECTOTLAND', 'TSOIL1']

# initializes an image stack class given the path to the data directory
# the class knows about the default folder structure down the line
img_stack = MerraImageStack(data_path=os.path.join(os.path.dirname(__file__),
                                             'merra-test-data',
                                             'M2T1NXLND.5.12.4'),
                          parameter=param_list)

# timestamp
timestamp = datetime(2018, 10, 1, 0, 30)

# read one image out of the stack at specific timestamp
image = img_stack.read(timestamp=timestamp)

For reading all image between two dates the merra.interface.MerraImageStack.iter_images() iterator can be used.

Variables of interest for MERRA-2

A full list of variable names can be found in the MERRA-2 README provided by NASA.

Short name Long name Parameter Resolution Depth [m] Units
SFMC water_surface_layer Soil moisture 0.5° x 0.625° 0.00 - 0.05 [m-3 m-3]
RZMC water_root_zone Soil moisture 0.5° x 0.625° 0.10 - 1.00 [m-3 m-3]
GWETTOP surface_soil_wetness Soil moisture 0.5° x 0.625° 0.00 - 0.05 []
GWETROOT root_zone_soil_wetness Soil moisture 0.5° x 0.625° 0.10 - 1.00 []
GWETPROF ave_prof_soil_moisture Soil moisture 0.5° x 0.625° 1.34 - 8.53 []
PRECTOTLAND Total_precipitation_land Rain precipitation rate 0.5° x 0.625° 0 [kg m-2 s-1]
PRECSNOLAND snowfall_land Snow precipitation rate 0.5° x 0.625° 0 [kg m-2 s-1]
SNOMAS Total_snow_storage_land Total snow storage land 0.5° x 0.625° 0 [kg m-2]
TSOIL1 soil_temperatures_layer_1 Soil temperatures layer 1 0.5° x 0.625°
    • 0.0988
[K]
TSOIL2 soil_temperatures_layer_2 Soil temperatures layer 2 0.5° x 0.625°
    • 0.1952
[K]
TSOIL3 soil_temperatures_layer_3 Soil temperatures layer 3 0.5° x 0.625°
    • 0.3859
[K]
TSOIL4 soil_temperatures_layer_4 Soil temperatures layer 4 0.5° x 0.625°
    • 0.7626
[K]
TSOIL5 soil_temperatures_layer_5 Soil temperatures layer 5 0.5° x 0.625°
    • 1.5071
[K]
TSOIL6 soil_temperatures_layer_6 Soil temperatures layer 6 0.5° x 0.625°
    • 10.000
[K]
TSURF surface_temperature_of_land_incl_snow Surface temperature 0.5° x 0.625° 0 [K]

Conversion to time series format

For a lot of applications it is favorable to convert the image based format into a format which is optimized for fast time series retrieval. This is what we often need for e.g. validation studies. This can be done by stacking the images into a netCDF file and choosing the correct chunk sizes or a lot of other methods. We have chosen to do it in the following way:

  • Store only the reduced gaußian grid points since that saves space.
  • Further reduction the amount of stored data by saving only land points if selected.
  • Store the time series in netCDF4 in the Climate and Forecast convention Orthogonal multidimensional array representation
  • Store the time series in 5x5 degree cells. This means there will be 2566 cell files (without reduction to land points) and a file called grid.nc which contains the information about which grid point is stored in which file. This allows us to read a whole 5x5 degree area into memory and iterate over the time series quickly.

This conversion can be performed using the merra_repurpose command line program. An example would be:

merra_repurpose /merra2_data /timeseries/data -s 2000-01-01 -e 2018-11-30 --parameters SFMC RZMC --temporal_sampling 6

Which would take MERRA-2 data stored in /merra2_data from January 1st 2000 to November 30th 2018 and store the parameters for 6-hourly sampled surface (SFMC) and root zone soil moisture (RZMC) as time series in the folder /timeseries/data.

Conversion to time series is performed by the repurpose package in the background. For custom settings or other options see the repurpose documentation and the code in merra.reshuffle.

Note: If a RuntimeError: NetCDF: Bad chunk sizes. appears during reshuffling, consider downgrading the netcdf4 library via:

conda install -c conda-forge netcdf4=1.2.2

if you are on Python 2.* and

conda install -c conda-forge netcdf4=1.2.8

if you are using Python 3.*.

Reading converted time series data

For reading the data the merra_repurpose command produces the class MerraTs:

from merra.interface import MerraTs

# specify path to data folder
path = '../timeseries/data'

# specify location lon and lat
lon, lat = (16.375, 48.125)

# initialize the time series class
merra_reader = MerraTs(ts_path=path,
                       ioclass_kws={'read_bulk':True},
                       parameters=['SFMC'])

# read SFMC time series at the location
ts = merra_reader.read(lon, lat)

Indices and tables