Python - Making a map#

Aim: To create a map of data from reanalysis heat fluxes.

Data: You will need to download files from ICDC. We will be using NCEP Reanalysis heat fluxes at the ocean surface, and you will need one snapshot of sensible heat flux, latex heat flux, net shortwave and net longwave radiation flux.

Directions: Create an *.ipynb and 8 figures. This will form your answer to questions on the exercise sheet.


Create a notebook & load the data#

  1. Create an *.ipynb containing the commands for this assignment, or copy this file and rename it, e.g., computing-regoz-4-<Lastname>.ipynb

  2. Import necessary packages.

    For example, matplotlib and pandas and numpy and xarray. You may also need

    If you are missing any of these packages, please refer to Resources: Python.

    Note

    In this exercise, we will use the package cartopy for the first time. You will likely need to install this package on your computer (and in your environment, if you are using conda environments).

  3. Download some data.

    • Option 1: Simpler. Get the data files from the Moodle. Choose one of the folders (4 files each) called “Heat flux files”.

    • Option 2: Choose a year to work with: anything from 1948 to 2023. Then navigate to each of the 4 components of surface heat fluxes from the NCEP reanalysis product, on the ICDC page https://www.cen.uni-hamburg.de/en/icdc/data/atmosphere/reanalysis-atmosphere.html. Download one of each file for your chosen year. Note that these files are about 30 mb each.

    Note

    This only works from the CEN/MPI network at UHH; that is, you can download files from one of the computers in the computer lab. If you are working on the lab computers for this exercise, it will work fine. If you’re working on your personal computer, you will need ot transfer the files from the lab computer (where you download the data) to your personal computer. A cloud service (e.g. the ownCloud system at UHH) can be used. The files are large (>30mb each) which may mean they are too large to e-mail to yourself.

  4. Make a basic exploration. How big are the data? What are the coordinates? Use print() or other commands you’ve learned in previous exercises.

# Your code here
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import xarray as xr
from datetime import datetime

# Some extra colormaps
import cmocean

# Cartopy
import cartopy
import cartopy.feature as cfeature
import cartopy.crs as ccrs
import cartopy.mpl.ticker as cticker
from cartopy.mpl.ticker import LatitudeFormatter, LongitudeFormatter

Downloading data#

  • To make the code easier to change, we’ll recommend building the filename out of components. This gives you the possibility later of downloading a different year and only updating part of the code.

    myyear = 2001
    filepath = 'data/' # If you have a file folder containing data
    fpre = ['lhtfl', 'nswrs']  # Add to this all the names of the variables
    fpost = '.sfc.gauss.' + str(myyear) + '.nc'
    

    This will give us a filename once we contenate together the various pieces:

    fname = filepath + fpre[0] + fpost
    
  • Now you can do this one after another, where you call each of the 4 choices in fpre separately as fpre[0], fpre[1], and so on. This would look something like:

    fname = filepath + fpre[0] + fpost
    mydata1 = xr.open_dataset(fname)
    
    fname = filepath + fpre[1] + fpost
    mydata2 = xr.open_dataset(fname)
    
    fname = filepath + fpre[2] + fpost
    mydata3 = xr.open_dataset(fname)
    
    fname = filepath + fpre[3] + fpost
    mydata4 = xr.open_dataset(fname)
    

    Or you can use a “loop”. In python, one of the basic loop types is a “for-do loop” or a “for loop”.

    Here, we’ll use a loop that goes through a simple vector which has the length the same as the length of fpre. First check what is the length of fpre?

    print(len(fpre))
    

    It should be 4. If it’s not, then go back and edit where you define the list fpre. Verify that it is indeed of type list by doing

    print(type(fpre))
    
  • Now construct your for-do loop which builds the different variable names

    for i in range(len(fpre)):
        fname = filepath + fpre[i] + fpost
        print(fname)
    

    Check that it is correctly producing the name of a file.

  • Load the data. If you had a single filename correctly named in the string fname, then you could use xr.open_dataset(fname). For example

    mydata = xr.open_dataset(fname)
    
# Your code here

Loading all variables into a single dictionary#

Python has a variable type called a “dictionary” which is used to store “key - value” pairs. These are pairs like (x, y) is a pair, where the x (first member of the pair) is a “key” or the name you use to refer to something, and the y (second member of the pair) is the “value” or contents of the object named by the “key”.

In the simple website example above, these are pairs of strings (the key is a string and the value is a string), or numbers (the key is a string and the values are numbers), or arrays. In our case here, we can create a dictionary of xarray datasets (the “key” will be the name of the component of heat flux, one of “lhtfl”, “shtfl”, “nswrs” or “nlwrs”, and the “value” or contents which we can call using the key is an xarray dataset).

In your code above, replace the left side of the equation where you load the dataset (i.e., where you use the command xr.open_dataset) with

flux_components[fpre[i]] = xr.open_dataset(fname)

Once you have done this, you can check out the data within the dictionary using the following commands.

The first one, print(flux_components['lhtfl'].lhtfl.shape) will tell you how big the dataset of latent heat flux is (x, y and z directions).

The second looks more like an xarray dataset that you’re familiar with (print(flux_components['lhtfl']).

# Figure out how big the datasets are
#print(flux_components['lhtfl'].lhtfl.shape)
#print(flux_components['lhtfl'])

Merge into a single xarray dataset#

The dictionary of xarray datasets was kind of useful, but with xarray we don’t need to bother using a dictionary to store the data. Instead, we can use the command xr.merge to combine the similar datatypes (same coordinates, same dimensions, but different variables: latent, sensible heat flux, and shortwave and longwave radiation). Take a look at this using print(). What are the dimensions? What are the coordinates? What are the variables? Compare this to your print(flux_components['lhtfl']) which contained only one of these datasets.

# Merge the datasets using xr.merged
#all_flux = xr.merge([
#    flux_components['lhtfl'],
#    flux_components['shtfl'],
#    flux_components['nswrs'],
#    flux_components['nlwrs']])
#print(all_flux)

Fig 1. Plot with matplotlib#

Now we’d like to take a look at the data for a single snapshot (a single time). The example code below will choose the very first frame (where the time index is 0), and plot the latent heat flux. Update the code in order to plot four fields (sensible, latent, shortwave and longwave).

See also

Information and examples using matplotlib.pyplot.contourf(): https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contourf.html.

Scroll to the bottom of this page, and see a few examples of how to use contourf() and what the results can look like.

# Plot the fields
# choose the index of the snapshot to show
itime = 0
map1 = all_flux.lhtfl[itime,:,:]

fig, axs = plt.subplots(2,2)
axs[0,0].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu')
axs[0,0].set_title('Latent heat flux')
axs[0,0].set_ylabel('Latitude')

# Cumbersome date time to string
d = data1.time[itime].dt.strftime('%Y.%m.%d').values
fig.suptitle('NCEP Reanalysis \n' + d)

fig.savefig('fig1-heatflux-Lastname.png')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 4
      1 # Plot the fields
      2 # choose the index of the snapshot to show
      3 itime = 0
----> 4 map1 = all_flux.lhtfl[itime,:,:]
      6 fig, axs = plt.subplots(2,2)
      7 axs[0,0].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu')

NameError: name 'all_flux' is not defined

Making seasonal / annual average#

Now we’re going to use some of the fancier features of the xarray data construction. We’d like to make an average over all time (annual average) for this dataset. Since we’ve stored the data all in a single xarray dataset, we can calculate the mean with one line of code.

What happens if you don’t include the keep_attrs=True option? Try deleting it and see what changes.

See also

# Make an average over a full year
#ann_flux = all_flux.mean(dim='time', keep_attrs=True)
#print(ann_flux)

Fig 2. Update for annual averages#

Single-day snapshots of heat fluxes can be hard to read. Let’s repeat the plot above with the annual averages instead.

# Plot the fields
# choose the index of the snapshot to show

fig, axs = plt.subplots(2,2)
axs[0,0].contourf(data1.lon, data1.lat, ann_flux.lhtfl[:,:], cmap='RdYlBu')
axs[0,0].set_title('Latent heat flux')
axs[0,0].set_ylabel('Latitude')

# Cumbersome date time to string - Since we've done the annual average, we only need one year.
d = data1.time[itime].dt.strftime('%Y').values
fig.suptitle('NCEP Reanalysis \n' + d)

fig.savefig('fig2-heatflux-Lastname.png')

Fig 3. Using cartopy#

What if you want to add some coastlines to your maps, and maybe switch up the map projection? Here we’ll use the cartopy package.

See also

How to use cartopy with matplotlib: https://scitools.org.uk/cartopy/docs/latest/matplotlib/intro.html

The code below only partially works. The y-axis labels are broken, and it currently plots the daily snapshot rather than the annual mean. Try to update the plot to fix the y axis labels and to move the top row of figures closer to the bottom row.

  • Explore how to fix the labels on Cartopy maps: https://scitools.org.uk/cartopy/docs/latest/gallery/gridlines_and_labels/tick_labels.html

  • Optional: Try switching your map projection. A map “projection” refers to how we render the near-spherical Earth on a flat piece of paper. There are pros and cons of various map projections. Things to ask yourself when choosing a projection:

    • Are land-masses distorted (e.g. compared to spherical earth)? E.g., is Greenland massive, or is your \(x\) axis scaled strangely relative to your \(y\) axis?

    • Are lines of latitude/longitude straight? (They don’t need to be.)

    • Will my map plotting choices distract from the data?

    Aim to minimise distortions in the region where you want your viewer to focus their attention.

    Available map projections with cartopy: https://scitools.org.uk/cartopy/docs/v0.15/crs/projections.html

# Set some parameters for the map
nrows=2
ncols=2
itime = 0
myprojection = ccrs.AlbersEqualArea()
myprojection = ccrs.Mercator()
myprojection = ccrs.PlateCarree()

# Initialise the map with the projection above
fig, axs = plt.subplots(nrows=nrows,ncols=ncols,
                        subplot_kw={'projection': myprojection},
                        figsize=(11,8.5))

# axs is a 2 dimensional array of `GeoAxes`.  
# We will flatten it into a 1-D array.
# This helps when plotting using a for-loop.
axs=axs.flatten()

# Loop through fluxes
for i in range(len(fpre)):
    # Select the flux to load
    data1 = flux_components[fpre[i]]
    map1 = data1[fpre[i]][itime,:,:]
    axs[i].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu', transform=cartopy.crs.PlateCarree())
    axs[i].coastlines()               # plot some data on them
    axs[i].set_title(fpre[i])                        # label it
    axs[i].add_feature(cfeature.COASTLINE)

    # Longitude labels
    axs[i].set_xticks(np.arange(-180,181,90), crs=cartopy.crs.PlateCarree())
    lon_formatter = cticker.LongitudeFormatter()
    axs[i].xaxis.set_major_formatter(lon_formatter)

    # Latitude labels
    axs[i].set_yticks(np.arange(-90,91,30), crs=ccrs.Mercator())

fig.savefig('fig3-cartopy-lastname.png')