Python - Making a map#
Aim: To create a map of data from reanalysis heat fluxes.
Data: You will need to download files from ICDC. We will be using NCEP Reanalysis heat fluxes at the ocean surface, and you will need one snapshot of sensible heat flux, latex heat flux, net shortwave and net longwave radiation flux.
Directions: Create an *.ipynb
and 8 figures. This will form your answer to questions on the exercise sheet.
Create a notebook & load the data#
Create an
*.ipynb
containing the commands for this assignment, or copy this file and rename it, e.g.,computing-regoz-4-<Lastname>.ipynb
Import necessary packages.
For example,
matplotlib
andpandas
andnumpy
andxarray
. You may also needIf you are missing any of these packages, please refer to Resources: Python.
Note
In this exercise, we will use the package cartopy for the first time. You will likely need to install this package on your computer (and in your environment, if you are using conda environments).
Download some data.
Option 1: Simpler. Get the data files from the Moodle. Choose one of the folders (4 files each) called “Heat flux files”.
Option 2: Choose a year to work with: anything from 1948 to 2023. Then navigate to each of the 4 components of surface heat fluxes from the NCEP reanalysis product, on the ICDC page https://www.cen.uni-hamburg.de/en/icdc/data/atmosphere/reanalysis-atmosphere.html. Download one of each file for your chosen year. Note that these files are about 30 mb each.
Note
This only works from the CEN/MPI network at UHH; that is, you can download files from one of the computers in the computer lab. If you are working on the lab computers for this exercise, it will work fine. If you’re working on your personal computer, you will need ot transfer the files from the lab computer (where you download the data) to your personal computer. A cloud service (e.g. the ownCloud system at UHH) can be used. The files are large (>30mb each) which may mean they are too large to e-mail to yourself.
Make a basic exploration. How big are the data? What are the coordinates? Use
print()
or other commands you’ve learned in previous exercises.
# Your code here
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import xarray as xr
from datetime import datetime
# Some extra colormaps
import cmocean
# Cartopy
import cartopy
import cartopy.feature as cfeature
import cartopy.crs as ccrs
import cartopy.mpl.ticker as cticker
from cartopy.mpl.ticker import LatitudeFormatter, LongitudeFormatter
Downloading data#
To make the code easier to change, we’ll recommend building the filename out of components. This gives you the possibility later of downloading a different year and only updating part of the code.
myyear = 2001 filepath = 'data/' # If you have a file folder containing data fpre = ['lhtfl', 'nswrs'] # Add to this all the names of the variables fpost = '.sfc.gauss.' + str(myyear) + '.nc'
This will give us a filename once we contenate together the various pieces:
fname = filepath + fpre[0] + fpost
Now you can do this one after another, where you call each of the 4 choices in
fpre
separately asfpre[0]
,fpre[1]
, and so on. This would look something like:fname = filepath + fpre[0] + fpost mydata1 = xr.open_dataset(fname) fname = filepath + fpre[1] + fpost mydata2 = xr.open_dataset(fname) fname = filepath + fpre[2] + fpost mydata3 = xr.open_dataset(fname) fname = filepath + fpre[3] + fpost mydata4 = xr.open_dataset(fname)
Or you can use a “loop”. In python, one of the basic loop types is a “for-do loop” or a “for loop”.
See also
About for-do loops: https://www.w3schools.com/python/python_for_loops.asp
Here, we’ll use a loop that goes through a simple vector which has the length the same as the length of
fpre
. First check what is the length offpre
?print(len(fpre))
It should be 4. If it’s not, then go back and edit where you define the list
fpre
. Verify that it is indeed of typelist
by doingprint(type(fpre))
Now construct your for-do loop which builds the different variable names
for i in range(len(fpre)): fname = filepath + fpre[i] + fpost print(fname)
Check that it is correctly producing the name of a file.
Load the data. If you had a single filename correctly named in the string
fname
, then you could usexr.open_dataset(fname)
. For examplemydata = xr.open_dataset(fname)
# Your code here
Loading all variables into a single dictionary
#
Python has a variable type called a “dictionary” which is used to store “key - value” pairs. These are pairs like (x, y) is a pair, where the x (first member of the pair) is a “key” or the name you use to refer to something, and the y (second member of the pair) is the “value” or contents of the object named by the “key”.
See also
Python dictionary: https://www.w3schools.com/python/python_dictionaries.asp
In the simple website example above, these are pairs of strings (the key is a string and the value is a string), or numbers (the key is a string and the values are numbers), or arrays. In our case here, we can create a dictionary of xarray datasets (the “key” will be the name of the component of heat flux, one of “lhtfl”, “shtfl”, “nswrs” or “nlwrs”, and the “value” or contents which we can call using the key is an xarray dataset).
In your code above, replace the left side of the equation where you load the dataset (i.e., where you use the command xr.open_dataset
) with
flux_components[fpre[i]] = xr.open_dataset(fname)
Once you have done this, you can check out the data within the dictionary using the following commands.
The first one, print(flux_components['lhtfl'].lhtfl.shape)
will tell you how big the dataset of latent heat flux is (x, y and z directions).
The second looks more like an xarray dataset that you’re familiar with (print(flux_components['lhtfl']
).
# Figure out how big the datasets are
#print(flux_components['lhtfl'].lhtfl.shape)
#print(flux_components['lhtfl'])
Merge into a single xarray dataset#
The dictionary of xarray datasets was kind of useful, but with xarray we don’t need to bother using a dictionary to store the data. Instead, we can use the command xr.merge
to combine the similar datatypes (same coordinates, same dimensions, but different variables: latent, sensible heat flux, and shortwave and longwave radiation). Take a look at this using print()
. What are the dimensions? What are the coordinates? What are the variables? Compare this to your print(flux_components['lhtfl'])
which contained only one of these datasets.
# Merge the datasets using xr.merged
#all_flux = xr.merge([
# flux_components['lhtfl'],
# flux_components['shtfl'],
# flux_components['nswrs'],
# flux_components['nlwrs']])
#print(all_flux)
Fig 1. Plot with matplotlib
#
Now we’d like to take a look at the data for a single snapshot (a single time). The example code below will choose the very first frame (where the time index is 0), and plot the latent heat flux. Update the code in order to plot four fields (sensible, latent, shortwave and longwave).
See also
Information and examples using matplotlib.pyplot.contourf()
: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contourf.html.
Scroll to the bottom of this page, and see a few examples of how to use contourf()
and what the results can look like.
# Plot the fields
# choose the index of the snapshot to show
itime = 0
map1 = all_flux.lhtfl[itime,:,:]
fig, axs = plt.subplots(2,2)
axs[0,0].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu')
axs[0,0].set_title('Latent heat flux')
axs[0,0].set_ylabel('Latitude')
# Cumbersome date time to string
d = data1.time[itime].dt.strftime('%Y.%m.%d').values
fig.suptitle('NCEP Reanalysis \n' + d)
fig.savefig('fig1-heatflux-Lastname.png')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 4
1 # Plot the fields
2 # choose the index of the snapshot to show
3 itime = 0
----> 4 map1 = all_flux.lhtfl[itime,:,:]
6 fig, axs = plt.subplots(2,2)
7 axs[0,0].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu')
NameError: name 'all_flux' is not defined
Making seasonal / annual average#
Now we’re going to use some of the fancier features of the xarray data construction. We’d like to make an average over all time (annual average) for this dataset. Since we’ve stored the data all in a single xarray
dataset, we can calculate the mean with one line of code.
What happens if you don’t include the keep_attrs=True
option? Try deleting it and see what changes.
See also
Averaging
xarray
datasets all at once. https://docs.xarray.dev/en/stable/generated/xarray.Dataset.mean.html
# Make an average over a full year
#ann_flux = all_flux.mean(dim='time', keep_attrs=True)
#print(ann_flux)
Fig 2. Update for annual averages#
Single-day snapshots of heat fluxes can be hard to read. Let’s repeat the plot above with the annual averages instead.
# Plot the fields
# choose the index of the snapshot to show
fig, axs = plt.subplots(2,2)
axs[0,0].contourf(data1.lon, data1.lat, ann_flux.lhtfl[:,:], cmap='RdYlBu')
axs[0,0].set_title('Latent heat flux')
axs[0,0].set_ylabel('Latitude')
# Cumbersome date time to string - Since we've done the annual average, we only need one year.
d = data1.time[itime].dt.strftime('%Y').values
fig.suptitle('NCEP Reanalysis \n' + d)
fig.savefig('fig2-heatflux-Lastname.png')
Fig 3. Using cartopy
#
What if you want to add some coastlines to your maps, and maybe switch up the map projection? Here we’ll use the cartopy
package.
See also
How to use cartopy
with matplotlib
: https://scitools.org.uk/cartopy/docs/latest/matplotlib/intro.html
The code below only partially works. The y-axis labels are broken, and it currently plots the daily snapshot rather than the annual mean. Try to update the plot to fix the y axis labels and to move the top row of figures closer to the bottom row.
Explore how to fix the labels on Cartopy maps: https://scitools.org.uk/cartopy/docs/latest/gallery/gridlines_and_labels/tick_labels.html
Optional: Try switching your map projection. A map “projection” refers to how we render the near-spherical Earth on a flat piece of paper. There are pros and cons of various map projections. Things to ask yourself when choosing a projection:
Are land-masses distorted (e.g. compared to spherical earth)? E.g., is Greenland massive, or is your \(x\) axis scaled strangely relative to your \(y\) axis?
Are lines of latitude/longitude straight? (They don’t need to be.)
Will my map plotting choices distract from the data?
Aim to minimise distortions in the region where you want your viewer to focus their attention.
Available map projections with
cartopy
: https://scitools.org.uk/cartopy/docs/v0.15/crs/projections.html
# Set some parameters for the map
nrows=2
ncols=2
itime = 0
myprojection = ccrs.AlbersEqualArea()
myprojection = ccrs.Mercator()
myprojection = ccrs.PlateCarree()
# Initialise the map with the projection above
fig, axs = plt.subplots(nrows=nrows,ncols=ncols,
subplot_kw={'projection': myprojection},
figsize=(11,8.5))
# axs is a 2 dimensional array of `GeoAxes`.
# We will flatten it into a 1-D array.
# This helps when plotting using a for-loop.
axs=axs.flatten()
# Loop through fluxes
for i in range(len(fpre)):
# Select the flux to load
data1 = flux_components[fpre[i]]
map1 = data1[fpre[i]][itime,:,:]
axs[i].contourf(data1.lon, data1.lat, map1, cmap='RdYlBu', transform=cartopy.crs.PlateCarree())
axs[i].coastlines() # plot some data on them
axs[i].set_title(fpre[i]) # label it
axs[i].add_feature(cfeature.COASTLINE)
# Longitude labels
axs[i].set_xticks(np.arange(-180,181,90), crs=cartopy.crs.PlateCarree())
lon_formatter = cticker.LongitudeFormatter()
axs[i].xaxis.set_major_formatter(lon_formatter)
# Latitude labels
axs[i].set_yticks(np.arange(-90,91,30), crs=ccrs.Mercator())
fig.savefig('fig3-cartopy-lastname.png')