LOTUS Regression
0.8
  • Technical Note on the Baseline Regression Implementation
  • Trends with the GOZCARDS Dataset
  • Seasonal trends with the GOZCARDS Dataset
  • Trends with the SAGE II / OSIRIS / OMPS-LP Dataset
  • Predictors
  • Extra Predictors
  • Running the Regression for a Single Bin
  • Changing the Linear Components
  • API Reference
LOTUS Regression
  • Seasonal trends with the GOZCARDS Dataset
  • View page source

Seasonal trends with the GOZCARDS Dataset¶

Here we calculate seasonal trends using the GOZCARDS dataset by regressing to the VMR monthly zonal means using seasonal terms in our predictors.

[2]:
import xarray as xr
import numpy as np
from LOTUS_regression.regression import regress_all_bins
from LOTUS_regression.predictors.seasonal import add_seasonal_components
from LOTUS_regression.predictors import load_data

The GOZCARDS data is in multiple NetCDF4 files. Load them all in and concatenate on the time dimension. Also only take values in the latitude range -60 to 60.

[3]:
GOZCARDS_FILES = r'/home/runner/work/lotus-regression/lotus-regression/test_data/GOZCARDS/*.nc4'

data = xr.decode_cf(xr.open_mfdataset(GOZCARDS_FILES, combine='nested', concat_dim='time', group='Merged', engine='netcdf4'))

data = data.loc[dict(lat=slice(-60, 60))]

print(data)
<xarray.Dataset> Size: 47MB
Dimensions:               (time: 384, data_source: 6, overlap: 6, lev: 25,
                           lat: 12)
Coordinates:
  * lat                   (lat) float32 48B -55.0 -45.0 -35.0 ... 35.0 45.0 55.0
  * lev                   (lev) float32 100B 1e+03 681.3 464.2 ... 0.1468 0.1
  * time                  (time) datetime64[ns] 3kB 1979-01-15 ... 2012-12-15
  * data_source           (data_source) int32 24B 1 2 3 4 5 6
  * overlap               (overlap) int32 24B 1 2 3 4 5 6
Data variables: (12/13)
    data_source_name      (time, data_source) |S20 46kB dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_start_time    (time, overlap) datetime64[ns] 18kB dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_end_time      (time, overlap) datetime64[ns] 18kB dask.array<chunksize=(12, 6), meta=np.ndarray>
    overlap_used_source   (time, overlap, data_source) int8 14kB dask.array<chunksize=(12, 6, 6), meta=np.ndarray>
    overlap_source_total  (time, overlap, data_source, lev, lat) float64 33MB dask.array<chunksize=(12, 6, 6, 25, 12), meta=np.ndarray>
    average               (time, lev, lat) float32 461kB dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    ...                    ...
    std_dev               (time, lev, lat) float32 461kB dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    std_error             (time, lev, lat) float32 461kB dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    minimum               (time, lev, lat) float32 461kB dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    maximum               (time, lev, lat) float32 461kB dask.array<chunksize=(12, 25, 12), meta=np.ndarray>
    offset                (time, data_source, lev, lat) float32 3MB dask.array<chunksize=(12, 6, 25, 12), meta=np.ndarray>
    offset_std_error      (time, data_source, lev, lat) float32 3MB dask.array<chunksize=(12, 6, 25, 12), meta=np.ndarray>
Attributes:
    OriginalFileVersion:           1.01
    OriginalFileProcessingCenter:  Jet Propulsion Laboratory / California Ins...
    InputOriginalFile:             All available original source data for the...
    DataQuality:                   All data screening according to the specif...
    DataSource:                    SAGE-I v5.9_rev, SAGE-II v6.2, HALOE v19, ...
    DataSubset:                    Full Dataset

Load some standard predictors and add a constant

[4]:
predictors = load_data('pred_baseline_pwlt.csv')

print(predictors.columns)
Index(['enso', 'solar', 'qboA', 'qboB', 'aod', 'linear_pre', 'linear_post',
       'constant'],
      dtype='object')

Currently our predictors have no seasonal dependence. Add in some seasonal terms with different numbers of Fourier components.

[5]:
predictors = add_seasonal_components(predictors, {'constant': 4, 'linear_pre': 4, 'linear_post': 4, 'qboA': 2, 'qboB': 2})

print(predictors[:5])
                enso     solar      qboA      qboB       aod  linear_pre  \
time
1979-01-01  0.534698  1.578955 -1.067789  0.952231 -0.411760   -1.800000
1979-02-01  0.351147  1.633527 -0.831698  0.892204 -0.412330   -1.791667
1979-03-01  0.004440  1.318208 -1.168007  0.628544 -0.413462   -1.783333
1979-04-01  0.269569  1.129935 -1.229261  0.561389 -0.416360   -1.775000
1979-05-01  0.330752  1.002783 -1.226434  0.052783 -0.420368   -1.766667

            linear_post  constant  qboA_sin0  qboA_cos0  ...  \
time                                                     ...
1979-01-01          0.0       1.0  -0.000000  -1.067789  ...
1979-02-01          0.0       1.0  -0.422799  -0.716214  ...
1979-03-01          0.0       1.0  -0.992164  -0.616320  ...
1979-04-01          0.0       1.0  -1.228947  -0.027752  ...
1979-05-01          0.0       1.0  -1.080099   0.580970  ...

            linear_post_sin3  linear_post_cos3  constant_sin0  constant_cos0  \
time
1979-01-01               0.0               0.0       0.000000       1.000000
1979-02-01               0.0              -0.0       0.508356       0.861147
1979-03-01              -0.0              -0.0       0.849450       0.527668
1979-04-01              -0.0               0.0       0.999745       0.022576
1979-05-01               0.0              -0.0       0.880683      -0.473706

            constant_sin1  constant_cos1  constant_sin2  constant_cos2  \
time
1979-01-01       0.000000       1.000000       0.000000       1.000000
1979-02-01       0.875539       0.483147       0.999579      -0.029025
1979-03-01       0.896456      -0.443132       0.096613      -0.995322
1979-04-01       0.045141      -0.998981      -0.997707      -0.067683
1979-05-01      -0.834370      -0.551205      -0.090190       0.995925

            constant_sin3  constant_cos3
time
1979-01-01       0.000000       1.000000
1979-02-01       0.846029      -0.533137
1979-03-01      -0.794497      -0.607268
1979-04-01      -0.090190       0.995925
1979-05-01       0.919817      -0.392347

[5 rows x 40 columns]

Perform the regression and convert the coefficients to percent anomaly. We pass include_monthly_fits = True so that the seasonal fits are used to calculate monthly trends. The results at the end will include ‘linear_post_monthly’ and ‘linear_post_monthly_std’ that are the monthly trend terms and errors respectively

[6]:
data = data.sel(lev=slice(100, 0.2)).sel(lat=slice(-65, 65))   # remove bins without data
results = regress_all_bins(predictors, data['average'], tolerance=0.1, include_monthly_fits=True)

# Convert to ~ percent
results /= data['average'].mean(dim='time')
results *= 100

print(results)
<xarray.Dataset> Size: 327kB
Dimensions:                  (lev: 17, lat: 12, month: 12)
Coordinates:
  * lev                      (lev) float32 68B 100.0 68.13 ... 0.3162 0.2154
  * lat                      (lat) float32 48B -55.0 -45.0 -35.0 ... 45.0 55.0
  * month                    (month) int64 96B 1 2 3 4 5 6 7 8 9 10 11 12
Data variables: (12/90)
    enso                     (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    enso_std                 (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    solar                    (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    solar_std                (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    qboA                     (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    qboA_std                 (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    ...                       ...
    constant_cos2            (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    constant_cos2_std        (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    constant_sin3            (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    constant_sin3_std        (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    constant_cos3            (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
    constant_cos3_std        (lev, lat) float64 2kB dask.array<chunksize=(17, 12), meta=np.ndarray>
[7]:
import LOTUS_regression.plotting.trends as trends
import matplotlib.pyplot as plt

# Plot the seasonal post trends at the level closest to 2 hPa (2.15 hPa)
trends.plot_with_confidence(results.sel(lev=2, method='nearest'), 'linear_post_monthly', x='lat', y='month')
plt.xlabel('Latitude')
plt.ylabel('Month')
[7]:
Text(0, 0.5, 'Month')
../_images/examples_GOZCARDS_Trends_Seasonal_11_1.png
Previous Next

© Copyright 2024, USask ARG and the LOTUS Group.

Built with Sphinx using a theme provided by Read the Docs.