climate.climate_data¶
Provides classes for generating and analyzing complex climate networks.
-
class
pyunicorn.climate.climate_data.
ClimateData
(observable, grid, time_cycle, anomalies=False, observable_name='', observable_long_name=None, window=None, silence_level=0)[source]¶ Bases:
pyunicorn.core.data.Data
Encapsulates spatio-temporal climate data.
Provides methods to manipulate this data, i.e. calculate daily (monthly) mean values and anomaly values.
- @ivar data_source: (string) - The name of the data source
- (model, reanalysis, station)
-
classmethod
Load
(file_name, observable_name, time_cycle, time_name='time', latitude_name='lat', longitude_name='lon', data_source=None, file_type='NetCDF', window=None, vertical_level=None, silence_level=0)[source]¶ Initialize an instance of ClimateData.
- Supported file types
file_type
are: - “NetCDF” for regular (rectangular) grids
- “iNetCDF” for irregular (e.g. geodesic) grids or station data.
The spatio-temporal window is described by the following dictionary:
window = {"time_min": 0., "time_max": 0., "lat_min": 0., "lat_max": 0., "lon_min": 0., "lon_max": 0.}
Parameters: - file_name (str) – The name of the data file.
- observable_name (str) – The short name of the observable within data file (particularly relevant for NetCDF).
- time_cycle (int) – The annual cycle length of the data (units of samples).
- time_name (str) – The name of the time variable within data file.
- latitude_name (str) – The name of the latitude variable within data file.
- longitude_name (str) – The name of longitude variable within data file.
- data_source (str) – The name of the data source (model, reanalysis, station).
- file_type (str) – The format of the data file.
- window (dict) – Spatio-temporal window to select a view on the data.
- vertical_level (int) – The vertical level to be extracted from the data file. Is ignored for horizontal data sets. If None, the first level in the data file is chosen.
- silence_level (int) – The inverse level of verbosity of the object.
- Supported file types
-
static
SmallTestData
()[source]¶ Return test data set of 6 time series with 10 sampling points each.
Example:
>>> r(Data.SmallTestData().observable()) array([[ 0. , 1. , 0. , -1. , -0. , 1. ], [ 0.309 , 0.9511, -0.309 , -0.9511, 0.309 , 0.9511], [ 0.5878, 0.809 , -0.5878, -0.809 , 0.5878, 0.809 ], [ 0.809 , 0.5878, -0.809 , -0.5878, 0.809 , 0.5878], [ 0.9511, 0.309 , -0.9511, -0.309 , 0.9511, 0.309 ], [ 1. , 0. , -1. , -0. , 1. , 0. ], [ 0.9511, -0.309 , -0.9511, 0.309 , 0.9511, -0.309 ], [ 0.809 , -0.5878, -0.809 , 0.5878, 0.809 , -0.5878], [ 0.5878, -0.809 , -0.5878, 0.809 , 0.5878, -0.809 ], [ 0.309 , -0.9511, -0.309 , 0.9511, 0.309 , -0.9511]])
Return type: ClimateData instance Returns: a ClimateData instance for testing purposes.
-
__init__
(observable, grid, time_cycle, anomalies=False, observable_name='', observable_long_name=None, window=None, silence_level=0)[source]¶ Initialize an instance of ClimateData.
The spatio-temporal window is described by the following dictionary:
window = {"time_min": 0., "time_max": 0., "lat_min": 0., "lat_max": 0., "lon_min": 0., "lon_max": 0.}
Parameters: - observable (2D array [time, index]) – The array of time series to be represented by the
Data
instance. - grid (
Grid
instance) – The Grid representing the spatial coordinates associated to the time series and their temporal sampling. - time_cycle (int) – The annual cycle length of the data (units of samples).
- anomalies (bool) – Indicates whether the data are climatological anomaly values.
- observable_name (str) – A short name for the observable.
- observable_long_name (str) – A long name for the observable.
- window (dict) – Spatio-temporal window to select a view on the data.
- silence_level (int) – The inverse level of verbosity of the object.
- observable (2D array [time, index]) – The array of time series to be represented by the
-
_calculate_anomaly
()[source]¶ Calculate anomaly time series from observable.
To obtain climatological anomaly time series, the climatological means are subtracted from each sample in the original time series. This procedure is also known as phase averaging.
Note
Only the currently selected spatio-temporal window is considered.
Return type: 2D Numpy array [time, node index] Returns: the anomalized time series.
-
_calculate_phase_mean
()[source]¶ Calculate mean values of observable for each phase of the annual cycle.
This is also commonly referred to as climatological mean, e.g., the mean temperature for all Januaries in the data set for monthly time resolution (time_cycle=12).
Note
Only the currently selected spatio-temporal window is considered.
Return type: 2D Numpy array [cycle index, node index] Returns: the mean values of observable for each phase of the annual cycle.
-
anomaly
()[source]¶ Return anomaly time series from observable.
For further comments, see
_calculate_anomaly()
.Note
Only the currently selected spatio-temporal window is considered.
Example:
>>> r(ClimateData.SmallTestData().anomaly()[:,0]) array([-0.5 , -0.321 , -0.1106, 0.1106, 0.321 , 0.5 , 0.321 , 0.1106, -0.1106, -0.321 ])
Return type: 2D Numpy array [time, node index] Returns: the anomalized time series.
-
anomaly_selected_months
(selected_months)[source]¶ Return anomaly time series from observable for selected months.
For further comments, see
_calculate_anomaly()
.Note
Only the currently selected spatio-temporal window is considered.
Parameters: selected_months ([number]) – The selected months. Return type: 2D array [time, node index] Returns: the anomalized time series for selected months.
-
clear_cache
()[source]¶ Clean up cache.
Is reversible, since all cached information can be recalculated from basic data.
-
indices_selected_months
(selected_months)[source]¶ Return sorted time indices associated to certain months.
Currently, only cycle lengths of 12 (monthly data) and 360 (standardized daily data) are supported.
Note
Only the currently selected spatio-temporal window is considered.
Parameters: selected_months ([number]) – The selected months. Return type: 1D array (int) Returns: the sorted time indices corresponding to chosen months.
-
indices_selected_phases
(selected_phases)[source]¶ Return sorted time indices associated to certain phase indices.
Note
Only the currently selected spatio-temporal window is considered.
Example:
>>> ClimateData.SmallTestData().indices_selected_phases([0,1,4]) array([0, 1, 4, 5, 6, 9])
Parameters: selected_phases ([int]) – The selected phase indices. Return type: 1D array (int) Returns: the sorted time indices corresponding to chosen phase indices.
-
phase_indices
()[source]¶ Return time indices associated to all phases in the annual cycle.
In other words, provides all time indices falling into a particular day, month etc. of the year.
Just includes measurements from years for which complete data exists.
Note
Only the currently selected spatio-temporal window is considered.
Note
Only the currently selected spatio-temporal window is considered.
Example:
>>> ClimateData.SmallTestData().phase_indices() array([[0, 5], [1, 6], [2, 7], [3, 8], [4, 9]])
Return type: 2D Numpy array (int) [phase index, year] Returns: the time indices associated to all phases of the annual cycle.
-
phase_mean
()[source]¶ Return mean values of observable for each phase of the annual cycle.
For further comments, see
_calculate_phase_mean()
.Note
Only the currently selected spatio-temporal window is considered.
Example:
>>> r(ClimateData.SmallTestData().phase_mean()) array([[ 0.5 , 0.5 , -0.5 , -0.5 , 0.5 , 0.5 ], [ 0.63 , 0.321 , -0.63 , -0.321 , 0.63 , 0.321 ], [ 0.6984, 0.1106, -0.6984, -0.1106, 0.6984, 0.1106], [ 0.6984, -0.1106, -0.6984, 0.1106, 0.6984, -0.1106], [ 0.63 , -0.321 , -0.63 , 0.321 , 0.63 , -0.321 ]])
Return type: 2D Numpy array [cycle index, node index] Returns: the mean values of observable for each phase of the annual cycle.
-
set_global_window
()[source]¶ Set the view on the whole data set.
Select the full data set and creates a data array as well as a corresponding Grid object to access this window from outside.
Example (Set smaller window and subsequently restore global window):
>>> data = ClimateData.SmallTestData() >>> data.set_window(window={"time_min": 0., "time_max": 4., ... "lat_min": 10., "lat_max": 20., ... "lon_min": 5., "lon_max": 10.}) >>> data.grid.grid()["lat"] array([ 10., 15.], dtype=float32) >>> data.set_global_window() >>> data.grid.grid()["lat"] array([ 0., 5., 10., 15., 20., 25.], dtype=float32)
-
set_window
(window)[source]¶ Set spatio-temporal window.
Calls set_window method of parent class Data and additionally sets flags, so that measures derived from data (mean, anomaly) will be recalculated for new window.
The spatio-temporal window is described by the following dictionary:
window = {"time_min": 0., "time_max": 0., "lat_min": 0., "lat_max": 0., "lon_min": 0., "lon_max": 0.}
If the temporal boundaries are equal, the data’s full time range is selected. If any of the two corresponding spatial boundaries are equal, the data’s full spatial extension is included.
For more information see
pyunicorn.Data.set_window()
.Example:
>>> data = ClimateData.SmallTestData() >>> data.set_window(window={"time_min": 0., "time_max": 0., ... "lat_min": 10., "lat_max": 20., ... "lon_min": 5., "lon_max": 10.}) >>> r(data.anomaly()) array([[ 0.5 , -0.5 ], [ 0.321 , -0.63 ], [ 0.1106, -0.6984], [-0.1106, -0.6984], [-0.321 , -0.63 ], [-0.5 , 0.5 ], [-0.321 , 0.63 ], [-0.1106, 0.6984], [ 0.1106, 0.6984], [ 0.321 , 0.63 ]])
Parameters: window (dictionary) – The spatio-temporal window to select a view on the data.
-
shuffled_anomaly
()[source]¶ Return the randomly shuffled anomaly time series.
Each anomaly time series is shuffled individually.
Note
Only the currently selected spatio-temporal window is considered.
Example (Anomaly with and without temporal shuffling should have the same standard deviation along time axis):
>>> r(ClimateData.SmallTestData().anomaly().std(axis=0)) array([ 0.31 , 0.6355, 0.31 , 0.6355, 0.31 , 0.6355]) >>> r(ClimateData.SmallTestData().shuffled_anomaly().std(axis=0)) array([ 0.31 , 0.6355, 0.31 , 0.6355, 0.31 , 0.6355])
Return type: 2D Numpy array [time, node index] Returns: the anomalized and shuffled time series.
-
time_cycle
= None¶ (number (int)) - The annual cycle length of the data (units of samples).