CTD Tools API
CTD Tools Readers Module
This module provides various reader classes for importing CTD sensor data from different file formats into xarray Datasets. It includes a registry of available readers, allowing for lazy loading of specific reader classes based on the file format.
Available Readers:
NetCdfReader: Read NetCDF files
CsvReader: Read CSV files
RbrAsciiReader: Read RBR ASCII files
NortekAsciiReader: Read Nortek ASCII files
RbrRskLegacyReader: Read legacy RSK files
RbrRskReader: Read RSK files
RbrRskAutoReader: Auto-detect RSK format
SbeCnvReader: Read SeaBird CNV files
SeasunTobReader: Read Sea & Sun TOB files
Example Usage:
from ctd_tools.readers import SbeCnvReader, NetCdfReader
# Read a CNV file reader = SbeCnvReader(“data.cnv”) data = reader.get_data()
# Read a NetCDF file nc_reader = NetCdfReader(“data.nc”) nc_data = nc_reader.get_data()
- class ctd_tools.readers.AbstractReader(input_file: str, mapping: dict | None = None, input_header_file: str | None = None, perform_default_postprocessing: bool = True, rename_variables: bool = True, assign_metadata: bool = True, sort_variables: bool = True)[source]
Abstract super class for reading sensor data.
Must be subclassed to implement specific file format readers.
- input_file
The path to the input file containing sensor data.
- Type:
str
- data
The processed sensor data as a xarray Dataset, or None if not yet processed.
- Type:
xr.Dataset | None
- mapping
A dictionary mapping names used in the input file to standard names.
- Type:
dict, optional
- perform_default_postprocessing
Whether to perform default post-processing on the data.
- Type:
bool
- rename_variables
Whether to rename xarray variables to standard names.
- Type:
bool
- assign_metadata
Whether to assign metadata to xarray variables.
- Type:
bool
- sort_variables
Whether to sort xarray variables by name.
- Type:
bool
- __init__(input_file: str, mapping: dict | None = None,
perform_default_postprocessing: bool = True, rename_variables: bool = True, assign_metadata: bool = True, sort_variables: bool = True)
Initializes the reader with the input file and optional mapping.
- _perform_default_postprocessing(ds: xr.Dataset) xr.Dataset [source]
Performs default post-processing on the xarray Dataset.
- assign_metadata = True
- abstract static file_extension() str | None [source]
Get the file extension for this reader.
This property must be implemented by all subclasses.
Returns:
- str
The file extension (e.g., ‘.cnv’, ‘.tob’, ‘.rsk’).
Raises:
- NotImplementedError:
If the subclass does not implement this property.
- abstract static format_key() str [source]
Get the format key for this reader.
This property must be implemented by all subclasses.
Returns:
- str
The format key (e.g., ‘sbe-cnv’, ‘nortek-ascii’, ‘rbr-rsk’).
Raises:
- NotImplementedError:
If the subclass does not implement this property.
- abstract static format_name() str [source]
Get the format name for this reader.
This property must be implemented by all subclasses.
Returns:
- str
The format (e.g., ‘SeaBird CNV’, ‘Nortek ASCII’, ‘RBR RSK’).
Raises:
- NotImplementedError:
If the subclass does not implement this property.
- perform_default_postprocessing = True
- rename_variables = True
- sort_variables = True
CTD Tools Writers Module
This module provides various writer classes for exporting CTD sensor data from xarray Datasets to different file formats.
Available Writers:
NetCdfWriter: Export to NetCDF format
CsvWriter: Export to CSV format
ExcelWriter: Export to Excel format
Example Usage:
from ctd_tools.writers import NetCdfWriter, CsvWriter, ExcelWriter
# Write to NetCDF writer = NetCdfWriter(data) writer.write(“output.nc”)
# Write to CSV csv_writer = CsvWriter(data) csv_writer.write(“output.csv”)
# Write to Excel excel_writer = ExcelWriter(data) excel_writer.write(“output.xlsx”)
- class ctd_tools.writers.AbstractWriter(data: Dataset)[source]
Abstract base class for writing sensor data from xarray Datasets.
This class provides a common interface for all writer implementations. All concrete writer classes should inherit from this class and implement the write method.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be written.
Methods:
- __init__(data: xr.Dataset):
Initializes the writer with the provided xarray Dataset.
- file_extension: str
The default file extension for this writer (to be implemented by subclasses).
- data: xr.Dataset
The xarray Dataset containing the sensor data.
- data.setter(value: xr.Dataset):
Sets the xarray Dataset with validation.
- write(file_name: str, **kwargs):
Writes the xarray Dataset to a file (to be implemented by subclasses).
Raises:
- NotImplementedError:
If the subclass does not implement the write method or the file_extension property.
- TypeError:
If the provided data is not an xarray Dataset.
- property data: Dataset
Get the xarray Dataset.
Returns:
- xr.Dataset
The xarray Dataset containing the sensor data.
- class ctd_tools.writers.CsvWriter(data: Dataset)[source]
Writes sensor data from a xarray Dataset to a CSV file.
This class is used to save sensor data in a CSV format, which is a common format for tabular data. The provided data is expected to be in an xarray Dataset format.
- Example usage:
writer = CsvWriter(data) writer.write(“output_file.csv”)
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be written to a CSV file.
Methods:
- __init__(data: xr.Dataset):
Initializes the CsvWriter with the provided xarray Dataset.
- write(file_name: str, coordinate = params.TIME):
Writes the xarray Dataset to a CSV file with the specified file name and coordinate. The coordinate parameter specifies which coordinate to use for selecting the data.
- file_extension: str
The default file extension for this writer, which is ‘.csv’.
- static file_extension() str [source]
Get the default file extension for this writer.
Returns:
- str
The file extension for CSV files, which is ‘.csv’.
- write(file_name: str, coordinate='time', **kwargs)[source]
Writes the xarray Dataset to a CSV file with the specified file name and coordinate.
Parameters:
- file_name (str):
The name of the output CSV file where the data will be saved.
- coordinate (str):
The coordinate to use for selecting the data. Default is params.TIME. This should be a valid coordinate present in the xarray Dataset.
- **kwargs:
Additional keyword arguments (unused in this implementation).
- class ctd_tools.writers.ExcelWriter(data: Dataset)[source]
Writes sensor data from a xarray Dataset to an Excel file.
This class is used to save sensor data in an Excel format, which is commonly used for tabular data. The provided data is expected to be in an xarray Dataset format.
- Example usage:
writer = ExcelWriter(data) writer.write(“output_file.xlsx”)
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be written to an Excel file.
Methods:
- __init__(data: xr.Dataset):
Initializes the ExcelWriter with the provided xarray Dataset.
- write(file_name: str, coordinate = params.TIME):
Writes the xarray Dataset to an Excel file with the specified file name and coordinate. The coordinate parameter specifies which coordinate to use for selecting the data.
- file_extension: str
The default file extension for this writer, which is ‘.xlsx’.
- static file_extension() str [source]
Get the default file extension for this writer.
Returns:
- str
The file extension for Excel files, which is ‘.xlsx’.
- write(file_name: str, coordinate='time', **kwargs)[source]
Writes the xarray Dataset to an Excel file with the specified file name and coordinate.
Parameters:
- file_name (str):
The name of the output Excel file where the data will be saved.
- coordinate (str):
The coordinate to use for selecting the data. Default is params.TIME. This should be a valid coordinate present in the xarray Dataset.
- **kwargs:
Additional keyword arguments (unused in this implementation).
Raises:
- ValueError:
If the provided coordinate is not found in the dataset.
- class ctd_tools.writers.NetCdfWriter(data: Dataset)[source]
Writes sensor data from a xarray Dataset to a netCDF file.
This class is used to save sensor data in a netCDF format, which is commonly used for storing large datasets, especially in the field of oceanography and environmental science. The provided data is expected to be in an xarray Dataset format.
- Example usage:
writer = NetCdfWriter(data) writer.write(“output_file.nc”)
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be written to a netCDF file.
Methods:
- __init__(data: xr.Dataset):
Initializes the NetCdfWriter with the provided xarray Dataset.
- write(file_name: str):
Writes the xarray Dataset to a netCDF file with the specified file name.
- file_extension: str
The default file extension for this writer, which is ‘.nc’.
CTD Tools Plotters Module
This module provides various plotter classes for visualizing CTD sensor data from xarray Datasets using matplotlib.
Available Plotters:
TsDiagramPlotter: Create T-S (Temperature-Salinity) diagrams with density isolines
ProfilePlotter: Create vertical CTD profiles for temperature and salinity
TimeSeriesPlotter: Create time series plots for any parameter
Example Usage:
from ctd_tools.plotters import TsDiagramPlotter, ProfilePlotter, TimeSeriesPlotter
# Create a T-S diagram ts_plotter = TsDiagramPlotter(data) ts_plotter.plot(title=”Station 001 T-S Diagram”, output_file=”ts_diagram.png”)
# Create a vertical profile profile_plotter = ProfilePlotter(data) profile_plotter.plot(title=”CTD Profile”, output_file=”profile.png”)
# Create a time series plot time_plotter = TimeSeriesPlotter(data) time_plotter.plot(“temperature”, title=”Temperature Time Series”, output_file=”temp_series.png”)
- class ctd_tools.plotters.AbstractPlotter(data: Dataset | None = None)[source]
Abstract base class for plotting sensor data from xarray Datasets.
This class provides a common interface for all plotter implementations. All concrete plotter classes should inherit from this class and implement the plot method.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be plotted.
Methods:
- __init__(data: xr.Dataset):
Initializes the plotter with the provided xarray Dataset.
- data: xr.Dataset
The xarray Dataset containing the sensor data.
- data.setter(value: xr.Dataset):
Sets the xarray Dataset with validation.
- plot(**kwargs):
Creates the plot (to be implemented by subclasses).
- _get_dataset_without_nan() -> xr.Dataset:
Returns dataset with NaN values removed from time dimension.
- _validate_required_variables(required_vars: list):
Validates that required variables exist in the dataset.
Raises:
- NotImplementedError:
If the subclass does not implement the plot method.
- TypeError:
If the provided data is not an xarray Dataset.
- ValueError:
If required variables are missing from the dataset.
- property data: Dataset | None
Get the xarray Dataset containing the sensor data.
Returns:
- xr.Dataset | None
The xarray Dataset containing the sensor data.
- class ctd_tools.plotters.ProfilePlotter(data: Dataset | None = None)[source]
Creates vertical CTD profiles showing temperature and salinity vs depth.
This class specializes in creating vertical profile plots with depth on the y-axis and temperature/salinity on separate x-axes.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be plotted.
Methods:
- plot(output_file=None, title=’Salinity and Temperature Profiles’,
show_grid=True, dot_size=3, show_lines_between_dots=True):
Creates and displays/saves the vertical profile plot.
- plot(output_file: str | None = None, title: str = 'Salinity and Temperature Profiles', show_grid: bool = True, dot_size: int = 3, show_lines_between_dots: bool = True, *args, **kwargs)[source]
Creates a vertical CTD profile plot.
Parameters:
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- titlestr, default ‘Salinity and Temperature Profiles’
Title for the plot.
- show_gridbool, default True
Whether to show grid lines on the plot.
- dot_sizeint, default 3
Size of the scatter plot markers.
- show_lines_between_dotsbool, default True
Whether to connect data points with lines.
- **kwargsdict
Additional keyword arguments (for compatibility).
Raises:
- ValueError:
If required variables (temperature, salinity, depth) are missing.
- class ctd_tools.plotters.TimeSeriesPlotter(data: Dataset | None = None)[source]
Creates time series plots for any parameter in the CTD dataset.
This class specializes in creating time series plots showing how a specific parameter varies over time.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be plotted.
Methods:
- plot(parameter_name, output_file=None, ylim_min=None, ylim_max=10,
xlim_min=None, xlim_max=None):
Creates and displays/saves the time series plot.
- plot(*args, **kwargs)[source]
Creates a time series plot for a given parameter.
Parameters:
Raises:
- ValueError:
If the parameter_name is not found in the dataset or time data is missing.
- plot_parameter(parameter_name: str, output_file: str | None = None, ylim_min: float | None = None, ylim_max: float | None = None)[source]
Convenience method with explicit parameters for better IDE support.
Parameters:
- parameter_namestr
Name of the parameter to plot (must exist in the dataset).
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- ylim_minfloat, optional
Minimum value for the y-axis. If None, auto-scaled.
- ylim_maxfloat, optional
Maximum value for the y-axis. If None, auto-scaled.
- class ctd_tools.plotters.TimeSeriesPlotterMulti(data: Dataset | None = None)[source]
Creates time series plots for multiple parameters in the CTD dataset.
This class specializes in creating time series plots showing how multiple parameters vary over time. It supports: - Multiple parameters on the same y-axis - Multiple parameters on dual y-axes (left/right) - Automatic unit-based grouping - Custom styling for each parameter - Data normalization for comparison - Single parameter plotting (for consistency)
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be plotted.
Methods:
- plot(parameter_names, output_file=None, dual_axis=False,
left_params=None, right_params=None, normalize=False, **kwargs):
Creates and displays/saves the time series plot for multiple parameters.
- plot_single_parameter(parameter_name, …):
Convenience method for single parameter plotting.
- plot_multiple_parameters(parameter_names, …):
Convenience method for multi-parameter plotting with explicit parameters.
- plot(*args, **kwargs)[source]
Creates a time series plot for multiple parameters.
Parameters:
- *argstuple
First argument can be parameter_names (str or List[str]).
- **kwargsdict
Keyword arguments: - parameter_names : str or List[str] - Parameter name(s) to plot - output_file : str, optional - Path to save the plot - dual_axis : bool, default False - Use dual y-axes for different units - left_params : List[str], optional - Parameters for left y-axis - right_params : List[str], optional - Parameters for right y-axis - normalize : bool, default False - Normalize all parameters to 0-1 range - colors : List[str], optional - Custom colors for each parameter - line_styles : List[str], optional - Custom line styles - ylim_left : Tuple[float, float], optional - (min, max) for left y-axis - ylim_right : Tuple[float, float], optional - (min, max) for right y-axis
Raises:
- ValueError:
If parameters are not found in the dataset or time data is missing.
- plot_multiple_parameters(parameter_names: List[str], output_file: str | None = None, dual_axis: bool = False, left_params: List[str] | None = None, right_params: List[str] | None = None, normalize: bool = False, colors: List[str] | None = None, line_styles: List[str] | None = None, ylim_left: Tuple[float, float] | None = None, ylim_right: Tuple[float, float] | None = None)[source]
Convenience method for multi-parameter plotting with explicit parameters.
Parameters:
- parameter_namesList[str]
List of parameter names to plot (must exist in the dataset).
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- dual_axisbool, default False
Use dual y-axes for different units or manual assignment.
- left_paramsList[str], optional
Parameters to plot on the left y-axis (if dual_axis=True).
- right_paramsList[str], optional
Parameters to plot on the right y-axis (if dual_axis=True).
- normalizebool, default False
Normalize all parameters to 0-1 range for comparison.
- colorsList[str], optional
Custom colors for each parameter line.
- line_stylesList[str], optional
Custom line styles for each parameter (‘-’, ‘–’, ‘-.’, ‘:’).
- ylim_leftTuple[float, float], optional
Y-axis limits for left axis as (min, max).
- ylim_rightTuple[float, float], optional
Y-axis limits for right axis as (min, max).
- plot_normalized_comparison(parameter_names: List[str], output_file: str | None = None, colors: List[str] | None = None, **kwargs)[source]
Convenience method for normalized parameter comparison.
All parameters are normalized to 0-1 range for easy comparison of trends regardless of their original units or scales.
Parameters:
- parameter_namesList[str]
List of parameter names to plot (must exist in the dataset).
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- colorsList[str], optional
Custom colors for each parameter line.
- **kwargsdict
Additional styling options.
- plot_single_parameter(parameter_name: str, output_file: str | None = None, ylim_min: float | None = None, ylim_max: float | None = None, color: str | None = None, line_style: str = '-')[source]
Convenience method for single parameter plotting.
Parameters:
- parameter_namestr
Name of the parameter to plot (must exist in the dataset).
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- ylim_minfloat, optional
Minimum value for the y-axis. If None, auto-scaled.
- ylim_maxfloat, optional
Maximum value for the y-axis. If None, auto-scaled.
- colorstr, optional
Color for the line. If None, uses default color cycle.
- line_stylestr, default ‘-’
Line style for the plot (‘-’, ‘–’, ‘-.’, ‘:’).
- plot_with_auto_dual_axis(parameter_names: List[str], output_file: str | None = None, normalize: bool = False, **kwargs)[source]
Convenience method that automatically uses dual axis based on parameter units.
Parameters:
- parameter_namesList[str]
List of parameter names to plot (must exist in the dataset).
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- normalizebool, default False
Normalize all parameters to 0-1 range for comparison.
- **kwargsdict
Additional styling options (colors, line_styles, ylim_left, ylim_right).
- class ctd_tools.plotters.TsDiagramPlotter(data: Dataset | None = None)[source]
Creates T-S (Temperature-Salinity) diagrams from CTD sensor data.
This class specializes in creating T-S diagrams, which are scatter plots of temperature vs salinity data points, often colored by depth and with optional density isolines.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be plotted.
Methods:
- plot(output_file=None, title=’T-S Diagram’, dot_size=70, use_colormap=True,
show_density_isolines=True, colormap=’jet’, show_lines_between_dots=True, show_grid=True):
Creates and displays/saves the T-S diagram.
- _plot_density_isolines():
Adds density isolines to the T-S diagram.
- plot(output_file: str | None = None, title: str = 'T-S Diagram', dot_size: int = 70, use_colormap: bool = True, show_density_isolines: bool = True, colormap: str = 'jet', show_lines_between_dots: bool = True, show_grid: bool = True, *args, **kwargs)[source]
Creates a T-S diagram plot.
Parameters:
- output_filestr, optional
Path to save the plot. If None, the plot is displayed.
- titlestr, default ‘T-S Diagram’
Title for the plot.
- dot_sizeint, default 70
Size of the scatter plot markers.
- use_colormapbool, default True
Whether to color points by depth using a colormap.
- show_density_isolinesbool, default True
Whether to show density isolines on the plot.
- colormapstr, default ‘jet’
Matplotlib colormap name to use for depth coloring.
- show_lines_between_dotsbool, default True
Whether to connect data points with lines.
- show_gridbool, default True
Whether to show grid lines on the plot.
Raises:
- ValueError:
If required variables (temperature, salinity, depth) are missing.
CTD Tools Processing Module
This module provides various processing classes for analyzing and manipulating sensor data stored in xarray Datasets.
Available Processors:
StatisticsProcessor: Calculate statistical metrics on sensor data
SubsetProcessor: Subset sensor data by time, sample indices, or parameter values
ResampleProcessor: Resample sensor data to different time intervals
Example Usage:
from ctd_tools.processing import StatisticsProcessor, SubsetProcessor, ResampleProcessor
# Calculate statistics stats_processor = StatisticsProcessor(dataset, “temperature”) mean_temp = stats_processor.mean() max_temp = stats_processor.max()
# Subset data subset_processor = SubsetProcessor(dataset) subset = subset_processor.set_time_min(“2023-01-01”).set_time_max(“2023-01-31”).get_subset()
# Resample data resample_processor = ResampleProcessor(dataset) daily_data = resample_processor.resample(“1D”)
- class ctd_tools.processors.AbstractProcessor(data: Dataset)[source]
Abstract base class for processing sensor data from xarray Datasets.
This class provides a common interface for all processor implementations. All concrete processor classes should inherit from this class and implement their specific processing methods.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be processed.
Methods:
- __init__(data: xr.Dataset):
Initializes the processor with the provided xarray Dataset.
- abstract process() Any [source]
Process the dataset.
This method should be implemented by concrete processor classes to define their specific processing logic.
Returns:
- Any:
The result of the processing operation.
- class ctd_tools.processors.ResampleProcessor(data: Dataset)[source]
Resample sensor data to different time intervals.
This class provides methods to resample sensor data along the time dimension to different frequencies (e.g., hourly, daily, monthly).
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data.
Example Usage:
resample_processor = ResampleProcessor(dataset) daily_data = resample_processor.resample(“1D”).mean() hourly_data = resample_processor.resample(“1H”).median()
- process() Dataset [source]
Process the dataset (returns the original dataset).
This method is required by the AbstractProcessor interface. For resampling, use the resample() method instead.
Returns:
- xr.Dataset:
The original dataset.
- resample(time_interval: str, dim: str | None = None) Any [source]
Resample the dataset to a specified time interval.
Parameters:
- time_intervalstr
The time interval for resampling (e.g., “1H”, “1D”, “1M”). Uses pandas frequency strings.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.core.resample.DatasetResample:
A resample object that can be used to apply aggregation functions.
Example:
# Resample to daily averages daily_mean = resample_processor.resample(“1D”).mean()
# Resample to hourly maximum values hourly_max = resample_processor.resample(“1H”).max()
- resample_count(time_interval: str, dim: str | None = None) Dataset [source]
Resample and count valid values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with count values.
- resample_max(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute maximum values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with maximum values.
- resample_mean(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute mean values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with mean values.
- resample_median(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute median values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with median values.
- resample_min(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute minimum values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with minimum values.
- resample_std(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute standard deviation.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with standard deviation values.
- resample_sum(time_interval: str, dim: str | None = None) Dataset [source]
Resample and compute sum values.
Parameters:
- time_intervalstr
The time interval for resampling.
- dimstr, optional
The dimension to resample along. If None, uses the TIME parameter.
Returns:
- xr.Dataset:
The resampled dataset with sum values.
- class ctd_tools.processors.StatisticsProcessor(data: Dataset, parameter: str)[source]
Calculate statistical metrics on sensor data.
This class provides methods to calculate various statistical measures like mean, median, standard deviation, etc. on specific parameters within a sensor dataset.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data.
- parameterstr
The name of the parameter to calculate statistics for.
Example Usage:
stats_processor = StatisticsProcessor(dataset, “temperature”) mean_temp = stats_processor.mean() max_temp = stats_processor.max() stats = stats_processor.get_all_statistics()
- count_valid(dim: str | None = None) Any [source]
Count valid (non-NaN) values.
Parameters:
- dimstr, optional
The dimension along which to count valid values. If None, uses the TIME parameter.
Returns:
- int or xr.DataArray:
The count of valid values.
- get_all_statistics(dim: str | None = None) dict [source]
Calculate all available statistics.
Parameters:
- dimstr, optional
The dimension along which to calculate statistics. If None, uses the TIME parameter.
Returns:
- dict:
A dictionary containing all calculated statistics.
- max(dim: str | None = None) Any [source]
Calculate the maximum value.
Parameters:
- dimstr, optional
The dimension along which to calculate the maximum. If None, uses the TIME parameter.
Returns:
- Any:
The maximum value(s).
- mean(dim: str | None = None) Any [source]
Calculate the arithmetic mean.
Parameters:
- dimstr, optional
The dimension along which to calculate the mean. If None, uses the TIME parameter.
Returns:
- float or xr.DataArray:
The mean value(s).
- median(dim: str | None = None) Any [source]
Calculate the median value.
Parameters:
- dimstr, optional
The dimension along which to calculate the median. If None, uses the TIME parameter.
Returns:
- float or xr.DataArray:
The median value(s).
- min(dim: str | None = None) Any [source]
Calculate the minimum value.
Parameters:
- dimstr, optional
The dimension along which to calculate the minimum. If None, uses the TIME parameter.
Returns:
- float or xr.DataArray:
The minimum value(s).
- process() dict [source]
Process the dataset to calculate all statistics.
Returns:
- dict:
A dictionary containing all calculated statistics.
- quantile(q: float | list, dim: str | None = None) Any [source]
Calculate quantiles.
Parameters:
- qfloat or list
Quantile(s) to compute (0 <= q <= 1).
- dimstr, optional
The dimension along which to calculate the quantiles. If None, uses the TIME parameter.
Returns:
- float or xr.DataArray:
The quantile value(s).
- class ctd_tools.processors.SubsetProcessor(data: Dataset)[source]
Subset sensor data based on sample number, time, and parameter values.
This class allows for flexible slicing of sensor data stored in an xarray Dataset. It can filter data based on sample indices, time ranges, and specific parameter values.
Attributes:
- dataxr.Dataset
The xarray Dataset containing the sensor data to be subsetted.
- min_sampleint, optional
The minimum sample index to include in the subset.
- max_sampleint, optional
The maximum sample index to include in the subset.
- min_datetimepd.Timestamp, optional
The minimum time to include in the subset.
- max_datetimepd.Timestamp, optional
The maximum time to include in the subset.
- parameter_namestr, optional
The name of the parameter to filter by.
- parameter_value_minfloat, optional
The minimum value of the parameter to include in the subset.
- parameter_value_maxfloat, optional
The maximum value of the parameter to include in the subset.
Example Usage:
subset_processor = SubsetProcessor(dataset) subset_processor.set_sample_min(10).set_sample_max(50) subset_processor.set_time_min(“2023-01-01”).set_time_max(“2023-01-31”) subset = subset_processor.get_subset()
- get_subset() Dataset [source]
Return the subset of the dataset based on the specified criteria.
This method applies all the slicing parameters to filter the dataset. It slices the dataset by sample number, time, and parameter values as specified.
Returns:
- xr.Dataset:
The subset of the dataset that matches the specified criteria.
- process() Dataset [source]
Process the dataset to create a subset.
This method applies all the filtering criteria to create the final subset.
Returns:
- xr.Dataset:
The subset of the dataset based on the specified criteria.
- reset() SubsetProcessor [source]
Reset all filtering criteria to None.
Returns:
- SubsetProcessor:
The current instance for method chaining.
- set_parameter_name(value: str) SubsetProcessor [source]
Set the name of the parameter to filter by.
Parameters:
- valuestr
The name of the parameter to filter by.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not a string.
- ValueError:
If the provided parameter name is not found in the dataset.
- set_parameter_value_max(value: int | float) SubsetProcessor [source]
Set the maximum value of the parameter to include in the subset.
Parameters:
- valueint or float
The maximum value of the parameter to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not a number.
- set_parameter_value_min(value: int | float) SubsetProcessor [source]
Set the minimum value of the parameter to include in the subset.
Parameters:
- valueint or float
The minimum value of the parameter to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not a number.
- set_sample_max(value: int) SubsetProcessor [source]
Set the maximum sample index for slicing the dataset.
Parameters:
- valueint
The maximum sample index to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not an integer.
- set_sample_min(value: int) SubsetProcessor [source]
Set the minimum sample index for slicing the dataset.
Parameters:
- valueint
The minimum sample index to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not an integer.
- set_time_max(value: str | Timestamp) SubsetProcessor [source]
Set the maximum time for slicing the dataset.
Parameters:
- valuestr or pd.Timestamp
The maximum time to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not a string or a pandas Timestamp.
- set_time_min(value: str | Timestamp) SubsetProcessor [source]
Set the minimum time for slicing the dataset.
Parameters:
- valuestr or pd.Timestamp
The minimum time to include in the subset.
Returns:
- SubsetProcessor:
The current instance for method chaining.
Raises:
- TypeError:
If the provided value is not a string or a pandas Timestamp.