Flexible YAML and OG 1.0 Output#
PyGlider 1.0 makes the deployment YAML more flexible so that variable names,
derived-variable computations, and the netCDF dimension name are no longer
hardcoded. This lets you produce OceanGliders 1.0 (OG 1.0)
trajectory files — with uppercase OG vocabulary names and the N_MEASUREMENTS
dimension — using exactly the same processing pipeline as the legacy IOOS GDAC
format.
Top-level YAML keys#
Two new top-level keys control the output format.
output_conventions#
Declares the naming convention used in the file. This value is recorded in the global attributes and signals to downstream tools which vocabulary the variable names follow.
output_conventions: OG-1.0 # or IOOS_GDAC (default)
output_dimension#
Sets the name of the observation dimension in the output netCDF file.
Internally pyglider always works with a dimension called time; this key
causes it to be renamed on write and restored on read.
output_dimension: N_MEASUREMENTS # default: time
If omitted, the dimension name is time (IOOS GDAC style).
Variable flexibility via processing_role#
In previous versions, the processing pipeline looked for variables by their
literal name — ds['pressure'], ds['conductivity'], etc. — which forced
the YAML to use those exact keys.
Now each variable entry in netcdf_variables can carry a processing_role
that tells the pipeline what role the variable plays, regardless of what it is
called in the output file.
netcdf_variables:
PRES: # OG 1.0 output name
source: sci_water_pressure
processing_role: pressure # pipeline looks this up by role
long_name: Pressure (measured variable)
units: dbar
...
LATITUDE:
source: m_lat
processing_role: latitude
...
Required roles#
The following roles are used directly by the processing pipeline regardless of
any processing_method entries. If you use non-standard variable names (i.e.
OG 1.0), you must declare these roles explicitly:
Role |
Default IOOS GDAC name |
Typical OG 1.0 name |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If a required role cannot be resolved to a variable that exists in the dataset,
pyglider raises a ValueError with a message pointing to the YAML fix needed.
Legacy fallback roles#
Variables like temperature, conductivity, and oxygen are only looked up by role
in the legacy processing path (when no processing_method entries cover
thermodynamic variables). If you supply processing_method blocks for salinity
and density — as OG 1.0 YAMLs should — you do not need processing_role on
these variables; the method inputs name them explicitly.
If processing_role is absent and no processing_method covers a variable,
pyglider falls back to looking for a variable whose name matches the role string,
so existing IOOS GDAC YAMLs continue to work without modification.
Derived variables via processing_method#
Previously, salinity, density, depth, and profile numbering were computed by
hardcoded calls inside the processing functions, always consuming variables
named conductivity, temperature, pressure, etc.
Now you can specify how each derived variable is computed and which named inputs to use:
PSAL:
processing_method:
practical_salinity:
conductivity: CNDC
temperature: TEMP
pressure: PRES
long_name: Sea water practical salinity
units: "1"
...
DEPTH:
processing_method:
depth_from_pressure:
pressure: PRES
latitude: LATITUDE
processing_role: depth
...
PROFILE_NUMBER:
processing_method:
find_profiles:
pressure: PRES
processing_role: profile_index
...
The processing_method key contains a single-entry mapping from a method name
to its named inputs. The inputs are references to other variable names in
netcdf_variables.
Built-in method names#
Method |
Computes |
Required inputs |
|---|---|---|
|
SP via TEOS-10 |
|
|
θ via TEOS-10 |
|
|
σ₀ via TEOS-10 |
|
|
in-situ density |
|
|
depth (m) |
|
|
profile index and direction |
|
|
cumulative distance |
|
|
sparse GPS fix variable (SeaExplorer only) |
|
GPS fix variables (SeaExplorer OG 1.0)#
OG 1.0 requires three sparse variables that record the glider’s actual GPS surface
fixes: LATITUDE_GPS, LONGITUDE_GPS, and TIME_GPS. These are non-NaN only at
the measurement timestamps that are closest to a real GPS fix; all other values are
NaN. They are derived from the SeaExplorer navigation (gli) files, not from the
sensor payload.
Use processing_method: gps_fixes_from_nav to declare them in the YAML. Three
separate entries are needed, one per output variable, each with a role input that
selects which array to write (latitude, longitude, or time). The lat_source
and lon_source inputs name the NMEA-format columns in the gli file (almost always
Lat and Lon).
LATITUDE_GPS:
processing_method:
gps_fixes_from_nav:
role: latitude
lat_source: Lat
lon_source: Lon
long_name: Latitude of each GPS surface fix
standard_name: latitude
units: degrees_north
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LAT/
LONGITUDE_GPS:
processing_method:
gps_fixes_from_nav:
role: longitude
lat_source: Lat
lon_source: Lon
long_name: Longitude of each GPS surface fix
standard_name: longitude
units: degrees_east
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LON/
TIME_GPS:
processing_method:
gps_fixes_from_nav:
role: time
lat_source: Lat
lon_source: Lon
long_name: Time of each GPS surface fix
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
observation_type: measured
Pyglider reads the merged gli parquet file, filters to rows where DeadReckoning
is 0 (or NavState is not 116 when DeadReckoning is absent), converts NMEA
coordinates to decimal degrees, and maps each fix to the nearest timestamp on the
sensor time grid. The three variables are produced in one pass; the role input
determines which result is assigned to each YAML entry. All other YAML attributes
(units, long_name, vocabulary, etc.) are written to the variable as usual.
Custom methods#
If the method name contains a . it is treated as a dotted Python import
path. The function must have the signature:
def my_method(ds: xr.Dataset, inputs: dict, output_name: str) -> xr.DataArray:
...
where inputs is {role_name: variable_name_in_ds, ...} as declared in the
YAML.
Dimension handling#
The time dimension is always used internally. On write, if
output_dimension is set to something other than time, the dimension is
renamed just before the file is saved. On read (e.g. when loading a
timeseries file to make grids or profiles), pyglider detects the non-standard
dimension by finding the coordinate with standard_name: time and renames it
back to time. This is transparent to any intermediate processing steps.
OG 1.0 examples#
Slocum#
# OG 1.0 compliant deployment YAML for dfo-rosie713-20190615
# Based on tests/example-slocum/deploymentRealtime.yml
#
# Key differences from IOOS GDAC format:
# - output_dimension: N_MEASUREMENTS (instead of time as dimension)
# - Variable names follow OG1.0 vocabulary (uppercase, e.g. CNDC, TEMP, PRES)
# - processing_role: maps output variable name to its role in the processing pipeline
# - processing_method: specifies how derived variables are computed and from which inputs
# - QC variables are auto-named as {varname}_QC
output_dimension: N_MEASUREMENTS
output_conventions: OG-1.0
metadata:
acknowledgement: Funding from Fisheries and Oceans Canada, Canadian Foundation
for Innovation, BC Knowledge Development Fund
comment: "Calvert Island test deployment June 2019."
contributor_name: James Pegg, Jody Klymak, Tetjana Ross, Cailin Burmaster
contributor_email: jpegg@uvic.ca, jklymak@uvic.ca, tross@uvic.ca, cburmaster@uvic.ca
contributor_role: Technical Coordinator, PI, PI, Operator
contributor_role_vocabulary: "http://vocab.nerc.ac.uk/collection/W08/current/CONT0005/,http://vocab.nerc.ac.uk/collection/W08/current/CONT0004/,http://vocab.nerc.ac.uk/collection/W08/current/CONT0004/,http://vocab.nerc.ac.uk/collection/W08/current/CONT0003/"
creator_email: jklymak@uvic.ca
creator_name: Jody Klymak
creator_url: http://cproof.uvic.ca
data_mode: 'R'
deployment_id: '1'
deployment_name: 'dfo-rosie713-20190615'
deployment_start: '2019-06-15'
deployment_end: '2019-06-30'
glider_name: dfo-rosie
glider_serial: '713'
glider_model: Slocum G3 Deep
glider_instrument_name: slocum
glider_wmo: "999999"
institution: C-PROOF
contributing_institutions: C-PROOF
contributing_institutions_role: Operator
contributing_institutions_role_vocabulary: "http://vocab.nerc.ac.uk/collection/W08/current/CONT0003/"
keywords: "AUVS, Autonomous Underwater Vehicles, Oceans, Ocean Pressure,
Water Pressure, Oceans, Ocean Temperature, Water Temperature,
Oceans, Salinity/Density, Conductivity, Oceans,
Salinity/Density, Density, Oceans, Salinity/Density, Salinity"
keywords_vocabulary: GCMD Science Keywords
license: "This data may be redistributed and used without restriction or
warranty"
metadata_link: "https://cproof.uvic.ca"
metadata_conventions: CF-1.10, ACDD-1.3, OG-1.0
Conventions: CF-1.10, ACDD-1.3, OG-1.0
naming_authority: "ca.uvic.cproof"
platform: "sub-surface gliders"
platform_serial: "713"
platform_vocabulary: "http://vocab.nerc.ac.uk/collection/L06/current/27/"
platform_type: "Slocum Glider"
processing_level: "Data provided as is with no expressed or implied
assurance of quality assurance or quality control."
project: SaanichInletTest19
project_url: http://cproof.uvic.ca
publisher_email: jklymak@uvic.ca
publisher_name: Jody Klymak
publisher_url: https://cproof.uvic.ca
references: cproof toolbox URL
rtqc_method: "No QC applied"
sea_name: Coastal Waters of Southeast Alaska and British Columbia
source: Observational data from a profiling glider.
standard_name_vocabulary: CF Standard Name Table v83
summary: Manufacturer test in Saanich Inlet.
transmission_system: IRRIDIUM
wmo_id: "999999"
deployment_vessel: "Vellela Vellela"
deployment_station: "Imperial Eagle"
deployment_latitude: "48.873728"
deployment_longitude: "-125.212218"
glider_devices:
pressure:
make: Micron
model: Pressure
serial: '104702'
ctd:
sensor_name: SENSOR_CTD
long_name: CTD Metadata
make_model: Seabird SlocumCTD
maker: Seabird Scientific
model: SlocumCTD
type: CTD
type_vocabulary: "https://vocab.nerc.ac.uk/collection/L05/current"
# pyglider-only fields (used for processing, not written to netCDF):
make: Seabird
serial: '9507'
factory_calibrated: " "
calibration_date: " "
calibration_report: " "
comment: 'Constants for ctd_9507 are found using dfo-rosie713-20230810.'
Thermal_lag_constants_[alpha,tau]: [0.2, 2]
dTdC: 0
optics:
sensor_name: SENSOR_FLUOROMETER
long_name: Fluorometer Metadata
make_model: Wetlabs FLBBCDSLC
maker: Wetlabs
model: FLBBCDSLC
type: fluorometer_chla
type_vocabulary: "http://vocab.nerc.ac.uk/collection/R25/current/"
# pyglider-only:
make: Wetlabs
serial: '5059'
oxygen:
sensor_name: SENSOR_DOXY
long_name: Oxygen Sensor Metadata
make_model: AADI Optode4831
maker: AADI
model: Optode4831
type: OPTODE_DOXY
type_vocabulary: "http://vocab.nerc.ac.uk/collection/R25/current/"
# pyglider-only:
make: AADI
serial: '665'
netcdf_variables:
# -------------------------------------------------------------------------
# Coordinates — processing_role tells the pipeline what each variable is
# -------------------------------------------------------------------------
TIME:
source: sci_m_present_time
processing_role: time
long_name: Time elapsed since 1970-01-01T00:00:00Z
standard_name: time
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
axis: T
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/TIME/
LATITUDE:
source: m_lat
processing_role: latitude
long_name: Latitude north (WGS84)
standard_name: latitude
units: degrees_north
axis: Y
comment: "Estimated between surface fixes"
observation_type: measured
reference: WGS84
valid_max: 90.0
valid_min: -90.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LAT/
LONGITUDE:
source: m_lon
processing_role: longitude
long_name: Longitude east (WGS84)
standard_name: longitude
units: degrees_east
axis: X
comment: "Estimated between surface fixes"
observation_type: measured
reference: WGS84
valid_max: 180.0
valid_min: -180.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LON/
# -------------------------------------------------------------------------
# Measured CTD variables
# -------------------------------------------------------------------------
CNDC:
source: sci_water_cond
processing_role: conductivity
long_name: Electrical conductivity of the water body by CTD
standard_name: sea_water_electrical_conductivity
units: S m-1
sensor: SENSOR_CTD
valid_min: 0.0
valid_max: 10.0
observation_type: measured
accuracy: 0.0003
precision: 0.0001
resolution: 0.00002
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CNDC/
TEMP:
source: sci_water_temp
processing_role: temperature
long_name: Temperature of the water body by CTD
standard_name: sea_water_temperature
units: degree_C
sensor: SENSOR_CTD
valid_min: -5.0
valid_max: 50.0
observation_type: measured
accuracy: 0.002
precision: 0.001
resolution: 0.0002
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/TEMP/
PRES:
source: sci_water_pressure
processing_role: pressure
long_name: Pressure (measured variable)
standard_name: sea_water_pressure
units: dbar
conversion: bar2dbar
sensor: SENSOR_CTD
valid_min: 0.0
valid_max: 2000.0
positive: down
reference_datum: sea-surface
observation_type: measured
accuracy: 1.0
precision: 2.0
resolution: 0.02
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PRES/
# -------------------------------------------------------------------------
# Derived coordinate variables — processing_method specifies inputs
# -------------------------------------------------------------------------
DEPTH:
processing_method:
depth_from_pressure:
pressure: PRES
latitude: LATITUDE
processing_role: depth
long_name: Depth below surface of the water body
standard_name: depth
units: m
axis: Z
positive: down
reference_datum: sea-surface
observation_type: calculated
valid_min: 0.0
valid_max: 2000.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/DEPTH/
PROFILE_NUMBER:
processing_method:
find_profiles:
pressure: PRES
processing_role: profile_index
long_name: Profile number
comment: "PROFILE_NUMBER increments by one each time the glider starts
an ascending or descending profile."
valid_min: 1
valid_max: 2147483647
PROFILE_DIRECTION:
processing_method:
find_profiles:
pressure: PRES
processing_role: profile_direction
long_name: Vertical direction of profile
comment: "1 = descending, -1 = ascending, 0 = not in a profile"
DISTANCE_OVER_GROUND:
processing_method:
distance_over_ground:
latitude: LATITUDE
longitude: LONGITUDE
long_name: Distance over ground flown since mission start
units: km
# -------------------------------------------------------------------------
# Derived thermodynamic variables (TEOS-10)
# The teos_10 method computes salinity, potential temperature, potential
# density, and density together. Each entry specifies which output to
# assign and which inputs to use, allowing for multiple CTDs.
# -------------------------------------------------------------------------
PSAL:
processing_method:
practical_salinity:
conductivity: CNDC
temperature: TEMP
pressure: PRES
long_name: Sea water practical salinity
standard_name: sea_water_practical_salinity
units: "1"
comment: "raw, uncorrected salinity"
valid_min: 0.0
valid_max: 40.0
observation_type: calculated
sensor: SENSOR_CTD
accuracy: 0.01
precision: 0.01
resolution: 0.001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PSAL/
THETA:
processing_method:
potential_temperature:
salinity: PSAL
temperature: TEMP
pressure: PRES
long_name: Water potential temperature
standard_name: sea_water_potential_temperature
units: degree_C
observation_type: calculated
accuracy: 0.002
precision: 0.001
resolution: 0.0001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/THETA/
SIGTHETA:
processing_method:
potential_density_sigma0:
salinity: PSAL
temperature: TEMP
pressure: PRES
latitude: LATITUDE
longitude: LONGITUDE
long_name: Water potential density referenced to surface
standard_name: sea_water_potential_density
units: kg m-3
observation_type: calculated
accuracy: 0.01
precision: 0.01
resolution: 0.001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/SIGTHETA/
DENSITY:
processing_method:
density:
salinity: PSAL
temperature: TEMP
pressure: PRES
latitude: LATITUDE
longitude: LONGITUDE
long_name: Density
standard_name: sea_water_density
units: kg m-3
observation_type: calculated
valid_min: 990.0
valid_max: 1040.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/DENSITY/
# -------------------------------------------------------------------------
# Other measured variables
# -------------------------------------------------------------------------
HEADING:
source: m_heading
long_name: Glider heading angle
standard_name: platform_orientation
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/HEADING/
PITCH:
source: m_pitch
long_name: Glider pitch angle
standard_name: platform_pitch_angle
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PITCH/
ROLL:
source: m_roll
long_name: Glider roll angle
standard_name: platform_roll_angle
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/ROLL/
WAYPOINT_LATITUDE:
source: c_wpt_lat
long_name: Waypoint latitude
standard_name: latitude
units: degrees_north
WAYPOINT_LONGITUDE:
source: c_wpt_lon
long_name: Waypoint longitude
standard_name: longitude
units: degrees_east
CHLA:
source: sci_flbbcd_chlor_units
long_name: Chlorophyll-a concentration
standard_name: mass_concentration_of_chlorophyll_a_in_sea_water
units: mg m-3
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CHLA/
CDOM:
source: sci_flbbcd_cdom_units
long_name: CDOM
units: ppb
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CDOM/
BBP700:
source: sci_flbbcd_bb_units
long_name: 700 nm wavelength backscatter
units: "1"
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/BBP700/
DOXY:
source: sci_oxy4_oxygen
long_name: Dissolved oxygen
standard_name: moles_of_oxygen_per_unit_mass_in_sea_water
units: umol kg-1
sensor: SENSOR_DOXY
valid_min: 0.0
valid_max: 500.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/DOXY/
# -------------------------------------------------------------------------
# QC variables — auto-named {varname}_QC, no source needed
# average_method: QC_protocol tells the gridder to use max-flag binning
# -------------------------------------------------------------------------
CNDC_QC:
average_method: QC_protocol
TEMP_QC:
average_method: QC_protocol
PRES_QC:
average_method: QC_protocol
PSAL_QC:
average_method: QC_protocol
DENSITY_QC:
average_method: QC_protocol
DOXY_QC:
average_method: QC_protocol
CHLA_QC:
average_method: QC_protocol
scalar_variables:
PLATFORM_MODEL:
from_metadata: glider_model
long_name: Glider model name
WMO_IDENTIFIER:
from_metadata: glider_wmo
long_name: WMO identifier
PLATFORM_SERIAL_NUMBER:
from_metadata: glider_serial
long_name: Glider serial number
DEPLOYMENT_TIME:
from_metadata: deployment_start
long_name: Deployment start time
units: seconds since 1970-01-01T00:00:00Z
DEPLOYMENT_LATITUDE:
from_metadata: deployment_latitude
long_name: Deployment latitude
units: degrees_north
DEPLOYMENT_LONGITUDE:
from_metadata: deployment_longitude
long_name: Deployment longitude
units: degrees_east
# NOTE: profile_variables is used by extract_timeseries_profiles (IOOS GDAC
# profile files). OG 1.0 does not define separate per-profile files — the
# trajectory file itself is the deliverable. This section is retained for
# users who also want IOOS GDAC profile output, but variable names will need
# updating if OG 1.0 naming is required there too.
profile_variables:
profile_id:
comment: Sequential profile number within the trajectory.
long_name: 'Profile ID'
valid_max: 2147483646
valid_min: 1
profile_time:
comment: Timestamp corresponding to the mid-point of the profile
long_name: Profile Center Time
observation_type: calculated
standard_name: time
profile_time_start:
comment: Timestamp corresponding to the start of the profile
long_name: Profile Start Time
observation_type: calculated
standard_name: time
profile_time_end:
comment: Timestamp corresponding to the end of the profile
long_name: Profile End Time
observation_type: calculated
standard_name: time
profile_lat:
comment: Interpolated latitude at the mid-point of the profile
long_name: Profile Center Latitude
observation_type: calculated
standard_name: latitude
units: degrees_north
valid_max: 90.0
valid_min: -90.0
profile_lon:
comment: Interpolated longitude at the mid-point of the profile
long_name: Profile Center Longitude
observation_type: calculated
standard_name: longitude
units: degrees_east
valid_max: 180.0
valid_min: -180.0
u:
comment: Depth-averaged eastward current estimate
long_name: Depth-Averaged Eastward Sea Water Velocity
observation_type: calculated
standard_name: eastward_sea_water_velocity
units: m s-1
valid_max: 10.0
valid_min: -10.0
v:
comment: Depth-averaged northward current estimate
long_name: Depth-Averaged Northward Sea Water Velocity
observation_type: calculated
standard_name: northward_sea_water_velocity
units: m s-1
valid_max: 10.0
valid_min: -10.0
lon_uv:
comment: Not computed
long_name: Longitude
observation_type: calculated
standard_name: longitude
units: degrees_east
valid_max: 180.0
valid_min: -180.0
lat_uv:
comment: Not computed
long_name: Latitude
observation_type: calculated
standard_name: latitude
units: degrees_north
valid_max: 90.0
valid_min: -90.0
time_uv:
comment: Not computed
long_name: Time
standard_name: time
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
observation_type: calculated
instrument_ctd:
comment: pumped CTD
calibration_date: "2017-12-24"
factory_calibrated: "yes"
long_name: Seabird Glider Payload CTD
make_model: Seabird GPCTD
serial_number: "9507"
type: platform
# -*- coding: utf-8 -*-
"""
Process dfo-rosie713 realtime data to OG 1.0 format.
This script mirrors tests/example-slocum/process_deploymentRealTime.py but
targets OG 1.0 output. It uses the same raw binary data; only the YAML and
output directories differ.
NOTE: This script documents the *intended* API once pyglider has been updated
to support processing_role, processing_method, output_dimension, and the
_load_dataset/_save_dataset helpers. Some calls will not work correctly until
those changes are made to slocum.py, utils.py, and ncprocess.py.
"""
import logging
import os
import pyglider.ncprocess as ncprocess
import pyglider.slocum as slocum
import pyglider.utils as pgutils
logging.basicConfig(level='INFO')
# Raw data lives in the existing example-slocum directory.
# Outputs go into subdirectories here.
binarydir = '../example-slocum/realtime_raw/'
cacdir = '../example-slocum/cac/'
deploymentyaml = './deploymentRealtime_og10.yml'
l1tsdir = './L0-timeseries-og10/'
# profiledir = './L0-profiles-og10/'
griddir = './L0-gridfiles-og10/'
# ------------------------------------------------------------------------
# Step 1: binary → OG 1.0 timeseries netCDF
#
# binary_to_timeseries will need to:
# - read processing_role to identify pressure/temperature/conductivity/
# latitude/longitude variables by role rather than by hardcoded name
# - read processing_method entries to compute derived variables
# (DEPTH, PSAL, SIGMA0, DENSITY, POTENTIAL_TEMPERATURE, PROFILE_NUMBER,
# PROFILE_DIRECTION, DISTANCE_OVER_GROUND) using the named inputs
# - call _save_dataset instead of ds.to_netcdf so that the time dimension
# is renamed to N_MEASUREMENTS before writing
# ------------------------------------------------------------------------
outname = slocum.binary_to_timeseries(
binarydir, cacdir, l1tsdir, deploymentyaml,
search='*.[s|t]bd',
profile_filt_time=20,
profile_min_time=20,
)
# ------------------------------------------------------------------------
# Step 2: timeseries → per-profile netCDF files (IOOS GDAC style)
#
# extract_timeseries_profiles will need to:
# - call _load_dataset instead of xr.open_dataset so that the
# N_MEASUREMENTS dimension is normalised back to time for processing
# - use processing_role to find profile_index (PROFILE_NUMBER),
# latitude (LATITUDE), longitude (LONGITUDE), etc.
# - call _save_dataset when writing each profile file
#
# NOTE: OG 1.0 does not define per-profile files; the trajectory file is
# the primary deliverable. This step produces IOOS GDAC profile files as
# a secondary output for users who need them.
# ------------------------------------------------------------------------
# ncprocess.extract_timeseries_profiles(outname, profiledir, deploymentyaml)
# ------------------------------------------------------------------------
# Step 3: timeseries → gridded netCDF
#
# make_gridfiles will need to:
# - call _load_dataset to normalise the dimension on load
# - use processing_role to find depth (DEPTH), latitude (LATITUDE),
# longitude (LONGITUDE) for gridding axes
# - call _save_dataset when writing the gridded file
# ------------------------------------------------------------------------
outname2 = ncprocess.make_gridfiles(outname, griddir, deploymentyaml)
SeaExplorer#
# OG 1.0 compliant deployment YAML for dfo-eva035-20190718
# Based on tests/example-data/example-seaexplorer/deploymentRealtime.yml
#
# Key differences from IOOS GDAC format:
# - output_dimension: N_MEASUREMENTS (instead of time as dimension)
# - Variable names follow OG 1.0 vocabulary (uppercase, e.g. CNDC, TEMP, PRES)
# - processing_role: maps output variable name to its role in the pipeline
# - processing_method: specifies how derived variables are computed and from which inputs
# - QC variables are auto-named as {varname}_QC
output_dimension: N_MEASUREMENTS
output_conventions: OG-1.0
metadata:
acknowledgement: Funding from Fisheries and Oceans Canada, Canadian Foundation
for Innovation, BC Knowledge Development Fund
comment: "Explorer Seamount cruise on Tully"
contributor_name: James Pegg, Jody Klymak, Tetjana Ross
contributor_email: jpegg@uvic.ca, jklymak@uvic.ca, tross@uvic.ca
contributor_role: Technical Coordinator, PI, PI
contributor_role_vocabulary: "http://vocab.nerc.ac.uk/collection/W08/current/CONT0005/,http://vocab.nerc.ac.uk/collection/W08/current/CONT0004/,http://vocab.nerc.ac.uk/collection/W08/current/CONT0004/"
creator_email: jklymak@uvic.ca
creator_name: Jody Klymak
creator_url: http://cproof.uvic.ca
data_mode: 'R'
deployment_id: '1'
deployment_name: 'dfo-eva035-20190718'
deployment_start: '2019-07-18'
deployment_end: '2019-12-30'
glider_name: dfo-eva035
glider_serial: '035'
glider_model: SeaExplorer
glider_instrument_name: seaexplorer
glider_wmo: '999999'
institution: C-PROOF
contributing_institutions: C-PROOF
contributing_institutions_role: Operator
contributing_institutions_role_vocabulary: "http://vocab.nerc.ac.uk/collection/W08/current/CONT0003/"
keywords: "AUVS, Autonomous Underwater Vehicles, Oceans, Ocean Pressure,
Water Pressure, Oceans, Ocean Temperature, Water Temperature,
Oceans, Salinity/Density, Conductivity, Oceans,
Salinity/Density, Density, Oceans, Salinity/Density, Salinity"
keywords_vocabulary: GCMD Science Keywords
license: "This data may be redistributed and used without restriction or
warranty"
metadata_link: "https://cproof.uvic.ca"
metadata_conventions: CF-1.10, ACDD-1.3, OG-1.0
Conventions: CF-1.10, ACDD-1.3, OG-1.0
naming_authority: "ca.uvic.cproof"
platform: "sub-surface gliders"
platform_vocabulary: "http://vocab.nerc.ac.uk/collection/L06/current/27/"
platform_type: "SeaExplorer Glider"
processing_level: "Data provided as is with no expressed or implied
assurance of quality assurance or quality control."
project: ExplorerSeamount19
project_url: http://cproof.uvic.ca
publisher_email: jklymak@uvic.ca
publisher_name: Jody Klymak
publisher_url: http://cproof.uvic.ca
references: cproof toolbox URL
rtqc_method: "No QC applied"
sea_name: BC Coastal Waters
source: Observational data from a profiling glider.
standard_name_vocabulary: CF Standard Name Table v83
summary: Short deployment off Tully near Explorer Seamount.
transmission_system: IRRIDIUM
wmo_id: "999999"
deployment_latitude: "48.91"
deployment_longitude: "-130.61"
glider_devices:
pressure:
make: Micron
model: Pressure
serial: '104702'
ctd:
sensor_name: SENSOR_CTD
long_name: CTD Metadata
make_model: Seabird GPCTD
maker: Seabird Scientific
model: GPCTD
type: CTD
type_vocabulary: "https://vocab.nerc.ac.uk/collection/L05/current"
# pyglider-only fields (used for processing, not written to netCDF):
make: Seabird
serial: '0278'
factory_calibrated: "Yes"
calibration_date: "02/11/2018"
calibration_report: " "
Thermal_lag_constants_[alpha,tau]: [0.34, 4.6]
dTdC: 0
comment: 'Constants were found using dfo-bb046-20220707.'
optics:
sensor_name: SENSOR_FLUOROMETER
long_name: Fluorometer Metadata
make_model: Wetlabs FLBBCDSLC
maker: Wetlabs
model: FLBBCDSLC
type: fluorometer_chla
type_vocabulary: "http://vocab.nerc.ac.uk/collection/R25/current/"
# pyglider-only:
make: Wetlabs
serial: '4741'
oxygen:
make: AROD_FT
model: Optode4831
serial: '0022'
netcdf_variables:
timebase:
source: GPCTD_TEMPERATURE
# -------------------------------------------------------------------------
# Coordinates — processing_role tells the pipeline what each variable is
# -------------------------------------------------------------------------
TIME:
source: time
processing_role: time
long_name: Time elapsed since 1970-01-01T00:00:00Z
standard_name: time
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
axis: T
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/TIME/
LATITUDE:
source: NAV_LATITUDE
processing_role: latitude
long_name: Latitude north (WGS84)
standard_name: latitude
units: degrees_north
axis: Y
conversion: nmea2deg
comment: "Estimated between surface fixes"
observation_type: measured
reference: WGS84
valid_max: 90.0
valid_min: -90.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LAT/
LONGITUDE:
source: NAV_LONGITUDE
processing_role: longitude
long_name: Longitude east (WGS84)
standard_name: longitude
units: degrees_east
axis: X
conversion: nmea2deg
comment: "Estimated between surface fixes"
observation_type: measured
reference: WGS84
valid_max: 180.0
valid_min: -180.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LON/
# -------------------------------------------------------------------------
# Measured CTD variables
# -------------------------------------------------------------------------
CNDC:
source: GPCTD_CONDUCTIVITY
processing_role: conductivity
long_name: Electrical conductivity of the water body by CTD
standard_name: sea_water_electrical_conductivity
units: S m-1
sensor: SENSOR_CTD
valid_min: 0.0
valid_max: 10.0
observation_type: measured
accuracy: 0.0003
precision: 0.0001
resolution: 0.00002
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CNDC/
TEMP:
source: GPCTD_TEMPERATURE
processing_role: temperature
long_name: Temperature of the water body by CTD
standard_name: sea_water_temperature
units: degree_C
sensor: SENSOR_CTD
valid_min: -5.0
valid_max: 50.0
observation_type: measured
accuracy: 0.002
precision: 0.001
resolution: 0.0002
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/TEMP/
PRES:
source: GPCTD_PRESSURE
processing_role: pressure
long_name: Pressure (measured variable)
standard_name: sea_water_pressure
units: dbar
sensor: SENSOR_CTD
valid_min: 0.0
valid_max: 2000.0
positive: down
reference_datum: sea-surface
observation_type: measured
accuracy: 1.0
precision: 2.0
resolution: 0.02
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PRES/
# -------------------------------------------------------------------------
# Derived coordinate variables — processing_method specifies inputs
# -------------------------------------------------------------------------
DEPTH:
processing_method:
depth_from_pressure:
pressure: PRES
latitude: LATITUDE
processing_role: depth
long_name: Depth below surface of the water body
standard_name: depth
units: m
axis: Z
positive: down
reference_datum: sea-surface
observation_type: calculated
valid_min: 0.0
valid_max: 2000.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/DEPTH/
PROFILE_NUMBER:
processing_method:
find_profiles:
pressure: PRES
processing_role: profile_index
long_name: Profile number
comment: "PROFILE_NUMBER increments by one each time the glider starts
an ascending or descending profile."
valid_min: 1
valid_max: 2147483647
PROFILE_DIRECTION:
processing_method:
find_profiles:
pressure: PRES
processing_role: profile_direction
long_name: Vertical direction of profile
comment: "1 = descending, -1 = ascending, 0 = not in a profile"
DISTANCE_OVER_GROUND:
processing_method:
distance_over_ground:
latitude: LATITUDE
longitude: LONGITUDE
long_name: Distance over ground flown since mission start
units: km
# -------------------------------------------------------------------------
# Sparse GPS fix variables (non-NaN only at actual surface fixes)
# -------------------------------------------------------------------------
LATITUDE_GPS:
processing_method:
gps_fixes_from_nav:
role: latitude
lat_source: Lat
lon_source: Lon
long_name: Latitude of each GPS surface fix
standard_name: latitude
units: degrees_north
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LAT/
LONGITUDE_GPS:
processing_method:
gps_fixes_from_nav:
role: longitude
lat_source: Lat
lon_source: Lon
long_name: Longitude of each GPS surface fix
standard_name: longitude
units: degrees_east
observation_type: measured
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/LON/
TIME_GPS:
processing_method:
gps_fixes_from_nav:
role: time
lat_source: Lat
lon_source: Lon
long_name: Time of each GPS surface fix
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
observation_type: measured
# -------------------------------------------------------------------------
# Derived thermodynamic variables (TEOS-10)
# -------------------------------------------------------------------------
PSAL:
processing_method:
practical_salinity:
conductivity: CNDC
temperature: TEMP
pressure: PRES
long_name: Sea water practical salinity
standard_name: sea_water_practical_salinity
units: "1"
comment: "raw, uncorrected salinity"
valid_min: 0.0
valid_max: 40.0
observation_type: calculated
sensor: SENSOR_CTD
accuracy: 0.01
precision: 0.01
resolution: 0.001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PSAL/
THETA:
processing_method:
potential_temperature:
salinity: PSAL
temperature: TEMP
pressure: PRES
long_name: Water potential temperature
standard_name: sea_water_potential_temperature
units: degree_C
observation_type: calculated
accuracy: 0.002
precision: 0.001
resolution: 0.0001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/THETA/
SIGTHETA:
processing_method:
potential_density_sigma0:
salinity: PSAL
temperature: TEMP
pressure: PRES
latitude: LATITUDE
longitude: LONGITUDE
long_name: Water potential density referenced to surface
standard_name: sea_water_potential_density
units: kg m-3
observation_type: calculated
accuracy: 0.01
precision: 0.01
resolution: 0.001
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/SIGTHETA/
DENSITY:
processing_method:
density:
salinity: PSAL
temperature: TEMP
pressure: PRES
latitude: LATITUDE
longitude: LONGITUDE
long_name: Density
standard_name: sea_water_density
units: kg m-3
observation_type: calculated
valid_min: 990.0
valid_max: 1040.0
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/DENSITY/
# -------------------------------------------------------------------------
# Other measured variables
# -------------------------------------------------------------------------
HEADING:
source: Heading
long_name: Glider heading angle
standard_name: platform_orientation
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/HEADING/
PITCH:
source: Pitch
long_name: Glider pitch angle
standard_name: platform_pitch_angle
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/PITCH/
ROLL:
source: Roll
long_name: Glider roll angle
standard_name: platform_roll_angle
units: degrees
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/ROLL/
CHLA:
source: FLBBCD_CHL_SCALED
long_name: Chlorophyll-a concentration
standard_name: mass_concentration_of_chlorophyll_a_in_sea_water
units: mg m-3
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CHLA/
CDOM:
source: FLBBCD_CDOM_SCALED
long_name: CDOM
units: ppb
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/CDOM/
BBP700:
source: FLBBCD_BB_700_SCALED
long_name: 700 nm wavelength backscatter
units: "1"
sensor: SENSOR_FLUOROMETER
vocabulary: http://vocab.nerc.ac.uk/collection/OG1/current/BBP700/
# -------------------------------------------------------------------------
# QC variables
# -------------------------------------------------------------------------
CNDC_QC:
average_method: QC_protocol
TEMP_QC:
average_method: QC_protocol
PRES_QC:
average_method: QC_protocol
PSAL_QC:
average_method: QC_protocol
DENSITY_QC:
average_method: QC_protocol
CHLA_QC:
average_method: QC_protocol
scalar_variables:
PLATFORM_MODEL:
from_metadata: glider_model
long_name: Glider model name
WMO_IDENTIFIER:
from_metadata: glider_wmo
long_name: WMO identifier
PLATFORM_SERIAL_NUMBER:
from_metadata: glider_serial
long_name: Glider serial number
DEPLOYMENT_TIME:
from_metadata: deployment_start
long_name: Deployment start time
units: seconds since 1970-01-01T00:00:00Z
DEPLOYMENT_LATITUDE:
from_metadata: deployment_latitude
long_name: Deployment latitude
units: degrees_north
DEPLOYMENT_LONGITUDE:
from_metadata: deployment_longitude
long_name: Deployment longitude
units: degrees_east
# -*- coding: utf-8 -*-
"""
Process dfo-eva035 realtime data to OG 1.0 format.
This script mirrors process_deploymentRealTime.py but targets OG 1.0 output.
It reuses the raw parquet files produced by the standard pipeline; only the
YAML and output directories differ.
"""
import logging
import pyglider.seaexplorer as seaexplorer
import pyglider.ncprocess as ncprocess
logging.basicConfig(level='INFO')
rawncdir = './realtime_rawnc/'
deploymentyaml = './deploymentRealtime_og10.yml'
l0tsdir = './L0-timeseries-og10/'
griddir = './L0-gridfiles-og10/'
# Step 1: timeseries (raw_to_rawnc and merge_parquet already run by the
# standard pipeline — we reuse the parquet files in realtime_rawnc/).
outname = seaexplorer.raw_to_timeseries(
rawncdir, l0tsdir, deploymentyaml, kind='sub'
)
# Step 2: gridded netCDF
outname2 = ncprocess.make_gridfiles(outname, griddir, deploymentyaml)