Getting Started: SeaExplorer#
Gather data#
SeaExplorers send back and record two main types of files, glider files (*.gli.*) that contain glider navigation information, and payload files (*.pld1.*) that contain the science data. These can be subset files, *.sub.* that Alseamar decimates for transmission, or they can be full resolution files from the glider (*.raw.*), offloaded post mission. The raw or subset files need to be made available in a single directory for pyglider to process.
You can download and expand example data using .get_example_data:
import pyglider.example_data as pexamp
pexamp.get_example_data('./')
which will add a local directory example-data to your current directory.
Make a deployment configuration file#
The processing routines all take a deployment.yaml file as an argument, and information from this is used to fill in metadata and to map sensor names to NetCDF variable names. See Example deployment.yaml, below.
There are four top-levels to the deployment.yaml
metadata: The only field that is necessary here isglider_name. The rest of the fields will be added to the netcdf files as top-level attributesglider_devices: This is a list of the glider devices, and any information about them like make, mode, serial number. This is optional, and again is added to the netcdf top-level attributesnetcdf_variables: These are necessary, and map from sensor name (e.g.source: GPCTD_CONDUCTIVITY) to a data variable name (e.g.conductivity). The fields other thansource:are optional for the processing to run, and are placed in the attributes of the netCDF variable. However, note that many of these attributes are necessary for CF compliance.profile_variables: This is a mapping for variables that are per-profile, rather than timeseries. They include variables like a mean position and time for the profile, and a mean derived ocean velocities.
Process#
The example script is relatively straight forward if there is no intermediate processing. See Example processing script, below.
Data comes from an input directory, and is translated to raw glider-dependent parquet files files and put in a new directory. These files are useful of their own right. Apache Parquet is a columnar oriented format for storing tabular data. Parquet files take up less space than netCDF or csv and are much faster to read and write. These files can be opened with polars.read_parquet or pandas.read_parquet. These files are then merged into a single monolithic parquet file, and this is translated to a CF-compliant timeseries netCDF file. Finally individual profiles are saved and a 2-D 1-m grid in time-depth is saved.
It is likely that between these steps the user will want to add any screening steps, or adjustments to the calibrations. PyGlider does not provide those steps.
Example deployment.yaml#
metadata:
# https://github.com/ioos/ioosngdac/wiki/NGDAC-NetCDF-File-Format-Version-2
acknowledgement: Funding from Fisheries and Oceans Canada, Canadian Foundation
for Innovation, BC Knowledge Development Fund
comment: "Explorer Seamount cruise on Tully"
contributor_name: James Pegg, Jody Klymak, Tetjana Ross
contributor_role: Lead Technician, Principal Investigator, Co-PI
creator_email: jklymak@uvic.ca
creator_name: Jody Klymak
creator_url: http://cproof.uvic.ca
# date_created etc: added automatically
# numbers must be enclosed by strings so YAML keeps as strings
deployment_id: '1'
deployment_name: 'dfo-eva035-20190718'
deployment_start: '2019-07-18'
deployment_end: '2019-12-30'
format_version: IOOS_Glider_NetCDF_v2.0.nc
# id filled automatically...
glider_name: dfo-eva035
glider_serial: '035'
glider_model: SeaExplorer
glider_instrument_name: seaexplorer
glider_wmo: '999999'
institution: C-PROOF
keywords: "AUVS, Autonomous Underwater Vehicles, Oceans, Ocean Pressure,
Water Pressure, Oceans, Ocean Temperature, Water Temperature,
Oceans, Salinity/Density, Conductivity, Oceans,
Salinity/Density, Density, Oceans, Salinity/Density, Salinity"
keywords_vocabulary: GCMD Science Keywords
license: "This data may be redistributed and used without restriction or
warranty"
metadata_link: "https://cproof.uvic.ca"
Metadata_Conventions: CF-1.6, Unidata Dataset Discovery v1.0
naming_authority: "ca.uvic.cproof"
platform_type: "SeaExplorer Glider"
processing_level: "Data provided as is with no expressed or implied
assurance of quality assurance or quality control."
project: ExplorerSeamount19
project_url: http://cproof.uvic.ca
publisher_email: jklymak@uvic.ca
publisher_name: Jody Klymak
publisher_url: http://cproof.uvic.ca
references: cproof toolbox URL
# https://www.nodc.noaa.gov/General/NODC-Archive/seanamelist.txt
sea_name: BC Coastal Waters
source: Observational data from a profiling glider.
standard_name_vocabulary: CF STandard Name Table v49
summary: Short deployment off Tully near Explorer Seamount.
transmission_system: IRRIDIUM
wmo_id: "999999"
glider_devices:
pressure:
make: Micron
model: Pressure
serial: '104702'
ctd:
make: Seabird
model: GPCTD
serial: '0278'
long_name: Seabird SlocumCTD
make_model: Seabird SlocumCTD
factory_calibrated: "Yes"
calibration_date: "02/11/2018"
calibration_report: " "
comment: " "
optics:
make: Wetlabs
model: FLBBCDSLC
serial: '4741'
oxygen:
make: AROD_FT
model: Optode4831
serial: '0022'
# map between glider variables and netcdf variables. This shouldn't
# change too much.
netcdf_variables:
timebase:
source: GPCTD_TEMPERATURE
# Time and Place:
time:
source: time
long_name: Time
standard_name: time
axis: T
observation_type: "measured"
coordinates: time depth latitude longitude
latitude:
source: NAV_LATITUDE
long_name: latitude
standard_name: latitude
units: degrees_north
axis: Y
coordinates: time depth latitude longitude
conversion: nmea2deg
comment: "Estimated between surface fixes"
observation_type: measured
platform: platform
reference: WGS84
valid_max: 90.0
valid_min: -90.0
coordinate_reference_frame: urn:ogc:crs:EPSG::4326
longitude:
source: NAV_LONGITUDE
long_name: longitude
standard_name: longitude
units: degrees_east
axis: X
coordinates: time depth latitude longitude
conversion: nmea2deg
comment: "Estimated between surface fixes"
observation_type: measured
platform: platform
reference: WGS84
valid_max: 180.0
valid_min: -180.0
coordinate_reference_frame: urn:ogc:crs:EPSG::4326
heading:
source: Heading
long_name: glider heading angle
standard_name: platform_orientation
units: degrees
coordinates: time depth latitude longitude
pitch:
source: Pitch
long_name: glider pitch angle
standard_name: platform_pitch_angle
units: degrees
coordinates: time depth latitude longitude
roll:
source: Roll
long_name: glider roll angle
standard_name: platform_roll_angle
units: degrees
coordinates: time depth latitude longitude
# data parameters
conductivity:
source: GPCTD_CONDUCTIVITY
long_name: water conductivity
standard_name: sea_water_electrical_conductivity
units: S m-1
coordinates: time depth latitude longitude
instrument: instrument_ctd
valid_min: 0
valid_max: 10
observation_type: "measured"
accuracy: 0.0003
precision: 0.0001
resolution: 0.00002
temperature:
source: GPCTD_TEMPERATURE
long_name: water temperature
standard_name: sea_water_temperature
units: Celsius
coordinates: time depth latitude longitude
instrument: instrument_ctd
valid_min: -5
valid_max: 50
observation_type: "measured"
accuracy: 0.002
precision: 0.001
resolution: 0.0002
pressure:
source: GPCTD_PRESSURE
long_name: water pressure
standard_name: sea_water_pressure
units: dbar
coordinates: time depth latitude longitude
valid_min: 0
valid_max: 2000
positive: "down"
reference_datum: "sea-surface"
instrument: "instrument_ctd"
observation_type: "measured"
accuracy: 1
precision: 2
resolution: 0.02
comment: "ctd pressure sensor"
# optics:
chlorophyll:
source: FLBBCD_CHL_SCALED
long_name: chlorophyll
standard_name: concentration_of_chlorophyll_in_sea_water
units: mg m-3
coordinates: time depth latitude longitude
cdom:
source: FLBBCD_CDOM_SCALED
long_name: CDOM
units: ppb
coordinates: time depth latitude longitude
backscatter_700:
source: FLBBCD_BB_700_SCALED
long_name: 700 nm wavelength backscatter
units: "1"
coordinates: time depth latitude longitude
# Oxygen
oxygen_concentration:
source: AROD_FT_DO
long_name: oxygen concentration
standard_name: mole_concentration_of_dissolved_molecular_oxygen_in_sea_water
units: umol l-1
coordinates: time depth latitude longitude
coarsen: 8
temperature_oxygen:
source: AROD_FT_TEMP
long_name: oxygen sensor temperature
standard_name: temperature_of_sensor_for_oxygen_in_sea_water
units: Celsius
coordinates: time depth latitude longitude
coarsen: 8
# derived water speed:
# water_velocity_eastward:
# source: m_final_water_vx
# long_name: mean eastward water velocity in segment
# standard_name: barotropic_eastward_sea_water_velocity
# units: m s-1
# coordinates: time depth latitude longitude
#
# water_velocity_northward:
# source: m_final_water_vy
# long_name: mean northward water velocity in segment
# standard_name: barotropic_northward_sea_water_velocity
# units: m s-1
# coordinates: time depth latitude longitude
profile_variables:
# variables for extract_L1timeseries_profiles processing step...
profile_id:
comment: Sequential profile number within the trajectory. This value is unique in each file that is part of a single trajectory/deployment.
long_name: 'Profile ID'
valid_max: 2147483647
valid_min: 1
profile_time:
comment: Timestamp corresponding to the mid-point of the profile
long_name: Profile Center Time
observation_type: calculated
platform: platform
standard_name: time
profile_time_start:
comment: Timestamp corresponding to the start of the profile
long_name: Profile Start Time
observation_type: calculated
platform: platform
standard_name: time
profile_time_end:
comment: Timestamp corresponding to the end of the profile
long_name: Profile End Time
observation_type: calculated
platform: platform
standard_name: time
profile_lat:
comment: Value is interpolated to provide an estimate of the latitude at the mid-point of the profile
long_name: Profile Center Latitude
observation_type: calculated
platform: platform
standard_name: latitude
units: degrees_north
valid_max: 90.0
valid_min: -90.0
profile_lon:
comment: Value is interpolated to provide an estimate of the latitude at the mid-point of the profile
long_name: Profile Center Longitude
observation_type: calculated
platform: platform
standard_name: longitude
units: degrees_east
valid_max: 180.0
valid_min: -180.0
u:
comment: The depth-averaged current is an estimate of the net current measured while the glider is underwater. The value is calculated over the entire underwater segment, which may consist of 1 or more dives.
long_name: Depth-Averaged Eastward Sea Water Velocity
observation_type: calculated
platform: platform
standard_name: eastward_sea_water_velocity
units: m s-1
valid_max: 10.0
valid_min: -10.0
v:
comment: The depth-averaged current is an estimate of the net current measured while the glider is underwater. The value is calculated over the entire underwater segment, which may consist of 1 or more dives.
long_name: Depth-Averaged Northward Sea Water Velocity
observation_type: calculated
platform: platform
standard_name: northward_sea_water_velocity
units: m s-1
valid_max: 10.0
valid_min: -10.0
lon_uv:
comment: Not computed
long_name: Longitude
observation_type: calculated
platform: platform
standard_name: longitude
units: degrees_east
valid_max: 180.0
valid_min: -180.0
lat_uv:
comment: Not computed
long_name: Latitude
observation_type: calculated
platform: platform
standard_name: latitude
units: degrees_north
valid_max: 90.0
valid_min: -90.0
time_uv:
comment: Not computed
long_name: Time
standard_name: time
calendar: gregorian
units: seconds since 1970-01-01T00:00:00Z
observation_type: calculated
instrument_ctd:
comment: pumped CTD
calibration_date: "2017-12-24"
calibration_report: 20171224_Seabird_SlocumCTD_SN9446_calibrations.pdf
factory_calibrated: "yes"
long_name: Seabird Glider Payload CTD
make_model: Seabird GPCTD
platform: platform
serial_number: "9446"
type: platform
Example processing script#
import logging
import os
import pyglider.seaexplorer as seaexplorer
import pyglider.ncprocess as ncprocess
import pyglider.utils as pgutils
logging.basicConfig(level='INFO')
sourcedir = '~alseamar/Documents/SEA035/000012/000012/C-Csv/*'
rawdir = './realtime_raw/'
rawncdir = './realtime_rawnc/'
deploymentyaml = './deploymentRealtime.yml'
l0tsdir = './L0-timeseries/'
profiledir = './L0-profiles/'
griddir = './L0-gridfiles/'
## get the data and clean up derived
if False:
os.system('rsync -av ' + sourcedir + ' ' + rawdir)
# clean last processing...
os.system('rm ' + rawncdir + '* ' + l0tsdir + '* ' + profiledir + '* ' +
griddir + '* ')
if True:
# turn *.EBD and *.DBD into *.ebd.nc and *.dbd.nc netcdf files.
seaexplorer.raw_to_rawnc(rawdir, rawncdir, deploymentyaml)
# merge individual neetcdf files into single netcdf files *.ebd.nc and *.dbd.nc
seaexplorer.merge_parquet(rawncdir, rawncdir, deploymentyaml, kind='sub')
# Make level-1 timeseries netcdf file from th raw files...
outname = seaexplorer.raw_to_timeseries(rawncdir, l0tsdir, deploymentyaml, kind='sub')
ncprocess.extract_timeseries_profiles(outname, profiledir, deploymentyaml)
outname2 = ncprocess.make_gridfiles(outname, griddir, deploymentyaml)
pgutils.example_gridplot(outname2, './gridplot.png', ylim=[700, 0],
toplot=['potential_temperature', 'salinity', 'oxygen_concentration',
'chlorophyll', 'cdom'])