Automation for oceanographers

Posted on Fri 07 January 2022 in FOSS

As a physical oceanographer, I occasionally spend chunks of time at sea. Typically this will be aboard a scientific research vessel with very limited shoreside connectivity. In order to keep oceanographic data flowing, I have developed several hacky scripts to perform routine analysis and transfer data in an efficient manner.

Read emails in Python

This script is used to periodically check a mailbox for emails and take an action if they match certain criteria. I developed this to read automated emails of glider locations and add them to a database

import json
import imaplib
import email
from datetime import datetime
import time
import os
from pathlib import Path


def read_email_from_gmail():
    # check what time email was last checked
    timefile = Path("lastcheck.txt")
    if timefile.exists():
        with open("lastcheck.txt", "r") as variable_file:
            for line in variable_file.readlines():
                last_check = datetime.fromisoformat((line.strip()))
    else:
        last_check = datetime(1970,1,1)
    # Write the time of this run
    with open('lastcheck.txt', 'w') as f:
        f.write(str(datetime.now()))
    # Check gmail account for emails
    mail = imaplib.IMAP4_SSL('imap.gmail.com')
    mail.login("youremail@gmail.com", "password")
    mail.select('inbox')

    result, data = mail.search(None, 'ALL')
    mail_ids = data[0]

    id_list = mail_ids.split()
    first_email_id = int(id_list[0])
    latest_email_id = int(id_list[-1])
    # Cut to last 10 emails
    if len(id_list) > 10:
        first_email_id = int(id_list[-10])

    # Check which emails have arrived since the last run of this script
    unread_emails = []
    for i in range(first_email_id,latest_email_id+1):
        result, data = mail.fetch(str(i), '(RFC822)')

        for response_part in data:
            if isinstance(response_part, tuple):
                msg = email.message_from_bytes(response_part[1])
                date_tuple = email.utils.parsedate_tz(msg['Date'])
                if date_tuple:
                    local_date = datetime.fromtimestamp(
                        email.utils.mktime_tz(date_tuple))
                    if local_date > last_check:
                        unread_emails.append(i)
    # Exit if no new emails
    if not unread_emails:
        with open('mqtt-log.txt', 'a') as f:
            f.write(str(datetime.now()) + ' no new mail' + '\n')
        exit(0)

    # Check new emails
    for i in unread_emails:
        result, data = mail.fetch(str(i), '(RFC822)')
        for response_part in data:
            if isinstance(response_part, tuple):
                msg = email.message_from_bytes(response_part[1])
                email_subject = msg['subject']
                email_from = msg['from']
                #print ('From : ' + email_from + '\n')
                #print ('Subject : ' + email_subject + '\n')
                # If email is from UEA domain and subject is GPS, pass to glider_loc script
                if email_from[-10:-1] == 'uea.ac.uk' and email_subject == 'GPS':
                    # The function glider_loc takes the glider location and relays it over MQTT
                    glider_loc(data, email_from)

Note: to avoid getting locked out by Gmail, I recommend enabling 2FA and creating an app password for this script to use.

Automated emails for data transfer

The simplest method I have found for sending automated emails is to install nullmailer on a Linux box then run a short shell script.

#!/bin/bash
mv /home/pilot/data-to-nbp/most-recent /home/pilot/data-to-nbp/dives-`date +"%Y-%m-%dT%H:%M"`
mv /home/pilot/data-to-nbp/dives-to-nbp.zip /home/pilot/data-to-nbp/dives-to-nbp.zip-`date +"%Y-%m-%dT%H:%M"`
mkdir /home/pilot/data-to-nbp/most-recent
find /home/sg**/p*.mat -mtime -0.25 -exec cp {} /home/pilot/data-to-nbp/most-recent  \;
zip -rj /home/pilot/data-to-nbp/dives-to-nbp.zip /home/pilot/data-to-nbp/most-recent
echo "data last 6 hours" | mail -s "data4u" email@provider -A  /home/pilot/data-to-nbp/dives-to-nbp.zip
printf '\n%s' "$(date "+%Y-%m-%dT%H:%M:%S")" >> /home/pilot/data-to-nbp/transfer.log
printf ", transferred data" >> /home/pilot/data-to-nbp/transfer.log

This script performs several useful tasks. Here's a line by line breakdown - Archives the folder /home/pilot/data-to-nbp/most-recent with a timestamp - archives the previously sent zip file - creates a directory - finds files matching a certain pattern created in the last 6 hours and copies them to that directory - zips the files - emails the zip file to a recipient - logs that the transfer was successful

Read locations from Argos tags

This Python script accesses the Argos web portal through a dedicated web API. This enable automated access to Argos tag locations

import datetime
import json
import os
import zeep
import xmltodict

wsdl = "http://ws-argos.cls.fr/argosDws/services/DixService?wsdl"

client = zeep.Client(wsdl=wsdl)
resp_xml = client.service.getXml(username="argos username", password="argo password", nbPassByPtt=100,
                                 nbDaysFromNow=20,
                                 displayLocation="true", displayRawData="true",
                                 mostRecentPassages="true", platformId=str(tag_number))

resp_dict = xmltodict.parse(resp_xml)
bar = resp_dict['data']
# Only some records have valid locations
if 'program' not in bar.keys():
    return
baz = bar['program']
b = baz['platform']
b0 = b['satellitePass']


for b1 in b0:
    if 'location' not in b1.keys():
        continue
    argo_dict = b1['location']
    location_ate = datetime.datetime.strptime(argo_dict['locationDate'], '%Y-%m-%dT%H:%M:%S.%fZ')
    location_tag_number = int(tag_number)
    location_longitude = float(argo_dict['longitude'])
    location_latitude = float(argo_dict['latitude'])
    location_quality = argo_dict['locationClass']
    location_altitude = float(argo_dict['altitude'])

GDAL for creating webtiles

Short bash script that will take any input geotiff and create webtiles for use with leaflet maps. This is how I generated the ice maps for the nbp2202map. Credit to Li Ling for figuring out how to warp the geotiffs to a usable projection.

PATH=$PATH:/home/callum/anaconda3/envs/geospatial/bin
infile=input_filename.tif
gdalwarp -t_srs EPSG:4326 -te -140 -76 -90 -66 $infile liproj.tif
gdal_translate -of vrt -expand rgba liproj.tif li.vr
gdal2tiles.py li.vrt AMSR  --zoom 1-9

Line by line:

  • Add Anaconda environment to path which has cartopy installed. This is the easiest way to reliably install GDAL in my experience
  • Specify input file
  • Warp input file to EPSG:4236 (lazy, unprojected data)
  • Colour the input file to RGBA
  • Create webtiles at set zoom levels

Other handy scripts

  • ADCP GNSS mash a Python script that combines two timestamped datasets from an autonomous platform to add location information to ADCP data. Includes parsing NMEA, manipulating files and using datetime
  • webscraping a nice little example of scraping data from GitHub
  • geotiff-generator A Python program to generate geotiffs from EMODnet, GEBCO or user supplied bathymetry. Includes taking user input from the command-line, stitching together EMODnet netCDFs and working with tri-band rasters

Tools used

These scripts use python and/or bash. The Python stuff probably works on Windows, but all were developed on Linux. For more tools check out out my toolbox