### Visualizing GPS tracking data in Python/Jupyter¶

(Get this Jupyter/ipynb notebook from my Github account)

Did you know you can embed interactive Google Maps and other similar mapping tools directly into your Python notebook? Check it out, as we play with and analyze some GPS tracking data! (This Jupyter notebook is availabile for download here.)

Before getting started we can use a command-line shell utility from the GPXPY package to check the summary stats of the data file:

!gpxinfo route1141462231.gpx

File: route1141462231.gpx
Length 2D: 5.941km
Length 3D: 5.941km
Moving time: n/a
Stopped time: n/a
Max speed: 0.00m/s = 0.00km/h
Total uphill: 0.00m
Total downhill: 0.00m
Started: None
Ended: None
Points: 500
Avg distance between points: 11.88m

Track #0, Segment #0
Length 2D: 5.941km
Length 3D: 5.941km
Moving time: n/a
Stopped time: n/a
Max speed: 0.00m/s = 0.00km/h
Total uphill: 0.00m
Total downhill: 0.00m
Started: None
Ended: None
Points: 500
Avg distance between points: 11.88m



GPX files can contain multiple tracks, each with multiple segments, each with multiple tracking points. For this file the above shows only one track, with a single segment, containing 500 data points. The data files can contain time, altitude, and speed information too, and the above summary suggests those values are missing; we'll double-check that below.

Now for the Python:

# some of this cell's code came from this link - thank you!
# https://ocefpaf.github.io/python4oceanographers/blog/2014/08/18/gpx

import gpxpy
gpx = gpxpy.parse(open('./route1141462231.gpx'))

# Files can have more than one track, which can have more than one segment, which have more than one point...
print('Num tracks: ' + str(len(gpx.tracks)))
track = gpx.tracks[0]
print('Num segments: ' + str(len(track.segments)))
segment = track.segments[0]
print('Num segments: ' + str(len(segment.points)))

# Load the data into a Pandas dataframe (by way of a list)
data = []
segment_length = segment.length_3d()
for point_idx, point in enumerate(segment.points):
data.append([point.longitude, point.latitude,point.elevation,
point.time, segment.get_speed(point_idx)])
import pandas as pd
columns = ['Longitude', 'Latitude', 'Altitude', 'Time', 'Speed']
df = pd.DataFrame(data, columns=columns)
print('\nNum non-None Longitude records: ' + str(len(df[~pd.isnull(df.Longitude)])))
print('Num non-None Latitude records: ' + str(len(df[~pd.isnull(df.Latitude)])))
print('Num non-None Altitude records: ' + str(len(df[~pd.isnull(df.Altitude)])))
print('Num non-None Time records: ' + str(len(df[~pd.isnull(df.Time)])))
print('Num non-None Speed records: ' + str(len(df[~pd.isnull(df.Speed)])))
print('\nTitle string contained in track.name: ' + track.name)

Num tracks: 1
Num segments: 1
Num segments: 500

Longitude   Latitude Altitude  Time Speed
0 -122.338898  47.612796     None  None  None
1 -122.338973  47.612702     None  None  None
2 -122.338854  47.612698     None  None  None
3 -122.338701  47.612898     None  None  None
4 -122.338664  47.613013     None  None  None

Num non-None Longitude records: 500
Num non-None Latitude records: 500
Num non-None Altitude records: 0
Num non-None Time records: 0
Num non-None Speed records: 0

Title string contained in track.name:
Walked 3.52 mi on 07/01/2016



So we've confirmed the lack of Altitude, Time, and Speed data; just Lon/Lat points. But at least the title string for the track lists total distance and date - the former we can verify with the data; the latter we cannot.

Meanwhile here are some options to get an interactive map plot of the track:

%matplotlib inline


#### MPLleaflet:¶

import mplleaflet   # (https://github.com/jwass/mplleaflet)
import matplotlib.pyplot as plt
plt.plot(df['Longitude'], df['Latitude'], color='red', marker='o', markersize=3, linewidth=2, alpha=0.4)
#mplleaflet.display(fig=ax.figure)  # shows map inline in Jupyter but takes up full width
mplleaflet.show(path='mpl.html')  # saves to html file for display below
#mplleaflet.display(fig=fig, tiles='esri_aerial')  # shows aerial/satellite photo
# (I don't actually find the aerial view very helpful as it's oblique and obscures what's on the track.)


#### Folium:¶

import folium   # (https://pypi.python.org/pypi/folium)
mymap = folium.Map( location=[ df.Latitude.mean(), df.Longitude.mean() ], zoom_start=14)
for coord in df[['Latitude','Longitude']].values:
#mymap   # shows map inline in Jupyter but takes up full width
mymap.save('fol.html')  # saves to html file for display below


import gmplot   # (https://github.com/vgm64/gmplot)
gmap.scatter(df['Latitude'], df['Longitude'], 'red', size=7, marker=False)
# apparently cannot be shown inline in Jupyter


For better display I saved the above to html files which I'll show in subframes below:

%%HTML
<iframe width="45%" height="350" src="fol.html"></iframe>
<iframe width="45%" height="350" src="mpl.html"></iframe>
<!-- <iframe width="45%" height="350" src="gmap.html"></iframe>  hm, this one dies without a Google API key -->


The Google Maps plot below is just the screenshot I took to show how it displayed in my Safari browser outside of Jupyter, actually even using the same <iframe width="45%" height="350" src="gmap.html"></iframe> as above in a separate HTML file (flanked by <HTML><BODY> and </BODY></HTML>), no problem in that case. It appears there's some check they're doing that prevents straightforward anonymous Google Maps calls from within Jupyter, oh well.

%%HTML
<img width="40%" src="gmap.png">

# calculate distances on surface of ellipsoid
from vincenty import vincenty

df['lastLat']=df['Latitude'].shift(1)
df['lastLong']=df['Longitude'].shift(1)
df['dist(meters)'] = df.apply(lambda x: vincenty((x['Latitude'], x['Longitude']), (x['lastLat'], x['lastLong'])), axis = 1) * 1000.

print('Total distance as summed between points in track:')
print('   ' + str(sum(df['dist(meters)'][1:])*0.000621371) + ' mi')
# The df['dist'][1:] above is because the "shift" sets the first lastLon,lastLat as NaN.
print('Comparing to total distance contained in track.name: ' + track.name)

Total distance as summed between points in track:
3.69802674803 mi
Comparing to total distance contained in track.name:
Walked 3.52 mi on 07/01/2016



That 0.18 mile difference above is about 274 meters. There may have there been a different instrument or method used by whoever entered that 3.52 into the title string, and also we did not take the elevation changes into account in the distance calculation (the elevation changes were available in that separate download link on the MapMyRun page and I didn't incorporate them into the dataframe here).

However, there's also a very interesting phenomenon going on in the GPS data as seen in those maps above. That walk in downtown Seattle is right among the tallest skyscraper buildings in the city, and the GPS signals are known to reflect off those buildings and cause geometric effects like that for GPS in such downtown areas. There are papers about this - it's not a trivial matter - it's not noise that you can simply filter out, it's a spatially- and temporarily-varying bias, because not only does it depend on where you are standing with your GPS unit downtown, but it also depends on where the GPS satellites are in their trajectory. Here are a few interesting examples from the scientific/engineering literature about this issue: