Google Location History
Geotagging photos with Google Location History
Geotagging photos can be very useful and while most smartphones take care of this automatically, many modern DSLR cameras do not. Using the approach below, we can tag photos from our library using our google location history assuming that we use the google map service and had our smartphone with us while shooting with the DSLR.
To download google location history in json format, visit google takeout.
Additional considerations:
- Images taken with DSLR set to the correct local time
- Android Phone present while photos were taken
- iPhone with google maps with permission set to “always allow” access to device location
We can start by importing some things that we know we’re going to need. And then we can load the json data with pandas using json_normalize
. By examining the json data, we can see that locations
is probably the root note so we can start with that.
import pandas as pd
import numpy as np
import json
with open ('History.json') as f:
d = json.load(f)
data = pd.json_normalize(d['locations'])
data.head(3)
timestampMs | latitudeE7 | longitudeE7 | accuracy | altitude | activity | velocity | heading | |
---|---|---|---|---|---|---|---|---|
0 | 1496705193030 | 611923950 | -1498647164 | 26 | 70.0 | NaN | NaN | NaN |
1 | 1496705101727 | 611923950 | -1498647164 | 26 | 70.0 | NaN | NaN | NaN |
2 | 1496705041669 | 611923950 | -1498647164 | 26 | 70.0 | [{'timestampMs': '1496705043690', 'activity': ... | NaN | NaN |
Cleaning up and converting number formats
Next we can do some cleanup of data and convert some of the number formats. One thing that is sort of magic about this is that datetime.fromtimestamp seems to automatically recognize the timestamp after it has been converted from milliseconds to seconds. One other thing to mention is that the altitude needs to be an ‘int’ later on but we can start with ‘double’ for now.
Note: There is an ‘activity’ column with varying amounts of nested dictionaries and lists that seem to contain some useful information about the motion of the phone at the time (resting, walking, driving). We’ll skip this data for now since we don’t need it for this purpose. Out of curiousity, I was able to flatten it out into strings in csv format, however there were sometimes misaligned columns and it became increasingly difficult to deal with these.
#output = data[['timestampMs', 'latitudeE7', 'longitudeE7']]
from datetime import datetime
def timefmt(x):
#return datetime.fromtimestamp(int(x)).strftime("%Y-%m-%dT%H:%m")
return datetime.fromtimestamp(int(x))
output = pd.DataFrame()
output['timestampMs'] = data['timestampMs'].astype('float') / 1000
output['timestamp'] = output['timestampMs'].apply(timefmt)
output['latitudeE7'] = data['latitudeE7'].astype('float') / 10000000
output['longitudeE7'] = data['longitudeE7'].astype('float') / 10000000
output['altitude'] = data['altitude'].astype('double')
output.head()
timestampMs | timestamp | latitudeE7 | longitudeE7 | altitude | |
---|---|---|---|---|---|
0 | 1.496705e+09 | 2017-06-05 15:26:33 | 61.12345 | -149.45678 | 70.0 |
1 | 1.496705e+09 | 2017-06-05 15:25:01 | 61.12345 | -149.45678 | 70.0 |
2 | 1.496705e+09 | 2017-06-05 15:24:01 | 61.12345 | -149.45678 | 70.0 |
3 | 1.496705e+09 | 2017-06-05 15:22:36 | 61.12345 | -149.45678 | 70.0 |
4 | 1.496705e+09 | 2017-06-05 15:21:36 | 61.12345 | -149.45678 | 70.0 |
#just checking what we're dealing with
output.dtypes
timestampMs float64
timestamp datetime64[ns]
latitudeE7 float64
longitudeE7 float64
altitude float64
dtype: object
Get the timestamp from the image
We can use the exif package to read the timestamp from an image. We can test it out on a single image here.
#Using exif to read timestamp from image. I tried using exif also to write back the GPS coordinates, however it seemed that GPSPhoto is a much easier option.
from exif import Image
with open("./img/DSC_1850.jpg", 'rb') as image_file:
my_image = Image(image_file)
#list all the information available in the image
print(dir(my_image))
print(' ')
print(my_image.datetime_original)
['_exif_ifd_pointer', '_gps_ifd_pointer', '_segments', 'artist', 'cfa_pattern', 'color_space', 'components_configuration', 'compression', 'contrast', 'copyright', 'custom_rendered', 'datetime', 'datetime_digitized', 'datetime_original', 'digital_zoom_ratio', 'exif_version', 'exposure_bias_value', 'exposure_mode', 'exposure_program', 'exposure_time', 'f_number', 'file_source', 'flash', 'flashpix_version', 'focal_length', 'focal_length_in_35mm_film', 'gain_control', 'get', 'get_file', 'gps_version_id', 'has_exif', 'jpeg_interchange_format', 'jpeg_interchange_format_length', 'light_source', 'make', 'maker_note', 'max_aperture_value', 'metering_mode', 'model', 'orientation', 'photographic_sensitivity', 'pixel_x_dimension', 'pixel_y_dimension', 'reference_black_white', 'resolution_unit', 'saturation', 'scene_capture_type', 'scene_type', 'sensing_method', 'sensitivity_type', 'sharpness', 'software', 'subject_distance_range', 'subsec_time', 'subsec_time_digitized', 'subsec_time_original', 'user_comment', 'white_balance', 'x_resolution', 'y_and_c_positioning', 'y_resolution']
2015:05:27 21:50:22
Find the matching timestamp in the google data
Next, we want to match the timestamp of an image with a timestamp in the google history to find out our coordinates at the time a particular photo was taken. The challenge here is that a timestamp from a photo may not match exactly with our data so we can use pandas .get_loc function to find the nearest match. But first we’ll have to clean up the data a bit by sorting it and removing some duplicate entries as .get_loc requires.
Note that ‘n’ is the row number or index in our sorted data where the match was found.
from datetime import datetime
dt = datetime.strptime(my_image.datetime_original, '%Y:%m:%d %H:%M:%S')
print(dt)
output = output.sort_values(by=['timestamp'], axis=0)
output = output.drop_duplicates(subset=['timestamp'], keep='first')
idx = pd.Index(output['timestamp'])
n = idx.get_loc(dt, method='nearest')
output.iloc[n]
2015-05-27 21:50:22
timestampMs 1.43279e+09
timestamp 2015-05-27 21:49:55
latitudeE7 60.1058
longitudeE7 -149.434
altitude NaN
Name: 397104, dtype: object
Write GPS Exif data back to an image.
We can use a package called GPSPhoto to easily write lat long in decimal format.
Note: GPSPhoto has several dependencies and was a little tricky to satisfy them all. One of the dependencies is ‘PIL’ (Python Image Library) but the newer fork ‘pillow’ will work.
#Now we'll use GPSPhoto to write some GPS coordinates to the image
from GPSPhoto import gpsphoto
photo = gpsphoto.GPSPhoto('./img/DSC_1368.JPG')
info = gpsphoto.GPSInfo((61.123, -148.456), alt=10, timeStamp=dt)
photo.modGPSData(info, './img/DSC_1368.JPG')
Put it all together
#Get list of image files in image directory
from exif import Image
from datetime import datetime
import os
from GPSPhoto import gpsphoto
root = ".\\img"
file_list = []
for path, subdirs, files in os.walk(root):
for name in files:
#print(name)
file_list.append(os.path.join(path, name))
for file in file_list:
with open(file, 'rb') as image_file:
my_image = Image(image_file)
dt = datetime.strptime(my_image.datetime_original, '%Y:%m:%d %H:%M:%S')
#get_loc requires values to be sorted and without duplicates
output = output.sort_values(by=['timestamp'], axis=0)
output = output.drop_duplicates(subset=['timestamp'], keep='first')
idx = pd.Index(output['timestamp'])
n = idx.get_loc(dt, method='nearest')
lat = output.iloc[n]['latitudeE7']
lon = output.iloc[n]['longitudeE7']
altd = output.iloc[n]['altitude']
#Filter out some bad values in the altitude after getting an error.
#This would have been better done in the source data but for now this works.
if altd != np.nan:
if altd == 'NaN':
altd = 0
else:
altd = int(np.int_(altd))
if not 0 < altd < 5000: #sometimes the altitude in google is weird
altd = 0
else:
altd = 0
photo = gpsphoto.GPSPhoto(file)
info = gpsphoto.GPSInfo((lat, lon), alt=altd, timeStamp=dt)
photo.modGPSData(info, file)
print('Modified image: ' + file + ' with lat=' + str(lat) + ' long=' + str(lon) + ' alt=' + str(altd))
Results
Modified image: .\img\2016-01-06_DSC_6237.JPG with lat=61.2064548 long=-149.9151696 alt=0
Modified image: .\img\2016-01-15_DSC_6501.JPG with lat=61.5779418 long=-149.1482996 alt=133
Modified image: .\img\DSC_1850.jpg with lat=60.1057598 long=-149.4343206 alt=0
I did not realize until I did this project but Adobe Lightroom has a map feature that displays photos from the library on a map, so this would be very useful for updating the library for that purpose.