It's All About Semantics, Location History That Is

I had previously worked on an RLEAPP parser for the Google Takeout Location History some time ago but when asked just recently by Sarah Hayes, I came to find that it was a little broken. A quick addition of a few lines of code fixed it right up but she had another request, and that was to parse the Semantic Location History as well.

So I fired up my trusty sample databases from the Magnet CTF and took a look. Ross Donnelly published a great paper on looking at location data from Google Takeout so I won't go too deep into details. Basically the Semantic Location History files can be found at the path:

Takeout\Location History\Semantic Location History\*

Inside the main folder is subfolders for each year that has recorded data (ex. 2021, 2022, etc.). Inside those folders it is further broken down by month. Each month has a .JSON file in the formation YYYY_MONTH.json (where the month is spelled out). As Ross states in his paper, the JSON files contain basically two types of items:

  • activitySegment
  • placeVisit
Each having a bunch of data ripe for pulling out via a parser. I used this page as a guide to what the data meant but overall you can get a sense of most of it from the key name. I chose to grab the following fields from the two items:

activitySegment

  • starting latitude
  • starting longitude
  • ending latitude
  • ending longitude
  • starting timestamp
  • ending timestamp
  • activity type
  • confidence (low, medium, high)
  • activity type probabilities
  • file name (parsed from)

placeVisit

  • latitude
  • longitude
  • placeId
  • address
  • name
  • start timestamp
  • end timstamp
  • file name (parsed from)
One thing I found in testing is that Google decided to change some of the field names between 2020 and 2021-2022 sometime. There is more data to be looked at from these as it can get complicated quickly (especially with user manipulation of data points). These are just more reference points of potential location history to add to your investigation.

With that said I've already account for that change and now have a parser in RLEAPP for both instances.

Figure 1: Sample Activity Segments

Figure 2: Sample Place Visits