GameSparks Developer Portal Analytical ‘Big Numbers’ and the MongoDB Aggregation Framework

My first development task at GameSparks was to implement the functionality in the GameSparks Developer Portal that would provide a number of analysis data aggregates on the Analytics page.  We refer to these aggregate values as the ‘Big Numbers’.  This blog post gives an overview of how I used the MongoDB aggregation framework to calculate these statistics.


The aggregate values provide game development companies with vital metrics about how well their game is performing and indicate levels of engagement within the game’s current user base.  For example Daily Active Users (DAUs) and Monthly Active Users (MAUs) show the unique daily and unique monthly active users respectively.

The Raw Data

First off lets have a look at the raw game event data that the GameSparks platform holds in the Analysis collection for authorisation (“auth”) operations.  These documents indicate that a user has authorised with the GameSparks platform.

"_id" : "5239a180e4b02ae6fed17095",
"userId" : "5237e28be4b0cce02d388e33",
"requestId" : "5239a180e4b02ae6fed17090",
"sessionId" : "5239a17fe4b02ae6fed1708a",
"operation" : "auth",
"startTime" : ISODate("2013-09-18T12:50:08.608Z"),
"os" : "Android OS 4.1.1 / API-16 (JRO03H/eng.lsd.20121116.155032)",
"platform" : "Android",
"deviceId" : "d52716d4c8e51cb03fcae38952ea655d",
"country" : "UK",
"city" : "York"

Other types of operation data are stored in the Analysis collection so the first thing to do is filter the collection on the “operation” field where the value equals “auth”.

db.analysis.aggregate({$match: {operation: "auth"}})

That’s the first part of the aggregation pipeline complete next we’ll add to this pipeline in order to find the Monthly Active Users (MAU) statistic.

Finding the Monthly Active Users for the last 30 days

We now need to extend the matcher to find the last 30 days worth of data using the $gte operation.

{$match: {operation: "auth", startTime: {$gte: ISODate("2013-09-08T00:00:00.000Z")}}})

Then we add a group phase to the pipeline that groups all the userId fields resulting in a list of documents with a unique userId field.

{$match: {operation: "auth", startTime: {$gte: ISODate("2013-08-24T00:00:00.000Z")}}},
{$group: {_id: "$userId"}})

All that remains is to use the $sum operator to give us the total.

{$match: {operation: "auth", startTime: {$gte: ISODate("2013-08-24T00:00:00.000Z")}}},
{$group: {_id: "$userId"}},
{$group: {_id: "", total: {$sum: 1 }}})

The final output from this aggregation contains the Monthly Active Users aggregate value in the “total” field.

{"result": [{ "_id" : "", "total" : 6 }], "ok" : 1}

In my next post I’ll show you how we run these queries from Java using the MongoDB Java Driver.




Who uses GameSparks?