Event Data Modeling Guide

Data is a powerful way to uncover what is important in your application. In order to truly leverage your event data within Advanced Billing, you’ll need to put some thought into the types of things you want to record and how you’ll record them. This data modeling guide will help you get the most out of your data.

Events & Event Data

Our database is optimized to store event data. Events are actions that occur at a point in time. These actions can be performed by a user, an admin, a server, a program, etc. Events have properties. Properties are the bits of data that describe what is happening and allow you to do in-depth analysis. When we talk about “event data” we mean events and all the properties that you send along with them.

Imagine your product sends SMS Messages on behalf of your users. Below is an example of a message event and its properties. There is a timestamp property that is automatically included at the top, plus a set of custom properties like account, origin and destination country, message length, and more:

{
  "chargify": {
    "timestamp": "2020-03-05T19:10:39.205000"
  },
  "account": {
    "id": "0b64a14a-41fe-11ea-9f5c-3c15c2c04522",
    "org": "Acme, Inc.",
    "since": "2018-01-11T13:01:55",
    "age_in_months": 26
  },
  "origin_country": "US",
  "origin_number_type": "local",
  "origin_number": "+1205518XXXX",
  "destination_country": "AU",
  "destination_number": "+6142314XXXX",
  "destination_carrier": "vodafone",
  "message_length": 240,
  "message_segments": 2
}

This event is sent to Advanced Billing using an HTTP POST request to a URL of the following format:

https://events.chargify.com/SITE_SUBDOMAIN/events/STREAM_HANDLE

Streams

Streams are used to logically organize all the events happening in your application. Events belong in the same stream when they can be described by similar properties. For example, all logins share properties like first name, last name, app version, platform, and time since last login. It makes sense to store all of your logins in a stream called “Logins”.

Logins are just one example of an stream. Here are some more: API calls, SMS messages, purchases, social media shares, comments, saves, exits, upgrades, errors, levelups, interactive gestures, modifications, views, signups.

Streams can have almost any name, but there are a few rules to follow

The name must be 256 characters or less.
The API handle must be 64 characters or less and contain only Ascii characters

Best Practices for Streams

Some things to consider when creating your streams:

Events in an Stream should have similar properties. For example, all logins share properties like first name, last name, app version, platform, and time since last login.
Streams for a given application share many “global properties”. For example, most events in your application probably share some properties like Account ID. It is a good planning exercise to identify those properties that you want to include in every Stream so you can structure them the same way each time.
When possible, minimize the number of distinct Streams. Let’s say you’re an ecommerce platform analyzing purchases across many devices and you want to compare them. You’ve got purchases from multiple versions of your iPhone app and multiple versions of your iPad app. It’s logical to think of creating separate streams for each of them, but it’s not the best way. Instead, consider creating a single stream called Purchases. Each purchase in your stream shares many properties like item description, unit price, quantity, payment method, and customer. Additionally, you can include properties for DeviceType (iPhone, iPad, etc) and Version (2.4A, 2.4B, 1.3).

Since you’re now tracking those Device & Version properties for every purchase, it’s very easy to do the following:

count the total number of purchases across all devices
count the total number of purchases where DeviceType equals “iPhone”
count the total number of purchases for iPhone app version 2.4A.

Event Properties

Properties are pieces of information that describe an event and relevant information about things related to that event.

When we talk about events and their properties, we are starting to dig into the art of data science. There is no prescription for what events you should record and what properties will be important for your unique application. Rather, you need to think creatively about what information is important to you now, and what might be important in the future.

While we believe that it can’t hurt to have too much information, we have put some practical limits in place. There cannot be more than 1,000 properties per stream. This is usually caused by the dynamic naming of properties. For example, creating a property whose name is the current time. This will create a new property for every event you send since they will be recorded at different times!

Here are some things to consider capturing as event properties:

Information about the event itself. If your event is a phone call, what number is being called? How many times did the phone ring? Did someone answer?
Information about the actor performing the event. For example, if you’re recording a user action, what do you know about the user at that point in time? If possible, record their age, gender, location, favorite coffee shop, or whatever else you know that might be useful for analyzing their behavior later.
Information about other actors involved. For example, if your event is a user sharing content with another user, you could record the properties of the recipient. What is their name? To what groups do they belong?
Information about the session. How long has your app been running since this event occurred? Is this the user’s first session?
Information about the environment. What platform? What hardware? What version of your application?
Other relevant information about the “state of the universe”. Think about anything else that might be handy to know later. If you’re making a farming game, record the items in a user’s garden and their coordinates. You might find some interesting usage patterns. Maybe people who spend over $30 all have statues in their garden; maybe you could add more fancy decorations to the game to entice them to spend more?

Though it might seem counter-intuitive and redundant to send the same information (e.g. user info, platform info) with every event, it will make it much easier for you to segment your data later.

Feel free to add or remove event properties from your code at any time. Advanced Billing will automatically keep track of whatever you send, and your new properties will be available for analysis immediately. Just be aware that changing properties in the middle of a customer’s billing cycle means that property is not available for the whole billing period. So, you may need to collect data for some time (e.g. over a month) before you start using that property within your billing.

Properties all have a name and a value. While they can have almost any name, there are a few rules to follow.

Property Name Rules

Must be less than 256 characters long.
There cannot be any periods (.) in the name.
They cannot be a null value.

Property Value Rules

String values must be less than 10,000 characters long.
Numeric values must be between -2^63 (-9223372036854775808) and 2^63 - 1 (9223372036854775807) (inclusive).
Values in lists must themselves follow the above rules.
Values in dictionaries must themselves follow the above rules.

Property Hierarchy

The nice thing about using JSON as a data format is that you can include LOTS of properties with your events, and you can organize them into a hierarchy.

You can see in the example below that this purchases event has properties that describe the purchase, properties that describe the customer, and properties that describe the store.

The ability to store the properties in this hierarchy makes it much simpler to name the properties. Notice how the customer name and the store name are simply labeled “name”. When you look for these properties in a filter or in your data extract, you’ll find them labeled customer.name and store.name.

{
    "item": "sophisticated orange turtleneck with deer on it",
    "cost": 469.50,
    "payment_method": "Bank Simple VISA",
    "customer": {
        "id": 233255,
        "name": "Francis Woodbury",
        "age": 28,
        "address": {
            "city": "San Francisco",
            "country": "USA"
        }
    },
    "store": {
        "name": "Yupster Things",
        "city": "San Francisco",
        "address": "467 West Portal Ave"
    }
}

This is a simple example — your hierarchy can have as many levels and properties as you want!

Property Data Types

Events support a variety of data types (integer, string, array, etc). Advanced Billing automatically infers the data types of your event properties based on the data you send. Some properties, such as timestamp, require you to use a specific property name. Arrays may only contain the supported primitive types, not additional JSON key value objects.

Inferred Data Types

Advanced Billing automatically infers your event property’s data type. The possible data types are:

string - string of characters
number - number or decimal
boolean - either true or false
array - collection of data points of like data types

You can easily check and inspect your data’s property types by viewing your “Event Streams”.

Arrays

You can store arrays as values in Advanced Billing events. However, arrays of objects are not recommended, especially if you anticipate filtering or segmenting your data for analysis and billing.

Timestamp Data Type

Two time-related properties are included in your event automatically. The properties chargify.timestamp and chargify.created_at are set at the time your event is recorded. You have the ability to overwrite the chargify.timestamp property. This could be useful, for example, if you are backfilling historical data. Advanced Billing stores all date and time information in UTC!

Here’s an example “pageview” event showing the Advanced Billing timestamp properties:

{
    "chargify": {
        "created_at": "2012-12-14T20:24:01.123000+00:00",
        "timestamp": "2012-12-14T20:24:01.123000+00:00",
        "id": "asd9fadifjaqw9asdfasdf939"
    },
    "device": {
        "OS": "Mac",
        "name": "Chrome",
        "version": 23
    },
    "page": "Intro to Analytics Course Page"
}

ISO-8601 Format

ISO-8601 is an international standard for representing time data. The format is as follows:

{YYYY}-{MM}-{DD}T{hh}:{mm}:{ss}.{SSS}{TZ}

YYYY: Four digit year. Example: “2012”
MM: Two digit month. Example: January would be “01”
DD: Two digit day. Example: The first of the month would be “01”
hh: Two digit hour. Example: The hours for 12:01am would be “00” and the hours for 11:15pm would be “23”
mm: Two digit minute.
ss: Two digit seconds.
SSS: Milliseconds to the third decimal place.
TZ: Time zone offset. Specify a positive or negative integer. To specify UTC, add “Z” to the end. Example: To specify Pacific time (UTC-8 hours), you should append “-0800” to the end of your date string.

Note: If no time zone is specified, the date/time is assumed to be in local time. In Advanced Billing, we’ll treat that as UTC.

Example ISO-8601 date strings:

2012-01-01T00:01:00-08:00
1996-02-29T15:30:00+12:00
2000-05-30T12:12:12Z