The machine learning process
In my last article, I introduced how machine learning plays an important role in the new analytics renaissance. I discussed how this technology will change gaming by allowing marketers to discover insights locked away in the casino’s databases. Machine learning enables a marketer to examine large amounts of data, looking for patterns, and then generates the capability to recognize those same patterns in new data. The insights generated can help marketers make better predictions on outcomes. More to the point, machine learning helps you get smarter with the data that you already have. In this article, we’ll examine the machine learning process as a series of executable steps.
At the start, machine learning always begins with data. This data can come from many different parts of the organization (player tracking system, F&B, POS, etc.). The data can also take many different forms, player data (gender, age, address, etc.), gaming data (ADT, tier, frequency, LTV, etc.), promotion data (redemption status, conversion rate, cost of sales, etc.). The more data that you have to frame the question, the more precise the answer.
Choosing the appropriate data to use in the analysis is vitally important. For example, suppose you want to determine with a high degree of confidence that a promotion will lead to redemption. What data will you need to answer this question? Some things will be obvious, like offer/redeemed status, time of redemption, and offer amount. But other data may also be relevant, such as visit frequency, player worth, or date of last visit. Discovering the most relevant data for a machine learning project is a foundational part of the process, requiring domain expertise from a casino marketer.
Whatever data you choose, it’s often not in the correct form to use directly. Instead, machine learning projects typically require some form of pre-processing. In our promotion redemption example, the raw data might need to be “cleaned” in some way if there is missing or incomplete data. Or the data may need to be structured for better analysis, such as linking the data back to gaming activity. The goal of data pre-processing is to create what’s called “prepared data.” Creating prepared data from raw data frequently takes up the majority of the total time spent on a machine learning project.
Once the data is in the right shape, the next step is to search for the best answer for the question being asked. To do this, a data scientist will use machine learning algorithms to work with the prepared data. These algorithms typically apply some statistical analysis to the data. This includes relatively common things, such as a regression, or more complex approaches with more exotic technical names like multiclass decision jungle. The data scientist chooses a machine learning algorithm that is most appropriate for the question being analyzed.
When a machine learning algorithm is run on prepared data, the result is referred to as a model. A model is a computer program that is a set of steps for recognizing a pattern in data – in our example, recognizing what data leads to a promotion being redeemed. The algorithm implemented by the model itself provides a way to recognize when the pattern is discovered.
It’s important to note that machine learning models typically don’t return yes/no answers. Instead, a probability between 0 and 1 is presented. So, if the model showed that the probability for promotion redemption is 0.8 when certain factors are present within a player’s past redemption history, then a marketer could use that information to select players for a future promotion who exhibit similar characteristics. Knowing what to do when presented with a probability value is a business decision. For example, a marketer could decide against selecting players for a future promotion whose probability is below 0.5. Such a decision could help save on promotion costs while ensuring focus on those players with the highest probability of conversion.
Typically, the first candidate model created isn’t the best one. Instead, the data scientist will try many different combinations of machine learning algorithms and prepared data, searching for the one that produces the best model for the question being asked. Each iterative attempt is an experiment, and most machine learning projects will run many experiments.
Once an effective model is discovered, the last step is deploying the model. Deploying essentially means using the model as part of a process or application for answering the question over and over again. In our promotion redemption example, an application could be created that would select those players automatically from the player database who have a 0.8 probability of redemption when certain factors are present.
It’s clear that using machine learning to recognize patterns can help marketers discover insights with much greater speed and efficiency. No need to lock your marketers in a room for a week to get answers; partner with a data scientist for a day, and solve the problem with less time and greater accuracy.