As a data scientist working in the business world, many times have I nodded my head sagely as someone on the other side of the table said smart-sounding sentences about how we needed to account for the working cost of capital as we amortized the operating margins perpendicular to the inventory turns. Or something like that. Those are definitely all terms that I’ve heard, but likely they don’t fit together quite like that.
I imagine that is how data scientists sound to you, business executive. But let me assure you, everything is OK. When you hear our side of the table jargoning away and you get the general drift, but aren’t quite sure specifically what we mean, neither do we. For example, we data scientists frequently talk about using machine learning algorithms for some predictive purpose, say, optimizing an ad campaign. Sounds impressive? Well, it’s a lot like saying we’ll cure you of a disease using medicine. Machine learning isn’t a specific technique, it is a wide range of tools, many of which are very different from one another, but all are used to predict and classify.
Machine Learning (ML) in a nutshell: ML, a branch of artificial intelligence, enables computers to learn from data and improve over time. In business, it’s a tool for making data-driven decisions, optimizing ad campaigns by analyzing past performance and predicting future success, thus maximizing campaign effectiveness.
An Algorithmic Journey
Today’s efforts at ad campaign optimization are SpaceX satellite launches compared to the horse-and-buggy targeting efforts of a generation or two back. In mass media, marketers would rely on focus groups and surveys to identify their target audience; then they could buy ads in markets and on shows that skewed ever so slightly toward that audience. A big chunk of mass media ad spend fell on uninterested ears.
The catalog retailers had it a little better, relying on records of who had ordered before. Back in the 1960s they commonly used RFM targeting: Recency, Frequency, and Monetary value. Imagine a simple world where your customer base is half composed of A-list customers who have a 50% chance of buying in a month and half B-list customers with only a 20% chance of buying. Targeting all your customers would result in 35% of them purchasing. A simplistic version of RFM, let’s say targeting only customers who purchased in each of the last three months, would yield a segment of customers comprising 94% A-listers (I’ll spare you the math).
Over time, the field of “database marketing” grew… not to market databases, but to use databases to market to everyone. Data resellers would scour public records, learning about consumers’ car purchases and home values. Magazine subscription information would be sold to compile profiles of consumers’ interests. Data analysts would pore over this data, manually building predictive models to target the best prospects and screen out those unlikely to buy, boosting ROI primarily by cutting the cost of marketing.
Along came loyalty cards, giving consumers a small discount on some items in exchange for an ability to track all their purchases. Soon followed by e-commerce sites, where not only were purchases easily tracked, but items viewed and not placed in the cart. Recommendation engines started working overtime to find similar or complementary items to increase average order size. “Retargeting” campaigns would market items that seemed like they had been of interest but had not been purchased.
And at last we come to the modern day, where we talk about machine learning the way we talk about our cars; something we feel we understand fully, but how many among us can clearly explain the functioning of a cam shaft or a differential?
Camshaft in a nutshell: A camshaft is a rotating cylinder used in internal combustion engines to open and close intake and exhaust valves, synchronizing the engine’s operations.
Differential in a nutshell: A differential is a gear mechanism in automobiles that allows wheels to rotate at different speeds, enabling smooth turning and enhancing traction control.
But we can still drive our cars just fine. And so too with Machine Learning. Exactly how the machine makes its determination of who sees a digital ad is beyond human understanding. Take a moment to think about vast data held in the gestalt databases of Google, Facebook, Amazon, Apple, and others; what you feel now is probably the same awe that astronomer Carl Sagan sensed as he thought of the cosmos itself!
The massive volume of data is now far beyond the ability of human analysts to build models by hand. Data scientists design machine learning algorithms to look for patterns too complex and subtle for us to understand, but no less real than the demographics and RFM of yesterday. The algorithms toil endlessly, searching the massive haystacks for needles, and finding them even if the needles are made of hay! Down deep the algorithms are using the same fundamental concepts that you did when drawing that “line of best fit” on a scatterplot in that one statistics class from business school, but doing so with rocket boosters.
Optimizing the Ad
A prospective customer is scrolling on Instagram or reading a blog; an algorithm, in a fraction of second, matches the data-driven profile of the user with the ad best suited. The next time your marketing team launches a new campaign for your new bronze-coated reticulating widgets, watch the targeting in action. The digital marketing manager will load the campaign into, say, Facebook and the frontline metrics, like click-through rate on digital ads, will start low, and hour-by-hour over the first couple of days you can watch the reported metric creep up.
Perhaps you are paying $5 for every thousand impressions of your ad. When the click-through rate (CTR) starts at 0.2%, you are concerned. That’s two clicks for every five bucks, but your marketing analyst told you to expect only perhaps one in twenty clicks to yield a sale, and each transaction will yield only about $10 in gross profit. Twenty clicks to make $10, but two clicks costs $5… you do the math and figure you are paying $50 to make $10.
CTR (Click-Through Rate) in a nutshell: CTR is an important metric in digital advertising that is calculated by dividing the number of clicks an ad receives by the number of times it’s shown (impressions). A higher CTR, all other things being equal, indicates a more effective ad.
But the algorithm starts to understand that users who are men in their 20s and early 30s are the more likely responders. The CTR pushes up to 0.5%. Now 1,000 impressions yields 5 clicks, so just $20 to get a sale. Still losing money, but much improved.
The algorithms peer into the demographics, activities, and interests of those who responded compared to those who didn’t. Subtle relationships emerge. City-dwellers into tech hobbies respond. Users in rural communities who own foreign cars. Californians who travel frequently. The CTR surges to 2%.
Five dollars buys a thousand impressions which yields 20 clicks which generates one sale which is worth $10. The campaign has surged from a disaster to tremendous ROI. Gradually the CTR pushes up toward 2.3% as the algorithms further refine their profiling and settle there for the remainder of the campaign.
Well done. Congratulations! (I’m speaking to the algorithm, of course.)
Finding the Patterns
And with online attribution modeling, algorithms not only optimize the match of ad to prospect, they can shift budget from one channel or platform to another. A step higher up and marketing mix models analyze the patterns over the last two or three years, offering suggestions on strategic marketing and quarterly or annual budget allocation.
Whether the underlying model for any given effort is a random forest or uses gradient boosting or relies on cosine similarity – those are all real terms, by the way, speakers of working capital and internal rate of return – is less important than the outcome: The algorithms find, and implement, patterns. And they do it really well. This is everything you need to understand. So when talking with the marketing team, you can nod your head wisely. At first you faked it, but now you’ve maked it. Congratulations. (This time I’m speaking to you.)
For more columns from Michael Bagalman’s Data Science for Decision Makers series, click here.