Archive for the ‘Data Mining’ Category

Defining Success: Lift, Support, and Confidence

Tuesday, July 22nd, 2008

 

I want to take a minute and build off of Matt’s post from yesterday. While lift, confidence, and support may sound like terms that are more applicable to therapy sessions, they are actually the metrics that we use to rely the trust we have in our models. Many people we talk with are familiar with these terms on a certain level, but when pressed, their understanding boils down to the following: higher is usually good and lower is usually bad. I wanted to use this post to define these metrics a little more thoroughly and talk about how they’re calculated.  Hopefully, readers will come away with a better understanding of what they mean and exactly how they’re used.

Support

In order to talk about more complex terms like lift and gain, we need to first start with the basics: support.  Support, sometimes referred to as the cover, is the number of data points (customers, transactions, etc.) that meet a set of rules and/or assumptions.  If I do a market basket analysis and find that customers who buy milk also buy cereal, the support would be the number of customers in the sample set where this holds true. Obviously, you can only estimate the value of the support number when given the size of the total sample population which is why we have our next metric: confidence.

Confidence

Since a rule with a support of 900 looks good when the sample size is 1,000 and not so good when the sample size is 1,000,000, we need a way to easily figure out whether or not our support is significant. Confidence is a ratio that takes the support number and divides it by the number of instances where the rule may hold true (or to be more exact - where the antecedent of our rule holds true).  For instance, in our milk/cereal example above, confidence would be the total number of customers who bought milk and cereal divided by the total number of customers that bought milk.  While it’s true that the higher the confidence the more reliable the rule, it is important to note that knowing the total sample size and the support value as well as the confidence is necessary to get an accurate picture of the rules significance in regards to the total population.

Benchmark

I define benchmark here because it makes it easier to explain both gain and lift.  Benchmark is the total number of items (customers, transactions, etc.) that meet an outcome divided by the total number of items in the database. Let’s go back to the milk/cereal example. Since cereal is the outcome that we are trying to predict, the benchmark would be the total number of transactions where cereal was purchased over the total number of transactions in the database. In layman’s terms, if we were randomly picking 100 items out of the database, it is the percentage of those items where the outcome would hold true. Benchmark is valuable because it puts a lower bound on the value of a model. If a model can do better than the benchmark value, then it provides real value to the customer.

Lift

The most common term that is used in statistics and especially analytics is lift.  Lift is a way to measure how much better a model is over benchmark. It is defined as the confidence divided by the benchmark and any value that is greater that one suggest that there is some usefulness to the rule. Many applications show lift in a chart. In these instances, the total population is divided into deciles - ten even groups - into which members are placed based on their predicted probability of response. The highest responders are put into decile 1, etc.  Lift is then calculated for each of these deciles and plotted on a line chart.

Hopefully, this provided a little more insight into how we calculate the value of a model. Next time, I’ll run through a complete example to show how these are calculated in practice.

Analyzing Survey Data to Find Critical Factors

Wednesday, July 16th, 2008

Many companies use surveys to get a handle on how customers perceive them and to find areas in which they can improve. Oftentimes, though, these surveys produce a lot of data but not a lot of insight. After all, if you ask your customers to rate you in 30 different product or service areas, how do you know which are critical and which are just nice-to-haves?

The key to analyzing survey data to find which areas are important is to be sure the survey has at least one all-encompassing “outcome” question that identifies whether you are successful in meeting the customer’s needs. This is usually something like “How would you rate our product/service overall?” or “Would you recommend us to a friend?”. We then use the responses to the other questions to find which individual product or service areas most directly affect the outcome.

This is most easily done using common data mining techniques. Using logistical regression or regression trees, it becomes easy to find the two or three individual areas that drive the overall customer perception of the company. For example, we might find that of the 100 different attributes available, a restaurant’s overall rating is driven primarily by speed of service, staff friendliness, and location.

Armed with information about the few attributes of your business which define your customers perception, you can better focus your resources to drastically improve the customer experience. Otherwise, it’s too easy to find yourself sweating over the unimportant details.

Some Examples of Personalized Marketing

Monday, July 14th, 2008

Someone asked me the other day, in response to my assertion that one-to-one marketing on a massive scale was the wave of the future, how a company could possibly send out so many personally-tailored emails. Being in the local Irish Pub, The Burren, I almost laughed Guinness out of my nostrils. But I couldn’t avoid the underlying message. One-to-one marketing never really has been embraced because no one really thinks that they do customer segmentation very well, that there are too many obstacles to customer segmentation for it to be entirely useful. Ultimately, this means that few believe they have homogenous enough segments to deliver the personalized goods.

What this also means is that one-to-one marketing is complex due to the fallacies of profiling. I once worked at a company that had such in-depth profiles for each segment that the profiles read like a Faulknerian novels. At this company, I learned that our target female customer in the 35-40 range probably once wanted to visit France but was now stuck with two kids in middle America and made meatloaf once a month for a husband she rarely saw. She obviously consoled herself by buying our software.

What’s my point with all this? This kind of profiling is for low-transaction sales, nothing more. Direct marketing units with high transaction rates should never take the tack of email blasting a segment based on their demographics. Nevermind writing fanciful biographies for said segment. Instead, direct marketing should ignore demographic profiles and concentrate on profiles that accomplish an immediate business goal (see below). Given the immediate needs that direct marketing normally serves, it needs to have a shortsighted, tactical approach, not the strategic approach that profiling represents. Below I look at the goal of getting rid of an overstock of shorts via the email channel. In doing so, I explore two, important dimensions of personalization: what the segment is willing to buy cross-referenced by when that segment most likely opens email.

The Group that Will Likely Buy Shorts Next

The truth is, the customer is not out there to buy from your company. They’re out there to purchase the product they want next and you’re merely there as a direct marketer to insinuate yourself into the buying equation. So which segment of your customers is likely to buy shorts next because that’s the group you want to reach when your shorts have been sitting in inventory for way too long and the leaves are already falling from the trees. So is this a profiling problem? In other words, is it time to blast every demographic who might wear shorts. I suppose you could. But then you’re likely to turn some people off. If you ran your customers past transactions through a classification data mining task, what you’d come up with is a list of people who are likely to buy discounted shorts at that time of the year. In fact, you’d probably come up with a few segments that demonstrate such a propensity. And they would definitely cut across your demographic profiles. You’ll have some moms buying shorts for their sons and some dads buying shorts for next summer’s Hawaii trip.

When Is the Best Time to Reach My Shorts Group?

Almost everyone out there sends me email blasts on Tuesdays and Thursdays. Why? Well, the general belief is that it adheres to the customers’ work/open schedule. I have seen elsewhere that most emails are opened on Sundays. That’s a compelling argument. But I tend to believe that each of your potential shorts purchasers has a more personalized schedule as to when they open and read emails. And that leads to the answer to the question in the subtitle. There are many times to best reach your customers who will buy your clearance shorts. The best web article I’ve read on this is by Bill Nussey of silverPOP who argues for tuning your send times per customer based on their last-recorded response. Couldn’t agree more. In fact, I believe that timing is the hidden axis of personalization. I would actually alter Nussey’s belief just slightly. And that’s simply to say that I would average their responses - and give the most recent responses just a bit more weight - to triangulate on the time your shorts buyers are most likely to open your email. For ease of use, you can bucket this into days or half-days so you don’t have to schedule an email every hour. If you record response data to your email blasts (opens), then this really shouldn’t be a problem.

So what do you ultimately have? You have customers that are most likely to want discount shorts and you have the best time to contact each of them. Now that’s personalized marketing.

Data Mining is Great for Companies Trying to Gain Traction for New Products

Monday, July 7th, 2008

Cross-sell should work particularly well for a company that has one best-of-breed product and a bevy of products on the come-up with which it really wants to gain traction. The reason for this: more accurate prediction of which groups of customers will buy a second product and when. You may even be able to determine which product is the likely second choice of those customers, meaning that you know which of your immature products to pitch them. By leveraging data mining, your best-of-breed products can really slingshot your other products to stardom.

If you have a best-of-breed product, then it’s likely that the majority of your customers were introduced to your company via this product. They became your customers because of this product, are likely customers who seek best-of-breed products, and are unlikely to buy one of your offerings which is not best-of-breed. In fact, they might not even think of you when they think of a product category in which you have a lesser offering. When these best-of-breed customers go looking for a new product category (in which you do have an offering, albeit an inferior one), you won’t be top-of-list. This is where timing is key to you. Knowing when your customers will buy a second product is a proxy for when they have money to buy and a proxy for when they’ve turned their attention to another product category. Thanks to data mining techniques such as neural networks and decision trees, you know when they have money before your best-of-breed competitors, which means you have a headstart - an early opportunity - to suggest a product category that may just be entering your customers’consideration set.

Once you get this timing down, you can figure out the best way to tout the integration or synergies that make your product better. Heck, you might even offer a discount on your non-best-of-breed product. But it’s the timing that gets you there. And never forget that these are your customers and you understand them - their behavior - better than your best-of-breed competitors. Use it to your advantage by investing in some state-of-the art database analytics techniques.

How to Improve Software Margins in the Age of Commoditization

Tuesday, July 1st, 2008

Tim Ferriss makes some excellent points in his post The Margin Manifesto: 11 Tenets for Reaching (or Doubling) Profitability in 3 Months which it got me thinking about how margins are changing in the software business and why enterprise software companies must start “firing” their high maintenance customers.

The software industry for some time has been forgiving of poor fiscal discipline. With 90% margins, it is possible to blow lots of cash on unprofitable sales and marketing campaigns and still make a mint. Furthermore, Wall Street has always rewarded new license revenue growth over cost control. In this kind of environment any new revenue is good revenue, regardless of its ultimate price.

Sadly, the days of inflated margins are nearing an end. The price of software is crashing, and SaaS along and the consumerization of IT is turning software into a commodity. Enterprise software companies doing $500k deals on six month sales cycles will have to reduce their cost structures quickly to survive this disruption to their model.

With these changes afoot, plenty of blog space has been devoted to exploring how software companies can cut sales and marketing costs through search engine optimization, pay-per-click advertising, and viral marketing. Comparatively little has been written about Tim’s #10 point, however: firing high maintenance customers.

Despite the huge improvement in margins that result from firing poor customers, there are three reasons that the idea rarely gains traction in an organization:
1) Cultural resistance due to short-sighted metrics

Most companies are reluctant to fire customers. After all, no sales or marketing organization can be convinced that it’s a good idea to forgo revenue in pursuit of improved profitability down the line. This is especially true when they are measured on how much they drive top line growth, as they almost always are.
2) Data integration challenges

Even if sales and marketing can be convinced of the value in firing poor customers, there are still huge technical barriers to integrating the data required to for analysis. Challenges abound in getting CRM data to merge neatly with support and billing databases. Unless IT has a lot of spare capacity (an occurence as common as a Bigfoot sighting), significant budget will have to be allocated for data integration.
3) Inability to make sense of the results

Finally, once all the data is integrated, a healthy dose of marketing analytics know-how is required to make sense of it all. Without highly trained business analysts on staff, it is very difficult understand which customers are profitable and which aren’t. Furthermore, unless you want to keep spending money acquiring bad customers, statisticians and data miners will need to be called in to help build attribute profiles of unprofitable segments.

While firing unprofitable customers is a powerful way to improve margins and profitability, these three barriers ensure that it rarely gets done.  Unfortunately for most enterprise software companies, it will be too late by the time they realize how criticial it is to shed themselves of poor customers.   Those with the foresight and fortitude to make it happen sooner than later, however, can expect great rewards.

Click-through Data Adds to B2B Data Mining Possibilities

Thursday, June 26th, 2008

The knock on B2B data mining has always been that there isn’t B2C-like data available. Instead of multiple transactions that give us customer behavior patterns, we have company demographic information (industry, company size, revenue), some information about the person from the company who we’ll deal with (position/title), and where that person came from (lead source). It’s not behavioral data, which we know to be inherently better as a predictor than demographic data. But some data is better than none, right?

And we can certainly create transactional data that gives us some behavior pattern. If we throw in the contact schedule - the touches - from your company’s representatives, don’t you have a transactional pattern of both buying and non-buying customers? Coupled with the demographic data, you can drum up a model that predicts how many touches a lead might need to become a client and maybe a best guess at the path that should be pursued with a new lead.

More to the point of this post, this is the great thing about click-through data: it has a transactional quality. In fact, it just might be the transactional data for B2B companies. (Aside: This is also one of the reasons why companies like Omniture are becoming so notable: they provide some behavioral patterns, however small.) If we can combine click-through patterns from the person representing the prospect company with the company’s demographic information, then we might have a real interesting model that determines just how serious a lead is about buying from you and their company’s relative experience level with your product area.

Let me close out this post by refuting two of the main complaints about B2B data and its unsuitability for data mining-based models.

There’s Not Enough Data

Everybody loves data mining when it comes to consumer-focused companies. The vast amounts of transactional data are transfixing. The thinking goes something like this: “I’ve got hundreds of thousands of transactions here so whatever our predictive model spits out must be right.” Well, this may be true. And it mayn’t. But that doesn’t make a model built with less data any less compelling. It just means that one model has more data points. Don’t feel inadequate for the difference. Just make sure that you have data that’s important to the business problem you’re trying to solve. For example, if you want to know the next-best product for newly-minted customers, then you’d better have a solid set of second-time customers who bought a bunch of different products. Do you need thousands of these second-time customers? C’mon.

Missing and Bad Data

Isn’t this a reality everywhere? Even consumer-focused companies (with hundreds of thousands of transactions) have this issue. Oh, and I have a suggestion on what to do with that missing and bad data. Throw it out. Chances are, it will have absolutely no effect on the predictive models, unless of course all of the missing or bad data has a common characteristic that isn’t found in the rest of the data. For example, let’s say you’re building a model that predicts the next software product that a first-time customer might want from your company. Well, if everybody that bought a specific product as their first purchase is missing a zip code, then you can’t very well throw all of those records out. It would skew the model irreparably. But as long as the missing data is evenly distributed throughout the records, don’t be afraid to trash ‘em.

New Segmentation Paradigm or How to Qualify a Lead?

Tuesday, June 24th, 2008

I just ran across this quote from Will Schnabel and it gave me pause. In Lead Score and Activity Alerts: One in the same?, Schnabel says, “In order to identify the few qualified prospects from the remainder of the inquiries, you first need to determine your ideal customer profile or in other words, what segment is your best target. From there, BANT questions (budget, authority, need, and time frame) help determine the qualification status of the leads.”

So I agree with the last part. BANT qualification is important, especially in the B2B and B2P (Business to Prosumer) markets. After all, no one will ever buy your product without budget or the authority to make decisions. It’s the first part that’s becoming increasingly interesting and more complicated for me. Is the customer profile so rigidly first in this timeline? Or is privileging this activity just another reason that the Marketing/Sales political split persists? Certainly, this is the right way to pursue new business strategically: decide the size and relative need of a target market, build or alter product to address needs in that market, create messaging that speaks to the market about its needs, and then attack the target market.

Well, what happens when that target segment trickles in while a different segment - one not entirely counted on - starts to beat down your doors? OK. If the doors are really being kicked in, your company is likely to notice. But let’s say that this secondary segment buys on par with the target. Sales is likely to notice it first, right? And this is usually what causes the rift between marketing and sales: The marketing department often continues to chase the strategically-desired and well-messaged segments instead of the empirically-best segments.

This is where a data mining approach refines the way we look at segments. There may be unforeseen criteria that ultimately make its way into the segmentation definition. Clickthrough data is the most interesting and most-discussed piece of data in this vein. Schnabel would have you believe that this data determines the relative need of the segment in the BANT qualification. I say, why can’t it be part of the segment itself? A shrewd marketing department might change the amount of information available on the website so that the “heavy researcher” psychographic segment is more fully qualified. Particularly if the heavy researcher is also a heavy buyer.

Many companies are starting to break down the barriers of B2B and B2C. The B2P audience is a great example of this. This means that traditional customer segmentation is becoming more difficult. Customers may be segmented along demographic, business history, behavioral, and psychographic variables all at once. Ultimately, a data mining-aided approach to customer segmentation lets you adapt to customers who find you by breaking with these traditional barriers to make segmentation a fluid discipline.