The knock on B2B data mining has always been that there isn’t B2C-like data available. Instead of multiple transactions that give us customer behavior patterns, we have company demographic information (industry, company size, revenue), some information about the person from the company who we’ll deal with (position/title), and where that person came from (lead source). It’s not behavioral data, which we know to be inherently better as a predictor than demographic data. But some data is better than none, right?
And we can certainly create transactional data that gives us some behavior pattern. If we throw in the contact schedule - the touches - from your company’s representatives, don’t you have a transactional pattern of both buying and non-buying customers? Coupled with the demographic data, you can drum up a model that predicts how many touches a lead might need to become a client and maybe a best guess at the path that should be pursued with a new lead.
More to the point of this post, this is the great thing about click-through data: it has a transactional quality. In fact, it just might be the transactional data for B2B companies. (Aside: This is also one of the reasons why companies like Omniture are becoming so notable: they provide some behavioral patterns, however small.) If we can combine click-through patterns from the person representing the prospect company with the company’s demographic information, then we might have a real interesting model that determines just how serious a lead is about buying from you and their company’s relative experience level with your product area.
Let me close out this post by refuting two of the main complaints about B2B data and its unsuitability for data mining-based models.
There’s Not Enough Data
Everybody loves data mining when it comes to consumer-focused companies. The vast amounts of transactional data are transfixing. The thinking goes something like this: “I’ve got hundreds of thousands of transactions here so whatever our predictive model spits out must be right.” Well, this may be true. And it mayn’t. But that doesn’t make a model built with less data any less compelling. It just means that one model has more data points. Don’t feel inadequate for the difference. Just make sure that you have data that’s important to the business problem you’re trying to solve. For example, if you want to know the next-best product for newly-minted customers, then you’d better have a solid set of second-time customers who bought a bunch of different products. Do you need thousands of these second-time customers? C’mon.
Missing and Bad Data
Isn’t this a reality everywhere? Even consumer-focused companies (with hundreds of thousands of transactions) have this issue. Oh, and I have a suggestion on what to do with that missing and bad data. Throw it out. Chances are, it will have absolutely no effect on the predictive models, unless of course all of the missing or bad data has a common characteristic that isn’t found in the rest of the data. For example, let’s say you’re building a model that predicts the next software product that a first-time customer might want from your company. Well, if everybody that bought a specific product as their first purchase is missing a zip code, then you can’t very well throw all of those records out. It would skew the model irreparably. But as long as the missing data is evenly distributed throughout the records, don’t be afraid to trash ‘em.