jump to navigation

TDWI – Day 2 – Predictive Modeling for the Non-Statistician November 3, 2009

Posted by skunkworkscmj in Books, Data Mining/Analytics, Open Source, Uncategorized.
Tags: ,
trackback

Course taught by Michael J Berry

Statistics have never scared me!  Before data warehousing and financial services, I was a mathematician and operations researcher.  I’ve built econometric models, dabbled in Monte Carlo simulation, and always searched for the deterministic model in the haystack.  My interest in Mike Berry’s class was understanding how a predictive modeling program could be added to the DW/BI program I manage.

To date, our DW/BI has been all about reporting – financial reporting, regulatory reporting, credit risk reporting, operational reporting, this happened last year, this happened last week, this happened yesterday-type reporting.  Blah blah blah.

As an organization we have always felt that the value of data was in understanding what IS GOING TO HAPPEN, not what happened in the past.  The magic is in understanding what customers will do, and then getting that information integrated into the operations processes – marketing, face-to-face sales, the call center, etc.   We’ve seen the slides and charts of the BI Maturity Model:

The full-day course was really good as it covered many of the areas I was interested in – tools, processes, effectiveness.  Mike Barry worked through some simple background statistics and introduced how models could be built and used in a marketing or CRM setting.  Using SAS E-Miner (he was tool agnostic for the course), he worked through some basic models using sample data from the Census bureau.  He compared regression models with decision trees as well as examples on propensity to buy and churn.

What did I get out of the course?

  • Refresher on predictive modeling
  • Insight into tools (E-Miner, Knime, others)
  • Insight into sources of external data

What didn’t I get out of the course?

How predictive models are actually implemented in a DW/BI environment.  I’m particularly interested in how models are coded and maintained in a production data processing cycle.  Not much insight there from the class, but that’s just another topic to research.

A Few Books

These are books either referenced in the course or that I have found useful on the topic:

Advertisement

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.