TDWI – Day 2 – Predictive Modeling for the Non-Statistician November 3, 2009
Posted by skunkworkscmj in Books, Data Mining/Analytics, Open Source, Uncategorized.Tags: Knime, Michael Barry
trackback
Course taught by Michael J Berry
Statistics have never scared me! Before data warehousing and financial services, I was a mathematician and operations researcher. I’ve built econometric models, dabbled in Monte Carlo simulation, and always searched for the deterministic model in the haystack. My interest in Mike Berry’s class was understanding how a predictive modeling program could be added to the DW/BI program I manage.
To date, our DW/BI has been all about reporting – financial reporting, regulatory reporting, credit risk reporting, operational reporting, this happened last year, this happened last week, this happened yesterday-type reporting. Blah blah blah.
As an organization we have always felt that the value of data was in understanding what IS GOING TO HAPPEN, not what happened in the past. The magic is in understanding what customers will do, and then getting that information integrated into the operations processes – marketing, face-to-face sales, the call center, etc. We’ve seen the slides and charts of the BI Maturity Model:
The full-day course was really good as it covered many of the areas I was interested in – tools, processes, effectiveness. Mike Barry worked through some simple background statistics and introduced how models could be built and used in a marketing or CRM setting. Using SAS E-Miner (he was tool agnostic for the course), he worked through some basic models using sample data from the Census bureau. He compared regression models with decision trees as well as examples on propensity to buy and churn.
What did I get out of the course?
- Refresher on predictive modeling
- Insight into tools (E-Miner, Knime, others)
- Insight into sources of external data
What didn’t I get out of the course?
How predictive models are actually implemented in a DW/BI environment. I’m particularly interested in how models are coded and maintained in a production data processing cycle. Not much insight there from the class, but that’s just another topic to research.
A Few Books
These are books either referenced in the course or that I have found useful on the topic:





Comments»
No comments yet — be the first.