What is not Data Mining? - A Myth Buster

This post was written by namwar on June 20, 2009
Posted Under: Analysis Services, Business Intelligence, SQL Server 2005

While watching a web cast by John Weston, I noticed a very important thing which is the clarification of what is not Data Mining? It is very common that people sometime get confused with what actually is Data Mining and start referring different terms and techniques used in normal data processing as Data Mining activities.

Following is a list which describes different data processing techniques and why they can not be reffered as Data Mining?

1. Ad Hoc Query:

Ad Hoc queries just examines the current data set and gives you result based on that. This means you can check what is the maximum price of a product but you can not predict what will be the maximum price of that product in near future? A Data Mining Algorithm can do it.

2. Event Notification:

You can set different alerts based on some threshold values which will inform you as soon as that threshold will reach by actual transactional data but again you can not predict when that threshold will reach? A Data Mining Algorithm can do it.

3. Multidimensional Analysis:

You can find the value of an item based on different dimensions like Time, Area, Color but you can not predict what will be the value of the item when its color will be Blue and Area will be UK and Time will be First Quarter of the year? A Data Mining Algorithm can do it.

4. Statistics:

Item Statistics can tell you the history of price changes, moving averages, maximum values, minimum values etc. but it can not tell you how price will change if you start selling another product in the same season. A Data Mining Algorithm can do it.

So in simple words…Data Mining is not history…It is Future!

Reader Comments

I’m a bit of noob, so please correct me if i’m wrong.
We use data mining to tell understand how our data is related to each other.

we sold more in january.
we sold more sweaters.
70% of sweater were sold to women.
30% of women chose red sweaters.

its true that these relation can be extrapolated into the future…but i feel the process of prediction based on collected/processed data* should be called Forecasting or Forcasts.
and The the process that these forcasts are built of off i.e. a process to gather sets of data from which an intelligent forcast can be built should be called datamining.
at least IMHO

Written By Salman on March 23rd, 2010 @ 4:39 pm

Add a Comment

required, use real name
required, will not be published
optional, your blog address