Introduction to Data Mining

What is Data Mining?

Databases today can vary in size into the terabytes — more than 1,000,000,000,000 bytes of data. Within these quantities of data lies hidden information of vital importance.How can one retrieve information useful for them in this huge quantity of data being generated on a daily basis? The answer is data mining, which is being used both to increase revenues and to reduce costs.

Data mining is the extraction of hidden predictive information from large databases.  In other words, one can say that data mining is the procedure of mining knowledge from data.

The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.Data mining tools allow enterprises to predict future trends.

It is a powerful new technology with great potential to help companies focus on the most important information.

Why do we need Data Mining?

With the enormous amount of data stored in files, databases, and other repositories, it is increasingly important, to develop powerful means for analysis and perhaps the analysis of such data and for the extraction of interesting knowledge that could help in decision-making.

Data mining has been used to:

  • Identify unexpected shopping patterns in supermarkets.
  • Optimize website profitability by making appropriate offers to each visitor.
  • Predict customer response rates in marketing campaigns.
  • Defining new customer groups for marketing purposes.
  • Predict customer defections: which customers are likely to switch to an alternative supplier in the near future.
  • Distinguish between profitable and unprofitable customers.
  • Improve yields in complex production processes by finding unexpected relationships between process parameters and defect rates.
  • Identify “wedge issues” and target political campaigns.
  • Identify suspicious (unusual) behavior, as part of a fraud detection process.

    Data Mining
    Figure 1: Data Mining



In short, Data Mining can be applied anywhere in business or organization where users are interested in identifying and exploiting predictable outcomes.

Applications of Data Mining:

  1. Fraud Detection
  2. E- Commerce
  3. Future Healthcare
  4. Market Basket Analysis
  5. Education
  6. Customer Relationship Management
  7. Sales Forecasting
  8. Financial Data Analysis
  9. Retail industry
  10. Telecommunication Industry
  11. Bioinformatics.

