Purpose
Data Mining is used to improve decision making by finding useful patterns and insights from data.
Business Analysis Body of Knowledge® (BABOK®)
Data mining is the process of analyzing the data from different sources and summarizing it into relevant information that can be used to increase the revenue and decrease the cost. It allow the users to view the data from many different angels, categorize it, and sum up the relationships identified. The ultimate goal of data mining technique is prediction and discovery. The process searches for consistent patterns and systematic relationship between the data, then validate the findings by applying the patterns to new subsets of data. It is particularly useful for revealing hidden patterns and providing insights during the analysis. Data mining usually lies in securing the right type, volume and quality of data that is necessary to draw the insights. It involve the use of dashboard and report that facilitate visual communication of the results.
Data mining is an analytic process that examines large amount of data from different perspectives and summarizes the data in such a way that useful patterns and relationships are discovered. The results of data mining technique are generally mathematical models or equations that describe underlying patterns and relationships. These models can be deployed for human decision making through visual dashboards and reports. It can be utilized either supervised or unsupervised investigations. In supervised investigation, users can pose a question and expect an answer that can derive their decision making. The unsupervised investigation is a pure pattern discovery exercise where patterns are allowed to emerge, and then considered for business decisions. Data mining is a general term that covers,
- Descriptive - such as clustering make it easier to see the patterns in a set of data such as similarities between the customers.
- Diagnostic - It is a decision tree, can show why the pattern exist, such as the characteristics of the organization most profitable customers.
- Predictive - It is regression or neutral network can show how likely something is to be true in the future, such as predicting the probability that the particular domain is fraudulent.
Some of the articles related to Data mining techniques are as follows,
Some of the important factors to be considered in Data mining technique are,
- Requirements Elicitation - The goal and scope of data mining is established either in terms of decision requirements for an important identified business decision, or in terms of functional area where relevant data will be mined for domain specific pattern discovery.
- Data Preparation - This is generally formed by merging the records from multiple tables or sources into a single, wide data set. The data may be physically extracted into an actual file or it may be an virtual file that is left in the database or data warehouse so it can be analyzed.
- Data Analysis - Once the data is available, it is analyzed. This step is often the longest and most complex in data mining effort and is increasingly the focus of automation. Much of the power of data mining effort comes from identifying the useful characteristics of data.
- Modelling Technique - The analytical data set and the calculated characteristics are fed into the data mining techniques which are either supervised or unsupervised. Multiple techniques are often used to see which is most effective.
- Deployment - Once the model has been built, it must be deployed to be useful. It can be deployed in variety of ways, either to support a human decision maker or to support automated decision making systems. Many data mining techniques potential business rules that can be deployed using a business rules management system.
Some of the books for Data mining techniques are,