Data mining is the principle of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but it is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods. It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" and "the science of extracting useful information from large data sets or databases".
Although the term "data mining" is usually used in relation to analysis of data, like artificial intelligence, it is an umbrella term with varied meanings in a wide range of contexts.
Data mining is considered a subfield within the Computer Science field of knowledge discovery. Data mining is also closely related to applied statistics and its subfields descriptive statistics and inferential statistics.
Privacy and Data Mining
Data mining government or commercial data sets for national security or law enforcement purposes has also raised privacy concerns.
There are many legitimate uses of data mining. For example, a database of prescription drugs taken by a group of people could be used to find combinations of drugs exhibiting harmful interactions. Since any particular combination may occur in only 1 out of 1000 people, a great deal of data would need to be examined to discover such an interaction. A project involving pharmacies could reduce the number of drug reactions and potentially save lives. Unfortunately, there is also a huge potential for abuse of such a database.
Essentially, data mining gives information that would not be available otherwise. It must be properly interpreted to be useful. When the data collected involve individual people, there are many questions concerning privacy, legality, and ethics.
|