I vocalized this title recently when explaining what data mining does.
The challenge which many people face is information overload. Too much information. Too much data. Some separate what data and what information means. The challenge is the same: too much, and too much of too much.
I was talking with a new friend about the weather, not the actual weather, but weather predictions. I was informing him that in many weather monitoring applications, data are discarded, and only a fraction of what is determined “meaningful” is retained for predictive application. Weather people give away data for the same reason non-profit and for-profit organizations throw away or shred files: people only keep what they believe will be value to them in the future.
Data or information (make your choice) is expensive to keep and archive. Perhaps the per byte storage cost is approaching zero, but what does not approach zero is the ongoing cost of keeping data organized (including the metadata). Someone still has to make that information available. Search engines can do a lot, but search engines do not categorize, people do. Computer algorithms can help automate some classification, but values start and end with people.
People determine what is news, and that decision starts the scientific process for determining patterns in data. I do not believe science creates itself or that science emerges on its own (as if Science were a personality with a decisive will). The anthropomorphized Science might be entertaining for science fiction, on perhaps another Star Trek adventure. Pragmatically, Science does not determine results, people and people groups and communities do.
In the weather, people have different goals, even with the same data. In weather prediction, some people hope it will rain, and others hope it will not. The value systems are different, and the same data mining models can help both groups answer the same question. Data mining will not adjudicate among people groups, but simply provides insight into the vast amount of data and helps human interpreters focus on certain points of information.
I am among those who believe that humans typically process information through patterns. We might deride discrimination as being inherently wrong, but what most people are actually against is discrmination closed to new information. I believe science is an centrally important portal for new information, but not the only one. Some ideas are beyond science, including logic and self-knowledge. Data mining can be a powerful tool to surface new patterns from empirically-based investigation. I promote data mining to be used within the scientific method, and logic helps apply values to the results from science.
Data mining separates news from noise.