Data Mining with Microsoft SQL Server 2008 Review Chapter 6
The book now goes into a series of chapters, six through twelve, of an in-depth look at the individual algorithms. I will repeat a comment from earlier in this series: this book was authored by the technology gurus who developed this software. The text supplements and extends what is free through MSDN Product Documentation (separately downloadable as SQL Server Books Online). The book has two important features:
- Detailed how-to tutorials and instructions of how to use the technology
- Behind-the-scenes technical tips which, though authoritative, cannot and should not be in the product documentation because Microsoft wants to promise functionality not implementation. In other words, how a product is implemented may change, though the functions should be consistent with the Microsoft documentation.
Now, let’s talk about the use of Microsoft in the chapter title (this chapter and subsequent chapters) to describe the algorithms. The Naïve Bayes machine learning algorithm is well known in the literature. Microsoft has made between minor and major tweaks with each algorithm, allowing them to rightfully claim the implementation as theirs. I do not have personal knowledge on whether these changes amount to a patent level of unique creation, but certainly enough to qualify for a copyright. Later, chapter 17 will talk about extending this technology and developing your own algorithms. Thus, it’s fair for Microsoft to sign their names on their algorithms, and that name persists through the data mining wizards and interfaces. Some future third-party developers might choose to make their own implementation of these same algorithms, and add their own names. If you choose to make one, I encourage you to share it, or at least a free version of it, on the open-source community site codeplex.com.
Continue reading “Microsoft Naïve Bayes” »