TechEd 2011 Atlanta GA Presentation

Today I presented a talk at TechEd North America 2011, titled Enterprise Data Mining with SQL Server. I am blogging today to talk about the outline of that presentation, and provide a link to download the demo code. Also, the presentation is available on Channel 9, and I provide that link in this blog post.

The premise of the presentation is that the word Enterprise separates the SQL Server Data Mining technology from many data mining applications. Many other competing commercial or open-source solutions are for desktops. While this technology can be used on a single computer (see my description of possible configurations), the word Enterprise implies that data mining can help leverage information for large organizations. People who implement this type of solution can become leaders of leaders of leaders, true even if they do not directly manage another person. The data mining models themselves provide data leadership.

The term data mining I have discussed on this blog and on this webpage. The word encompasses not just the machine learning algorithms, but also the entire data preparation process before posting a result for actionable decisions. Mathematically, much of the difference in results comes from accurate and refined data preparation (often called ETL) prior to data mining model creation.

The technology is SQL Server 2008 R2. The standard and higher versions of SQL Server since 2005 have had the data mining technology included as part of the license. The data mining is part of Analysis Services, and again, SQL Server Data Mining is NOT an application but is instead a service. Because it is a service, it therefore needs a face, and that face could include SQL Server Management Studio (SSMS, included with SQL Server), Business Intelligence Development Studio (BIDS, also included with SQL Server) or as in this presentation, PowerShell (a free add-on from Microsoft often included with the newer operating systems, but also downloadable for many older Windows versions). Another face not discussed in this presentation is Excel 2007 or 2010 (both of which access data mining through the free data mining add-in). People can extend the product by making their own interface, and I discuss that topic in the presentation.

Continue reading “TechEd 2011 Atlanta GA Presentation” »

Vote for MarkTab at PASS Summit

PASS is the Professional Association for SQL Server, the users group for SQL Server fans and professionals. Though I am currently presenting at the SQL Rally conference (the second-largest event in the PASS conference collection), the PASS Summit plans for Fall 2011 are already underway. The conference location is in Seattle, Washington, and I have submitted three session abstracts.

I am posting today to encourage you to indicate your interest in these topics by voting online by May 20, 2011. People indicate this interest by:

  • Signing up for free to be a “PASS Member” (required to vote in the “preference” area)
  • Clicking sessions which you think are interesting (does not require payment or obligation to attend)

Continue reading “Vote for MarkTab at PASS Summit” »

SQL People Interviews MarkTab

Andy Leonard is technology professional and community promoter who started the SQLPeople.NET website. From his @AndyLeonard Twitter description, he characterizes himself as follows:

Husband, father, grandfather, SSIS [SQL Server Integration Services] guy, SQL Server Database dev, community mentor, author, blogger, tweeter, trainer, consultant, writer, chicken whisperer.

He invited a long list of people (who use SQL Server) to tell their personal stories (and I encourage you to explore the many other stories too). The questions he chose were generic technology, and touch more on the people side, putting a human face and emotion on to people who often face computer screens for much of a typical day. His interview with me posted recently, and already has 178 views (as of my reading it today). I provide a link at the end of this blog post.

I provided these answers in January 2011, and I have more comments on my responses. Also, unlike the posted interview, this blog post has some pictures. However, let me first list the questions that he asked in my round of interviews:
Continue reading “SQL People Interviews MarkTab” »

Microsoft Kinect Hacking

I recently went to a presentation at the local Atlanta .NET Users’ Group on the Microsoft Kinect. The device has been one of the fastest selling game devices in history (a Guinness World Record for selling 8 million units in the first 60 days). The technology behind the device is based on machine learning algorithms.

The word “hacking” is actually inaccurate since Microsoft has been encouraging people to develop drivers. Kinect has a USB port, which allows the machine to be used with a regular computer. Nevertheless, there’s something edgy and dangerous about using the word “hacking” and when people talk about Kinect “development” (a comparatively boring and commercial word) the “hacking” term is sticking. I believe the gaming community likes the word “hacking”.

I see that the slides for the presentation I saw in March 2011 are not posted online. Thus, in this blog post I will have some comments on the device and some links to video. I will not be talking about the Kinect as a consumer game device, but I will have a link where you can explore that topic on your own.

Continue reading “Microsoft Kinect Hacking” »

PowerShell with SQL Server Data Mining

My paper on “PowerShell and SQL Server Data Mining” has just been published in Solid Quality Journal April 2011 edition.  I cofounded this journal, and many hands work to produce monthly issues and contribute to the articles. I have been impressed by the graphics layout team, and they make my work look great.

This PowerShell topic has been popular on this blog, since PowerShell is still relatively new technology.  I believe PowerShell is an important part of complete enterprise-level management. I will provide my abstract, and then link to the article.

Continue reading “PowerShell with SQL Server Data Mining” »

Tableau Software “Data Mining” Visualizations

Tableau Software produces first-rate visualization software.  They are among the many open-source and commercial visualization products (and services) listed by KDNuggets: http://www.kdnuggets.com/software/visualization.html.

As blog readers remember from February 2011, I posted on DevExpress and their marketing of the phrase “data mining” for their visualization software.  As a result of that blog post, I had productive interaction with one of their product experts who conceded the point.  Further, as you may remember from that post, my intention was to challenge commercial vendors to put machine learning algorithms even in the “View” layer (of the MVC design pattern).

Today, I report on heavy marketing by Tableau Software to associate themselves with “data mining” largely absent of machine learning (along the way, we also catch similar Microsoft marketing for PowerPivot).  Both Tableau Software and Microsoft produce visualizations with trend lines, which arguably might be a calculated regression.  However, trend lines alone do not encompass the rich science behind machine learning algorithms, even those available in SQL Server Data Mining since 2005. The difference provides a competitive opportunity for the much-needed visualization vendors.

Visualization alone is not data mining.  If visualization were data mining, then Excel 2010 alone, with all its fancy built-in graphs, would be considered “data mining” (but read on since Excel 2010 does do nifty linear regression visualization, and Tableau Software has nice trend lines too).  Under a loose “data mining” assumption, all spreadsheets going back to my earlier favorites, Lotus 1-2-3, and VisiCalc, would be “data mining” software.  I liked Lotus 1-2-3 graphs, and seeing how they changed along with source data.  Stopping at VisiCalc circa 1983 does NOT promote the incredible machine learning science developed since then.  And for C-level executives and venture capitalists looking to invest in the next big “data mining” systems, they should not be paying for just 1985 technology.  Its 2011, invest your money more wisely.

In this blog post:

  • A demonstration of how Tableau Software is marketing their “data mining” visualizations
  • An example of how someone used Tableau Software to connect to SQL Server Data Mining
  • A challenge to visualization entrepreneurs to incorporate machine learning into their software
  • My own gasoline data example discussing how to see the known and unknown

I have a variety of people reading this blog post including:

  • Analysts who use data mining to produce models
  • C-level executives and venture capitalists wanting to know what to look for in visual analytics software
  • Visualization developers looking for that next competitive edge in the growing business intelligence industry

Someone might be in all these groups, but hopefully my comments will help you explain this “data mining” issue to other groups.

Continue reading “Tableau Software “Data Mining” Visualizations” »