Exploring the Data Driven Economy
KDnuggets ranks as one of the most important web portals for analytics practitioners and the data mining and analytics community. In its recent 2013 “14thannual software poll,” the open source and free RapidMiner/RapidAnalytics (“RapidMiner”) software, from Rapid-I, was the most used analytics software at 39.2% of respondents . Rapid-I’s free software led other open source and highly distributed commercial tools, like R and Excel. It only slightly trailed open source R and commercial Excel in 2012, but took the lead in the 2013 poll. Rapid-I’s commercial versions also scored well, with 12% of respondents citing commercial RapidMiner as an analytics tool they used.
Before you go out trying to figure out how to establish an equity position in Rapid-I, let us share a fairly large grain of salt about the KDnuggets survey. First, it appears the polling technique was self-selecting and naturally focused around KDnuggets readership, so, for example, the voices of the tens of thousands of SAS experts out there who happen to not pay attention to KDnuggets were not heard in the survey. I.e. this was not a random sample of worldwide BI/analytics professionals. Second, it isn’t clear what the qualification criteria of the respondents’ were; other than geographical locations it isn’t apparent if more demographic data were captured. Third, giving the statistical focus of KDnuggets’ members, hardly any database or infrastructural software showed well on the list. Fourth, Rapid-I did some begging to get its aficionados out to vote in the KDnuggets poll.
Despite those cautions, the impact of RapidMinder cannot be overlooked. The respondent base of the KDnuggets poll was indeed worldwide, one-third USA, 45% EMEA, nearly 9% Latin America, and 13+% Asia/ANZ. There were 1880 voters. It is utterly impressive for any tool to rank up there with R and Excel, particularly with an analytics insider type of profile of KDnuggets readers, and even given the pitch to vote and other less than purely scientific characteristics of the poll.
Who then is Rapid-I? Based out of Dortmund, Germany, it was established in 2006, starting in 2012 has locations in the USA specifically in Burlington, MA and Sunnyvale, CA, cites about 30 partners, and deployments in about 50 countries. About 50 developers worldwide participate in the on-going development of open source RapidMiner, with the majority of these contributors working for Rapid-I. Ingo Mierswa is the founder of Rapid-I and is CEO of Rapid-I Germany, and also leads the RapidMiner development team. The open source community uses Eclipse as its standard IDE, so RapidMiner is a Java-based tool. Anonymous access to the source is allowed, and there is clearly a focus on subjects like R integration, information extraction, text/web mining, and time series analysis – these make up the special interest groups of the RapidMiner community.
The team that won the Boston-based Hack/Reduce competition in late 2012 used a combination of RapidMiner and Radoop as their primary tools, with the latter offering a Hadoop-based solution for BI/analytics along the lines of a Platfora. Giuseppe Taibi, CEO of Rapid-I North America, participated on the winning team.
Look, I have no way of validating the actual reach of RapidMiner. I am, however, a big fan of KDnuggets, and there is something quite real about the impact of RapidMiner. R, by itself, is not a drag and drop environment, and without Revolution Analytics and other integrations, R is not often used in enterprise BI/analytics. Excel, too, has it obvious limits. RapidMiner, similar to the Weka/Pentaho project which also scored well in the KDnuggets poll, offers a true visual environment for data mining and analytics, and like Pentaho, Rapid-I offers the commercialized version of the popularized open source tool. So enterprise BI/analytics types, why not give it a try?