{"id":6600,"date":"2012-02-02T12:34:04","date_gmt":"2012-02-02T12:34:04","guid":{"rendered":"http:\/\/www.smartdatacollective.com\/index.php\/post\/analytic-applications-are-built-data-scientists\/"},"modified":"2012-02-02T12:34:04","modified_gmt":"2012-02-02T12:34:04","slug":"analytic-applications-are-built-data-scientists","status":"publish","type":"post","link":"https:\/\/www.smartdatacollective.com\/analytic-applications-are-built-data-scientists\/","title":{"rendered":"Analytic Applications are Built by Data Scientists"},"content":{"rendered":"<p>Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his <a href=\"http:\/\/davidmenninger.ventanaresearch.com\/2012\/02\/01\/revolution-analytics-hosts-contest-on-business-predicting-the-future\/\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">perspectives on the contest<\/a>, noting that<\/p>\n<p><!--more--><\/p>\n<p>Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his <a href=\"http:\/\/davidmenninger.ventanaresearch.com\/2012\/02\/01\/revolution-analytics-hosts-contest-on-business-predicting-the-future\/\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">perspectives on the contest<\/a>, noting that<\/p>\n<blockquote>\n<p>R, as a statistical package, includes many algorithms for predictive analytics, including regression, clustering, classification, text mining and other techniques. The&nbsp;<a href=\"http:\/\/www.inside-r.org\/category\/tags\/contest\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">contest submissions<\/a>&nbsp;supported a variety of business cases, including, among others,&nbsp;<a href=\"http:\/\/www.inside-r.org\/howto\/time-series-analysis-and-order-prediction-r\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">predicting order amounts to optimize manufacturing processes<\/a>,&nbsp;&nbsp;<a href=\"http:\/\/www.inside-r.org\/howto\/direct-marketing-flight-forecasting-system\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">predicting marketing campaign effectiveness to optimize marketing spending<\/a>,&nbsp;<a href=\"http:\/\/www.inside-r.org\/howto\/towards-ideal-steel-plant-online-liquid-steel-temperature-prediction-using-r\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">predicting liquid steel temperatures to optimize steel plant processes<\/a>&nbsp;and&nbsp;<a href=\"http:\/\/www.inside-r.org\/howto\/mining-twitter-airline-consumer-sentiment\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">performing sentiment analysis of Twitter data<\/a>.<\/p>\n<\/blockquote>\n<p>(Incidentally, David also has a great riff on the <a href=\"http:\/\/davidmenninger.ventanaresearch.com\/2012\/02\/01\/technology-terminology-whats-in-a-name\/\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">terminology of &#8220;predictive analytics&#8221; and &#8220;big data&#8221;<\/a>&nbsp;out today.) He also notes that these applications are compelling precisely because of the close relationship between the contest entrants and the business problems they demonstrated how to solve:<\/p>\n<blockquote>\n<p>The entries also demonstrated a best practice: close alignment between the analyst and the underlying business objectives. Predictive analytics is not magic. It requires an understanding of business processes and an understanding of statistical techniques. The judging criteria reflected this requirement as well. One of the three categories we were asked to score was applicability of the submission to business. I think it\u2019s clear how the analyses in the winning entries could provide significant business value.<\/p>\n<\/blockquote>\n<p>As David notes, however, the counterpoint to this is that the analyst must combine *both* the . &#8220;How many people in your organization could perform those types of analyses,&#8221; he rightly asks. A combination of statistical tools along with domain expertise (plus the technical skills to implement the solution) is the hallmark of a good data scientist, which exactly why many organizations are looking to build effective <a href=\"http:\/\/radar.oreilly.com\/2011\/09\/building-data-science-teams.html\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">data science teams<\/a>.<\/p>\n<p>By the way, while the concept of &#8220;data scientist&#8221; is relatively new, this idea of combining statistical analysts with domain expertise is not. Bill Cleveland (yes, <em>that<\/em> <a href=\"http:\/\/stat.bell-labs.com\/wsc\/index.html\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Bill Cleveland<\/a>) made similar suggestions in a prescient paper back in 2001: &#8220;<a href=\"http:\/\/cm.bell-labs.com\/cm\/ms\/departments\/sia\/doc\/datascience.pdf\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics<\/a>&#8220;. (ISI Review, 69)<\/p>\n<p>David Menninger:&nbsp;<a href=\"http:\/\/davidmenninger.ventanaresearch.com\/2012\/02\/01\/revolution-analytics-hosts-contest-on-business-predicting-the-future\/\" target=\"_self\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Revolution Analytics Hosts Contest on Business Predicting the Future<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his perspectives on the contest, noting that<\/p>\n","protected":false},"author":29,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[5,8,19],"tags":[252],"class_list":{"0":"post-6600","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-data-quality","7":"category-predictive-analytics","8":"category-r-programming-language","9":"tag-big-data"},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6600","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/comments?post=6600"}],"version-history":[{"count":0,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6600\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/media?parent=6600"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/categories?post=6600"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/tags?post=6600"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}