{"id":6912,"date":"2012-04-27T02:41:07","date_gmt":"2012-04-27T02:41:07","guid":{"rendered":"http:\/\/www.smartdatacollective.com\/index.php\/post\/evolution-what-data-science\/"},"modified":"2012-04-27T02:41:07","modified_gmt":"2012-04-27T02:41:07","slug":"evolution-what-data-science","status":"publish","type":"post","link":"https:\/\/www.smartdatacollective.com\/evolution-what-data-science\/","title":{"rendered":"The Evolution of &#8220;What is Data Science?&#8221;"},"content":{"rendered":"<p>I\u2019m in the process of researching the origin and evolution of data science as a discipline and a profession. Here are the milestones that I have picked up so far, tracking the evolution of the term \u201cdata science,\u201d attempts to define it, and some related developments. &nbsp;<em>I would greatly appreciate any pointers to additional key milestones (events, publications, etc.)<\/em>.<\/p>\n<p><!--more--><\/p>\n<p>I\u2019m in the process of researching the origin and evolution of data science as a discipline and a profession. Here are the milestones that I have picked up so far, tracking the evolution of the term \u201cdata science,\u201d attempts to define it, and some related developments. &nbsp;<em>I would greatly appreciate any pointers to additional key milestones (events, publications, etc.)<\/em>.<\/p>\n<p><strong>1974<\/strong>&nbsp;<a href=\"http:\/\/en.wikipedia.org\/wiki\/Peter_Naur\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Peter Naur<\/a>&nbsp;publishes&nbsp;<em>Concise Survey of Computer Methods<\/em>&nbsp;in Sweden and the United States. The book is a survey of contemporary data processing methods that are used in a wide range of applications. It is organized around the concept of data as defined in the&nbsp;<a href=\"http:\/\/en.wikipedia.org\/wiki\/International_Federation_for_Information_Processing\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><em>IFIP<\/em><\/a><em>&nbsp;Guide to Concepts and Terms in Data Processing<\/em>, which defines data as \u201ca representation of facts or ideas in a formalized manner capable of being communicated or manipulated by some process.\u201c The Preface to the book tells the reader that a course plan was presented at the IFIP Congress in 1968, titled \u201cDatalogy, the science of data and of data processes and its place in education,\u201c and that in the text of the book, \u201dthe term \u2018data science\u2019 has been used freely.\u201d Naur offers the following definition of data science: \u201cThe science of dealing with data, once they have been established, while the relation of the data to what they represent is delegated to other fields and sciences.\u201d<\/p>\n<p><strong>1977<\/strong>&nbsp;<a href=\"http:\/\/www.iasc-isi.org\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The International Association for Statistical Computing<\/a>&nbsp;(IASC) was founded as a Section of the&nbsp;<a href=\"http:\/\/www.isi-web.org\/\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">ISI<\/a>. \u201cIt is the mission of the IASC to link traditional statistical methodology, modern computer technology, and the knowledge of domain experts in order to convert data into information and knowledge.\u201d<\/p>\n<p><strong>1989<\/strong>&nbsp;<a href=\"http:\/\/www.kdnuggets.com\/gps.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Gregory Piatetsky-Shapiro<\/a>&nbsp;organizes and chairs&nbsp;<a href=\"http:\/\/www.kdnuggets.com\/meetings\/kdd89\/index.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">the first Knowledge Discovery in Databases (KDD) workshop<\/a>. In 1995, it became the annual ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).<\/p>\n<p><strong>1996<\/strong>&nbsp;Members of the&nbsp;<a href=\"http:\/\/www.classification-society.org\/ifcs\/index.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><em>International Federation of Classification Societies<\/em>&nbsp;<em>(IFCS)<\/em><\/a>&nbsp;meet in Tokyo for their biennial conference. For the first time, the term \u201cdata science\u201d is included in the title of the conference (\u201c<a href=\"http:\/\/whatsthebigdata.com\/2012\/04\/15\/data-science-is-so-1996\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data science, classification, and related methods<\/a>\u201d). The IFCS was founded in 1985 by six country- and language-specific classification societies, one of which,&nbsp;<a href=\"http:\/\/www.classification-society.org\/clsoc\/clsoc.php\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><em>The Classification Society<\/em><\/a>, was founded in 1964. The aim of these classification societies has been to support the study of \u201cthe principle and practice of classification in a wide range of disciplines\u201d(CS), \u201cresearch in problems of classification, data analysis, and systems for ordering knowledge\u201d(IFCS), and the \u201cstudy of classification and clustering (including systematic methods of creating classifications from data) and related statistical and data analytic methods\u201c (CSNA bylaws). The classification societies have variously used the terms data analysis, data mining, and data science in their publications.<\/p>\n<p><strong>1997<\/strong>&nbsp;Launch of the journal&nbsp;<a href=\"http:\/\/www.springer.com\/computer\/database+management+%26+information+retrieval\/journal\/10618\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Knowledge Discovery and Data Mining<\/a>: \u201cAdvances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing. KDD is concerned with issues of scalability, the multi-step knowledge discovery process for extracting useful patterns and models from raw data stores (including data cleaning and noise modelling), and issues of making discovered patterns understandable.\u201d<\/p>\n<p><strong>2001<\/strong>&nbsp;<a href=\"http:\/\/www.stat.purdue.edu\/~wsc\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">William S. Cleveland<\/a>&nbsp; (then at Bell Labs, now at the Department of Statistics at Purdue University) publishes &#8220;<a href=\"http:\/\/cm.bell-labs.com\/cm\/ms\/departments\/sia\/doc\/datascience.pdf\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics<\/a>.&#8221; It is a plan \u201cto enlarge the major areas of technical work of the field of statistics. Because the plan is ambitious and implies substantial change, the altered field will be called \u2018data science.\u2019&#8221; The plan \u201csets out six technical areas for a university department\u201d: Multidisciplinary Investigations, Models and Methods for Data, Computing with Data, Pedagogy, Tool Evaluation, and Theory. Cleveland puts the proposed new discipline in the context of computer science and the contemporary work on data mining: \u201c\u2026the benefit to the data analyst has been limited, because the knowledge among computer scientists about how to think of and approach the analysis of data is limited, just as the knowledge of computing environments by statisticians is limited. A merger of knowledge bases would produce a powerful force for innovation. This suggests that statisticians should look to computing for knowledge today just as data science looked to mathematics in the past. \u2026 departments of data science should contain faculty members who devote their careers to advances in computing with data and who form partnership with computer scientists.\u201d<\/p>\n<p><strong>April 2002<\/strong>&nbsp;The&nbsp;<em><a href=\"http:\/\/www.codata.org\/dsj\/index.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data Science Journal<\/a><\/em>&nbsp;is launched, publishing papers on \u201cthe management of data and databases in Science and Technology. The scope of the Journal includes descriptions of data systems, their publication on the internet, applications and legal issues.\u201d The journal is published by the Committee on Data for Science and Technology (<a href=\"http:\/\/www.codata.org\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">CODATA<\/a>) of the International Council for Science (ICSU).<\/p>\n<p><strong>January 2003<\/strong>&nbsp;The&nbsp;<a href=\"http:\/\/www.jds-online.com\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><em>Journal of Data Science<\/em><\/a><em>&nbsp;<\/em>is launched: \u201cBy \u2018Data Science\u2019 we mean almost everything that has something to do with data: Collecting, analyzing, modeling&#8230;&#8230; yet the most important part is its applications &#8212; all sorts of applications. This journal is devoted to applications of statistical methods at large\u2026. The Journal of Data Science will provide a platform for all data workers to present their views and exchange ideas.\u201d<\/p>\n<p><strong>September 2005<\/strong>&nbsp;<a href=\"http:\/\/www.nsf.gov\/nsb\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The National Science Board<\/a>&nbsp;publishes \u201c<a href=\"http:\/\/www.nsf.gov\/pubs\/2005\/nsb0540\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Long-lived Digital Data Collections: Enabling Research and Education in the 21<sup>st<\/sup>&nbsp;Century<\/a>.\u201d One of the recommendations of the report reads: \u201cThe NSF, working in partnership with collection managers and the community at large, should act to develop and mature the career path for data scientists and to ensure that the research enterprise includes a sufficient number of high-quality data scientists.\u201d The report defines data scientists as \u201cthe information and computer scientists, database and software engineers and programmers, disciplinary experts, curators and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection.\u201d<\/p>\n<p><strong>July 2008<\/strong>&nbsp;The<a href=\"http:\/\/www.jisc.ac.uk\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">&nbsp;JISC<\/a>&nbsp;publishes the final report of a study it commissioned to \u201cexamine and make recommendations on the role and career development of data scientists and the associated supply of specialist data curation skills to the research community. \u201c The study\u2019s final report, \u201c<a href=\"http:\/\/www.jisc.ac.uk\/whatwedo\/programmes\/digitalrepositories2007\/dataskillscareers.aspx\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The Skills, Role &amp; Career Structure of Data Scientists &amp; Curators: &nbsp;Assessment of Current Practice &amp; Future Needs<\/a>,\u201d defines data scientists as \u201cpeople who work where the research is carried out \u2013 or, in the case of data centre personnel, in close collaboration with the creators of the data \u2013 and may be involved in creative enquiry and analysis, enabling others to work with digital data, and developments in data base technology.\u201d<\/p>\n<p><strong>January 2009<\/strong>&nbsp;<a href=\"http:\/\/www.nitrd.gov\/About\/Harnessing_Power_Web.pdf\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><em>Harnessing the Power of Digital Data for Science and Society<\/em><\/a>&nbsp;is published. This report of the Interagency Working Group on Digital Data to the Committee on Science of the National Science and Technology Council states that \u201cThe nation needs to identify and promote the emergence of new disciplines and specialists expert in addressing the complex and dynamic challenges of digital preservation, sustained access, reuse and repurposing of data. Many disciplines are seeing the emergence of a new type of data science and management expert, accomplished in the computer, information, and data sciences arenas and in another domain science. These individuals are key to the current and future success of the scientific enterprise. However, these individuals often receive little recognition for their contributions and have limited career paths. Critical challenges in achieving our strategic vision include providing an effective pipeline of data professionals to ensure that the needs and opportunities of the future can be met and providing these professionals with appropriate rewards and recognition.\u201d The report discusses the emergence of \u201cnew information disciplines\u201d and lists a few examples:<\/p>\n<ul>\n<li>Digital Curators: experts knowledgeable of and with responsibility for the content of digital collection(s);<\/li>\n<li>Digital Archivists: experts competent to appraise, acquire, authenticate, preserve, and provide access to records in digital form; and<\/li>\n<li>Data Scientists: information and computer scientists, database and software engineers and programmers, disciplinary experts, expert annotators, and others who are crucial to the successful management of a digital data collection.<\/li>\n<\/ul>\n<p><strong>May 2009<\/strong>&nbsp;<a href=\"http:\/\/www.linkedin.com\/profile\/view?id=413949&amp;authType=NAME_SEARCH&amp;authToken=nqu_&amp;locale=en_US&amp;srchid=020baa4d-af36-4fe8-a959-e84fbaca26da-0&amp;srchindex=1&amp;srchtotal=268&amp;goback=.fps_PBCK_*1_Mike_Driscoll_*1_*1_*1_*1_*2_*1_Y_*1_*1_*1_false_1_R_*1_*51_*1_*51_true_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&amp;pvs=ps&amp;trk=pp_profile_name_link\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Mike Driscoll<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/www.dataspora.com\/2009\/05\/sexy-data-geeks\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The Three Sexy Skills of Data Geeks<\/a>\u201d: \u201c\u2026with the Age of Data upon us, those who can model, munge, and visually communicate data \u2014 call us statisticians or data geeks \u2014 are a hot commodity.\u201d [Driscoll will follow up with&nbsp;<a href=\"http:\/\/www.dataspora.com\/2010\/08\/the-seven-secrets-of-successful-data-scientists\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The Seven Secrets of Successful Data Scientists<\/a>&nbsp;in August 2010]<\/p>\n<p><strong>June 2009<\/strong>&nbsp;<a href=\"http:\/\/flowingdata.com\/about-nathan\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Nathan Yau<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/flowingdata.com\/2009\/06\/04\/rise-of-the-data-scientist\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Rise of the Data Scientist<\/a>\u201d: &nbsp;\u201cAs we&#8217;ve all read by now, Google&#8217;s chief economist Hal Varian&nbsp;<a href=\"http:\/\/flowingdata.com\/2009\/02\/25\/googles-chief-economist-hal-varian-on-statistics-and-data\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">commented<\/a>&nbsp;in January that the next sexy job in the next 10 years would be statisticians. Obviously, I whole-heartedly agree. Heck, I&#8217;d go a step further and say they&#8217;re sexy now &#8211; mentally and physically. However, if you went on to read the rest of Varian&#8217;s interview, you&#8217;d know that by statisticians, he actually meant it as a general title for someone who is able to extract information from large datasets and then present something of use to non-data experts\u2026 [Ben] Fry\u2026 argues for an entirely new field that combines the skills and talents from often disjoint areas of expertise\u2026 [computer science; mathematics, statistics, and data mining; graphic design; infovis and human-computer interaction]. And after two years of highlighting visualization on FlowingData, it seems collaborations between the fields are growing more common, but more importantly, computational information design edges closer to reality. We&#8217;re seeing&nbsp;<em>data scientists<\/em>&nbsp;&#8211; people who can do it all &#8211; emerge from the rest of the pack.\u201d<\/p>\n<p><strong>June 2009<\/strong>&nbsp;<a href=\"http:\/\/www.linkedin.com\/profile\/view?id=27776413&amp;authType=name&amp;authToken=JHob&amp;goback=.anb_2013423_*2_*1_*1_*1_*1_*1\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Troy Sadkowsky<\/a>&nbsp;creates the&nbsp;<a href=\"http:\/\/www.linkedin.com\/groups\/Data-Scientists-2013423?trk=myg_ugrp_ovr\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">data scientists group<\/a>&nbsp;on LinkedIn as a companion to his website, datasceintists.com (which later became&nbsp;<a href=\"http:\/\/www.datascientists.net\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">datascientists.net<\/a>).<\/p>\n<p><strong>February 2010<\/strong>&nbsp;Kenneth Cukier writes in&nbsp;&#8220;<a href=\"http:\/\/www.economist.com\/node\/15557443\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data, data everywhere: A special report on managing&nbsp;information<\/a>&#8220;:&nbsp;&#8220;&#8230; a new kind of professional has emerged, the data scientist, who&nbsp;combines the skills of software programmer, statistician and&nbsp;storyteller\/artist to extract the nuggets of gold hidden under&nbsp;mountains of data.&#8221;<\/p>\n<p><strong>June 2010<\/strong>&nbsp;<a href=\"http:\/\/radar.oreilly.com\/mikel\/index.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Mike Loukides<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/radar.oreilly.com\/2010\/06\/what-is-data-science.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">What is Data Science?<\/a>\u201d: &nbsp;\u201cData scientists combine entrepreneurship with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate over a solution. They are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: \u2018here&#8217;s a lot of data, what can you make from it?\u2019&#8221;<\/p>\n<p><strong>September 2010<\/strong>&nbsp;&nbsp;<a href=\"http:\/\/www.hilarymason.com\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Hilary Mason<\/a>&nbsp;and&nbsp;<a href=\"http:\/\/www.columbia.edu\/~chw2\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Chris Wiggins<\/a>&nbsp;write in \u201c<a href=\"http:\/\/www.dataists.com\/2010\/09\/a-taxonomy-of-data-science\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">A Taxonomy of Data Science<\/a>\u201d: &nbsp;\u201c\u2026we thought it would be useful to propose one possible taxonomy\u2026 of what a data scientist does, in roughly chronological order: Obtain, Scrub, Explore, Model, and iNterpret\u2026. Data science is clearly a blend of the hackers\u2019 arts\u2026 statistics and machine learning\u2026 and the expertise in mathematics and the domain of the data for the analysis to be interpretable\u2026 It requires creative decisions and open-mindedness in a scientific context.\u201d<\/p>\n<p><strong>September 2010<\/strong>&nbsp;<a href=\"http:\/\/www.drewconway.com\/Drew_Conway\/About.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Drew Conway<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/www.drewconway.com\/zia\/?p=2378\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">The Data Science Venn Diagram<\/a>\u201d: &nbsp;\u201c\u2026one needs to learn a lot as they aspire to become a fully competent data scientist. Unfortunately, simply enumerating texts and tutorials does not untangle the knots. Therefore, in an effort to simplify the discussion, and add my own thoughts to what is already a crowded market of ideas, I present the Data Science Venn Diagram\u2026 hacking skills, math and stats knowledge, and substantive expertise.\u201d<\/p>\n<p><strong>May 2011<\/strong>&nbsp;&nbsp;<a href=\"http:\/\/www.linkedin.com\/in\/petewarden\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Pete Warden<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/radar.oreilly.com\/2011\/05\/data-science-terminology.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Why the term \u2018data science\u2019 is flawed but useful<\/a>\u201d: \u201cThere is no widely accepted boundary for what&#8217;s inside and outside of data science&#8217;s scope. Is it just a faddish rebranding of statistics? I don&#8217;t think so, but I also don&#8217;t have a full definition. I believe that the recent abundance of data has sparked something new in the world, and when I look around I see people with shared characteristics who don&#8217;t fit into traditional categories. These people tend to work beyond the narrow specialties that dominate the corporate and institutional world, handling everything from finding the data, processing it at scale, visualizing it and writing it up as a story. They also seem to start by looking at what the data can tell them, and then picking interesting threads to follow, rather than the traditional scientist&#8217;s approach of choosing the problem first and then finding data to shed light on it.\u201d<\/p>\n<p><strong>May 2011<\/strong>&nbsp;<a href=\"http:\/\/www.linkedin.com\/profile\/view?id=1197604&amp;authType=NAME_SEARCH&amp;authToken=iQkZ&amp;locale=en_US&amp;srchid=3bcd74d5-4670-4b79-937a-14ee62dddf65-0&amp;srchindex=1&amp;srchtotal=96&amp;goback=.fps_PBCK_david+smith+revolution_*1_*1_*1_*1_*1_*1_*2_*1_Y_*1_*1_*1_false_1_R_*1_*51_*1_*51_true_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&amp;pvs=ps&amp;trk=pp_profile_name_link\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">David Smith<\/a>&nbsp;writes in &#8220;<a href=\"http:\/\/blog.revolutionanalytics.com\/2011\/05\/data-science-whats-in-a-name.html\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">\u2019Data Science\u2019: What&#8217;s in a name?<\/a>\u201d: &nbsp;&nbsp;\u201cThe terms \u2018Data Science\u2019 and \u2018Data Scientist\u2019 have only been in common usage for a little over a year, but they&#8217;ve really taken off since then: many companies are now hiring for \u2018data scientists\u2019, and entire conferences are run under the name of \u2018data science\u2019. But despite the widespread adoption, some have resisted the change from the more traditional terms like \u2018statistician\u2019 or \u2018quant\u2019 or \u2018data analyst\u2019\u2026. I think \u2018Data Science\u2019 better describes what we actually do: a combination of computer hacking, data analysis, and problem solving.\u201d<\/p>\n<p><strong>September 2011<\/strong>&nbsp;<a href=\"http:\/\/www.linkedin.com\/profile\/view?id=36953496&amp;authType=NAME_SEARCH&amp;authToken=fVXu&amp;locale=en_US&amp;srchid=3b0338b4-50a2-433c-819e-97a97b9a0896-0&amp;srchindex=3&amp;srchtotal=59&amp;goback=.fps_PBCK_Harlan+Harris+_*1_*1_*1_*1_*1_*1_*2_*1_Y_*1_*1_*1_false_1_R_*1_*51_*1_*51_true_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2_*2&amp;pvs=ps&amp;trk=pp_profile_name_link\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Harlan Harris<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/www.harlan.harris.name\/2011\/09\/data-science-moores-law-and-moneyball\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Data Science, Moore\u2019s Law, and Moneyball<\/a>\u201d : \u201c\u2019Data Science\u2019 is defined as what \u2018Data Scientists\u2019 do. What Data Scientists do has been very well covered, and it runs the gamut from data collection and munging, through application of statistics and machine learning and related techniques, to interpretation, communication, and visualization of the results. Who Data Scientists are may be the more fundamental question\u2026&nbsp; I tend to like the idea that Data Science is defined by its practitioners, that it\u2019s a career path rather than a category of activities. In my conversations with people, it seems that people who consider themselves Data Scientists typically have eclectic career paths, that might in some ways seem not to make much sense.\u201d<\/p>\n<p><strong>September 2011<\/strong>&nbsp;<a href=\"http:\/\/www.linkedin.com\/in\/dpatil\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">DJ Patil<\/a>&nbsp;writes in \u201c<a href=\"http:\/\/radar.oreilly.com\/2011\/09\/building-data-science-teams.html?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+oreilly%2Fradar%2Fatom+%28O%27Reilly+Radar%29&amp;utm_content=My+Yahoo\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Building Data Science Teams<\/a>\u201d: \u201cStarting in 2008, Jeff Hammerbacher (@hackingdata) and I sat down to share our experiences building the data and analytics groups at Facebook and LinkedIn. In many ways, that meeting was the start of data science as a distinct professional specialization&#8230;.&nbsp; we realized that as our organizations grew, we both had to figure out what to call the people on our teams. \u2018Business analyst\u2019 seemed too limiting. \u2018Data analyst\u2019 was a contender, but we felt that title might limit what people could do. After all, many of the people on our teams had deep engineering expertise. \u2018Research scientist\u2019 was a reasonable job title used by companies like Sun, HP, Xerox, Yahoo, and IBM. However, we felt that most research scientists worked on projects that were futuristic and abstract, and the work was done in labs that were isolated from the product development teams. It might take years for lab research to affect key products, if it ever did. Instead, the focus of our teams was to work on data applications that would have an immediate and massive impact on the business. The term that seemed to fit best was data scientist: those who use both data and science to create something new. \u201c<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I\u2019m in the process of researching the origin and evolution of data science as a discipline and a profession. Here are the milestones that I have picked up so far, tracking the evolution of the term \u201cdata science,\u201d attempts to define it, and some related developments. &nbsp;I would greatly appreciate any pointers to additional key [&hellip;]<\/p>\n","protected":false},"author":210,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[1],"tags":[252,974,937],"class_list":{"0":"post-6912","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-uncategorized","7":"tag-big-data","8":"tag-big-data-analytics","9":"tag-data-science"},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6912","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/users\/210"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/comments?post=6912"}],"version-history":[{"count":0,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6912\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/media?parent=6912"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/categories?post=6912"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/tags?post=6912"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}