{"id":2534,"date":"2010-05-17T18:38:09","date_gmt":"2010-05-17T18:38:09","guid":{"rendered":"http:\/\/www.smartdatacollective.com\/index.php\/post\/27323\/"},"modified":"2010-05-17T18:38:09","modified_gmt":"2010-05-17T18:38:09","slug":"27323","status":"publish","type":"post","link":"https:\/\/www.smartdatacollective.com\/27323\/","title":{"rendered":"Winning the first game in a baseball series: a harbinger, or not?"},"content":{"rendered":"<p>For those not familiar with the major-league baseball in the US (and despite living here for more than 10 years, I still include myself in that category), the games usually played in series: team A visits the home of team B, and the two teams play two or more games against each other on successive days. It&#8217;s common wisdom that if team A wins on the first day, they&#8217;re more likely to be the victors on the second day, too. (Folks &#8220;knowledgeable in baseball and the mathematics of forecasting&#8221; give the probability of a second-day win given winning the first game at 65% to 70%.)&nbsp;But is that assertion borne out by the data?<\/p>\n<p>Decision Science News has analyzed data from&nbsp;all major league baseball games played between 1970 and 2009, and used R to <a href=\"http:\/\/www.decisionsciencenews.com\/2010\/05\/05\/you-won-but-how-much-was-luck-and-how-much-was-skill\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">analyze the likelihood of winning the second game<\/a> in a consecutive pair of games, given the result of the first. (All of the R code and data are provided, if you want to try and replicate the analysis yourself.) You can see various analysis in DSN&#8217;s post, but the most revealing one for me was this chart on the frequency of successive wins with respect to the margin of victory (excess runs) in the first game:<\/p>\n<p> <a href=\"http:\/\/revolution-computing.typepad.com\/.a\/6a010534b1db25970b013480ef519e970c-pi\" style=\"display: inline;\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><img decoding=\"async\" alt=\"SecondWinFrequency\" class=\"asset asset-image at-xid-6a010534b1db25970b013480ef519e970c \" src=\"http:\/\/revolution-computing.typepad.com\/.a\/6a010534b1db25970b013480ef519e970c-800wi\" title=\"SecondWinFrequency\" border=\"0\"><\/a> <br \/> So moderate and even <span class=\"dots\">&#8230;<\/span><br \/>\n<!--more--><\/p>\n<p><!--break--><\/p>\n<p>For those not familiar with the major-league baseball in the US (and despite living here for more than 10 years, I still include myself in that category), the games usually played in series: team A visits the home of team B, and the two teams play two or more games against each other on successive days. It&#8217;s common wisdom that if team A wins on the first day, they&#8217;re more likely to be the victors on the second day, too. (Folks &#8220;knowledgeable in baseball and the mathematics of forecasting&#8221; give the probability of a second-day win given winning the first game at 65% to 70%.)&nbsp;But is that assertion borne out by the data?<\/p>\n<p>Decision Science News has analyzed data from&nbsp;all major league baseball games played between 1970 and 2009, and used R to <a href=\"http:\/\/www.decisionsciencenews.com\/2010\/05\/05\/you-won-but-how-much-was-luck-and-how-much-was-skill\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">analyze the likelihood of winning the second game<\/a> in a consecutive pair of games, given the result of the first. (All of the R code and data are provided, if you want to try and replicate the analysis yourself.) You can see various analysis in DSN&#8217;s post, but the most revealing one for me was this chart on the frequency of successive wins with respect to the margin of victory (excess runs) in the first game:<\/p>\n<p>\n<a href=\"http:\/\/revolution-computing.typepad.com\/.a\/6a010534b1db25970b013480ef519e970c-pi\" style=\"display: inline;\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><img decoding=\"async\" alt=\"SecondWinFrequency\" class=\"asset asset-image at-xid-6a010534b1db25970b013480ef519e970c \" src=\"http:\/\/revolution-computing.typepad.com\/.a\/6a010534b1db25970b013480ef519e970c-800wi\" title=\"SecondWinFrequency\" border=\"0\"><\/a> <br \/>\nSo moderate and even significant wins by several runs don&#8217;t in the first game don&#8217;t appear to confer much benefit in the second. As DSN points out:<\/p>\n<blockquote>\n<p>The equation of the robust regression line is: Probability(Win_Second_Game) = .498 + .004*First_Game_Margin which suggests that even if you win the first game by an obscene 20 points, your chance of winning the second game is only 57.8%<\/p>\n<\/blockquote>\n<p>In fact, any second-game advantage from the first win that may exist is far overshadowed by the home-team advantage: &#8220;When it comes to winning the second game, it\u2019s better to be the home team who just lost than the visitor who just won.&#8221; See the full article at Decision Science News for the details of the analysis.<\/p>\n<p>Decision Science News: <a href=\"http:\/\/www.decisionsciencenews.com\/2010\/05\/05\/you-won-but-how-much-was-luck-and-how-much-was-skill\/\" target=\"_blank\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">You won, but how much was luck and how much was skill?<\/a>&nbsp;<\/p>\n<p>\n<p><p><span style=\"color: rgb(85, 26, 139); text-decoration: underline;\"><span><a href=\"http:\/\/blog.revolutionanalytics.com\/2010\/05\/baseball-harbinger.html\" title=\"http:\/\/blog.revolutionanalytics.com\/2010\/05\/baseball-harbinger.html\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\">Link to original post<\/a><\/span><br \/>\n<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>For those not familiar with the major-league baseball in the US (and despite living here for more than 10 years, I still include myself in that category), the games usually played in series: team A visits the home of team B, and the two teams play two or more games against each other on successive [&hellip;]<\/p>\n","protected":false},"author":29,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[2,6,8],"tags":[282,216],"class_list":{"0":"post-2534","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-business-intelligence","7":"category-data-visualization","8":"category-predictive-analytics","9":"tag-data-visualization","10":"tag-decision-management"},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/2534","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/users\/29"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/comments?post=2534"}],"version-history":[{"count":0,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/2534\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/media?parent=2534"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/categories?post=2534"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/tags?post=2534"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}