{"id":6624,"date":"2012-03-13T10:32:00","date_gmt":"2012-03-13T10:32:00","guid":{"rendered":"http:\/\/www.smartdatacollective.com\/index.php\/post\/why-large-enterprises-and-edw-owners-suddenly-care-about-bigdata\/"},"modified":"2012-03-13T10:32:00","modified_gmt":"2012-03-13T10:32:00","slug":"why-large-enterprises-and-edw-owners-suddenly-care-about-bigdata","status":"publish","type":"post","link":"https:\/\/www.smartdatacollective.com\/why-large-enterprises-and-edw-owners-suddenly-care-about-bigdata\/","title":{"rendered":"Why Large Enterprises and EDW Owners Suddenly Care About Big Data"},"content":{"rendered":"<p>While most of big data is geared towards social media and stream analytics, traditional EDW can also best leverage the power of Big Data. The concept of Big Data is not new, banks have been doing it for a while using mainframe size computers. The reason it\u2019s being talked so much now is that for the first time, cheap and massive computing power and even cheaper memory has put mainframe size power in the hands of every organization, right at the time when organizations have been struggling to justify the ROI in processing such exponential data volume.<\/p>\n<p><!--more--><\/p>\n<p>While most of big data is geared towards social media and stream analytics, traditional EDW can also best leverage the power of Big Data. The concept of Big Data is not new, banks have been doing it for a while using mainframe size computers. The reason it\u2019s being talked so much now is that for the first time, cheap and massive computing power and even cheaper memory has put mainframe size power in the hands of every organization, right at the time when organizations have been struggling to justify the ROI in processing such exponential data volume.<\/p>\n<p>&nbsp;<img decoding=\"async\" id=\"img-1328252444388\" class=\"alignCenter\" style=\"display: block; margin-left: auto; margin-right: auto;\" src=\"http:\/\/www.saama.com\/Portals\/94553\/images\/memoryprices.jpg\" border=\"0\" alt=\"BI Cost, BI, Business Intelligence\" \/><\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">Big Data is not a performance engine. i.e. it is not a traditional database that can run queries faster. It will also not replace traditional reporting strategies. What it can do is, it can batch process millions and billions of records both unstructured and structured much faster and cheaper.&nbsp;What has also become possible with BigData Analytics is the ability to merge all analysis into one platform. As a direct result, data analysis has become more accurate, well-rounded, reliable and focused on a specific business capability\/advantage.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">Before investing money in buying commodity hardware and calling consultants to wave the big data magic wands, companies should do a lot of soul-searching because once you set the wheels in motion, it is likely to take up lot of your organization\u2019s focus. To decide where you are in the BigData spectrum it is important to look at the 4 V\u2019s \u2013 Volume, Velocity, Variety and Variability of your data as shown in the info-graphic below.&nbsp;<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\"><img decoding=\"async\" id=\"img-1328252229269\" class=\"alignCenter\" style=\"display: block; margin-left: auto; margin-right: auto;\" src=\"http:\/\/www.saama.com\/Portals\/94553\/images\/4vs.jpg\" border=\"0\" alt=\"Big Data, Bigdata, BI, Business Intelligence\" \/><\/p>\n<p><img decoding=\"async\" id=\"img-1328751317481\" class=\"alignCenter\" style=\"display: block; margin-left: auto; margin-right: auto;\" src=\"http:\/\/www.saama.com\/Portals\/94553\/images\/4vslegend1.jpg\" border=\"0\" alt=\"BigData, Big Data, Business Analytics, Business Intelligence\" \/><\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">A key question to ask would be, if you have enough data volumes at the source to justify the use of Big Data processing (Average Data set &gt; 300GB). If you don\u2019t, you should consider investing in building a traditional enterprise data warehouse and fine tuning your reporting metrics. If yes, you should move on to the next question of how you want to process this amount of data.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">One of the key technologies that is widely being accepted by large Enterprises for BigData Processing is Hadoop. While this technology&nbsp;provides the processing power, the algorithms to make sense of this data will still need to be developed in-house. The most frequent application for Hadoop is to support the \u201cTransform\u201d in traditional ETL (Extract, Transform, Load), where the data is in myriad of unstructured, semi-structured, and structured formats and loaded into terabyte-scale analytical data marts where predictive modelers and other data scientists can work their magic.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">Hadoop and traditional EDW technologies can co-exist in the same ecosystem as shown below. Each has its own strengths and when combined provides a potent mix for your analytical needs that we have seen in few large companies.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\"><a href=\"http:\/\/enterprisebi.files.wordpress.com\/2012\/02\/hadoop-edw1.png\" data-wpel-link=\"external\" rel=\"external noopener noreferrer ugc\"><img loading=\"lazy\" loading=\"lazy\" decoding=\"async\" id=\"img-1328753795909\" class=\"alignleft size-full wp-image-59\" title=\"Hadoop-EDW\" src=\"http:\/\/enterprisebi.files.wordpress.com\/2012\/02\/hadoop-edw1.png\" alt=\"\" width=\"590\" height=\"386\" \/><\/a><\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">Traditional EDWs built on relational, columnar, and other approaches for storing, manipulating, and managing data will continue to exist. All of your investments in pre-Hadoop EDWs, data marts, operational data stores and the likes are reasonably safe from obsolescence.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">The reality here is that the EDW is evolving into a virtualized cloud ecosystem in which all of these database architectures can and will coexist in a pluggable \u201cBig Data\u201d storage layer alongside HDFS, HBase (Hadoop\u2019s columnar database), Cassandra (a sibling Apache project that supports peer-to-peer persistence for complex event processing and other real-time applications), Neo4j (graph database), and other \u201cNoSQL\u201d platforms.<\/p>\n<p style=\"widows: 2; text-transform: none; text-indent: 0px; margin: 14px 0px; outline-width: 0px; font: 13px\/22px 'Helvetica Neue', Helvetica, Arial, sans-serif; white-space: normal; orphans: 2; letter-spacing: normal; color: #000000; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; border-width: 0px; padding: 0px;\">Beginning with a Bigdata implementation really boils down to one basic question, do you have the use cases for it? We will post few sample use cases that are being adopted by large enterprises in our next posting. Stay tuned\u2026.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While most of big data is geared towards social media and stream analytics, traditional EDW can also best leverage the power of Big Data. The concept of Big Data is not new, banks have been doing it for a while using mainframe size computers. The reason it\u2019s being talked so much now is that for [&hellip;]<\/p>\n","protected":false},"author":239,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[15,27,7,21,29],"tags":[349],"class_list":{"0":"post-6624","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-analytics","7":"category-best-practices","8":"category-data-warehousing","9":"category-statistics","10":"category-text-analytics","11":"tag-roi"},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6624","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/users\/239"}],"replies":[{"embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/comments?post=6624"}],"version-history":[{"count":0,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/posts\/6624\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/media?parent=6624"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/categories?post=6624"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.smartdatacollective.com\/wp-json\/wp\/v2\/tags?post=6624"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}