• Connect with us
    • Information
      • About Us
      • Contact Us
      • Careers
      • Partnerships
      • Advertise With Us
      • Authors
      • Browse Topics
      • Events
      • Disclaimer
      • Privacy Policy
    • Australia
      North America
      World
    Login
    Investing News NetworkYour trusted source for investing success
    • North America
      Australia
      World
    • My INN
    Videos
    Companies
    Press Releases
    Private Placements
    SUBSCRIBE
    • Reports & Guides
      • Market Outlook Reports
      • Investing Guides
    • Button
    Resource
    • Precious Metals
    • Battery Metals
    • Base Metals
    • Energy
    • Critical Minerals
    Tech
    Life Science
    Technology Market
    Technology News
    Technology Stocks
    • Technology Market
    • Technology News
    • Technology Stocks
    market news

    Big Data or Bad Data? Survey Shows Enterprises Struggle to Manage Big Data Flows

    Investing News Network
    Jun. 22, 2016 06:50AM PST
    Technology Investing

    SAN FRANCISCO, CA–(Marketwired – Jun 22, 2016) – StreamSets, the company that delivers performance management for data flows, today announced results from a global survey of more than 300 data management professionals conducted by independent research firm Dimensional Research®. The study showed that enterprises of all sizes face challenges on a range of key data …

    SAN FRANCISCO, CA–(Marketwired – Jun 22, 2016) – StreamSets, the company that delivers performance management for data flows, today announced results from a global survey of more than 300 data management professionals conducted by independent research firm Dimensional Research®. The study showed that enterprises of all sizes face challenges on a range of key data performance management issues from stopping bad data to keeping data flows operating effectively. In particular, nearly 90 percent of respondents report flowing bad data into their data stores while just 12 percent consider themselves good at the key aspects of data flow performance management.

    The survey reveals pervasive data pollution, which implies analytic results may be wrong, leading to false insights that drive poor business decisions. Even if companies can detect their bad data, the process of cleaning it after the fact wastes the time of data scientists and delays its use, which is deadly in a world increasingly reliant on real-time analysis.

    Despite Constant Cleansing, Bad Data Is Polluting Data Stores and is Difficult to Detect
    Respondents cited ensuring data quality as the most common challenge they face when managing big data flows (selected by 68 percent). In addition to bad data flowing into stores, 74 percent of organizations reported currently having bad data in their stores, despite cleansing data throughout the data lifecycle. While 69 percent of organizations consider the ability to detect diverging data values in flow as “valuable” or “very valuable,” only 34 percent rated themselves as “good” or “excellent” at detecting those changes.

    Broad Challenges to Performance Managing Data Flows
    While detecting bad data is a critical aspect of data flow performance, the survey showed that enterprise struggles are much broader. In fact, only 12 percent of respondents rate themselves as “good” or “excellent” across five key performance management areas, namely detecting the following events: pipeline down, throughput degradation, error rate increases, data value divergence and personally identifiable information (PII) violations. 

    Performance degradation (44%), error rate increases (44%) and detecting divergent data (34%) were where respondents felt weakest. Detecting a “pipeline down” event was the only metric where a large majority felt positively about their capabilities (66%). Across each key performance management area there was a very large gap between the respondents’ self-reported capabilities and how valuable they considered each competency.

    Fragile Hand Coding Plus Data Drift is a Dangerous Combination
    These quality and performance management issues may be driven by the reality of data drift — unexpected changes in data structure or semantics — combined with the continued use of outdated methods to design data flows such as low-level coding or use of schema-driven ETL tools. Making frequent changes to pipelines using these inflexible approaches is not only highly inefficient but prone to errors. Also, these tools do not let you watch the data in motion, which means you are flying blind and can’t detect data quality or data flow issues. 

    • Constant tweaking of pipelines due to data drift: Eighty-five percent said that unexpected changes to data structure or semantics create a substantial operational impact. Over half (53%) reported that they have to alter each data flow pipeline several times a month, with 23% making changes several times a week or more.
    • The prominence of hand coding and legacy ETL tools: Nearly two thirds of respondents use ETL/data integration tools and 77 percent use hand coding to design their data pipelines.

    The Need for a New Paradigm
    With the emergence of fast data and streaming analytics, the operational risk has shifted from data at rest to data in motion. However, enterprises overwhelmingly report that they struggle to manage their data flows. What is required is a new organizational discipline around performance management of data flows with the goal of ensuring that next-generation applications are fed quality data continuously.

    “In today’s world of real-time analytics, data flows are the lifeblood of an enterprise,” said Girish Pancha, CEO, StreamSets. “The industry has long been fixated on managing data at rest and this myopia creates a real risk for enterprises as they attempt to harness big and fast data. It is imperative that we shift our mindset towards building continuous data operations capabilities that are in tune with the time-sensitive, dynamic nature of today’s data.”

    For more information and complete survey results, please visit https://streamsets.com/big-data-global-survey/

    About StreamSets
    Founded in 2014, StreamSets provides data ingest technology for the next generation of big data applications. Its enterprise-grade infrastructure accelerates data analysis and decision-making by bringing unprecedented transparency and event processing to data in motion. The company was founded by Girish Pancha, a long-time executive and former chief product officer of Informatica, and Arvind Prabhakar, an early employee and engineering leader at Cloudera. StreamSets is headquartered in San Francisco, and backed by top-tier Silicon Valley venture capital firms and angel investors, including Accel Partners, Battery Ventures, Ignition Partners and New Enterprise Associates (NEA). For more information, visit streamsets.com.

    Media Contact:

    Brittney Danon
    BOCA Communications for StreamSets, Inc.
    Email Contact

    venture capitalbig datamarket news
    The Conversation (0)

    Go Deeper

    AI Powered
    Robot hand touching human hand with a spark in between.

    Top 10 Emerging Technologies to Watch

    Blue cubes.

    Blockchain Technology Stocks: 10 Biggest Companies

    Latest News

    Syntheia Signs Non-Binding LOI for SATCOM Acquisition

    Canadian Investment Regulatory Organization Trade Resumption - RZL

    Canadian Investment Regulatory Organization Trade Resumption - RZL

    RZOLV Technologies Announces Independent SGS Lab-Scale Test Results on Gravity Concentrates - 98.7% Gold Recoveries

    CHARBONE Announces a Non-Brokered Private Placement Closing of $3.1M

    More News

    Outlook Reports world

    Resource
    • Precious Metals
      • Gold
      • Silver
    • Battery Metals
      • Lithium
      • Cobalt
      • Graphite
    • Energy
      • Uranium
      • Oil and Gas
    • Base Metals
      • Copper
      • Nickel
      • Zinc
    • Critical Metals
      • Rare Earths
    • Industrial Metals
    • Agriculture
    Tech
      • Artificial Intelligence
      • Cybersecurity
      • Gaming
      • Cleantech
      • Emerging Tech
    Life Science
      • Biotech
      • Cannabis
      • Psychedelics
      • Pharmaceuticals

    Featured Stocks

    More featured stocks

    Browse Companies

    Resource
    • Precious Metals
    • Battery Metals
    • Energy
    • Base Metals
    • Critical Metals
    Tech
    Life Science
    MARKETS
    COMMODITIES
    CURRENCIES