IT Operations Blog

What Happens When Big Data Becomes Bad Data?

Patrick Campbell
3 minute read
Patrick Campbell

Hadoop: When Big Data Becomes BadMost competitive enterprises now use big data for business intelligence activities. For example, Facebook leverages its big data sets—in real time—to determine what ads to place in your sidebar while you’re checking your social feeds or updating your status. In another example, you probably receive emails from your service provider, such as AT&T or Verizon, about your data plan usage to prompt you to upgrade your service.

But what happens when these big data use cases go bad, even for seconds? If the timely placement of ads gets stalled or a promotional email is sent at the wrong time, you’ve missed a revenue opportunity or frustrated a customer. When you consider these missteps on a large scale, for thousands or even millions of transactions, the loss in revenue or added expense from customer complaint logs can be devastating.

Just recently, the bike share company in my community sent me an email message to update my subscription after I had already done so. I called them with my concern and they said it was a system-wide error. This made me then wonder how many other subscribers might have called and tied up the support lines. This company—like many others relying on their big data—had to spend extra time and resources to correct this situation with a follow up apology email message.

Enterprises, similar to this bike share business (new, with fully integrated digital footprints for the locations of all the bikes, who’s using them, at what times, and in which neighborhoods), take advantage of big data to make informed decisions from both structured and unstructured data. Structured data includes content from relational databases and XML schemas that are pretty straightforward in terms of “getting” the data you need. Unstructured data can be content from dweb logs, comments, blog posts, email messages, or any text document, as well as audio, video, or image files.

The strategies to tap into unstructured data usually require more sophisticated algorithms to parse through and find specific data. For example, a big data strategy could find out which users are booking trips to specific destinations and then what their internet surfing habits are once they confirm a trip—a user books a trip to Seattle, checks the weather forecasts, and then shops for rain coats and umbrellas.

One of the most common ways to handle big data sets for enterprises is to deploy Hadoop, an open-source solution that provides distributed storage and processing of data sets on computer clusters built on relatively simple and inexpensive hardware. The most commonly used Hadoop distributions come from the following vendors:

  • Apache Hadoop
  • IBM InfoSphere BigInsights
  • Cloudera
  • Hortonworks
  • MapR

Being able to effectively monitor Hadoop big data processes helps you proactively diagnose performance and availability issues so that your business-critical big data analytics don’t come to a halt at the wrong time. Behavioral learning capabilities can detect when Hadoop clusters are struggling to keep pace with the business and innovative monitoring tools can provide real-time visibility into Hadoop environments to help pinpoint issues and optimize the infrastructure.

Think about it. When IT staff can prioritize and troubleshoot performance issues impacting Hadoop processes and quickly fix them, they’ll have more time to get to the business of coming up with strategic big data analytics that help grow the business and increase revenue. It’s that simple—really.

Get practical guidance for AIOps

IT operations teams have to work faster and smarter than ever to meet the demands of digital transformation. This e-book offers a practical, real-world look at ways artificial intelligence can improve the speed and efficiency of ITOM.
Download E-Book ›

These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing

BMC Bring the A-Game

From core to cloud to edge, BMC delivers the software and services that enable nearly 10,000 global customers, including 84% of the Forbes Global 100, to thrive in their ongoing evolution to an Autonomous Digital Enterprise.
Learn more about BMC ›

About the author

Patrick Campbell

Patrick Campbell

Patrick T. Campbell has spent his 20+ year career equally between Application and Network Performance Management and K-12 Education. As a Technical Marketing Engineer, he began his career in IT at InfoVista as a Technical Trainer, followed by Raytheon Solipsys, OPNET Technologies (Riverbed Technology), and now BMC Software. In K-12 Education, he taught mathematics at Drew College Preparatory School for seven years and then worked at the University of Maryland Baltimore County (UMBC) as a Mathematics and Science Professional Development Program Co-Director for International Teacher-Scholars from Egypt for another two. Passionate about learning, he has presented at OPNETWORK and at NAIS Teacher Conferences. Patrick received a B.S. in Industrial and Management Systems Engineering from Penn State, and has a Master’s Degree in Human Resource and Behavioral Science from Johns Hopkins University.