Workload Automation Blog

Big Data Security Issues in the Enterprise

Lillian Pierson
by Lillian Pierson
3 minute read


To protect your operation, it’s crucial that you understand big data security issues before moving forward on a big data implementation. While most professionals have at least a vague notion about the importance of good data security practices, IT and Security professionals are the ones who know the true magnitude of harms done by enterprise security incidents. Take a look at the following facts to get an idea for yourself:

  • One data breach incident, at a medium- to large-sized business, costs the business an average of $3.79 million.1
  • Globally, cybercrime costs companies and individuals an estimated $388 billion per year (estimate includes losses in the form of money and the time value of money).2
  • Identity theft incidents cost US citizens at least $21 billion per year.3

Considering these hefty costs, it makes absolute sense why governmental agencies around the world have put so many compulsory regulations into place. Some of these international laws4 include:

More intimidating still, consider the fact that the magnitude of security risks is often proportional to the amount of data that’s vulnerable to attack. The advent of big data storage and processing capabilities has introduced implicit risks for big security breaches-Risks that must be secured before a business implements a big data project at scale. In this fourth post of BMC’s big data series, you’ll get an overview on the main big data security issues you should be aware of.

Big Data Security Issues

Almost every big data security issue that’s common to enterprise-wide implementations can be traced back to design omissions in the original Hadoop distribution. Not that Hadoop’s original design was faulty or bad, it just was not designed to be used in an enterprise data environment. Enough time has passed, however, that effective adaptations and solutions have been developed to address these security concerns. The trick is to first understand what the potential weaknesses are, and then verify that you have taken proper precautions to protect those weaknesses.

  • User Authentication and Access – Organizations deploying Hadoop in a shared environment must be sure that user authentication and access rights are strictly controlled. Apache Sentry is one possible solution that’s available to help you limit and control user access rights across a big data system.
  • Regulatory Requirements – With so much data, organizations have to make a real and concerted effort to comply with regulatory requirements. In big data systems, you’d be prudent to take a few steps of extra precaution by ensuring that records on user activities and system events are being generated and stored. You’ll need them to carry out any user and system audits.
  • User Impersonation – Several big data security issues center around the fact that user and service authentication protocols in native Hadoop are somewhat weak. This leaves Hadoop systems open to the risk of malicious data inputs and edits. Make sure you have Kerberos and LDAP protocols in place in order to safeguard against this weakness.
  • Protecting Data-At-Rest and Moving Data – Native Hadoop distributions offer data encryption capabilities for data-at-rest, but it’s a bit trickier to protect data-in-motion. Network encryption methods have been developed to protect moving data, but they’re not included with Hadoop’s native distribution, so you’ll need to set up that line of protection for yourself.

Want to Learn More About Big Data Technologies?

BMC recently co-authored an authoritative guide on big data workflows. It’s called Managing Big Data Workflows for Dummies. Download it here today.


A primer on digital transformation leadership strategy

Learn the fundamentals of innovative IT leadership with practical steps so that you can start leading digital transformation within your company.
Download Now ›

These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing

About the author

Lillian Pierson

Lillian Pierson

Lillian Pierson, P.E. is a leading expert in the field of big data and data science. She equips working professionals and students with the data skills they need to stay competitive in today's data driven economy. She is the author of three highly referenced technical books by Wiley & Sons Publishers: Data Science for Dummies (2015), Big Data / Hadoop for Dummies (Dell Special Edition, 2015), and Big Data Automation for Dummies (BMC Special Edition, 2016). Lillian has spent the last decade training and consulting for large technical organizations in the private sector, such as IBM, Dell, and Intel, as well as government organizations, from the U.S. Navy down to the local government level As the Founder of Data-Mania LLC, Lillian offers online and face-to-face training courses as well as workshops, and other educational materials in the area of big data, data science, and data analytics.