To protect your operation, it’s crucial that you understand big data security issues before moving forward on a big data implementation. While most professionals have at least a vague notion about the importance of good data security practices, IT and Security professionals are the ones who know the true magnitude of harms done by enterprise security incidents. Take a look at the following facts to get an idea for yourself:
- One data breach incident, at a medium- to large-sized business, costs the business an average of $3.79 million.1
- Globally, cybercrime costs companies and individuals an estimated $388 billion per year (estimate includes losses in the form of money and the time value of money).2
- Identity theft incidents cost US citizens at least $21 billion per year.3
Considering these hefty costs, it makes absolute sense why governmental agencies around the world have put so many compulsory regulations into place. Some of these international laws4 include:
- Canadian Laws:The Privacy Act – July 1983, Personal Information Protection and Electronic Data Act (PIPEDA) of 2000 (Bill C-6)
- European Union Laws:European Union Data Protection Directive of 1998, EU Internet Privacy Law of 2002 (DIRECTIVE 2002/58/EC)
- German Laws:Federal Data Protection Act of 2001
- United Kingdom Laws:UK Data Protection Act 1998, Privacy and Electronic Communications (EC Directive) Regulations 2003
More intimidating still, consider the fact that the magnitude of security risks is often proportional to the amount of data that’s vulnerable to attack. The advent of big data storage and processing capabilities has introduced implicit risks for big security breaches-Risks that must be secured before a business implements a big data project at scale. In this fourth post of BMC’s big data series, you’ll get an overview on the main big data security issues you should be aware of.
Big Data Security Issues
Almost every big data security issue that’s common to enterprise-wide implementations can be traced back to design omissions in the original Hadoop distribution. Not that Hadoop’s original design was faulty or bad, it just was not designed to be used in an enterprise data environment. Enough time has passed, however, that effective adaptations and solutions have been developed to address these security concerns. The trick is to first understand what the potential weaknesses are, and then verify that you have taken proper precautions to protect those weaknesses.
- User Authentication and Access – Organizations deploying Hadoop in a shared environment must be sure that user authentication and access rights are strictly controlled. Apache Sentry is one possible solution that’s available to help you limit and control user access rights across a big data system.
- Regulatory Requirements – With so much data, organizations have to make a real and concerted effort to comply with regulatory requirements. In big data systems, you’d be prudent to take a few steps of extra precaution by ensuring that records on user activities and system events are being generated and stored. You’ll need them to carry out any user and system audits.
- User Impersonation – Several big data security issues center around the fact that user and service authentication protocols in native Hadoop are somewhat weak. This leaves Hadoop systems open to the risk of malicious data inputs and edits. Make sure you have Kerberos and LDAP protocols in place in order to safeguard against this weakness.
- Protecting Data-At-Rest and Moving Data – Native Hadoop distributions offer data encryption capabilities for data-at-rest, but it’s a bit trickier to protect data-in-motion. Network encryption methods have been developed to protect moving data, but they’re not included with Hadoop’s native distribution, so you’ll need to set up that line of protection for yourself.
Want to Learn More About Big Data Technologies?
BMC recently co-authored an authoritative guide on big data workflows. It’s called Managing Big Data Workflows for Dummies. Download it here today.
- Cost of Data Breaches Rising Globally, Says ‘2015 Cost of a Data Breach Study: Global Analysis’, Security Intelligence, May 27th
- Norton Study Calculates Cost of Global Cybercrime: $114 Billion Annually, Symantec, September 7, 2011.
- International Privacy Laws, Information Shield
- How to Measure the Value of Your DevOps Organization
- Top 17 DevOps Conferences of 2017
- SGD Linear Regression Example with Apache Spark
- Spark Decision Tree Classifier
- Using Zeppelin with Big Data
- Using Logistic Regression, Scala, and Spark