Machine Learning & Big Data Blog – BMC Blogs BMC Software Tue, 18 Sep 2018 12:46:05 +0000 en-US hourly 1 Machine Learning & Big Data Blog – BMC Blogs 32 32 Big Data vs Analytics vs Data Science: What’s The Difference? Mon, 10 Sep 2018 00:00:42 +0000 There is much confusion from people who do not work with the technology what the difference is between big data and analytics. Often you see the names big data analytics, big data, analytics, or data science. What do these mean? In brief, big data is the infrastructure that supports analytics. Analytics is applied mathematics. Analytics […]]]> Introduction to Google Cloud TPUs (Tensor Processing Unit) for ML Acceleration Fri, 24 Aug 2018 00:00:17 +0000 We already wrote how machine learning frameworks are using NVIDIA GPUs (graphical processing units) to speed machine learning. Now Google is taking that idea and using it to speed machine learning using their own ASIC hardware, called TPUs, Tensor Processing Units. What Google has really done is take technology invented by NVIDIA (GPUs) and pushed […]]]> How Keras Machine Language API Makes TensorFlow Easier Fri, 03 Aug 2018 11:34:20 +0000 Keras is a Python framework designed to make working with Tensorflow (also written in Python) easier. It builds neural networks, which, of course, are used for classification problems. The example problem below is binary classification. You can find the code here. The binary classification problem here is to determine whether a customer will buy something […]]]> What is Data Quality Management? Mon, 30 Jul 2018 00:30:01 +0000 More business leaders are becoming aware of the tremendous impact big data has on the trajectory of the enterprise organization as it relates to: Predicting customer expectations; Assisting with effective product management; Being available on-demand to influence top-down decision making; Tailoring customer service innovations by investigating shopping habits of customers; and Providing organizations with competitor […]]]> 4 Reasons to Automate the Ingestion of Data Mon, 23 Jul 2018 00:00:14 +0000 Hardly a day goes by without talk of automation and big data in any company. These days, the market understands the need for data: it’s the de facto way to gain business intelligence. And, data science and machine learning are go-to tools in predictive analytics, which means you need data, and a lot of it. […]]]> Tuning Machine Language Models for Accuracy Fri, 20 Jul 2018 00:00:48 +0000 Continuing with our explanations of how to measure the accuracy of an ML model, here we discuss two metrics that you can use with classification models: accuracy and receiver operating characteristic area under curve. These are some of the metrics suitable for classification problems, such a logistic regression and neural networks. There are others that […]]]> Bias and Variance in Machine Learning Tue, 10 Jul 2018 00:00:53 +0000 The risk in following ML models is they could be based on false assumptions and skewed by noise and outliers. That could lead to making bad predictions. That is why ML cannot be a black box. The user must understand the data and algorithms if the models are to be trusted. So here we look […]]]> Five Reasons You Need a Step-by-Step Approach to Workflow Orchestration for Big Data Mon, 09 Jul 2018 00:20:44 +0000 Is your organization struggling to keep up with the demands of Big Data and under pressure to prove quick results? If so, you’re not alone. According to analysts, up to 60% of Big Data projects are failing because they can’t scale at the enterprise level. Fortunately, taking a step-by-step approach to workflow orchestration can help […]]]> Mean Squared Error, R2, and Variance in Regression Analysis Thu, 05 Jul 2018 00:48:43 +0000 Here we introduce some terms important to machine learning; variance, r2 score, and mean square error. We illustrate with these concepts using scikit-learn. It is important to understand these metrics to determine whether regression models are accurate or misleading. Following a flawed model is a bad idea. So it is important that you can quantify […]]]> Data Integrity vs Data Quality: What’s the Difference? Tue, 03 Jul 2018 00:00:10 +0000 Big Data has been widely labeled as the new oil and the new black gold – parallels that describe the value of big data to our economy and business. However, the analogy only fits in limited situations. Big Data becomes a truly valuable commodity only when the data is of high quality determined based on […]]]>