Walker Rowe

Walker Rowe

Walker Rowe is an American freelance tech writer and programmer living in Chile. He specializes in big data, analytics, and cloud architecture. Find him on LinkedIn or at Southern Pacific Review, where he publishes short stories, poems, and news.

Bias and Variance in Machine Learning

BY

The risk in following ML models is they could be based on false assumptions and skewed by noise and outliers. That could lead to making bad predictions. That is why ML cannot be a black box. The user must understand the data and algorithms if the models are to be trusted. So here we look at some more measures of trustworthiness. As in the … [Read more...]

Mean Squared Error, R2, and Variance in Regression Analysis

BY

Here we introduce some terms important to machine learning; variance, r2 score, and mean square error. We illustrate with these concepts using scikit-learn. It is important to understand these metrics to determine whether regression models are accurate or misleading. Following a flawed model is a bad idea. So it is important that you can … [Read more...]

Getting Started with scikit-learn

BY

Here we explore another machine learning framework, scikit-learn, as well as show how to use matplotlib, to draw graphs. The scikit-learn python ML API predates Apache Spark and TensorFlow, which is to say it has been around longer than big data. It has long been used by those who see themselves as pure data scientists, as opposed to data … [Read more...]

Introduction to Spark’s Machine Learning Pipeline

BY

Here we explain what is a Spark machine learning pipeline. We will do this by converting existing code that we wrote, which is done in stages, to pipeline format. This will run all the data transformation and model fit operations under the pipeline mechanism. The existing Apache Spark ML code is explained in two blog posts: part one and part … [Read more...]

What is Refactoring? Code Refactoring Explained

BY

  Code refactoring means to take a working program and change it to make some improvements. It changes the code but not the outcome. These improvements can: make it easier for other programmers to read make aesthetic improvements such as implement a clever idea make the program run faster or use less resources adhere to … [Read more...]

Introduction to Google Cloud Machine Learning Engine

BY

The Google Cloud Machine Learning Engine is almost exactly the same as Amazon Sagemaker. It is not a SaaS program that you can just upload data to and start using like the Google Natural Language API. Instead, you have to program Google Cloud ML using any of the ML frameworks such as TensorFlow, scikit-learn, XGBoost, or Keras. Then Google spins up … [Read more...]

AWS Linear Learner: Using Amazon SageMaker for Logistic Regression

BY

In the last blog post we showed you how to use Amazon SageMaker. So read that one before you read this one because there we show screen prints and explain how to use the graphical interface of the product, including its hosted Jupyter Notebooks feature. We also introduced the SageMaker API, which is a front end for Google TensorFlow and other … [Read more...]