Here we explain how to use Apache Hive with ElasticSearch. We will copy an Apache webserver log...
Author - Walker Rowe
Using ElasticSearch with Apache Spark
ElasticSearch is a JSON database popular with log processing systems. For example, organizations...
Using Spark with Hive
Here we explain how to use Apache Spark with Hive. That means instead of Hive storing data in...
How to write a Hive User Defined Function (UDF) in Java
Here we show how to write user defined functions (UDF) in Java and call that from Hive. You can...
What is Apache HCatalog? HCatalog Explained
Here we explain what HCatalog is and why it is useful to Hadoop programmers. Basically, HCatalog...
Apache Hive Beeline Client, Import CSV File into Hive
Beeline has replaced the Hive CLI in what Hive was formally called HiveServer1. Now Hive is called...
K-means Clustering with Apache Spark
Here we show a simple example of how to use k-means clustering. We will look at crime statistics...
Apache Spark: Working with Streams
In the last two posts we wrote, we explained how to read data streaming from Twitter into Apache...
Using Zeppelin with Big Data
Zeppelin is an interactive notebook. It lets you write code into a web page, execute it, and...
Spark Decision Tree Classifier
Here we explain how to use the Decision Tree Classifier with Apache Spark ML (machine learning). We...