Imagine creating a live chart that updates as data flows in. With this you could watch currency value fluctuations, streaming IOT data, application performance, cybersecurity events, or other data in real time.
Here we show how to pass variables to AngularJS below.
HighCharts is free for non-commercial use. You could also look at Google Charts for the R programming language, which is called googleVis. Regarding Google Charts, it is designed for Google’s own cloud and Google Big Tables rather than open source Apache Spark.
One thing that makes graphing difficult in any language is understanding the different kinds of charts. There are dozens. You can learn something about that by studying these design principles on the HighCharts website.
So let’s get going. Here we are going to give the simplest possible example of passing data to an web page (i.e., AngularJS). Then we give the simplest possible example of Spark Streaming.
First, you need to install Zeppelin. The easiest way to do that is to use Docker:
docker pull dylanmei/zeppelin
docker run –rm -p 8080:8080 dylanmei/zeppelin
Zeppelin lets you pass variables to Angular using z-commands, like z.put, z.run, and z.angularBind. Here we make the simplest possible example. (The problem with most examples on the internet is they are too complicated. No one starts with a very simple example that you can easily understand. So we do.)
Below we make an Array of 1 element. Then we create an RDD and map over it 1 time to pass the value 1 to the HTML table element “1” shown below.
val data = Array(1)
val distData = sc.parallelize(data)
distData.map( l=> z.angularBind(l.toString, 1))
So when you run that and the Spark Scala code it will output:
Below we give an example of Spark Streaming code. In part II of this blog post we will show how to make a graph using this data and HighCharts. But for now just get a feel of how to create streaming code. We also wrote another, more complex example of streaming Twitter Tweets here.
Know that when you run streaming code in Zeppelin it will freeze up your browser, since it is streaming. So you have to kill the Docker process like this:
docker stop $(docker ps -aq)
If you kill it any other way you will have errors because you are trying to run to SparkContexts at the same time.
What this example does is listen on port 80 for traffic on your laptop, i.e., web page traffic. If you were to dump that as text you could see data packets. Here we just create a Dstream and use the info() method to echo some of that. It does not show the whole packet.
To prove that you have data coming into port 80 on your laptop, you could install nmap and then run:
while true;do nmap –packet-trace -A localhost -p 80;sleep 1;done
Here is the code.
val ssc = new StreamingContext(sc, Seconds(5))
val lines = ssc.socketTextStream("localhost", 80)
It should output something like this is a continuous loop:
Time: 1499497850000 ms
- Apache Spark: Working with Streams
- Reading Streaming Twitter feeds into Apache Spark
- Using Logistic Regression, Scala, and Spark
- Using Hive Advanced User Defined Functions with Generic and Complex Data Types
- How Hadoop & Workload Automation Make FRTB Compliance More Manageable
Want to Learn More About Big Data and What It Can Do for You?
BMC recently published an authoritative guide on big data automation. It’s called Managing Big Data Workflows for Dummies. Download now and learn to manage big data workflows to increase the value of enterprise data.