Interacting with Data on HDP using Apache Zeppelin and Apache Spark

Posted on 2 December, 2015 by jack

In this section we are going to walk through the process of using Apache Zeppelin and Apache Spark to interactively analyze data on a Apache Hadoop Cluster.

By the end of this tutorial, you will have learned:

How to interact with Apache Spark from Apache Zeppelin
How to read a text file from HDFS and create a RDD
How to interactively analyze a data set through a rich set of Spark API operations

Continue reading →

A short primer on Scala

Posted on 2 December, 2015 by jack

Object-Oriented Meets Functional

Scala is relatively new language based on the JVM. The main difference between other “Object Oriented Languages” and Scala is that everything in Scala is an object. The primitive types that are defined in Java, such as int or boolean, are objects in Scala. Functions are treated as objects, too. As objects, they can be passed as arguments, allowing a functional programming approach to writing applications for Apache Spark.

If you have programmed in Java or C#, you should feel right at home with Scala with very little effort.

You can also run or compile Scala programs from commandline or from IDEs such as Eclipse.

Test on HDP 2.3.2 with Spark 1.4.1

Continue reading →

Hands-on Tour of Apache Spark in 5 Minutes

Posted on 1 December, 2015 by jack

Hortonworks Sandbox

A Hands-On Example

Let’s open a shell to our Sandbox through SSH:

ssh -p 2222 root@127.0.0.1

1	ssh -p 2222 root@127.0.0.1

or putty

Continue reading →

[VirtualBox] ssh into a guest using NAT

Posted on 1 December, 2015 by jack

รัน VM ขึ้นมาก่อน จากนั้นไปที่ Devices > Network > Network Settings…

เลือก เป็น Nat แล้ว เลือก Port Forwarding

Continue reading →

Phaisarn Sutheebanjard

blog.phaisarn.com

Monthly Archives: December 2015

Interacting with Data on HDP using Apache Zeppelin and Apache Spark

A short primer on Scala

Hands-on Tour of Apache Spark in 5 Minutes

A Hands-On Example

[VirtualBox] ssh into a guest using NAT