Difficulty: beginner
Estimated Time: 15 minutes

The MapR Data Platform integrates Apache Hadoop, Apache Spark, and Apache Drill with real-time database capabilities, global event streaming, and scalable enterprise storage to power a new generation of Big Data applications. MapR solves the challenges of complex data environments by managing data and its ecosystem across multiple clouds and containerized infrastructures.

In this scenario you will become familiar with the MapR data platform by interacting with a single-node MapR cluster.

In this scenario you saw how MapR combines Hadoop, Spark, and Apache Drill with a distributed file system, distributed database, and distributed event streaming, all on a single cluster. This improves performance and lowers hardware costs for Big Data applications. The MapR Data Platform allows you to manage your data with any tooling on any infrastructure.

Would you like to learn more about MapR? Check out our blog, In Search of a Data Platform.

If you'd like to speak with MapR, contact us!

Spark with Zeppelin (boilerplate)

Step 1 - Open Zeppelin

This is a boilerplate tutorial for Spark and Zeppelin. The Spark interpretter in Zeppelin is preconfigured to run on YARN.

Wait about 5 minutes for Zeppelin to download and install.

  1. ps -e -f | grep zeppelin
  2. Open Zeppelin in the toolbar tab
  3. Open and run the Forest Fire Prediction notebook
Terminal
Zeppelin
MapR Control System