How to Install Apache Zeppelin on Ubuntu 16.04

Apache Zeppelin is a web-based open source notebook and collaborative tool for interactive data ingestion, discovery, analytics and visualization. Zeppelin supports more than 20 languages including Apache Spark, SQL, R, Elasticsearch and many more. Apache Zeppelin allows you to create beautiful data-driven documents and see the results of your analytics.

Prerequisites

  • A Vultr Ubuntu 16.04 server instance.
  • A sudo user.
  • A domain name pointed towards the server.

For this tutorial, we will use zeppelin.example.com as the domain name pointed towards the Vultr instance. Please make sure to replace all occurrences of the example domain name with the actual one.

Update your base system using the guide How to Update Ubuntu 16.04. Once your system has been updated, proceed to install Java.

Install Java

Apache Zeppelin is written in Java, thus it requires JDK to work. Add the Ubuntu repository for Oracle Java 8.

sudo add-apt-repository --yes ppa:webupd8team/java
sudo apt update

Install Oracle Java.

sudo apt -y install oracle-java8-installer

Verify its version.

java -version

You will see the following output.

user@vultr:~$ java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

Set the default path for the Java by installing the following package.

sudo apt -y install oracle-java8-set-default

You can verify if JAVA_HOME is set by running.

echo $JAVA_HOME

You will see.

user@vultr:~$ echo $JAVA_HOME
/usr/lib/jvm/java-8-oracle

If you see no output at all, you will need to log out from the current shell and log back in.

Install Zeppelin

Apache Zeppelin ships all the dependencies along with the binary files, so we do not need to install anything else except Java. Download the Zeppelin binary on your system. You can always find the latest version of the application on Zeppelin download page.

wget http://www-us.apache.org/dist/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-all.tgz

Extract the archive.

sudo tar xf zeppelin-*-bin-all.tgz -C /opt

The above command will extract the archive to /opt/zeppelin-0.7.3-bin-all. Rename the directory for the sake of convenience.

sudo mv /opt/zeppelin-*-bin-all /opt/zeppelin

Apache Zeppelin is now installed. You can immediately start the application, but it will not be accessible to you, as it listens to localhost only. We will configure Apache Zeppelin as a service. We will also configure Nginx as a reverse proxy.

Configure Systemd

In this step, we will set up a Systemd unit file for the Zeppelin application. This will ensure that the application process is automatically started on system restart and failures.

For security reasons, create an unprivileged user for running the Zeppelin process.

sudo useradd -d /opt/zeppelin -s /bin/false zeppelin

Provide ownership of the files to the newly created Zeppelin user.

sudo chown -R zeppelin:zeppelin /opt/zeppelin

Create a new Systemd service unit file.

sudo nano /etc/systemd/system/zeppelin.service

Populate the file with the following.

[Unit]
Description=Zeppelin service
After=syslog.target network.target

[Service]
Type=forking
ExecStart=/opt/zeppelin/bin/zeppelin-daemon.sh start
ExecStop=/opt/zeppelin/bin/zeppelin-daemon.sh stop
ExecReload=/opt/zeppelin/bin/zeppelin-daemon.sh reload
User=zeppelin
Group=zeppelin
Restart=always

[Install]
WantedBy=multi-user.target

Start the application.

sudo systemctl start zeppelin

Enable Zeppelin service to automatically start at boot time.

sudo systemctl enable zeppelin

To ensure that the service is running, you can run the following.

sudo systemctl status zeppelin

Configure Reverse Proxy

By default, the Zeppelin server listens to localhost on port 8080. We will use Nginx as a reverse proxy so that the application can be accessed via standard HTTP and HTTPS ports. We will also configure Nginx to use an SSL generated with Let's Encrypt free SSL CA.

Install Nginx.

sudo apt -y install nginx

Start Nginx and enable it to automatically start at boot time.

sudo systemctl start nginx
sudo systemctl enable nginx

Add the Certbot repository.

sudo add-apt-repository --yes ppa:certbot/certbot
sudo apt-get update

Install Certbot, which is the client application for Let's Encrypt CA.

sudo apt -y install certbot

Note: To obtain certificates from Let's Encrypt CA, the domain for which the certificates are to be generated must be pointed towards the server. If not, make the necessary changes to the DNS records of the domain and wait for the DNS to propagate before making the certificate request again. Certbot checks the domain authority before providing the certificates.

Generate the SSL certificates.

sudo certbot certonly --webroot -w /var/www/html -d zeppelin.example.com

The generated certificates are likely to be stored in /etc/letsencrypt/live/zeppelin.example.com/. The SSL certificate will be stored as fullchain.pem and private key will be stored as privkey.pem.

Let's Encrypt certificates expire in 90 days, hence it is recommended to set up auto-renewal of the certificates using Cron jobs.

Open the cron job file.

sudo crontab -e

Add the following line at the end of the file.

30 5 * * * /usr/bin/certbot renew --quiet

The above cron job will run every day at 5:30 AM. If the certificate is due for expiration, it will automatically be renewed.

Create a new server block file for the Zeppelin site.

sudo nano /etc/nginx/sites-available/zeppelin

Populate the file.

upstream zeppelin {
server 127.0.0.1:8080;
}
server {
    listen 80;
    server_name zeppelin.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443;
    server_name zeppelin.example.com;

    ssl_certificate           /etc/letsencrypt/live/zeppelin.example.com/fullchain.pem;
    ssl_certificate_key       /etc/letsencrypt/live/zeppelin.example.com/privkey.pem;

    ssl on;
    ssl_session_cache  builtin:1000  shared:SSL:10m;
    ssl_protocols  TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
    ssl_prefer_server_ciphers on;

    access_log  /var/log/nginx/zeppelin.access.log;

location / {
        proxy_pass http://zeppelin;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header X-NginX-Proxy true;
        proxy_redirect off;
    }
location /ws {
    proxy_pass http://zeppelin/ws;
    proxy_http_version 1.1;
    proxy_set_header Upgrade websocket;
    proxy_set_header Connection upgrade;
    proxy_read_timeout 86400;
    }
  }

Activate the configuration file.

sudo ln -s /etc/nginx/sites-available/zeppelin /etc/nginx/sites-enabled/zeppelin

Restart Nginx so that the changes can take effect.

sudo systemctl restart nginx zeppelin

Zeppelin is now accessible on the following address.

https://zeppelin.example.com

By default, there is no authentication enabled, so you can use the application directly.

Since the application is accessible to everyone, the notebooks you create are also accessible to everyone. It is very important to disable anonymous access and enable authentication so that only the authenticated users can access the application.

Disable Anonymous Access

To disable the default anonymous access, copy the configuration file template to its live location.

cd /opt/zeppelin
sudo cp conf/zeppelin-site.xml.template conf/zeppelin-site.xml

Edit the configuration file.

sudo nano conf/zeppelin-site.xml

Find the following lines in the file.

<property>
  <name>zeppelin.anonymous.allowed</name>
  <value>true</value>

Change the value to false to disable the anonymous access.

Enable Shiro Authentication

Now that we have disabled the anonymous access, we need to enable some kind of authentication mechanism so that privileged users can log in. Apache Zeppelin uses Apache Shiro authentication. Copy the Shiro configuration file.

sudo cp conf/shiro.ini.template conf/shiro.ini

Edit the configuration file.

sudo nano conf/shiro.ini

Find the following lines in the file.

[users]

admin = password1, admin
user1 = password2, role1, role2
user2 = password3, role3
user3 = password4, role2

The list contains the username, password, and roles of the users. For now, we will only use admin and user1. Change the password of admin and user1 and disable the other users by commenting them. You can also change the username and roles of the users. To learn more about Apache Shiro users and roles, read the Shiro authorization guide.

Once you have changed the passwords, the code block should will like this.

[users]

admin = StrongPassword, admin
user1 = UserPassword, role1, role2
# user2 = password3, role3
# user3 = password4, role2

Now restart Zeppelin to apply the changes.

sudo systemctl restart zeppelin

You will see that the authentication has been enabled and you will be able to log in using the username and password set in the Shiro configuration file.



Leave a Comment

How to Install Alfresco Community Edition on Ubuntu 16.04

How to Install Alfresco Community Edition on Ubuntu 16.04

Using a Different System? Alfresco Community Edition is an open source version of the Alfresco Content Services. It is written in Java and uses PostgreSQL t

Cómo instalar osTicket en FreeBSD 12

Cómo instalar osTicket en FreeBSD 12

¿Usando un sistema diferente? osTicket es un sistema de tickets de soporte al cliente de código abierto. El código fuente de osTicket está alojado públicamente en Github. En este tutorial

Cómo instalar osTicket en Fedora 30

Cómo instalar osTicket en Fedora 30

¿Usando un sistema diferente? osTicket es un sistema de tickets de soporte al cliente de código abierto. El código fuente de osTicket está alojado públicamente en Github. En este tutorial

How to Install Matomo Analytics on Debian 9

How to Install Matomo Analytics on Debian 9

Using a Different System? Matomo (formerly Piwik) is an open source analytics platform, an open alternative to Google Analytics. Matomo source is hosted o

How to Install Osclass on FreeBSD 12

How to Install Osclass on FreeBSD 12

Using a Different System? Osclass is an open source project that allows you to easily create a classified site without any technical knowledge. Its sourc

How to Install Matomo Analytics on Ubuntu 16.04

How to Install Matomo Analytics on Ubuntu 16.04

Using a Different System? Matomo (formerly Piwik) is an open source analytics platform, an open alternative to Google Analytics. Matomo source is hosted o

Cómo instalar X-Cart 5 en Ubuntu 18.04 LTS

Cómo instalar X-Cart 5 en Ubuntu 18.04 LTS

¿Usando un sistema diferente? X-Cart es una plataforma de comercio electrónico de código abierto extremadamente flexible con toneladas de características e integraciones. El código fuente de X-Cart es hoste

Installing Microweber on Ubuntu 16.04

Installing Microweber on Ubuntu 16.04

Using a Different System? Microweber is an open source drag and drop CMS and online shop. Microweber source code is hosted on GitHub. This guide will show yo

How to Install Mailtrain Newsletter Application on Ubuntu 16.04

How to Install Mailtrain Newsletter Application on Ubuntu 16.04

Using a Different System? Mailtrain is an open-source self hosted newsletter app built on Node.js and MySQL/MariaDB. Mailtrains source is on GitHub. Thi

How to Install Matomo Analytics on FreeBSD 11

How to Install Matomo Analytics on FreeBSD 11

Using a Different System? Matomo (formerly Piwik) is an open source analytics platform, an open alternative to Google Analytics. Matomo source is hosted o

How to Install Mailtrain Newsletter Application on CentOS 7

How to Install Mailtrain Newsletter Application on CentOS 7

Using a Different System? Mailtrain is an open-source self hosted newsletter app built on Node.js and MySQL/MariaDB. Mailtrains source is on GitHub. Thi

How to Install Mailtrain Newsletter Application on Debian 9

How to Install Mailtrain Newsletter Application on Debian 9

Using a Different System? Mailtrain is an open-source self hosted newsletter app built on Node.js and MySQL/MariaDB. Mailtrains source is on GitHub. Thi

Cómo instalar Taiga Project Management Tool en Ubuntu 16.04

Cómo instalar Taiga Project Management Tool en Ubuntu 16.04

¿Usando un sistema diferente? Taiga es una aplicación gratuita y de código abierto para la gestión de proyectos. A diferencia de otras herramientas de gestión de proyectos, Taiga utiliza un incre

How to Install osTicket on Ubuntu 18.04 LTS

How to Install osTicket on Ubuntu 18.04 LTS

Using a Different System? osTicket is an open-source customer support ticketing system. osTicket source code is publicly hosted on Github. In this tutorial

Cómo instalar Alfresco Community Edition en CentOS 7

Cómo instalar Alfresco Community Edition en CentOS 7

¿Usando un sistema diferente? Alfresco Community Edition es una versión de código abierto de Alfresco Content Services. Está escrito en Java y usa PostgreSQL t

Installing Akaunting on FreeBSD 12

Installing Akaunting on FreeBSD 12

Using a Different System? Introduction Akaunting is a free, open source and online accounting software designed for small businesses and freelancers. It i

Cómo instalar Zammad 2.0 en Ubuntu 16.04 LTS

Cómo instalar Zammad 2.0 en Ubuntu 16.04 LTS

¿Usando un sistema diferente? Zammad es un sistema de asistencia / tickets de código abierto diseñado para equipos de atención al cliente. Con Zammad, servicio al cliente

Installing Akaunting on Ubuntu 16.04

Installing Akaunting on Ubuntu 16.04

Using a Different System? Akaunting is a free, open source and online accounting software designed for small businesses and freelancers. It is built wit

How to Install InvoicePlane on FreeBSD 12

How to Install InvoicePlane on FreeBSD 12

Using a Different System? InvoicePlane is a free and open source invoicing application. Its source code can be found on this Github repository. This guid

How to Install Matomo Analytics on Fedora 28

How to Install Matomo Analytics on Fedora 28

Using a Different System? Matomo (formerly Piwik) is an open source analytics platform, an open alternative to Google Analytics. Matomo source is hosted o

ZPanel y Sentora en CentOS 6 x64

ZPanel y Sentora en CentOS 6 x64

ZPanel, un panel de control de alojamiento web popular, se bifurcó en 2014 a un nuevo proyecto llamado Sentora. Aprende a instalar Sentora en tu servidor con este tutorial.

Cómo instalar Vtiger CRM Open Source Edition en CentOS 7

Cómo instalar Vtiger CRM Open Source Edition en CentOS 7

Aprende cómo instalar Vtiger CRM, una aplicación de gestión de relaciones con el cliente, en CentOS 7 para aumentar tus ventas y mejorar el servicio al cliente.

Cómo instalar el servidor Counter-Strike 1.6 en Linux

Cómo instalar el servidor Counter-Strike 1.6 en Linux

Esta guía completa le mostrará cómo configurar un servidor Counter-Strike 1.6 en Linux, optimizando el rendimiento y la seguridad para el mejor juego. Aprende los pasos más recientes aquí.

¿Puede la IA luchar con un número cada vez mayor de ataques de ransomware?

¿Puede la IA luchar con un número cada vez mayor de ataques de ransomware?

Los ataques de ransomware van en aumento, pero ¿puede la IA ayudar a lidiar con el último virus informático? ¿Es la IA la respuesta? Lea aquí, sepa que la IA es una bendición o una perdición

ReactOS: ¿Es este el futuro de Windows?

ReactOS: ¿Es este el futuro de Windows?

ReactOS, un sistema operativo de código abierto y gratuito, está aquí con la última versión. ¿Puede satisfacer las necesidades de los usuarios de Windows de hoy en día y acabar con Microsoft? Averigüemos más sobre este estilo antiguo, pero una experiencia de sistema operativo más nueva.

Manténgase conectado a través de la aplicación de escritorio WhatsApp 24 * 7

Manténgase conectado a través de la aplicación de escritorio WhatsApp 24 * 7

Whatsapp finalmente lanzó la aplicación de escritorio para usuarios de Mac y Windows. Ahora puede acceder a Whatsapp desde Windows o Mac fácilmente. Disponible para Windows 8+ y Mac OS 10.9+

¿Cómo puede la IA llevar la automatización de procesos al siguiente nivel?

¿Cómo puede la IA llevar la automatización de procesos al siguiente nivel?

Lea esto para saber cómo la Inteligencia Artificial se está volviendo popular entre las empresas de pequeña escala y cómo está aumentando las probabilidades de hacerlas crecer y dar ventaja a sus competidores.

La actualización complementaria de macOS Catalina 10.15.4 está causando más problemas que resolver

La actualización complementaria de macOS Catalina 10.15.4 está causando más problemas que resolver

Recientemente, Apple lanzó macOS Catalina 10.15.4, una actualización complementaria para solucionar problemas, pero parece que la actualización está causando más problemas que conducen al bloqueo de las máquinas Mac. Lee este artículo para obtener más información

13 Herramientas comerciales de extracción de datos de Big Data

13 Herramientas comerciales de extracción de datos de Big Data

13 Herramientas comerciales de extracción de datos de Big Data

¿Qué es un sistema de archivos de diario y cómo funciona?

¿Qué es un sistema de archivos de diario y cómo funciona?

Nuestra computadora almacena todos los datos de una manera organizada conocida como sistema de archivos de diario. Es un método eficiente que permite a la computadora buscar y mostrar archivos tan pronto como presiona buscar.