NoteBook学习(二)-------- Zeppelin简介与安装

时间:2024-04-01 17:03:52

Zeppelin官网地址:

http://zeppelin.apache.org/

Github地址:

https://github.com/apache/zeppelin

(参照官网)

1、什么是zeppelin

  多用途的笔记本。数据的采集  发现 分析  可视化 协作。。 支持20+种后端语言,支持多种解释器  内置集成Spark

2、安装

  这里安装zeppelin0.8.0

Name Value
Oracle JDK 1.7 
(set JAVA_HOME)
OS Mac OSX 
Ubuntu 14.X 
CentOS 6.X 
Windows 7 Pro SP1

需要环境:JDK 7 以上     centos6以上

  下载完整包,下载地址:

  http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.8.0/zeppelin-0.8.0-bin-all.tgz

  下载完成后,上传centos服务器    解压:

  tar -zxvf zeppelin-0.8.0-bin-all.tgz

  在目录下启动,注意默认端口是8080:

  bin/zeppelin-daemon.sh start

  停止进程  zeppelin-daemon.sh stop

  注意此时配置没有修改,用的都是默认的   java_home   xmx   xms   ...

   启动成功后,浏览器访问:localhost:8080

   就能看到UI页面

   bin/zeppelin-daemon.sh stop 停止进程
也可以注册成服务
3、配置 zeppelin的配置主要是两个:
配置环境变量 conf/zeppelin-env.sh
配置java属性 conf/zeppelin-site.xml 环境变量优先 官网表格如下:
zeppelin-env.sh zeppelin-site.xml Default value Description
ZEPPELIN_PORT
zeppelin.server.port
8080 Zeppelin server port 
Note: Please make sure you're not using the same port with Zeppelin web application development port (default: 9000).
ZEPPELIN_SSL_PORT
zeppelin.server.ssl.port
8443 Zeppelin Server ssl port (used when ssl environment/property is set to true)
ZEPPELIN_MEM
N/A -Xmx1024m -XX:MaxPermSize=512m JVM mem options
ZEPPELIN_INTP_MEM
N/A ZEPPELIN_MEM JVM mem options for interpreter process
ZEPPELIN_JAVA_OPTS
N/A   JVM options
ZEPPELIN_ALLOWED_ORIGINS
zeppelin.server.allowed.origins
* Enables a way to specify a ',' separated list of allowed origins for REST and websockets. 
e.g. http://localhost:8080
ZEPPELIN_CREDENTIALS_PERSIST
zeppelin.credentials.persist
true Persist credentials on a JSON file (credentials.json)
ZEPPELIN_CREDENTIALS_ENCRYPT_KEY
zeppelin.credentials.encryptKey
  If provided, encrypt passwords on the credentials.json file (passwords will be stored as plain-text otherwise
N/A
zeppelin.anonymous.allowed
true The anonymous user is allowed by default.
ZEPPELIN_SERVER_CONTEXT_PATH
zeppelin.server.context.path
/ Context path of the web application
ZEPPELIN_SSL
zeppelin.ssl
false  
ZEPPELIN_SSL_CLIENT_AUTH
zeppelin.ssl.client.auth
false  
ZEPPELIN_SSL_KEYSTORE_PATH
zeppelin.ssl.keystore.path
keystore  
ZEPPELIN_SSL_KEYSTORE_TYPE
zeppelin.ssl.keystore.type
JKS  
ZEPPELIN_SSL_KEYSTORE_PASSWORD
zeppelin.ssl.keystore.password
   
ZEPPELIN_SSL_KEY_MANAGER_PASSWORD
zeppelin.ssl.key.manager.password
   
ZEPPELIN_SSL_TRUSTSTORE_PATH
zeppelin.ssl.truststore.path
   
ZEPPELIN_SSL_TRUSTSTORE_TYPE
zeppelin.ssl.truststore.type
   
ZEPPELIN_SSL_TRUSTSTORE_PASSWORD
zeppelin.ssl.truststore.password
   
ZEPPELIN_NOTEBOOK_HOMESCREEN
zeppelin.notebook.homescreen
  Display note IDs on the Apache Zeppelin homescreen 
e.g. 2A94M5J1Z
ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE
zeppelin.notebook.homescreen.hide
false Hide the note ID set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen. 
For the further information, please read Customize your Zeppelin homepage.
ZEPPELIN_WAR_TEMPDIR
zeppelin.war.tempdir
webapps Location of the jetty temporary directory
ZEPPELIN_NOTEBOOK_DIR
zeppelin.notebook.dir
notebook The root directory where notebook directories are saved
ZEPPELIN_NOTEBOOK_S3_BUCKET
zeppelin.notebook.s3.bucket
zeppelin S3 Bucket where notebook files will be saved
ZEPPELIN_NOTEBOOK_S3_USER
zeppelin.notebook.s3.user
user User name of an S3 bucket
e.g. bucket/user/notebook/2A94M5J1Z/note.json
ZEPPELIN_NOTEBOOK_S3_ENDPOINT
zeppelin.notebook.s3.endpoint
s3.amazonaws.com Endpoint for the bucket
ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID
zeppelin.notebook.s3.kmsKeyID
  AWS KMS Key ID to use for encrypting data in S3 (optional)
ZEPPELIN_NOTEBOOK_S3_EMP
zeppelin.notebook.s3.encryptionMaterialsProvider
  Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional)
ZEPPELIN_NOTEBOOK_S3_SSE
zeppelin.notebook.s3.sse
false Save notebooks to S3 with server-side encryption enabled
ZEPPELIN_NOTEBOOK_S3_SIGNEROVERRIDE
zeppelin.notebook.s3.signerOverride
  Optional override to control which signature algorithm should be used to sign AWS requests
ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING
zeppelin.notebook.azure.connectionString
  The Azure storage account connection string
e.g. 
DefaultEndpointsProtocol=https;
AccountName=<accountName>;
AccountKey=<accountKey>
ZEPPELIN_NOTEBOOK_AZURE_SHARE
zeppelin.notebook.azure.share
zeppelin Azure Share where the notebook files will be saved
ZEPPELIN_NOTEBOOK_AZURE_USER
zeppelin.notebook.azure.user
user Optional user name of an Azure file share
e.g. share/user/notebook/2A94M5J1Z/note.json
ZEPPELIN_NOTEBOOK_STORAGE
zeppelin.notebook.storage
org.apache.zeppelin.notebook.repo.GitNotebookRepo Comma separated list of notebook storage locations
ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC
zeppelin.notebook.one.way.sync
false If there are multiple notebook storage locations, should we treat the first one as the only source of truth?
ZEPPELIN_NOTEBOOK_PUBLIC
zeppelin.notebook.public
true Make notebook public (set only owners) by default when created/imported. If set to falsewill add user to readers and writers as well, making it private and invisible to other users unless permissions are granted.
ZEPPELIN_INTERPRETERS
zeppelin.interpreters
org.apache.zeppelin.spark.SparkInterpreter,
org.apache.zeppelin.spark.PySparkInterpreter,
org.apache.zeppelin.spark.SparkSqlInterpreter,
org.apache.zeppelin.spark.DepInterpreter,
org.apache.zeppelin.markdown.Markdown,
org.apache.zeppelin.shell.ShellInterpreter,
...
Comma separated interpreter configurations [Class]

NOTE: This property is deprecated since Zeppelin-0.6.0 and will not be supported from Zeppelin-0.7.0.

ZEPPELIN_INTERPRETER_DIR
zeppelin.interpreter.dir
interpreter Interpreter directory
ZEPPELIN_INTERPRETER_DEP_MVNREPO
zeppelin.interpreter.dep.mvnRepo
http://repo1.maven.org/maven2/ Remote principal repository for interpreter's additional dependency loading
ZEPPELIN_INTERPRETER_OUTPUT_LIMIT
zeppelin.interpreter.output.limit
102400 Output message from interpreter exceeding the limit will be truncated
ZEPPELIN_INTERPRETER_CONNECT_TIMEOUT
zeppelin.interpreter.connect.timeout
30000 Output message from interpreter exceeding the limit will be truncated
ZEPPELIN_DEP_LOCALREPO
zeppelin.dep.localrepo
local-repo Local repository for dependency loader.
ex)visualiztion modules of npm.
ZEPPELIN_HELIUM_NODE_INSTALLER_URL
zeppelin.helium.node.installer.url
https://nodejs.org/dist/ Remote Node installer url for Helium dependency loader
ZEPPELIN_HELIUM_NPM_INSTALLER_URL
zeppelin.helium.npm.installer.url
http://registry.npmjs.org/ Remote Npm installer url for Helium dependency loader
ZEPPELIN_HELIUM_YARNPKG_INSTALLER_URL
zeppelin.helium.yarnpkg.installer.url
https://github.com/yarnpkg/yarn/releases/download/ Remote Yarn package installer url for Helium dependency loader
ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE
zeppelin.websocket.max.text.message.size
1024000 Size(in characters) of the maximum text message that can be received by websocket.
ZEPPELIN_SERVER_DEFAULT_DIR_ALLOWED
zeppelin.server.default.dir.allowed
false Enable directory listings on server.
ZEPPELIN_NOTEBOOK_GIT_REMOTE_URL
zeppelin.notebook.git.remote.url
  GitHub's repository URL. It could be either the HTTP URL or the SSH URL. For example git@github.com:apache/zeppelin.git
ZEPPELIN_NOTEBOOK_GIT_REMOTE_USERNAME
zeppelin.notebook.git.remote.username
token GitHub username. By default it is `token` to use GitHub's API
ZEPPELIN_NOTEBOOK_GIT_REMOTE_ACCESS_TOKEN
zeppelin.notebook.git.remote.access-token
token GitHub access token to use GitHub's API. If username/password combination is used and not GitHub API, then this value is the password
ZEPPELIN_NOTEBOOK_GIT_REMOTE_ORIGIN
zeppelin.notebook.git.remote.origin
token GitHub remote name. Default is `origin`
配置ssl需要更多的配置

4 Zeppelin UI

登录到localhost:8080后,就可以看到UI页面
可以参照官方文档学习相关操作:
https://zeppelin.apache.org/docs/latest/quickstart/explore_ui.html

 左侧   import note  可以导入笔记本

create note 是创建笔记本  也可以看到之前的笔记本  他们默认保存在$ZEPPELIN_HOME/notebook下

右上角可以设置shiro配置   配置信息  凭证  解释器  解释器可以编辑  比如spark的路径

创建好后进入编辑页面,可以编写代码 ,执行。