Spark 1.3.0 单机安装

时间:2023-03-08 19:29:05

一、试验环境:

CentOS6.6 最小化安装;主机名spark-test,IP:10.10.10.26

OpenStack虚拟云主机。

注:安装流程:进入linux->安装JDK->安装scala->安装spark。


二、安装JDK

下载JDK:

版本jdk-6u45-linux-x64.bin,下载见Oracle官网

建立data文件夹,用来存放数据

# mkdir /data

[root@spark-test data]# ls
jdk-6u45-linux-x64.bin  scala-2.11.6.tgz  spark-1.3.0-bin-hadoop2.4.tgz

安装jdk

[root@spark-test data]# chmod u+x jdk-6u45-linux-x64.bin      //增加执行权限
[root@spark-test data]# ./jdk-6u45-linux-x64.bin

配置环境变量

[root@spark-test data]# vim /etc/profile

#JAVA VARIABLES START
export JAVA_HOME=/data/jdk1.6.0_45
export PATH=$PATH:$JAVA_HOME/bin
#JAVA VARIABLES END

[root@spark-test data]# source /etc/profile
[root@spark-test data]# java -version
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

三、安装scala

下载Scala,版本2.11.6  网址:http://www.scala-lang.org/download/2.11.6.html

Spark 1.3.0 单机安装

安装Scala

[root@spark-test data]# tar -zxvf  scala-2.11.6.tgz

配置环境变量

[root@spark-test data]# vim /etc/profile

#SCALA VARIABLES START
export SCALA_HOME=/data/scala-2.11.6
export PATH=$PATH:$SCALA_HOME/bin
#SCALA VARIABLES END

[root@spark-test data]# source /etc/profile
[root@spark-test data]# scala -version
Scala code runner version 2.11.6 -- Copyright 2002-2013, LAMP/EPFL

Scala配置成功

四、安装Spark

从官网下载http://spark.apache.org/downloads.html

Spark 1.3.0 单机安装

下载编译后的版本

解压安装

[root@spark-test data]# tar -zxvf spark-1.3.0-bin-hadoop2.4.tgz

配置Spark环境变量:

[root@spark-test data]# vim /etc/profile

#SPARK VARIABLES START
export SPARK_HOME=/data/spark-1.3.0-bin-hadoop2.4
export PATH=$PATH:$SPARK_HOME/bin
#SPARK VARIABLES END

[root@spark-test data]# source /etc/profile

切换到conf目录:

[root@spark-test conf]# ls
fairscheduler.xml.template   slaves.template
log4j.properties.template    spark-defaults.conf.template
metrics.properties.template  spark-env.sh.template
[root@spark-test conf]# mv spark-env.sh.template spark-env.sh

[root@spark-test conf]# vim spark-env.sh 

export SCALA_HOME=/data/scala-2.11.6
export JAVA_HOME=/data/jdk1.6.0_45
export SPARK_MASTER_IP=10.10.10.26
export SPARK_WORKER_MEMORY=1024m
export master=spark://10.10.10.26:7070

[root@spark-test conf]# vim slaves 

spark-test

启动spark集群:

[root@spark-test sbin]# pwd
/data/spark-1.3.0-bin-hadoop2.4/sbin
[root@spark-test sbin]# ./start-all.sh

验证:

[root@spark-test sbin]# jps
22974 Worker
23395 Jps
22830 Master

 

 

测试:

切换目录

[root@spark-test bin]# pwd
/data/spark-1.3.0-bin-hadoop2.4/bin

运行样例:

[root@spark-test spark-1.3.0-bin-hadoop2.4]# ./bin/run-example org.apache.spark.examples.SparkPi
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/04/01 11:40:48 INFO SparkContext: Running Spark version 1.3.0
15/04/01 11:40:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/01 11:40:49 INFO SecurityManager: Changing view acls to: root
15/04/01 11:40:49 INFO SecurityManager: Changing modify acls to: root
15/04/01 11:40:49 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/04/01 11:40:49 INFO Slf4jLogger: Slf4jLogger started
15/04/01 11:40:49 INFO Remoting: Starting remoting
15/04/01 11:40:50 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@spark-test.novalocal:58680]
15/04/01 11:40:50 INFO Utils: Successfully started service 'sparkDriver' on port 58680.
15/04/01 11:40:50 INFO SparkEnv: Registering MapOutputTracker
15/04/01 11:40:50 INFO SparkEnv: Registering BlockManagerMaster
15/04/01 11:40:50 INFO DiskBlockManager: Created local directory at /tmp/spark-53cdf980-4803-480f-8936-2b3bb7e2bbfc/blockmgr-c15cfa29-3bfb-4ee8-a0d3-b9735bfe9dea
15/04/01 11:40:50 INFO MemoryStore: MemoryStore started with capacity 265.0 MB
15/04/01 11:40:50 INFO HttpFileServer: HTTP File server directory is /tmp/spark-22f9b0df-bfdb-435d-b504-ab1c52b73556/httpd-244e5d7f-9c1d-48d8-bd95-2ed985ecb3a0
15/04/01 11:40:50 INFO HttpServer: Starting HTTP Server
15/04/01 11:40:50 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/01 11:40:50 INFO AbstractConnector: Started SocketConnector@0.0.0.0:59040
15/04/01 11:40:50 INFO Utils: Successfully started service 'HTTP file server' on port 59040.
15/04/01 11:40:50 INFO SparkEnv: Registering OutputCommitCoordinator
15/04/01 11:40:50 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/01 11:40:50 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/04/01 11:40:50 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/04/01 11:40:50 INFO SparkUI: Started SparkUI at http://spark-test.novalocal:4040
15/04/01 11:40:51 INFO SparkContext: Added JAR file:/data/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar at http://10.10.10.26:59040/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427859651127
15/04/01 11:40:51 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master...
15/04/01 11:40:51 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070
15/04/01 11:40:51 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070
15/04/01 11:41:11 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master...
15/04/01 11:41:11 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070
15/04/01 11:41:11 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070
15/04/01 11:41:31 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@10.10.10.26:7070/user/Master...
15/04/01 11:41:31 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.10.10.26:7070: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.10.10.26:7070
15/04/01 11:41:31 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.10.10.26:7070]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /10.10.10.26:7070
15/04/01 11:41:51 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/04/01 11:41:51 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
15/04/01 11:41:51 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
[root@spark-test spark-1.3.0-bin-hadoop2.4]# ./bin/run-example org.apache.spark.examples.SparkPi
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/04/01 11:53:22 INFO SparkContext: Running Spark version 1.3.0
15/04/01 11:53:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/01 11:53:22 INFO SecurityManager: Changing view acls to: root
15/04/01 11:53:22 INFO SecurityManager: Changing modify acls to: root
15/04/01 11:53:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/04/01 11:53:23 INFO Slf4jLogger: Slf4jLogger started
15/04/01 11:53:23 INFO Remoting: Starting remoting
15/04/01 11:53:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@spark-test.novalocal:55722]
15/04/01 11:53:23 INFO Utils: Successfully started service 'sparkDriver' on port 55722.
15/04/01 11:53:23 INFO SparkEnv: Registering MapOutputTracker
15/04/01 11:53:23 INFO SparkEnv: Registering BlockManagerMaster
15/04/01 11:53:23 INFO DiskBlockManager: Created local directory at /tmp/spark-d70142c7-effd-40c0-b050-f39d727d6e33/blockmgr-6d5699cc-acf8-4ab9-8b39-dfb5385209e5
15/04/01 11:53:23 INFO MemoryStore: MemoryStore started with capacity 265.0 MB
15/04/01 11:53:23 INFO HttpFileServer: HTTP File server directory is /tmp/spark-5821f748-ecf7-4e24-a593-ff2c2b040b43/httpd-90a05ad6-f73b-4a52-9a61-0ff135f449a9
15/04/01 11:53:23 INFO HttpServer: Starting HTTP Server
15/04/01 11:53:23 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/01 11:53:23 INFO AbstractConnector: Started SocketConnector@0.0.0.0:43969
15/04/01 11:53:23 INFO Utils: Successfully started service 'HTTP file server' on port 43969.
15/04/01 11:53:23 INFO SparkEnv: Registering OutputCommitCoordinator
15/04/01 11:53:23 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/01 11:53:23 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/04/01 11:53:23 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/04/01 11:53:23 INFO SparkUI: Started SparkUI at http://spark-test.novalocal:4040
15/04/01 11:53:23 INFO SparkContext: Added JAR file:/data/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar at http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427860403997
15/04/01 11:53:24 INFO Executor: Starting executor ID <driver> on host localhost
15/04/01 11:53:24 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@spark-test.novalocal:55722/user/HeartbeatReceiver
15/04/01 11:53:24 INFO NettyBlockTransferService: Server created on 39015
15/04/01 11:53:24 INFO BlockManagerMaster: Trying to register BlockManager
15/04/01 11:53:24 INFO BlockManagerMasterActor: Registering block manager localhost:39015 with 265.0 MB RAM, BlockManagerId(<driver>, localhost, 39015)
15/04/01 11:53:24 INFO BlockManagerMaster: Registered BlockManager
15/04/01 11:53:24 INFO SparkContext: Starting job: reduce at SparkPi.scala:35
15/04/01 11:53:24 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:35) with 2 output partitions (allowLocal=false)
15/04/01 11:53:24 INFO DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35)
15/04/01 11:53:24 INFO DAGScheduler: Parents of final stage: List()
15/04/01 11:53:24 INFO DAGScheduler: Missing parents: List()
15/04/01 11:53:24 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:31), which has no missing parents
15/04/01 11:53:24 INFO MemoryStore: ensureFreeSpace(1848) called with curMem=0, maxMem=277842493
15/04/01 11:53:24 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1848.0 B, free 265.0 MB)
15/04/01 11:53:24 INFO MemoryStore: ensureFreeSpace(1296) called with curMem=1848, maxMem=277842493
15/04/01 11:53:24 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1296.0 B, free 265.0 MB)
15/04/01 11:53:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:39015 (size: 1296.0 B, free: 265.0 MB)
15/04/01 11:53:24 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
15/04/01 11:53:24 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
15/04/01 11:53:24 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:31)
15/04/01 11:53:24 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
15/04/01 11:53:24 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1336 bytes)
15/04/01 11:53:24 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1336 bytes)
15/04/01 11:53:24 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
15/04/01 11:53:24 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
15/04/01 11:53:24 INFO Executor: Fetching http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427860403997
15/04/01 11:53:24 INFO Utils: Fetching http://10.10.10.26:43969/jars/spark-examples-1.3.0-hadoop2.4.0.jar to /tmp/spark-7cb47603-adb9-45ea-ad91-e5ddc3c6da41/userFiles-86c97d54-c082-4bb8-bcb3-34b97a432674/fetchFileTemp3928503400699723858.tmp
15/04/01 11:53:25 INFO Executor: Adding file:/tmp/spark-7cb47603-adb9-45ea-ad91-e5ddc3c6da41/userFiles-86c97d54-c082-4bb8-bcb3-34b97a432674/spark-examples-1.3.0-hadoop2.4.0.jar to class loader
15/04/01 11:53:25 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 736 bytes result sent to driver
15/04/01 11:53:25 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 736 bytes result sent to driver
15/04/01 11:53:25 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1068 ms on localhost (1/2)
15/04/01 11:53:25 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1028 ms on localhost (2/2)
15/04/01 11:53:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/04/01 11:53:25 INFO DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 1.107 s
15/04/01 11:53:25 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:35, took 1.326417 s
Pi is roughly 3.13518
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
15/04/01 11:53:25 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
15/04/01 11:53:25 INFO SparkUI: Stopped Spark web UI at http://spark-test.novalocal:4040
15/04/01 11:53:25 INFO DAGScheduler: Stopping DAGScheduler
15/04/01 11:53:25 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/04/01 11:53:25 INFO MemoryStore: MemoryStore cleared
15/04/01 11:53:25 INFO BlockManager: BlockManager stopped
15/04/01 11:53:25 INFO BlockManagerMaster: BlockManagerMaster stopped
15/04/01 11:53:25 INFO OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped!
15/04/01 11:53:25 INFO SparkContext: Successfully stopped SparkContext
15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/04/01 11:53:25 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.