在docker上安装 Spark 1.2.0,docker1.2.0


好久没有写博客了,最近有点时间打算写点。

1.什么docker

Docker 是一个开源项目,诞生于 2013年初,最初是 dotCloud 公司内部的一个业余项目。它基于 Google 公司推出的 Go 语言实现。 项目后来加入了 Linux 基金会,遵从了 Apache 2.0 协议,项目代码在 GitHub 上进行维护。

Docker 自开源后受到广泛的关注和讨论,以至于 dotCloud公司后来都改名为 Docker Inc。Redhat 已经在其 RHEL6.5 中集中支持 Docker;Google 也在其 PaaS 产品中广泛应用。

Docker 项目的目标是实现轻量级的操作系统虚拟化解决方案。 Docker的基础是 Linux 容器(LXC)等技术。

在 LXC 的基础上 Docker 进行了进一步的封装,让用户不需要去关心容器的管理,使得操作更为简便。用户操作 Docker 的容器就像操作一个快速轻量级的虚拟机一样简单。

下面的图片比较了 Docker 和传统虚拟化方式的不同之处,可见容器是在操作系统层面上实现虚拟化,直接复用本地主机的操作系统,而传统方式则是在硬件层面实现。

 

 

2.为什么要用docder

作为一种新兴的虚拟化方式,Docker 跟传统的虚拟化方式相比具有众多的优势。

首先,Docker 容器的启动可以在秒级实现,这相比传统的虚拟机方式要快得多。 其次,Docker对系统资源的利用率很高,一台主机上可以同时运行数千个 Docker 容器。

容器除了运行其中应用外,基本不消耗额外的系统资源,使得应用的性能很高,同时系统的开销尽量小。传统虚拟机方式运行 10 个不同的应用就要起 10 个虚拟机,而Docker 只需要启动 10 个隔离的应用即可。

具体说来,Docker 在如下几个方面具有较大的优势。

更快速的交付和部署

对开发和运维(devop)人员来说,最希望的就是一次创建或配置,可以在任意地方正常运行。

开发者可以使用一个标准的镜像来构建一套开发容器,开发完成之后,运维人员可以直接使用这个容器来部署代码。 Docker 可以快速创建容器,快速迭代应用程序,并让整个过程全程可见,使团队中的其他成员更容易理解应用程序是如何创建和工作的。 Docker 容器很轻很快!容器的启动时间是秒级的,大量地节约开发、测试、部署的时间。

更高效的虚拟化

Docker 容器的运行不需要额外的 hypervisor 支持,它是内核级的虚拟化,因此可以实现更高的性能和效率。

更轻松的迁移和扩展

Docker 容器几乎可以在任意的平台上运行,包括物理机、虚拟机、公有云、私有云、个人电脑、服务器等。这种兼容性可以让用户把一个应用程序从一个平台直接迁移到另外一个。

更简单的管理

使用 Docker,只需要小小的修改,就可以替代以往大量的更新工作。所有的修改都以增量的方式被分发和更新,从而实现自动化并且高效的管理。

对比传统虚拟机总结

 

特性

容器

虚拟机

启动

秒级

分钟级

硬盘使用

一般为 MB

一般为 GB

性能

接近原生

弱于

系统支持量

单机支持上千个容器

一般几十个


3.CentOS 系列安装 Docker

CentOS7系统 CentOS-Extras 库中已带 Docker,可以直接安装:

 

$sudo yum install docker

安装之后启动 Docker 服务,并让它随系统启动自动加载。

$sudo service docker start

$sudo chkconfig docker on

4.安装Spark

在当前的文章中,我们想帮助你开始用docker安装最新的是Spark- 1.2.0。

Docker和Spark是最近炒作非常火的两种技术。所以我们把Spark和Docker放在一起,容器的代码是我们的GitHub库中找到.

4.1从Docker仓库拉取镜像

[root@master ~]# docker pullsequenceiq/spark:1.2.0

Pulling repository sequenceiq/spark

334aabfef5f1: Pulling dependent layers

89b52f216c6c: Download complete

0dd5f7a357f5: Download complete

ae2537991743: Download complete

b38f87063c35: Download complete

36bf8ea12ad2: Download complete

c605a0ffb1d4: Download complete

0bd9464ce7fd: Download complete

7b5528f018cf: Download complete

e8f8ccba56cc: Download complete

d3808d6c73c4: Download complete

36fa609d2102: Download complete

5258b4da874d: Download complete

0bd02d3d7a4b: Download complete

bbad7d38a70e: Download complete

c6fbec816602: Download complete

3f5e48be180b: Download complete

ef4e09c06ac5: Download complete

334aabfef5f1: Download complete

ee2f8cf16677: Download complete

70c2821718e6: Download complete

0b0f13b6c16b: Download complete

8a17a79e13f5: Download complete

d2d8a13706fd: Download complete

dde2d8f01c66: Download complete

0165d67b327e: Download complete

afcddf83915d: Download complete

e0786d842672: Download complete

5c3542c1d6d2: Download complete

c04119d3b78c: Download complete

e2a6f40fbee4: Download complete

7c5e5f584526: Download complete

bbfe93940f8c: Download complete

0dae8995a865: Download complete

bd0a4bca6161: Download complete

5c09c81ffffd: Download complete

89b0655a34d7: Download complete

d2ca8f2c26eb: Download complete

aced545fc0a4: Download complete

82a5db38e8f3: Download complete

cc7d6c137a30: Download complete

f52a6540835d: Download complete

aa33b1563fe1: Download complete

944a6e9c3824: Download complete

f0ec3c14378c: Download complete

48ac51d3df99: Download complete

abfbbcb93f01: Download complete

e1f3493e6f14: Download complete

83ca5ab18a47: Download complete

63966e034d6e: Download complete

8aebb7338718: Download complete

0da4a51ce952: Download complete

2e2ffaf055bc: Download complete

dcdbfb337435: Download complete

865c6212c08c: Download complete

8c791638517c: Download complete

8ef7e34a3049: Download complete

873131a3f2d7: Download complete

944971358eb0: Download complete

d828dda7ad02: Download complete

04cecee6f836: Download complete

42460f40dc71: Download complete

4f1e85a3c877: Download complete

3f212fb7286c: Download complete

5b4955b94732: Download complete

83308e1cae94: Download complete

55bf8341ea4d: Download complete

2f5f4034cbe9: Download complete

6a2c6e8b5d08: Download complete

6047f6052c38: Download complete

Status: Downloaded newer image forsequenceiq/spark:1.2.0

这个过程时间比较长,镜像文件大概2G左右。我打算将镜像导出到本地文件,然后上传百度盘,方便大家下载。

http://pan.baidu.com/s/1dDGREiH

然后可以使用 docker load 从导出的本地文件中再导入到本地镜像库,例如

 sudo docker load --input spark.tar

4.2运行镜像

一旦从docker仓库拉取完了镜像就可以运行啦。 [root@master ~]# docker run -i -t -h sandbox sequenceiq/spark:1.2.0 /etc/bootstrap.sh -bash
/
Starting sshd:                                             [  OK  ]
Starting namenodes on [sandbox]
sandbox: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-sandbox.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-sandbox.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-sandbox.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-sandbox.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-sandbox.out
bash-4.1# jps
304 SecondaryNameNode
625 Jps
505 ResourceManager
188 DataNode
112 NameNode
588 NodeManager

4.3测试

当这些都运行完了,我测试一下看看是不是安装好了。 bash-4.1# cd /usr/local/spark
bash-4.1# ./bin/spark-shell --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/11 20:56:58 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:56:58 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:56:59 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:56:59 INFO spark.HttpServer: Starting HTTP Server
15/02/11 20:56:59 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:56:59 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:45752
15/02/11 20:56:59 INFO util.Utils: Successfully started service 'HTTP class server' on port 45752.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.2.0
      /_/


Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
Type in expressions to have them evaluated.
Type :help for more information.
15/02/11 20:57:17 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:57:17 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:57:17 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:57:18 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/11 20:57:19 INFO Remoting: Starting remoting
15/02/11 20:57:20 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@sandbox:58553]
15/02/11 20:57:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 58553.
15/02/11 20:57:20 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/11 20:57:20 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/11 20:57:20 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150211205720-f7e6
15/02/11 20:57:20 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB
15/02/11 20:57:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/11 20:57:24 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-d90ad2bb-e82f-4446-8ccd-e79ff4c6d076
15/02/11 20:57:24 INFO spark.HttpServer: Starting HTTP Server
15/02/11 20:57:24 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:57:24 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:52012
15/02/11 20:57:24 INFO util.Utils: Successfully started service 'HTTP file server' on port 52012.
15/02/11 20:57:25 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 20:57:25 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/02/11 20:57:25 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/11 20:57:25 INFO ui.SparkUI: Started SparkUI at http://sandbox:4040
15/02/11 20:57:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/11 20:57:27 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
15/02/11 20:57:27 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/02/11 20:57:27 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/02/11 20:57:27 INFO yarn.Client: Setting up container launch context for our AM
15/02/11 20:57:27 INFO yarn.Client: Preparing resources for our AM container
15/02/11 20:57:31 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 20:57:31 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/spark/spark-assembly-1.2.0-hadoop2.4.0.jar
15/02/11 20:57:31 INFO yarn.Client: Setting up the launch environment for our AM container
15/02/11 20:57:31 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 20:57:31 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 20:57:31 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 20:57:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 20:57:31 INFO yarn.Client: Submitting application 1 to ResourceManager
15/02/11 20:57:32 INFO impl.YarnClientImpl: Submitted application application_1423706171480_0001
15/02/11 20:57:33 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:33 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1423706251906
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0001/
         user: root
15/02/11 20:57:34 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:35 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:36 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:37 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:38 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:40 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:41 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:42 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:43 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:44 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:45 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:46 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:47 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:48 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:49 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:50 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:51 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:52 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:53 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:54 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:55 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:56 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:57 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:58 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:57:59 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:00 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:01 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:04 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:05 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:06 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:07 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:08 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:10 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:11 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:13 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:14 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:15 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:16 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:17 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:18 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:19 INFO yarn.Client: Application report for application_1423706171480_0001 (state: ACCEPTED)
15/02/11 20:58:19 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@sandbox:54672/user/YarnAM#-192648481]
15/02/11 20:58:19 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sandbox, PROXY_URI_BASES -> http://sandbox:8088/proxy/application_1423706171480_0001), /proxy/application_1423706171480_0001
15/02/11 20:58:19 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/02/11 20:58:20 INFO yarn.Client: Application report for application_1423706171480_0001 (state: RUNNING)
15/02/11 20:58:20 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: sandbox
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1423706251906
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0001/
         user: root
15/02/11 20:58:20 INFO cluster.YarnClientSchedulerBackend: Application application_1423706171480_0001 has started running.
15/02/11 20:58:20 INFO netty.NettyBlockTransferService: Server created on 60949
15/02/11 20:58:20 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/11 20:58:20 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:60949 with 530.3 MB RAM, BlockManagerId(<driver>, sandbox, 60949)
15/02/11 20:58:20 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/11 20:58:21 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
15/02/11 20:58:21 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.


scala> 15/02/11 20:58:43 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@sandbox:37188/user/Executor#375257054] with ID 1
15/02/11 20:58:43 INFO util.RackResolver: Resolved sandbox to /default-rack
15/02/11 20:58:43 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@sandbox:52808/user/Executor#1782772186] with ID 2
15/02/11 20:58:45 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:55768 with 530.3 MB RAM, BlockManagerId(1, sandbox, 55768)
15/02/11 20:58:45 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:41242 with 530.3 MB RAM, BlockManagerId(2, sandbox, 41242)




scala> sc.parallelize(1 to 1000).count()
15/02/11 20:59:45 INFO spark.SparkContext: Starting job: count at <console>:13
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Got job 0 (count at <console>:13) with 2 output partitions (allowLocal=false)
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at <console>:13)
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Missing parents: List()
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Submitting Stage 0 (ParallelCollectionRDD[0] at parallelize at <console>:13), which has no missing parents
15/02/11 20:59:45 INFO storage.MemoryStore: ensureFreeSpace(1088) called with curMem=0, maxMem=556038881
15/02/11 20:59:45 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1088.0 B, free 530.3 MB)
15/02/11 20:59:45 INFO storage.MemoryStore: ensureFreeSpace(842) called with curMem=1088, maxMem=556038881
15/02/11 20:59:45 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 842.0 B, free 530.3 MB)
15/02/11 20:59:45 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:60949 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:45 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/11 20:59:45 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:838
15/02/11 20:59:45 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at <console>:13)
15/02/11 20:59:45 INFO cluster.YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks
15/02/11 20:59:45 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, sandbox, PROCESS_LOCAL, 1260 bytes)
15/02/11 20:59:45 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox, PROCESS_LOCAL, 1260 bytes)
15/02/11 20:59:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:41242 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:52 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:55768 (size: 842.0 B, free: 530.3 MB)
15/02/11 20:59:52 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 6625 ms on sandbox (1/2)
15/02/11 20:59:52 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 6691 ms on sandbox (2/2)
15/02/11 20:59:52 INFO scheduler.DAGScheduler: Stage 0 (count at <console>:13) finished in 6.695 s
15/02/11 20:59:52 INFO cluster.YarnClientClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/02/11 20:59:52 INFO scheduler.DAGScheduler: Job 0 finished: count at <console>:13, took 7.036182 s
res0: Long = 1000


scala> exit

4.4运行spark样例程序

bash-4.1# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 1g --executor-memory 1g --executor-cores 1 ./lib/spark-examples-1.2.0-hadoop2.4.0.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/11 21:09:37 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 21:09:37 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 21:09:37 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 21:09:38 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/11 21:09:38 INFO Remoting: Starting remoting
15/02/11 21:09:38 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@sandbox:46836]
15/02/11 21:09:38 INFO util.Utils: Successfully started service 'sparkDriver' on port 46836.
15/02/11 21:09:38 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/11 21:09:38 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/11 21:09:38 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150211210938-ba7a
15/02/11 21:09:38 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB
15/02/11 21:09:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/11 21:09:39 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-590b1b3d-95b6-4d7c-bef4-36b0cafeafe9
15/02/11 21:09:39 INFO spark.HttpServer: Starting HTTP Server
15/02/11 21:09:39 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 21:09:39 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:60161
15/02/11 21:09:39 INFO util.Utils: Successfully started service 'HTTP file server' on port 60161.
15/02/11 21:09:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/11 21:09:40 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/02/11 21:09:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/11 21:09:40 INFO ui.SparkUI: Started SparkUI at http://sandbox:4040
15/02/11 21:09:41 INFO spark.SparkContext: Added JAR file:/usr/local/spark-1.2.0-bin-hadoop2.4/./lib/spark-examples-1.2.0-hadoop2.4.0.jar at http://172.17.0.2:60161/jars/spark-examples-1.2.0-hadoop2.4.0.jar with timestamp 1423706981078
15/02/11 21:09:41 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/02/11 21:09:41 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
15/02/11 21:09:41 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/02/11 21:09:41 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/02/11 21:09:41 INFO yarn.Client: Setting up container launch context for our AM
15/02/11 21:09:41 INFO yarn.Client: Preparing resources for our AM container
15/02/11 21:09:42 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 21:09:42 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/spark/spark-assembly-1.2.0-hadoop2.4.0.jar
15/02/11 21:09:42 INFO yarn.Client: Setting up the launch environment for our AM container
15/02/11 21:09:42 WARN yarn.ClientBase: SPARK_JAR detected in the system environment. This variable has been deprecated in favor of the spark.yarn.jar configuration variable.
15/02/11 21:09:42 INFO spark.SecurityManager: Changing view acls to: root
15/02/11 21:09:42 INFO spark.SecurityManager: Changing modify acls to: root
15/02/11 21:09:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
15/02/11 21:09:42 INFO yarn.Client: Submitting application 3 to ResourceManager
15/02/11 21:09:43 INFO impl.YarnClientImpl: Submitted application application_1423706171480_0003
15/02/11 21:09:44 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:44 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1423706982964
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0003/
         user: root
15/02/11 21:09:45 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:46 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:47 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:48 INFO yarn.Client: Application report for application_1423706171480_0003 (state: ACCEPTED)
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@sandbox:36886/user/YarnAM#250082351]
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sandbox, PROXY_URI_BASES -> http://sandbox:8088/proxy/application_1423706171480_0003), /proxy/application_1423706171480_0003
15/02/11 21:09:49 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/02/11 21:09:49 INFO yarn.Client: Application report for application_1423706171480_0003 (state: RUNNING)
15/02/11 21:09:49 INFO yarn.Client: 
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: sandbox
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1423706982964
         final status: UNDEFINED
         tracking URL: http://sandbox:8088/proxy/application_1423706171480_0003/
         user: root
15/02/11 21:09:49 INFO cluster.YarnClientSchedulerBackend: Application application_1423706171480_0003 has started running.
15/02/11 21:09:49 INFO netty.NettyBlockTransferService: Server created on 56981
15/02/11 21:09:49 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/11 21:09:49 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:56981 with 530.3 MB RAM, BlockManagerId(<driver>, sandbox, 56981)
15/02/11 21:09:49 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/11 21:10:03 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@sandbox:56995/user/Executor#-1663552722] with ID 2
15/02/11 21:10:04 INFO util.RackResolver: Resolved sandbox to /default-rack
15/02/11 21:10:04 INFO cluster.YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@sandbox:35154/user/Executor#1336228035] with ID 1
15/02/11 21:10:04 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
15/02/11 21:10:04 INFO spark.SparkContext: Starting job: reduce at SparkPi.scala:35
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Got job 0 (reduce at SparkPi.scala:35) with 2 output partitions (allowLocal=false)
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35)
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Parents of final stage: List()
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Missing parents: List()
15/02/11 21:10:04 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at SparkPi.scala:31), which has no missing parents
15/02/11 21:10:04 INFO storage.MemoryStore: ensureFreeSpace(1728) called with curMem=0, maxMem=556038881
15/02/11 21:10:05 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1728.0 B, free 530.3 MB)
15/02/11 21:10:05 INFO storage.MemoryStore: ensureFreeSpace(1235) called with curMem=1728, maxMem=556038881
15/02/11 21:10:05 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1235.0 B, free 530.3 MB)
15/02/11 21:10:05 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:56981 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:05 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/11 21:10:05 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:838
15/02/11 21:10:05 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[1] at map at SparkPi.scala:31)
15/02/11 21:10:05 INFO cluster.YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks
15/02/11 21:10:05 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, sandbox, PROCESS_LOCAL, 1335 bytes)
15/02/11 21:10:05 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, sandbox, PROCESS_LOCAL, 1335 bytes)
15/02/11 21:10:06 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:48023 with 530.3 MB RAM, BlockManagerId(2, sandbox, 48023)
15/02/11 21:10:06 INFO storage.BlockManagerMasterActor: Registering block manager sandbox:46354 with 530.3 MB RAM, BlockManagerId(1, sandbox, 46354)
15/02/11 21:10:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:46354 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sandbox:48023 (size: 1235.0 B, free: 530.3 MB)
15/02/11 21:10:24 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 18997 ms on sandbox (1/2)
15/02/11 21:10:24 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 19283 ms on sandbox (2/2)
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Stage 0 (reduce at SparkPi.scala:35) finished in 19.324 s
15/02/11 21:10:24 INFO cluster.YarnClientClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:35, took 20.226582 s
Pi is roughly 3.143
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null}
15/02/11 21:10:24 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null}
15/02/11 21:10:24 INFO ui.SparkUI: Stopped Spark web UI at http://sandbox:4040
15/02/11 21:10:24 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
15/02/11 21:10:24 INFO cluster.YarnClientSchedulerBackend: Stopped
15/02/11 21:10:25 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/02/11 21:10:25 INFO storage.MemoryStore: MemoryStore cleared
15/02/11 21:10:25 INFO storage.BlockManager: BlockManager stopped
15/02/11 21:10:25 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/02/11 21:10:25 INFO spark.SparkContext: Successfully stopped SparkContext
15/02/11 21:10:25 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/02/11 21:10:25 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
bash-4.1# 

相关内容

    暂无相关文章