如何在控制台中创建多个sparkcontext ?

时间:2022-06-22 02:31:57

I want to create more than one SparkContext in a console. According to a post in mailing list, I need to do SparkConf.set( 'spark.driver.allowMultipleContexts' , true), it seems reasonable but can not work. Can anyone have experience in this? thanks a lot:

我希望在一个控制台中创建多个SparkContext。根据邮件列表中的帖子,我需要做SparkConf。集(spark.driver。allowmultiplecontext, true),这似乎是合理的,但行不通。有人能在这方面有经验吗?非常感谢:

bellow is that I do and the error message, I made that in a Ipython Notebook:

bellow是我做的和错误信息,我在一个Ipython笔记本中做的:

from pyspark import SparkConf, SparkContext
conf = SparkConf().setMaster("spark://10.21.208.21:7077").set("spark.driver.allowMultipleContexts", "true")
conf.getAll()
[(u'spark.eventLog.enabled', u'true'),
 (u'spark.driver.allowMultipleContexts', u'true'),
 (u'spark.driver.host', u'10.20.70.80'),
 (u'spark.app.name', u'pyspark-shell'),
 (u'spark.eventLog.dir', u'hdfs://10.21.208.21:8020/sparklog'),
 (u'spark.master', u'spark://10.21.208.21:7077')]

sc1 = SparkContext(conf=conf.setAppName("app 1")) ## this sc success
sc1
<pyspark.context.SparkContext at 0x1b7cf10>

sc2 = SparkContext(conf=conf.setAppName("app 2")) ## this failed
ValueError                                Traceback (most recent call last)
<ipython-input-23-e6dcca5aec38> in <module>()
----> 1 sc2 = SparkContext(conf=conf.setAppName("app 2"))

/usr/local/spark-1.2.0-bin-cdh4/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc)
    100         """
    101         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 102         SparkContext._ensure_initialized(self, gateway=gateway)
    103         try:
    104             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/usr/local/spark-1.2.0-bin-cdh4/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    226                         " created by %s at %s:%s "
    227                         % (currentAppName, currentMaster,
--> 228                             callsite.function, callsite.file, callsite.linenum))
    229                 else:
    230                     SparkContext._active_spark_context = instance

ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=app 1, master=spark://10.21.208.21:7077) created by __init__ at <ipython-input-21-fb3adb569241>:1 

4 个解决方案

#1


3  

This is a PySpark-specific limitation that existed before the spark.driver.allowMultipleContexts configuration was added (which relates to multiple SparkContext objects within a JVM). PySpark disallows multiple active SparkContexts because various parts of its implementation assume that certain components have global shared state.

这是在spark.driver之前存在的pyspark特有的限制。添加allowmultiplecontext配置(它与JVM中的多个SparkContext对象相关)。PySpark不允许多个活动的sparkcontext,因为其实现的各个部分假定某些组件具有全局共享状态。

#2


3  

I was hoping previous spark context be stopped and closed by calling close() stop() and the new one can be recreated, but still getting same error.

我希望通过调用close() stop()来停止并关闭前面的spark上下文,并且可以重新创建新的spark上下文,但仍然会出现相同的错误。

#3


1  

My way:

我的方法:

from pyspark import SparkContext
try:
    sc.stop()
except:
    pass
sc=SparkContext('local','pyspark')
'''
your code
'''
sc.stop()

#4


0  

Run the bellow function before creating a new context

在创建新的上下文之前运行bellow函数。

def kill_current_spark_context():
    SparkContext.getOrCreate().stop()

#1


3  

This is a PySpark-specific limitation that existed before the spark.driver.allowMultipleContexts configuration was added (which relates to multiple SparkContext objects within a JVM). PySpark disallows multiple active SparkContexts because various parts of its implementation assume that certain components have global shared state.

这是在spark.driver之前存在的pyspark特有的限制。添加allowmultiplecontext配置(它与JVM中的多个SparkContext对象相关)。PySpark不允许多个活动的sparkcontext,因为其实现的各个部分假定某些组件具有全局共享状态。

#2


3  

I was hoping previous spark context be stopped and closed by calling close() stop() and the new one can be recreated, but still getting same error.

我希望通过调用close() stop()来停止并关闭前面的spark上下文,并且可以重新创建新的spark上下文,但仍然会出现相同的错误。

#3


1  

My way:

我的方法:

from pyspark import SparkContext
try:
    sc.stop()
except:
    pass
sc=SparkContext('local','pyspark')
'''
your code
'''
sc.stop()

#4


0  

Run the bellow function before creating a new context

在创建新的上下文之前运行bellow函数。

def kill_current_spark_context():
    SparkContext.getOrCreate().stop()