无法在 virtualenv 中加载 pyspark

无法在 virtualenv 中加载 pyspark

问题描述:

我在 python virtualenv 中安装了 pyspark.我还安装了新发布的 jupyterlab http://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html 在 vi​​rtualenv 中.我无法在 jupyter-notebook 中使用 SparkContext 变量来触发 pyspark.

I had installed pyspark in a python virtualenv. I have also installed jupyterlab which was newly released http://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html in the virtualenv. I was unable to fire pyspark within a jupyter-notebook in such a way that I have the SparkContext variable available.

首先启动 virtualenv

First fire the virtualenv

source venv/bin/activate
export SPARK_HOME={path_to_venv}/lib/python2.7/site-packages/pyspark
export PYSPARK_DRIVER_PYTHON=jupyter-lab

在此之前,我希望您已经完成:pip install pysparkpip install jupyterlab 在您的 virtualenv 中

Before this I hope you have done:pip install pyspark and pip install jupyterlab inside your virtualenv

要检查,一旦您的 jupyterlab 打开,请在 jupyterlab 的框中键入 sc,您应该有可用的 SparkContext 对象并且输出应该是这样的:

To check, once your jupyterlab is open, type sc in a box in the jupyterlab and you should have the SparkContext object available and the output should be this:

SparkContext
Spark UI
Version
v2.2.1
Master
local[*]
AppName
PySparkShell