是否可以将Jupyter Notebook用于AWS Glue而不是Zeppelin

问题描述:

我开始将AWS Glue用于数据ETL.我已经将数据源放入我的AWS数据目录中,并且即将为要测试的一个特定Postgres数据库中的数据创建一个作业.我在网上阅读过,创作自己的工作时,可以使用Zeppelin笔记本.我根本没有使用Zeppelin,但是当我是python开发人员时,已经大量使用了Jupyter笔记本,并且在数据分析和机器学习自学中大量使用了它.我无法在网上找到它,所以我的问题是在编写自己的AWS Glue作业时是否可以使用Jupyter笔记本代替Zeppelin笔记本?" >

I got started using AWS Glue for my data ETL. I've pulled in my data sources into my AWS data catalog, and am about to create a job for the data from one particular Postgres database I have for testing. I have read online that when authoring your own job, you can use a Zeppelin notebook. I haven't used Zeppelin at all, but have used Jupyter notebook heavily as I'm a python developer, and was using it a lot for data analytics, and machine learning self learnings. I haven't been able to find it anywhere online, so my question is this "Is there a way to use Jupyter notebook in place of a Zeppelin notebook when authoring your own AWS Glue jobs?"

如果您可以在本地设置Jupyter笔记本并启用到AWS Glue的SSH隧道,我认为应该可行.我确实看到了一些参考站点,用于设置本地Jupyter笔记本,启用SSH隧道等,尽管不是特定于AWS Glue的.

I think it should be possible, if you can setup a Jupyter notebook locally, and enable SSH tunneling to the AWS Glue. I do see some reference sites for setting up local Jupyter notebook, enable SSH tunneling, etc, though not AWS Glue specific.