NLTK:设置代理服务器

问题描述：

我正在尝试学习 NLTK -用Python编写的自然语言工具包，我想安装一个示例数据集来运行一些示例.

I'm trying to learn NLTK - Natural Language Toolkit written in Python and I want install a sample data set to run some examples.

我的网络连接使用代理服务器，并且我尝试按以下方式指定代理地址:

My web connection uses a proxy server, and I'm trying to specify the proxy address as follows:

>>> nltk.set_proxy('http://proxy.example.com:3128' ('USERNAME', 'PASSWORD'))
>>> nltk.download()

但是我得到一个错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable

我决定在呼叫nltk.download()之前先设置ProxyBasicAuthHandler:

I decided to set up a ProxyBasicAuthHandler before calling nltk.download():

import urllib2

auth_handler = urllib2.ProxyBasicAuthHandler(urllib2.HTTPPasswordMgrWithDefaultRealm())
auth_handler.add_password(realm=None, uri='http://proxy.example.com:3128/', user='USERNAME', passwd='PASSWORD')
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)

import nltk
nltk.download()

但是现在我得到了HTTP Error 407 - Proxy Autentification Required.

文档> 指出，如果将代理设置为None，则此功能将尝试检测系统代理.但这不起作用.

The documentation says that if the proxy is set to None then this function will attempt to detect the system proxy. But it isn't working.

如何为NLTK安装样本数据集?

How can I install a sample data set for NLTK?

答

网站出现错误，您在第一次尝试中获得了这些代码行(我见过同样的错误)

There is an error with the website where you got those lines of code for your first attempt (I have seen that same error)

错误的行是

nltk.set_proxy('http://proxy.example.com:3128' ('USERNAME', 'PASSWORD'))

您需要使用逗号分隔参数.正确的行应该是

You need a comma to separate the arguments. The correct line should be

nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))

这将很好地工作.

NLTK:设置代理服务器

相关推荐