如何在Python中编码和解码百分比编码（URL编码）的字符串？

问题描述：

我写了一个简单的应用程序，可以从Wiki页面下载文章。当我搜索名字为 Lech 的代码时，我的代码将返回 Lech_Kaczy％C5％84ski 或 Lech_Pozna％C5％84 而不是Lech_Kaczyński和Lech_Poznań。

I wrote a simple application which downloads articles from wiki pages. When I search, for example for a firstname Lech, my code returns strings like Lech_Kaczy%C5%84ski or Lech_Pozna%C5%84 instead of Lech_Kaczyński and Lech_Poznań.

如何将那些字符解码为普通的波兰字母？我尝试使用：
urllib.unquote（text），但随后得到了 Lech_Kaczy\xc5\x84ski ， Lech_Pozna\xc5\x84 代替Lech_Kaczyński和Lech_Poznań。

How can I decode those characters to ordinary polish letters? I tried to use: urllib.unquote(text) but then got Lech_Kaczy\xc5\x84ski, Lech_Pozna\xc5\x84 instead of Lech_Kaczyński and Lech_Poznań.

我的代码是>>

I have in my code:

# -*- coding: utf-8 -*-
import sys
reload(sys)
sys.setdefaultencoding("utf-8")

但是结果是相同的（根本行不通）。

But the result is the same (it simply does not work).

答

尝试以下操作：

import urllib
urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8')

这将返回unicode字符串：

This will return a unicode string:

u'Lech_Kaczy\u0144ski'

您可以照常打印和处理。例如：

which you can then print and process as usual. For example:

print(urllib.unquote("Lech_Kaczy%C5%84ski").decode('utf8'))

将导致

Lech_Kaczyński

如何在Python中编码和解码百分比编码（URL编码）的字符串？

相关推荐