如何在python中编写检查以查看文件是否有效的UTF-8?
问题描述:
如标题中所述,我想检查给定的文件对象(以二进制流打开)是有效的UTF-8文件.
As stated in title, I would like to check in given file object (opened as binary stream) is valid UTF-8 file.
有人吗?
谢谢
答
您可以做类似的事情
import codecs
try:
f = codecs.open(filename, encoding='utf-8', errors='strict')
for line in f:
pass
print "Valid utf-8"
except UnicodeDecodeError:
print "invalid utf-8"