在python中将快照与Avro结合使用时的问题
问题描述:
我正在读取.gz文件并将其转换为AVRO格式.当我使用 codec ='deflate'
时.一切正常.即,我能够转换为avro格式.当我使用 codec ='snappy'
时,它会抛出错误,说明如下:
I am reading the .gz file and converting to AVRO format. When I was using the codec='deflate'
. It is working fine. i.e., I was able to convert to avro format. When I use codec='snappy'
it is throwing an error stating below:
raise DataFileException("Unknown codec: %r" % codec)
avro.datafile.DataFileException: Unknown codec: 'snappy'
使用deflate->工作正常
with deflate --> working fine
writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec='deflate')
snappy->引发错误
with snappy --> throwing an error
writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec = "snappy")
快速响应会很有帮助.
谢谢.
.
答
来自avro/datafile.py
from avro/datafile.py
try:
import snappy
has_snappy = True
except ImportError:
has_snappy = False
...
# Codecs supported by container files:
VALID_CODECS = frozenset(['null', 'deflate'])
if has_snappy:
VALID_CODECS = frozenset.union(VALID_CODECS, ['snappy'])
所以您必须安装python-snappy lib
so you have to install python-snappy lib