在python中将快照与Avro结合使用时的问题

问题描述:

我正在读取.gz文件并将其转换为AVRO格式.当我使用 codec ='deflate'时.一切正常.即,我能够转换为avro格式.当我使用 codec ='snappy'时,它会抛出错误,说明如下:

I am reading the .gz file and converting to AVRO format. When I was using the codec='deflate'. It is working fine. i.e., I was able to convert to avro format. When I use codec='snappy' it is throwing an error stating below:

raise DataFileException("Unknown codec: %r" % codec)
avro.datafile.DataFileException: Unknown codec: 'snappy'

使用deflate->工作正常

with deflate --> working fine

writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec='deflate')

snappy->引发错误

with snappy --> throwing an error

writer = DataFileWriter(open(avro_file, "wb"), DatumWriter(), schema, codec = "snappy")

快速响应会很有帮助.

谢谢.

.

来自avro/datafile.py

from avro/datafile.py

try:
  import snappy
  has_snappy = True
except ImportError:
  has_snappy = False

...

# Codecs supported by container files:
VALID_CODECS = frozenset(['null', 'deflate'])
if has_snappy:
  VALID_CODECS = frozenset.union(VALID_CODECS, ['snappy'])

所以您必须安装python-snappy lib

so you have to install python-snappy lib