使用 `tf.audio.decode_wav` 读取 `wav` 文件
我正在关注 simple_audio 上的音频识别 tensorflow 教程.笔记本运行良好.
I am following the tensorflow tutorial for audio recognition at simple_audio. The notebook works very well.
下一步,我想录制我自己的声音,然后通过在 tensorflow 中训练的模型运行它.我首先生成了一个录音:
As a next step, I wanted to record my own voice and then run it through the model trained in tensorflow. I first generated a recording:
seconds=1
sr=16000
nchannels=1
myrecording = sd.rec(int(seconds * sr), samplerate=sr, channels=nchannels)
sd.wait()
wavfile.write(filename, sr, myrecording)
到目前为止一切顺利,我可以播放我的录音.但是当我尝试使用类似这样的 tf.audio.decode_wav
加载文件时:
So far so good, I can play my recording. But when I try to load the file with tf.audio.decode_wav
similar to this:
audio_binary = tf.io.read_file(filename)
audio, _ = tf.audio.decode_wav(audio_binary)
我收到以下错误:
InvalidArgumentError:WAV 音频格式错误:预期为 1 (PCM),但得到 3 [Op:DecodeWav]
InvalidArgumentError: Bad audio format for WAV: Expected 1 (PCM), but got3 [Op:DecodeWav]
非常感谢任何有关可能出错的提示.
Any pointers on what might be going wrong are greatly appreciated.
(本来想把这个写成评论,但我还没有足够的声望)
(Would have written this as a comment, but I don't have enough reputation yet)
WAV 文件的默认编码称为16 位 PCM",这意味着录制的声音在写入 WAV 文件之前使用 16 位 int 数据表示.
The default encoding for WAV files is called "16 bit PCM", which means the recorded sound is represented using 16-bit int data before it is written to your WAV file.
tf.audio.decode_wav()
声明在 文档:将 16 位 PCM WAV 文件解码为浮点张量".因此,使用任何其他编码(在您的情况下为 24 位编码)传递 WAV 文件将导致您收到的错误.
tf.audio.decode_wav()
states in the documentation: "Decode a 16-bit PCM WAV file to a float tensor". Thus passing a WAV file using any other encoding (24-bit encoding, in your case) would result in an error like the one you received.