如何将WAV文件转换为浮动幅度

问题描述:

所以我问了标题中的所有内容:

so I asked everything in the title:

我有一个wav文件(由PyAudio通过输入音频编写),我想将其转换为与声级(振幅)相对应的float数据,以进行傅立叶变换等.

I have a wav file (written by PyAudio from an input audio) and I want to convert it in float data corresponding of the sound level (amplitude) to do some fourier transformation etc...

有人想过将WAV数据转换为浮点型吗?

Anyone have an idea to convert WAV data to float?

我确定了两种不错的方法.

I have identified two decent ways of doing this.

方法1:使用wavefile模块

如果您不介意安装一些额外的库,请使用此方法,这些库在我的Mac上有些混乱,但是在我的Ubuntu服务器上很容易.

Use this method if you don't mind installing some extra libraries which involved a bit of messing around on my Mac but which was easy on my Ubuntu server.

https://github.com/vokimon/python-wavefile

import wavefile

# returns the contents of the wav file as a double precision float array
def wav_to_floats(filename = 'file1.wav'):
    w = wavefile.load(filename)
    return w[1][0]

signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(min(signal))

方法2:使用wave模块

如果希望减少模块安装麻烦,请使用此方法.

Use this method if you want less module install hassles.

从文件系统中读取一个wav文件,并将其转换为-1到1范围内的浮点数.它适用于16位文件,并且如果它们是> 1通道,将以与在文件.对于其他位深度,请根据此页面底部的表将参数中的"h"更改为struct.unpack:

Reads a wav file from the filesystem and converts it into floats in the range -1 to 1. It works with 16 bit files and if they are > 1 channel, will interleave the samples in the same way they are found in the file. For other bit depths, change the 'h' in the argument to struct.unpack according to the table at the bottom of this page:

https://docs.python.org/2/library/struct.html

它不适用于24位文件,因为没有24位的数据类型,因此无法告诉struct.unpack该怎么做.

It will not work for 24 bit files as there is no data type that is 24 bit, so there is no way to tell struct.unpack what to do.

import wave
import struct
import sys

def wav_to_floats(wave_file):
    w = wave.open(wave_file)
    astr = w.readframes(w.getnframes())
    # convert binary chunks to short 
    a = struct.unpack("%ih" % (w.getnframes()* w.getnchannels()), astr)
    a = [float(val) / pow(2, 15) for val in a]
    return a

# read the wav file specified as first command line arg
signal = wav_to_floats(sys.argv[1])
print "read "+str(len(signal))+" frames"
print  "in the range "+str(min(signal))+" to "+str(min(signal))