非语音噪声或声音识别软件?

问题描述:

我工作的一些软件为儿童,并期待增加对软件的若干非语音的声音作出反应的能力。例如,拍手,狂叫,吹口哨,放屁的噪音等。

I'm working on some software for children, and looking to add the ability for the software to respond to a number of non-speech sounds. For instance, clapping, barking, whistling, fart noises, etc.

我使用CMU狮身人面像和Windows Speech API的过去,但是,据我可以告诉这些都不对非语音的声音任何支持,而事实上我相信积极筛选出来。

I've used CMU Sphinx and the Windows Speech API in the past, however, as far as I can tell neither of these have any support for non-speech noises, and in fact I believe actively filter them out.

在一般我在寻找我如何获得这个功能,但我怀疑它可能会帮助,如果我把它分解成是我猜测的该怎么寻找接下来的三个问题:

In general I'm looking for "How do I get this functionality" but I suspect it may help if I break it down into three questions that are my guesses for what to search for next:


  1. 有没有办法使用的主要语音识别引擎之一来识别非字的方式声音通过改变声学模型或发音词典?

  2. (或)是否已有现有的库做非字噪音识别?

  3. (或)我有一点与隐马尔可夫模型和语音识别的大学底层技术的熟悉,但它会是多么困难,从头开始创建一个非常小的噪音/声音识别没有很好的估计(假设&LT ;被认可20噪声)。如果1)和2)失败了,它需要多长时间来推出自己的?
  4. 的估计

感谢

我不知道,你可以使用任何现有的库,我怀疑你可能要推出自己的。

I don't know any existing libraries you can use, I suspect you may have to roll your own.

请问本文有兴趣?它有一些技术细节,他们似乎能够识别掌声和口哨声从区分他们。

Would this paper be of interest? It has some technical detail, they seem to be able to recognise claps and differentiate them from whistles.