Blog do projektu Open Source JavaHotel

niedziela, 30 marca 2014

octave, Mel-frequency cepstrum

Introduction
I created Octave version of Mel-frequency spectrum implementation. I followed approach from this article. Source is available here. In order to run this code two additional packages should be installed : nan and signal.
General description
The implementation is controlled by several global variables.
  • NOMELS: the size of MEL filterbank
  • STARTHZ : lower frequency
  • LASTHZ : upper frequency (usually half of the sample frequency).
  • FFTSIZE : FFT resolution (usually 512 or 1024)
There are several functions defined.
  • readsound1 : read wav file (only one channel if more defined)
  • applyhunning : apply window Hunning function
  • countmel : convert Hertz value to Mel scale
  • revmel : inverse to countmel
  • calcpoints: calculate NOMELS mel scale points from STARTHZ to LASTHZ
  • calfreq : trasforms calcpoints to Hertz 
  • calcbins : transforms calfreq to FFT bins equvalence 
  • melbins : calcpoints -> calfreq ->calcbins
  • creatembank : create mel filterbank for single mel filter
  • createfilters : create the whole filter bank (a vector of single filter)
  • calculateenergy : create mel coefficient (energy sum) for FFT output
  • transformenergy : calculateenergy->log->dct
  • transformmel : create mel coefficients for input signal
Usage
I tried to use this package for recognizing numbers in voice command but failed. As a reference voice I downloaded samples from http://dictionary.reference.com/ (for instance for one). As an input voice recording I tried automatically generated voices from text-to-speech demo and some natural voices.
The matching function is defined as matchsinglefile. It simply scans through input signal (using global variables SLICESIZE and OFFSET) and cuts off consecutive slices. Then compares mel coefficients for this slices against mel coefficient of reference sound using rms (Euclide distance) function. 
 But I failed to find a treshold allowing to tell number from non-number. Probably all stuff requires more elaboration.





Brak komentarzy:

Prześlij komentarz