FluidBufMFCC:
Filter:
Classes (extension) | Libraries > FluidCorpusManipulation

FluidBufMFCC : FluidBufProcessor : FluidServerObject : Object
ExtensionExtension

Mel-Frequency Cepstral Coefficients as Spectral Descriptors on a Buffer

Description

A classic timbral spectral descriptor, the Mel-Frequency Cepstral Coefficients (MFCCs).

MFCC stands for Mel-Frequency Cepstral Coefficients ("cepstral" is pronounced like "kepstral"). This analysis is often used for timbral description and timbral comparison. It compresses the overall spectrum into a smaller number of coefficients that, when taken together, describe the general contour of the spectrum.

The MFCC values are derived by first computing a mel-frequency spectrum, just as in FluidMelBands. numCoeffs coefficients are then calculated by using that mel-frequency spectrum as input to the discrete cosine transform. This means that the shape of the mel-frequency spectrum is compared to a number of cosine wave shapes (different cosine shapes created from different frequencies). Each MFCC value (i.e., "coefficient") represents how similar the mel-frequency spectrum is to one of these cosine shapes.

Other than the 0th coefficient, MFCCs are unchanged by differences in the overall energy of the spectrum (which relates to how we perceive loudness). This means that timbres with similar spectral contours, but different volumes, will still have similar MFCC values, other than MFCC 0. To remove any indication of loudness but keep the information about timbre, we can ignore MFCC 0 by setting the parameter startCoeff to 1.

For an interactive explanation of this relationship, visit https://learn.flucoma.org/reference/mfcc/explain.

Read more about FluidBufMFCC on the learn platform.

Class Methods

FluidBufMFCC.process(server, source, startFrame: 0, numFrames: -1, startChan: 0, numChans: -1, features, numCoeffs: 13, numBands: 40, startCoeff: 0, minFreq: 20, maxFreq: 20000, windowSize: 1024, hopSize: -1, fftSize: -1, padding: 1, freeWhenDone: true, action)

FluidBufMFCC.processBlocking(server, source, startFrame: 0, numFrames: -1, startChan: 0, numChans: -1, features, numCoeffs: 13, numBands: 40, startCoeff: 0, minFreq: 20, maxFreq: 20000, windowSize: 1024, hopSize: -1, fftSize: -1, padding: 1, freeWhenDone: true, action)

Processs the source Buffer on the Server. processBlocking will execute directly in the server command FIFO, whereas process will delegate to a separate worker thread. The latter is generally only worthwhile for longer-running jobs where you don't wish to tie up the server.

Arguments:

server

The Server on which the buffers to be processed are allocated.

source

The buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processed sequentially.

startFrame

Where in the srcBuf the analysis should start, in samples. The default is 0.

Constraints

  • Minimum: 0
numFrames

How many frames should be analysed. The default of -1 indicates to analyse to the end of the buffer.

startChan

For a multichannel srcBuf, which channel should be processed first. The default is 0.

Constraints

  • Minimum: 0
numChans

For a multichannel srcBuf, how many channels should be processed. The default of -1 indicates to analyse through the last channel.

features

The destination buffer to write the MFCC analysis into.

numCoeffs

The number of cepstral coefficients to return. The default is 13.

Constraints

  • Minimum: 2
  • Maximum: MIN(numBands, maxnumCoeffs)
numBands

The number of bands that will be perceptually equally distributed between minFreq and maxFreq. The default is 40.

Constraints

  • Minimum: MAX(numCoeffs, 2)
  • Maximum: MIN((FFT Size / 2) + 1 (see fft settings), maxnumBands)
startCoeff

The lowest index of the output cepstral coefficients to return, zero-counting. This can be useful to skip over the 0th coefficient (by indicating startCoeff = 1), because the 0th coefficient is representative of the overall energy in spectrum, while the rest of the coefficients are not affected by overall energy, only the mel-frequency spectral contour. The default is 0.

Constraints

  • Minimum: 0
  • Maximum: 1
minFreq

The lower bound of the frequency band to use in analysis, in Hz. The default is 20.

Constraints

  • Minimum: 0
maxFreq

The upper bound of the frequency band to use in analysis, in Hz. The default is 20000.

Constraints

  • Minimum: 0
windowSize

The window size. As MFCC computation relies on spectral frames, we need to decide what precision we give it spectrally and temporally. For more information visit https://learn.flucoma.org/learn/fourier-transform/. The default is 1024.

hopSize

The window hop size. As MFCC computation relies on spectral frames, we need to move the window forward. It can be any size, but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).

fftSize

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will use the next power of 2 equal or above the windowSize.

padding

Controls the zero-padding added to either end of the source buffer or segment. Padding ensures all values are analysed. Possible values are:

0No padding - The first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function.
1Half the window size - The first sample is centred in the analysis window ensuring that the start and end of the segment are accounted for in the analysis.
2Window size minus the hop size - Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.
freeWhenDone

Free the server instance when processing complete. Default true

action

A function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [features] as an argument.

Returns:

An instance of the processor

FluidBufMFCC.kr(source, startFrame: 0, numFrames: -1, startChan: 0, numChans: -1, features, numCoeffs: 13, numBands: 40, startCoeff: 0, minFreq: 20, maxFreq: 20000, windowSize: 1024, hopSize: -1, fftSize: -1, padding: 1, trig: 1, blocking: 0)

Trigger the equivalent behaviour to processBlocking / process from a Synth. Can be useful for expressing a sequence of buffer and data processing jobs to execute. Note that the work still executes on the server command FIFO (not the audio thread), and it is the caller's responsibility to manage the sequencing, using the done status of the various UGens.

Arguments:

source

The buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processed sequentially.

startFrame

Where in the srcBuf the analysis should start, in samples. The default is 0.

Constraints

  • Minimum: 0
numFrames

How many frames should be analysed. The default of -1 indicates to analyse to the end of the buffer.

startChan

For a multichannel srcBuf, which channel should be processed first. The default is 0.

Constraints

  • Minimum: 0
numChans

For a multichannel srcBuf, how many channels should be processed. The default of -1 indicates to analyse through the last channel.

features

The destination buffer to write the MFCC analysis into.

numCoeffs

The number of cepstral coefficients to return. The default is 13.

Constraints

  • Minimum: 2
  • Maximum: MIN(numBands, maxnumCoeffs)
numBands

The number of bands that will be perceptually equally distributed between minFreq and maxFreq. The default is 40.

Constraints

  • Minimum: MAX(numCoeffs, 2)
  • Maximum: MIN((FFT Size / 2) + 1 (see fft settings), maxnumBands)
startCoeff

The lowest index of the output cepstral coefficients to return, zero-counting. This can be useful to skip over the 0th coefficient (by indicating startCoeff = 1), because the 0th coefficient is representative of the overall energy in spectrum, while the rest of the coefficients are not affected by overall energy, only the mel-frequency spectral contour. The default is 0.

Constraints

  • Minimum: 0
  • Maximum: 1
minFreq

The lower bound of the frequency band to use in analysis, in Hz. The default is 20.

Constraints

  • Minimum: 0
maxFreq

The upper bound of the frequency band to use in analysis, in Hz. The default is 20000.

Constraints

  • Minimum: 0
windowSize

The window size. As MFCC computation relies on spectral frames, we need to decide what precision we give it spectrally and temporally. For more information visit https://learn.flucoma.org/learn/fourier-transform/. The default is 1024.

hopSize

The window hop size. As MFCC computation relies on spectral frames, we need to move the window forward. It can be any size, but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).

fftSize

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will use the next power of 2 equal or above the windowSize.

padding

Controls the zero-padding added to either end of the source buffer or segment. Padding ensures all values are analysed. Possible values are:

0No padding - The first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function.
1Half the window size - The first sample is centred in the analysis window ensuring that the start and end of the segment are accounted for in the analysis.
2Window size minus the hop size - Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.
trig

A kr signal that will trigger execution

blocking

Whether to execute this process directly on the server command FIFO or delegate to a worker thread. See processBlocking/process for caveats.

Inherited class methods

Instance Methods

.cancel

From superclass: FluidBufProcessor

Cancels non-blocking processing

.wait

From superclass: FluidBufProcessor

When called in the context of a Routine (it won't work otherwise), will block execution until the processor has finished. This can be convinient for writing sequences of processes more linearly than using lots of nested actions.

Inherited instance methods

Examples

Load a lot of MFCC analyses to a data set for later data processing