athena.transform.feats.framepow

“This model extracts framepow features per frame.

Module Contents

Classes

Framepow Compute power of every frame in speech. Return a float tensor with
class athena.transform.feats.framepow.Framepow(config: dict)

Bases: athena.transform.feats.base_frontend.BaseFrontend

Compute power of every frame in speech. Return a float tensor with shape (1 * num_frames).

classmethod params(cls, config=None)

Set params. :param config: contains four optional parameters:

window_length: Window length in seconds. (float, default = 0.025) frame_length: Hop length in seconds. (float, default = 0.010) snip_edges: If True, the last frame (shorter than window_length)

will be cutoff. If False, 1 // 2 frame_length data will be padded to data. (int, default = True)

remove_dc_offset: Subtract mean from waveform on each frame (bool, default = true)

:return:An object of class HParams, which is a set of hyperparameters as name-value pairs.

call(self, audio_data, sample_rate)

Caculate power of every frame in speech. :param audio_data: the audio signal from which to compute spectrum.

Should be an (1, N) tensor.
Parameters:sample_rate – the samplerate of the signal we working with, default is 16kHz.
:return:A float tensor of size (1 * num_frames) containing power of every
frame in speech.
dim(self)

dim