athena.data.datasets.speech_recognition_kaldiio

audio dataset

Module Contents

Classes

SpeechRecognitionDatasetKaldiIOBuilder SpeechRecognitionDatasetKaldiIOBuilder
class athena.data.datasets.speech_recognition_kaldiio.SpeechRecognitionDatasetKaldiIOBuilder(config=None)

Bases: athena.data.datasets.base.BaseDatasetBuilder

SpeechRecognitionDatasetKaldiIOBuilder

default_config
num_class

return the max_index of the vocabulary + 1

speaker_list

return the speaker list

audio_featurizer_func

return the audio_featurizer function

sample_type
sample_shape
sample_signature
reload_config(self, config)

reload the config

preprocess_data(self, file_dir, apply_sort_filter=True)

Generate a list of tuples (feat_key, speaker).

load_scps(self, file_dir)

load kaldi-format feats.scp, labels.scp and utt2spk (optional)

__getitem__(self, index)
__len__(self)

return the number of data samples

filter_sample_by_unk(self)

filter samples which contain unk

filter_sample_by_input_length(self)

filter samples by input length

The length of filterd samples will be in [min_length, max_length)

Returns:a filtered list of tuples (wav_filename, wav_len, transcripts, speed, speaker)
Return type:entries
filter_sample_by_output_length(self)

filter samples by output length

The length of filterd samples will be in [min_length, max_length)

Returns:a filtered list of tuples (wav_filename, wav_len, transcripts, speed, speaker)
Return type:entries
compute_cmvn_if_necessary(self, is_necessary=True)

compute cmvn file