athena.models.fastspeech

Module Contents

Classes

FastSpeech Reference: Fastspeech: Fast, robust and controllable text to speech
LengthRegulator Length regulator for feed-forward Transformer.
DurationCalculator Calculate duration based on teacher model
class athena.models.fastspeech.FastSpeech(data_descriptions, config=None)

Bases: athena.models.base.BaseModel

Reference: Fastspeech: Fast, robust and controllable text to speech (http://papers.nips.cc/paper/8580-fastspeech-fast-robust-and-controllable-text-to-speech.pdf)

default_config
set_teacher_model(self, teacher_model, teacher_type)
restore_from_pretrained_model(self, pretrained_model, model_type='')
get_loss(self, outputs, samples, training=None)
_feedforward_decoder(self, encoder_output, duration_indexes, duration_sequences, output_length, training)
call(self, samples, training: bool = None)
synthesize(self, samples)
class athena.models.fastspeech.LengthRegulator

Bases: tensorflow.keras.layers.Layer

Length regulator for feed-forward Transformer.

inference(self, phoneme_sequences, duration_sequences, alpha=1.0)

Calculate replicated sequences based on duration sequences :param phoneme_sequences: sequences of phoneme features, shape: [batch, x_steps, d_model] :param duration_sequences: durations of each frame, shape: [batch, x_steps] :param alpha: Alpha value to control speed of speech.

Returns:replicated sequences based on durations, shape: [batch, y_steps, d_model]
Return type:expanded_array
call(self, phoneme_sequences, duration_indexes, output_length)

Calculate replicated sequences based on duration sequences :param phoneme_sequences: sequences of phoneme features, shape: [batch, x_steps, d_model] :param duration_indexes: durations of each frame, shape: [batch, y_steps] :param output_length: shape: [batch]

Returns:replicated sequences based on durations, shape: [batch, y_steps, d_model]
Return type:expanded_array
class athena.models.fastspeech.DurationCalculator(teacher_model=None, teacher_type=None)

Bases: tensorflow.keras.layers.Layer

Calculate duration based on teacher model

call(self, samples)
Parameters:samples – samples from dataset
Returns:Batch of durations shape: [batch, max_input_length).
Return type:durations
_calculate_t2_attentions(self, samples)
_calculate_transformer_attentions(self, samples)