athena.models.masked_pc

an implementations for MPC

Module Contents

Classes

MaskedPredictCoding implementation for MPC pretrain model
class athena.models.masked_pc.MaskedPredictCoding(data_descriptions, config=None)

Bases: athena.models.base.BaseModel

implementation for MPC pretrain model :param num_filters: a int type number, i.e the number of filters in cnn :param d_model: a int type number, i.e dimension of model :param num_heads: number of heads in transformer :param num_encoder_layers: number of layer in encoder :param dff: a int type number, i.e dimension of model :param rate: rate of dropout layers :param chunk_size: number of consecutive masks, i.e 1 or 3 :param keep_probability: probability not to be masked :param mode: train mode, i.e MPC: pretrain :param max_pool_layers: index of max pool layers in encoder, default is -1

default_config
call(self, samples, training: bool = None)

used for training :param samples is a dict, including keys: ‘input’, ‘input_length’, ‘output_length’, ‘output’

input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bank
Returns:
MPC outputs to fit acoustic features
encoder_outputs: Transformer encoder outputs, Tensor, shape is (batch, seqlen, dim)
get_loss(self, logits, samples, training=None)

get MPC loss :param logitsdd: MPC output

Returns:MPC L1 loss
compute_logit_length(self, samples)
generate_mpc_mask(self, input_data)

generate mask for pretraining :param acoustic features: i.e F-bank

Returns:mask tensor
prepare_samples(self, samples)

for special data prepare carefully: do not change the shape of samples