athena.models.masked_pc¶
an implementations for MPC
Module Contents¶
Classes¶
MaskedPredictCoding |
implementation for MPC pretrain model |
-
class
athena.models.masked_pc.MaskedPredictCoding(data_descriptions, config=None)¶ Bases:
athena.models.base.BaseModelimplementation for MPC pretrain model :param num_filters: a int type number, i.e the number of filters in cnn :param d_model: a int type number, i.e dimension of model :param num_heads: number of heads in transformer :param num_encoder_layers: number of layer in encoder :param dff: a int type number, i.e dimension of model :param rate: rate of dropout layers :param chunk_size: number of consecutive masks, i.e 1 or 3 :param keep_probability: probability not to be masked :param mode: train mode, i.e MPC: pretrain :param max_pool_layers: index of max pool layers in encoder, default is -1
-
default_config¶
-
call(self, samples, training: bool = None)¶ used for training :param samples is a dict, including keys: ‘input’, ‘input_length’, ‘output_length’, ‘output’
input: acoustic features, Tensor, shape is (batch, time_len, dim, 1), i.e f-bankReturns: - MPC outputs to fit acoustic features
- encoder_outputs: Transformer encoder outputs, Tensor, shape is (batch, seqlen, dim)
-
get_loss(self, logits, samples, training=None)¶ get MPC loss :param logitsdd: MPC output
Returns: MPC L1 loss
-
compute_logit_length(self, samples)¶
-
generate_mpc_mask(self, input_data)¶ generate mask for pretraining :param acoustic features: i.e F-bank
Returns: mask tensor
-
prepare_samples(self, samples)¶ for special data prepare carefully: do not change the shape of samples
-