athena.loss¶
some losses
Module Contents¶
Classes¶
CTCLoss |
CTC LOSS |
Seq2SeqSparseCategoricalCrossentropy |
Seq2SeqSparseCategoricalCrossentropy LOSS |
MPCLoss |
MPC LOSS |
Tacotron2Loss |
Tacotron2 Loss |
GuidedAttentionLoss |
|
GuidedMultiHeadAttentionLoss |
Guided multihead attention loss function module for multi head attention. |
FastSpeechLoss |
used for training of fastspeech |
SoftmaxLoss |
Softmax Loss |
AMSoftmaxLoss |
Additive Margin Softmax Loss |
AAMSoftmaxLoss |
Additive Angular Margin Softmax Loss |
ProtoLoss |
Prototypical Loss |
AngleProtoLoss |
Angular Prototypical Loss |
GE2ELoss |
Generalized End-to-end Loss |
-
class
athena.loss.CTCLoss(logits_time_major=False, blank_index=-1, name='CTCLoss')¶ Bases:
tensorflow.keras.losses.LossCTC LOSS CTC LOSS implemented with Tensorflow
-
__call__(self, logits, samples, logit_length=None)¶
-
-
class
athena.loss.Seq2SeqSparseCategoricalCrossentropy(num_classes, eos=-1, by_token=False, by_sequence=True, from_logits=True, label_smoothing=0.0)¶ Bases:
tensorflow.keras.losses.CategoricalCrossentropySeq2SeqSparseCategoricalCrossentropy LOSS CategoricalCrossentropy calculated at each character for each sequence in a batch
-
__call__(self, logits, samples, logit_length=None)¶
-
-
class
athena.loss.MPCLoss(name='MPCLoss')¶ Bases:
tensorflow.keras.losses.LossMPC LOSS L1 loss for each masked acoustic features in a batch
-
__call__(self, logits, samples, logit_length=None)¶
-
-
class
athena.loss.Tacotron2Loss(model, guided_attn_loss_function, regularization_weight=0.0, l1_loss_weight=0.0, mask_decoder=False, pos_weight=1.0, name='Tacotron2Loss')¶ Bases:
tensorflow.keras.losses.LossTacotron2 Loss
-
__call__(self, outputs, samples, logit_length=None)¶ Parameters: outputs – contain elements below: att_ws_stack: shape: [batch, y_steps, x_steps]
-
-
class
athena.loss.GuidedAttentionLoss(guided_attn_weight, reduction_factor, attn_sigma=0.4, name='GuidedAttentionLoss')¶ Bases:
tensorflow.keras.losses.Loss-
__call__(self, att_ws_stack, samples)¶
-
_create_attention_masks(self, input_length, output_length)¶ masks created by attention location
Parameters: - input_length – shape: [batch_size]
- output_length – shape: [batch_size]
Returns: shape: [batch_size, 1, y_steps, x_steps]
Return type: masks
-
_create_length_masks(self, input_length, output_length)¶ masks created by input and output length
Parameters: - input_length – shape: [batch_size]
- output_length – shape: [batch_size]
Returns: shape: [batch_size, 1, output_length, input_length]
Return type: masks
Examples
output_length: [6, 8] input_length: [3, 5] masks:
- [[[1, 1, 1, 0, 0],
- [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [1, 1, 1, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]],
- [[1, 1, 1, 1, 1],
- [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]]
-
-
class
athena.loss.GuidedMultiHeadAttentionLoss(guided_attn_weight, reduction_factor, attn_sigma=0.4, num_heads=2, num_layers=2, name='GuidedMultiHeadAttentionLoss')¶ Bases:
athena.loss.GuidedAttentionLossGuided multihead attention loss function module for multi head attention.
-
__call__(self, att_ws_stack, samples)¶
-
-
class
athena.loss.FastSpeechLoss(duration_predictor_loss_weight, eps=1.0, use_mask=True, teacher_guide=False)¶ Bases:
tensorflow.keras.losses.Lossused for training of fastspeech
-
__call__(self, outputs, samples)¶ Its corresponding log value is calculated to make it Gaussian. :param outputs: it contains four elements:
before_outs: outputs before postnet, shape: [batch, y_steps, feat_dim] teacher_outs: teacher outputs, shape: [batch, y_steps, feat_dim] after_outs: outputs after postnet, shape: [batch, y_steps, feat_dim] duration_sequences: duration predictions from teacher model, shape: [batch, x_steps] pred_duration_sequences: duration predictions from trained predictor
shape: [batch, x_steps]Parameters: samples – samples from dataset
-
-
class
athena.loss.SoftmaxLoss(embedding_size, num_classes, name='SoftmaxLoss')¶ Bases:
tensorflow.keras.losses.LossSoftmax Loss Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples, logit_length=None)¶
-
-
class
athena.loss.AMSoftmaxLoss(embedding_size, num_classes, m=0.3, s=15, name='AMSoftmaxLoss')¶ Bases:
tensorflow.keras.losses.LossAdditive Margin Softmax Loss Reference to paper “CosFace: Large Margin Cosine Loss for Deep Face Recognition”
and “In defence of metric learning for speaker recognition”Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples, logit_length=None)¶
-
-
class
athena.loss.AAMSoftmaxLoss(embedding_size, num_classes, m=0.3, s=15, easy_margin=False, name='AAMSoftmaxLoss')¶ Bases:
tensorflow.keras.losses.LossAdditive Angular Margin Softmax Loss Reference to paper “ArcFace: Additive Angular Margin Loss for Deep Face Recognition”
and “In defence of metric learning for speaker recognition”Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples, logit_length=None)¶
-
-
class
athena.loss.ProtoLoss(name='ProtoLoss')¶ Bases:
tensorflow.keras.losses.LossPrototypical Loss Reference to paper “Prototypical Networks for Few-shot Learning”
and “In defence of metric learning for speaker recognition”Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples=None, logit_length=None)¶ Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]
-
-
class
athena.loss.AngleProtoLoss(init_w=10.0, init_b=-5.0, name='AngleProtoLoss')¶ Bases:
tensorflow.keras.losses.LossAngular Prototypical Loss Reference to paper “In defence of metric learning for speaker recognition” Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples=None, logit_length=None)¶ Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]
-
-
class
athena.loss.GE2ELoss(init_w=10.0, init_b=-5.0, name='GE2ELoss')¶ Bases:
tensorflow.keras.losses.LossGeneralized End-to-end Loss Reference to paper “Generalized End-to-end Loss for Speaker Verification”
and “In defence of metric learning for speaker recognition”Similar to this implementation “https://github.com/clovaai/voxceleb_trainer”
-
__call__(self, outputs, samples=None, logit_length=None)¶ Parameters: outputs – [batch_size, num_speaker_utts, embedding_size]
-