athena.tools.ctc_scorer

ctc scorer used in joint-ctc decoding

Module Contents

Classes

CTCPrefixScorer ctc one pass decoding, the algorithm is based on
class athena.tools.ctc_scorer.CTCPrefixScorer(eos, ctc_beam, num_classes, blank=-1, ctc_weight=0.25)

ctc one pass decoding, the algorithm is based on “HYBRID CTC/ATTENTION ARCHITECTURE FOR END-TO-END SPEECH RECOGNITION,”

initial_state(self, init_cand_states, x)

Initialize states and Add init_state and init_score to init_cand_states :param init_cand_states: CandidateHolder.cand_states :param x: log softmax value from ctc_logits, shape: (beam, T, num_classes)

Return: init_cand_states

score(self, candidate_holder, new_scores)

Call this function to compute the ctc one pass decoding score based on the logits of the ctc module, the scoring function shares a common interface :param candidate_holder: CandidateHolder :param new_scores: the score from other models

Returns:shape: (beam, num_classes) cand_states: CandidateHolder.cand_states updated cand_states
Return type:ctc_score_result
cand_score(self, y, cs, r_prev)
r: the probability of the output seq containing the predicted label
given the current input seqs, shape: [input_length, 2, ctc_beam]

r[:, 0]: the prediction of the t-th frame is not blank r[:, 1]: the prediction of the t-th frame is blank log_phi: the probability that the last predicted label is not created

by the t-th frame

log_psi: the sum of all log_phi’s, the prefix probability, shape:[ctc_beam]

Parameters:
  • y – cand_seq
  • cs – top_ctc_candidates
  • r_prev – ctc_pre_state
Returns:

ctc_score new_state

Return type:

log_psi