athena.tools.ctc_scorer¶
ctc scorer used in joint-ctc decoding
Module Contents¶
Classes¶
CTCPrefixScorer |
ctc one pass decoding, the algorithm is based on |
-
class
athena.tools.ctc_scorer.CTCPrefixScorer(eos, ctc_beam, num_classes, blank=-1, ctc_weight=0.25)¶ ctc one pass decoding, the algorithm is based on “HYBRID CTC/ATTENTION ARCHITECTURE FOR END-TO-END SPEECH RECOGNITION,”
-
initial_state(self, init_cand_states, x)¶ Initialize states and Add init_state and init_score to init_cand_states :param init_cand_states: CandidateHolder.cand_states :param x: log softmax value from ctc_logits, shape: (beam, T, num_classes)
Return: init_cand_states
-
score(self, candidate_holder, new_scores)¶ Call this function to compute the ctc one pass decoding score based on the logits of the ctc module, the scoring function shares a common interface :param candidate_holder: CandidateHolder :param new_scores: the score from other models
Returns: shape: (beam, num_classes) cand_states: CandidateHolder.cand_states updated cand_states Return type: ctc_score_result
-
cand_score(self, y, cs, r_prev)¶ - r: the probability of the output seq containing the predicted label
- given the current input seqs, shape: [input_length, 2, ctc_beam]
r[:, 0]: the prediction of the t-th frame is not blank r[:, 1]: the prediction of the t-th frame is blank log_phi: the probability that the last predicted label is not created
by the t-th framelog_psi: the sum of all log_phi’s, the prefix probability, shape:[ctc_beam]
Parameters: - y – cand_seq
- cs – top_ctc_candidates
- r_prev – ctc_pre_state
Returns: ctc_score new_state
Return type: log_psi
-