k2
add_epsilon_self_loops
- k2.add_epsilon_self_loops(fsa, ret_arc_map=False)[source]
Add epsilon self-loops to an Fsa or FsaVec.
This is required when composing using a composition method that does not treat epsilons specially, if the other FSA has epsilons in it.
- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.ret_arc_map (
bool
) – If False, return the resulting Fsa. If True, return an extra arc map.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
]]- Returns
If ret_arc_map is False, return an instance of
Fsa
that has an epsilon self-loop on every non-final state. If ret_arc_map is True, it returns an extra arc_map. arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the resulting Fsa. arc_map[i] is -1 if the i-th arc in the resulting Fsa has no counterpart in the input fsa.
arc_sort
- k2.arc_sort(fsa, ret_arc_map=False)[source]
Sort arcs of every state.
Note
Arcs are sorted by labels first, and then by dest states.
Caution
If the input fsa is already arc sorted, we return it directly. Otherwise, a new sorted fsa is returned.
- Parameters
fsa (
Fsa
) – The input FSA.ret_arc_map (
bool
) – True to return an extra arc_map (a 1-D tensor with dtype being torch.int32). arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the output Fsa.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
]]- Returns
If ret_arc_map is False, return the sorted FSA. It is the same as the input fsa if the input fsa is arc sorted. Otherwise, a new sorted fsa is returned and the input fsa is NOT modified. If ret_arc_map is True, an extra arc map is also returned.
Example: Sort a single FSA
1#!/usr/bin/env python3 2 3import k2 4 5s = ''' 60 1 1 4 0.1 70 1 3 5 0.2 80 1 2 3 0.3 90 2 5 2 0.4 100 2 4 1 0.5 111 2 2 3 0.6 121 2 3 1 0.7 131 2 1 2 0.8 142 3 -1 -1 0.9 153 16''' 17fsa = k2.Fsa.from_str(s, acceptor=False) 18fsa.draw('arc_sort_single_before.svg', title='Before k2.arc_sort') 19sorted_fsa = k2.arc_sort(fsa) 20sorted_fsa.draw('arc_sort_single_after.svg', title='After k2.arc_sort') 21 22# If you want to sort by aux_labels, you can use 23inverted_fsa = k2.invert(fsa) 24sorted_fsa_2 = k2.arc_sort(inverted_fsa) 25sorted_fsa_2 = k2.invert(sorted_fsa_2) 26sorted_fsa_2.draw('arc_sort_single_after_aux_labels.svg', 27 title='After k2.arc_sort by aux_labels')
cat
- k2.cat(srcs)[source]
Concatenate a list of FsaVec into a single FsaVec.
Caution
Only common tensor attributes are kept in the output FsaVec. For non-tensor attributes, only one copy is kept in the output FsaVec. We choose the first copy of the FsaVec that has the lowest index in srcs.
- Parameters
srcs (
List
[Fsa
]) – A list of FsaVec. Each element MUST be an FsaVec.- Return type
Fsa
- Returns
Return a single FsaVec concatenated from the input FsaVecs.
closure
compose
- k2.compose(a_fsa, b_fsa, treat_epsilons_specially=True, inner_labels=None)[source]
Compute the composition of two FSAs.
When treat_epsilons_specially is True, this function works only on CPU. When treat_epsilons_specially is False and both a_fsa and b_fsa are on GPU, then this function works on GPU; in this case, the two input FSAs do not need to be arc sorted.
Note
a_fsa.aux_labels is required to be defined and it can be either a torch.Tensor or a ragged tensor of type k2.RaggedTensor. If it is a ragged tensor, then it requires that a_fsa.requires_grad is False.
For both FSAs, the aux_labels attribute is interpreted as output labels, (olabels), and the composition involves matching the olabels of a_fsa with the ilabels of b_fsa. This is implemented by intersecting the inverse of a_fsa (a_fsa_inv) with b_fsa, then replacing the ilabels of the result with the original ilabels on a_fsa which are now the aux_labels of a_fsa_inv. If b_fsa.aux_labels is not defined, b_fsa is treated as an acceptor (as in OpenFST), i.e. its olabels and ilabels are assumed to be the same.
Refer to
k2.intersect()
for how we assign the attributes of the output FSA.- Parameters
a_fsa (
Fsa
) – The first input FSA. It can be either a single FSA or an FsaVec.b_fsa (
Fsa
) – The second input FSA. it can be either a single FSA or an FsaVec.treat_epsilons_specially (
bool
) – If True, epsilons will be treated as epsilon, meaning epsilon arcs can match with an implicit epsilon self-loop. If False, epsilons will be treated as real, normal symbols (to have them treated as epsilons in this case you may have to add epsilon self-loops to whichever of the inputs is naturally epsilon-free).inner_labels (
Optional
[str
]) – If specified (and if a_fsa has aux_labels), the labels that we matched on, which would normally be discarded, will instead be copied to this attribute name.
Caution
b_fsa has to be arc sorted if the function runs on CPU.
- Return type
Fsa
- Returns
The result of composing a_fsa and b_fsa. len(out_fsa.shape) is 2 if and only if the two input FSAs are single FSAs; otherwise, len(out_fsa.shape) is 3.
compose_arc_maps
- k2.compose_arc_maps(step1_arc_map, step2_arc_map)[source]
Compose arc maps from two Fsa operations.
It implements:
ans_arc_map[i] = step1_arc_map[step2_arc_map[i]] if step2_arc_map[i] is not -1
ans_arc_map[i] = -1 if step2_arc_map[i] is -1
for i in 0 to step2_arc_map.numel() - 1.
- Parameters
step1_arc_map (
Tensor
) – A 1-D tensor with dtype torch.int32 from the first Fsa operation.step2_arc_map (
Tensor
) – A 1-D tensor with dtype torch.int32 from the second Fsa operation.
- Return type
Tensor
- Returns
Return a 1-D tensor with dtype torch.int32. It has the same number of elements as step2_arc_map. That is, ans_arc_map.shape == step2_arc_map.shape.
connect
- k2.connect(fsa)[source]
Connect an FSA.
Removes states that are neither accessible nor co-accessible.
Note
A state is not accessible if it is not reachable from the start state. A state is not co-accessible if it cannot reach the final state.
Caution
If the input FSA is already connected, it is returned directly. Otherwise, a new connected FSA is returned.
- Parameters
fsa (
Fsa
) – The input FSA to be connected.- Return type
Fsa
- Returns
An FSA that is connected.
convert_dense_to_fsa_vec
create_fsa_vec
- k2.create_fsa_vec(fsas)[source]
Create an FsaVec from a list of FSAs
We use the following rules to set the attributes of the output FsaVec:
For tensor attributes, we assume that all input FSAs have the same attribute name and the values are concatenated.
For non-tensor attributes, if any two of the input FSAs have the same attribute name, then we assume that their attribute values are equal and the output FSA will inherit the attribute.
- Parameters
fsas – A list of Fsa. Each element must be a single FSA.
- Returns
An instance of
Fsa
that represents a FsaVec.
create_sparse
- k2.create_sparse(rows, cols, values, size=None, min_col_index=None)[source]
This is a utility function that creates a (torch) sparse matrix likely intended to represent posteriors. The likely usage is something like (for example):
post = k2.create_sparse(fsa.seqframe, fsa.phones, fsa.get_arc_post(True,True).exp(), min_col_index=1)
(assuming seqframe and phones were integer-valued attributes of fsa).
- Parameters
rows (
Tensor
) – Row indexes of the sparse matrix (a torch.Tensor), which must have values >= 0; likely fsa.seqframe. Must have row_indexes.dim == 1. Will be converted to dtype=torch.longcols (
Tensor
) – Column indexes of the sparse matrix, with the same shape as rows. Will be converted to dtype=torch.longvalues (
Tensor
) – Values of the sparse matrix, likely of dtype float or double, with the same shape as rows and cols.size (
Optional
[Tuple
[int
,int
]]) – Optional. If not None, it is assumed to be a tuple containing (num_frames, highest_phone_plus_one)min_col_index (
Optional
[int
]) – If provided, before the sparse tensor is constructed we will filter out elements with cols[i] < min_col_index. Will likely be 0 or 1, if set. This is necessary if col_indexes may have values less than 0, or if you want to filter out 0 values (e.g. as representing blanks).
- Returns
Returns a torch.Tensor that is sparse with coo (coordinate) format, i.e. layout=torch.sparse_coo (which is actually the only sparse format that torch currently supports).
ctc_graph
- k2.ctc_graph(symbols, modified=False, device='cpu')[source]
Construct ctc graphs from symbols.
Note
The scores of arcs in the returned FSA are all 0.
- Parameters
symbols (
Union
[List
[List
[int
]],RaggedTensor
]) –It can be one of the following types:
A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]
An instance of
k2.RaggedTensor
. Must have num_axes == 2.
standard – Option to specify the type of CTC topology: “standard” or “simplified”, where the “standard” one makes the blank mandatory between a pair of identical symbols. Default True.
device (
Union
[device
,str
,None
]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. By default, the returned FSA is on CPU. If symbols is an instance ofk2.RaggedTensor
, the returned FSA will on the same device as k2.RaggedTensor.
- Return type
Fsa
- Returns
An FsaVec containing the returned ctc graphs, with “Dim0()” the same as “len(symbols)”(List[List[int]]) or “dim0”(k2.RaggedTensor)
Example 1
1#!/usr/bin/env python3 2 3import k2 4 5isym = k2.SymbolTable.from_str(''' 6blk 0 7a 1 8b 2 9c 3 10''') 11 12osym = k2.SymbolTable.from_str(''' 13a 1 14b 2 15c 3 16''') 17 18fsa = k2.ctc_graph([[1, 2, 2, 3]], modified=False) 19fsa_modified = k2.ctc_graph([[1, 2, 2, 3]], modified=True) 20 21fsa.labels_sym = isym 22fsa.aux_labels_sym = osym 23 24fsa_modified.labels_sym = isym 25fsa_modified.aux_labels_sym = osym 26 27# fsa is an FsaVec, so we use fsa[0] to visualize the first Fsa 28fsa[0].draw('ctc_graph.svg', 29 title='CTC graph for the string "abbc" (modified=False)') 30fsa_modified[0].draw('modified_ctc_graph.svg', 31 title='CTC graph for the string "abbc" (modified=True)')
Example 2 Construct a CTC graph using composition
1#!/usr/bin/env python3 2 3# Construct a CTC graph by intersection 4 5import k2 6 7isym = k2.SymbolTable.from_str(''' 8blk 0 9a 1 10b 2 11c 3 12''') 13 14osym = k2.SymbolTable.from_str(''' 15a 1 16b 2 17c 3 18''') 19 20linear_fsa = k2.linear_fsa([1, 2, 2, 3]) 21linear_fsa.labels_sym = isym 22 23ctc_topo = k2.ctc_topo(max_token=3, modified=False) 24ctc_topo_modified = k2.ctc_topo(max_token=3, modified=True) 25 26ctc_topo.labels_sym = isym 27ctc_topo.aux_labels_sym = osym 28 29ctc_topo_modified.labels_sym = isym 30ctc_topo_modified.aux_labels_sym = osym 31 32ctc_graph = k2.compose(ctc_topo, linear_fsa) 33ctc_graph_modified = k2.compose(ctc_topo_modified, linear_fsa) 34 35linear_fsa.draw('linear_fsa.svg', title='Linear FSA of the string "abbc"') 36ctc_topo.draw('ctc_topo.svg', title='CTC topology') 37ctc_topo_modified.draw('ctc_topo_modified.svg', title='Modified CTC topology') 38 39ctc_graph.draw('ctc_topo_compose_linear_fsa.svg', 40 title='k2.compose(ctc_topo, linear_fsa)') 41 42ctc_graph_modified.draw('ctc_topo_modified_compose_linear_fsa.svg', 43 title='k2.compose(ctc_topo_modified, linear_fsa)')
ctc_loss
- k2.ctc_loss(decoding_graph, dense_fsa_vec, output_beam=10, delay_penalty=0.0, reduction='sum', use_double_scores=True, target_lengths=None)[source]
Compute the CTC loss given a decoding graph and a dense fsa vector.
- Parameters
decoding_graph (
Fsa
) – An FsaVec. It can be the composition result of a ctc topology and a transcript.dense_fsa_vec (
DenseFsaVec
) – It represents the neural network output. Refer to the help information ink2.DenseFsaVec
.output_beam (
float
) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.delay_penalty (
float
) – A constant to penalize symbol delay, which is used to make symbol emit earlier for streaming models. It is almost the same as the delay_penalty in our rnnt_loss, See https://github.com/k2-fsa/k2/issues/955 and https://arxiv.org/pdf/2211.00490.pdf for more details.reduction (
Literal
[‘none’, ‘mean’, ‘sum’]) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied, ‘mean’: the output losses will be divided by the target lengths and then the mean over the batch is taken. ‘sum’: sum the output losses over batches.use_double_scores (
bool
) – True to use double precision floating point in computing the total scores. False to use single precision.target_lengths (
Optional
[Tensor
]) – Used only when reduction is mean. It is a 1-D tensor of batch size representing lengths of the targets, e.g., number of phones or number of word pieces in a sentence.
- Return type
Tensor
- Returns
If reduction is none, return a 1-D tensor with size equal to batch size. If reduction is mean or sum, return a scalar.
ctc_topo
- k2.ctc_topo(max_token, modified=False, device=None)[source]
Create a CTC topology.
A token which appears once on the right side (i.e. olabels) may appear multiple times on the left side (ilabels), possibly with epsilons in between. When 0 appears on the left side, it represents the blank symbol; when it appears on the right side, it indicates an epsilon. That is, 0 has two meanings here.
A standard CTC topology is the conventional one, where there is a mandatory blank between two repeated neighboring symbols. A non-standard, i.e., modified CTC topology, imposes no such constraint.
See https://github.com/k2-fsa/k2/issues/746#issuecomment-856421616 and https://github.com/k2-fsa/snowfall/pull/209 for more details.
- Parameters
max_token (
int
) – The maximum token ID (inclusive). We assume that token IDs are contiguous (from 1 to max_token). 0 represents blank.modified (
bool
) – If False, create a standard CTC topology. Otherwise, create a modified CTC topology.device (
Union
[device
,str
,None
]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it is None, then the returned FSA is on CPU.
- Return type
Fsa
- Returns
Return either a standard or a modified CTC topology as an FSA depending on whether standard is True or False.
Example
1#!/usr/bin/env python3 2 3import k2 4 5isym = k2.SymbolTable.from_str(''' 6blk 0 7a 1 8b 2 9c 3 10''') 11 12osym = k2.SymbolTable.from_str(''' 13a 1 14b 2 15c 3 16''') 17 18fsa = k2.ctc_topo(max_token=3, modified=False) 19fsa_modified = k2.ctc_topo(max_token=3, modified=True) 20 21fsa.labels_sym = isym 22fsa.aux_labels_sym = osym 23 24fsa_modified.labels_sym = isym 25fsa_modified.aux_labels_sym = osym 26 27fsa.draw('ctc_topo.svg', 28 title='CTC topology with max_token=3 (modified=False)') 29fsa_modified.draw('modified_ctc_topo.svg', 30 title='CTC topology with max_token=3 (modified=True)')
determinize
- k2.determinize(fsa, weight_pushing_type=<DeterminizeWeightPushingType.kNoWeightPushing: 2>)[source]
Determinize the input Fsa.
Caution
It only works on for CPU.
Any weight_pushing_type value other than kNoWeightPushing causes the ‘arc_derivs’ to not accurately reflect the real derivatives, although this will not matter as long as the derivatives ultimately derive from FSA operations such as getting total scores or arc posteriors, which are insensitive to pushing.
- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec. Must be connected. It’s also expected to be epsilon-free, but this is not checked; in any case, epsilon will be treated as a normal symbol.weight_pushing_type (
DeterminizeWeightPushingType
) –An enum value that determines what kind of weight pushing is desired, default kNoWeightPushing.
- kTropicalWeightPushing:
use tropical semiring (actually, max on scores) for weight pushing.
- kLogWeightPushing:
use log semiring (actually, log-sum on score) for weight pushing
- kNoWeightPushing:
do no weight pushing; this will cause some delay in scores being emitted, and the weights created in this way will correspond exactly to those that would be produced by the arc_derivs.
For decoding graph creation, we recommend kLogSumWeightPushing.
- Return type
Fsa
- Returns
The resulting Fsa, it’s equivalent to the input fsa under tropical semiring but will be deterministic. It will be the same as the input fsa if the input fsa has property kFsaPropertiesArcSortedAndDeterministic. Otherwise, a new deterministic fsa is returned and the input fsa is NOT modified.
do_rnnt_pruning
- k2.do_rnnt_pruning(am, lm, ranges)[source]
Prune the output of encoder(am) and prediction network(lm) with ranges generated by get_rnnt_prune_ranges.
- Parameters
am (
Tensor
) – The encoder output, with shape (B, T, encoder_dim)lm (
Tensor
) – The prediction network output, with shape (B, S + 1, decoder_dim)ranges (
Tensor
) – A tensor containing the symbol indexes for each frame that we want to keep. Its shape is (B, T, s_range), see the docs in get_rnnt_prune_ranges for more details of this tensor.
- Return type
Tuple
[Tensor
,Tensor
]- Returns
Return the pruned am and lm with shape (B, T, s_range, C)
expand_ragged_attributes
- k2.expand_ragged_attributes(fsas, ret_arc_map=False, ragged_attribute_names=None)[source]
Turn ragged labels attached to this FSA into linear (Tensor) labels, expanding arcs into sequences of arcs as necessary to achieve this. Supports autograd. If fsas had no ragged attributes, returns fsas itself.
Caution
This function will ensure that for final-arcs in the returned fsa, the corresponding labels for all ragged attributes are -1; it will add an extra arc at the end if necessary to ensure this, if the original ragged attributes did not have -1 as their final element on final-arcs (note: our intention is that -1’s on final arcs, like filler symbols, are removed when making attributes ragged; this is what fsa_from_unary_function_ragged() does if remove_filler==True (the default).
- Parameters
fsas (
Fsa
) – The source Fsaret_arc_map (
bool
) – If true, will return a pair (new_fsas, arc_map) with arc_map a tensor of int32 that maps from arcs in the result to arcs in fsas, with -1’s for newly created arcs. If false, just returns new_fsas.ragged_attribute_names (
Optional
[List
[str
]]) – If specified, just this list of ragged attributes will be expanded to linear tensor attributes, and the rest will stay ragged.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
]]
get_aux_labels
- k2.get_aux_labels(best_paths)[source]
Extract aux_labels from the best-path FSAs and remove 0s and -1s. :type best_paths:
Fsa
:param best_paths: An Fsa with best_paths.arcs.num_axes() == 3, i.e.containing multiple FSAs, which is expected to be the result of shortest_path (otherwise the returned values won’t be meaningful).
- Return type
List
[List
[int
]]- Returns
Returns a list of lists of int, containing the label sequences we decoded.
get_best_matching_stats
- k2.get_best_matching_stats(tokens, scores, counts, eos, min_token, max_token, max_order)[source]
For “query” sentences, this function gets the mean and variance of scores from the best matching words-in-context in a set of provided “key” sentences. This matching process matches the word and the words preceding it, looking for the highest-order match it can find (it’s intended for approximating the scores of models that see only left-context, like language models). The intended application is in estimating the scores of hypothesized transcripts, when we have actually computed the scores for only a subset of the hypotheses.
Caution
This function only runs on CPU for now.
- Parameters
tokens (
RaggedTensor
) –A ragged tensor of int32_t with 2 or 3 axes. If 2 axes, this represents a collection of key and query sequences. If 3 axes, this represents a set of such collections.
2-axis example:
[ [ the, cat, said, eos ], [ the, cat, fed, eos ] ]
3-axis example:
[ [ [ the, cat, said, eos ], [ the, cat, fed, eos ] ], [ [ hi, my, name, is, eos ], [ bye, my, name, is, eos ] ], ... ]
where the words would actually be represented as integers, The eos symbol is required if this code is to work as intended (otherwise this code will not be able to recognize when we have reached the beginnings of sentences when comparing histories). bos symbols are allowed but not required.
scores (
Tensor
) – A one dim torch.tensor with scores.size() == tokens.NumElements(), this is the item for which we are requesting best-matching values (as means and variances in case there are multiple best matches). In our anticipated use, these would represent scores of words in the sentences, but they could represent anything.counts (
Tensor
) – An one dim torch.tensor with counts.size() == tokens.NumElements(), containing 1 for words that are considered “keys” and 0 for words that are considered “queries”. Typically some entire sentences will be keys and others will be queries.eos (
int
) – The value of the eos (end of sentence) symbol; internally, this is used as an extra padding value before the first sentence in each collection, so that it can act like a “bos” symbol.min_token (
int
) – The lowest possible token value, including the bos symbol (e.g., might be -1).max_token (
int
) – The maximum possible token value. Be careful not to set this too large the implementation contains a part which takes time and space O(max_token - min_token).max_order (
int
) – The maximum n-gram order to ever return in the ngram_order output; the output will be the minimum of max_order and the actual order matched; or max_order if we matched all the way to the beginning of both sentences. The main reason this is needed is that we need a finite number to return at the beginning of sentences.
- Return type
Tuple
[Tensor
,Tensor
,Tensor
,Tensor
]- Returns
- Returns a tuple of four torch.tensor (mean, var, counts_out, ngram_order)
- mean:
For query positions, will contain the mean of the scores at the best matching key positions, or zero if that is undefined because there are no key positions at all. For key positions, you can treat the output as being undefined (actually they are treated the same as queries, but won’t match with only themselves because we don’t match at singleton intervals).
- var:
Like mean, but contains the (centered) variance of the best matching positions.
- counts_out:
The number of key positions that contributed to the mean and var statistics. This should only be zero if counts was all zero.
- ngram_order:
The n-gram order corresponding to the best matching positions found at each query position, up to a maximum of max_order; will be max_order if we matched all the way to the beginning of a sentence.
get_lattice
- k2.get_lattice(log_prob, log_prob_len, decoding_graph, search_beam=20, output_beam=8, min_active_states=30, max_active_states=10000, subsampling_factor=1)[source]
Get the decoding lattice from a decoding graph and log_softmax output. :type log_prob:
Tensor
:param log_prob: Output from a log_softmax layer of shape(N, T, C)
. :type log_prob_len:Tensor
:param log_prob_len: A tensor of shape(N,)
containing number of valid frames fromlog_prob
before padding.- Parameters
decoding_graph (
Fsa
) – An Fsa, the decoding graph. It can be either anHLG
or anH
. You can usectc_topo()
to build anH
.search_beam (
float
) – Decoding beam, e.g. 20. Smaller is faster, larger is more exact (less pruning). This is the default value; it may be modified by min_active_states and max_active_states.output_beam (
float
) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.min_active_states (
int
) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.max_active_states (
int
) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.subsampling_factor (
int
) – The subsampling factor of the model.
- Return type
Fsa
- Returns
An FsaVec containing the decoding result. It has axes [utt][state][arc].
get_rnnt_logprobs
- k2.get_rnnt_logprobs(lm, am, symbols, termination_symbol, rnnt_type='regular', boundary=None)[source]
Reduces RNN-T problem (the simple case, where joiner network is just addition), to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This function is called from rnnt_loss_simple(), but may be useful for other purposes.
- Parameters
lm (
Tensor
) –Language model part of un-normalized logprobs of symbols, to be added to acoustic model part before normalizing. Of shape:
[B][S+1][C]
where B is the batch size, S is the maximum sequence length of the symbol sequence, possibly including the EOS symbol; and C is size of the symbol vocabulary, including the termination/next-frame symbol. Conceptually, lm[b][s] is a vector of length [C] representing the “language model” part of the un-normalized logprobs of symbols, given all symbols earlier than s in the sequence. The reason we still need this for position S is that we may still be emitting the termination/next-frame symbol at this point.
am (
Tensor
) –Acoustic-model part of un-normalized logprobs of symbols, to be added to language-model part before normalizing. Of shape:
[B][T][C]
where B is the batch size, T is the maximum sequence length of the acoustic sequences (in frames); and C is size of the symbol vocabulary, including the termination/next-frame symbol. It reflects the “acoustic” part of the probability of any given symbol appearing next on this frame.
symbols (
Tensor
) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.termination_symbol (
int
) – The identity of the termination symbol, must be in {0..C-1}boundary (
Optional
[Tensor
]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
- Return type
Tuple
[Tensor
,Tensor
]- Returns
- (px, py) (the names are quite arbitrary).
- px: logprobs, of shape [B][S][T+1] if rnnt_type is regular,
[B][S][T] if rnnt_type is not regular.
py: logprobs, of shape [B][S+1][T]
in the recursion:
p[b,0,0] = 0.0 if rnnt_type == "regular": p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t], p[b,s,t-1] + py[b,s,t-1]) if rnnt_type != "regular": p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1], p[b,s,t-1] + py[b,s,t-1]) .. where p[b][s][t] is the "joint score" of the pair of subsequences of length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol. if rnnt_type == "regular", px[:,:,T] equals -infinity, meaning on the "one-past-the-last" frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.
get_rnnt_logprobs_joint
- k2.get_rnnt_logprobs_joint(logits, symbols, termination_symbol, rnnt_type='regular', boundary=None)[source]
Reduces RNN-T problem to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This function is called from rnnt_loss().
- Parameters
logits (
Tensor
) – The output of joiner network, with shape (B, T, S + 1, C), i.e. batch, time_seq_len, symbol_seq_len+1, num_classessymbols (
Tensor
) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.termination_symbol (
int
) – The identity of the termination symbol, must be in {0..C-1}boundary (
Optional
[Tensor
]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
- Return type
Tuple
[Tensor
,Tensor
]- Returns
(px, py) (the names are quite arbitrary):
px: logprobs, of shape [B][S][T+1] if rnnt_type is regular, [B][S][T] if rnnt_type is not regular. py: logprobs, of shape [B][S+1][T]
in the recursion:
p[b,0,0] = 0.0 if rnnt_type == "regular": p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t], p[b,s,t-1] + py[b,s,t-1]) if rnnt_type != "regular": p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1], p[b,s,t-1] + py[b,s,t-1])
length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol.
if rnnt_type == “regular”, px[:,:,T] equals -infinity, meaning on the “one-past-the-last” frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.
get_rnnt_logprobs_pruned
- k2.get_rnnt_logprobs_pruned(logits, symbols, ranges, termination_symbol, boundary, rnnt_type='regular')[source]
Construct px, py for mutual_information_recursion with pruned output.
- Parameters
logits (
Tensor
) – The pruned output of joiner network, with shape (B, T, s_range, C)symbols (
Tensor
) – The symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.ranges (
Tensor
) – A tensor containing the symbol ids for each frame that we want to keep. It is a LongTensor of shape[B][T][s_range]
, whereranges[b,t,0]
contains the begin symbol0 <= s <= S - s_range + 1
, such thatlogits[b,t,:,:]
represents the logits with positionss, s + 1, ... s + s_range - 1
. See docs inget_rnnt_prune_ranges()
for more details of what ranges contains.termination_symbol (
int
) – the termination symbol, with 0 <= termination_symbol < Cboundary (
Tensor
) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame whether emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
- Return type
Tuple
[Tensor
,Tensor
]- Returns
(px, py) (the names are quite arbitrary):
px: logprobs, of shape [B][S][T+1] if rnnt_type is regular, [B][S][T] if rnnt_type is not regular. py: logprobs, of shape [B][S+1][T]
in the recursion:
p[b,0,0] = 0.0 if rnnt_type == "regular": p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t], p[b,s,t-1] + py[b,s,t-1]) if rnnt_type != "regular": p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1], p[b,s,t-1] + py[b,s,t-1])
length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol.
if rnnt_type == “regular”, px[:,:,T] equals -infinity, meaning on the “one-past-the-last” frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.
get_rnnt_logprobs_smoothed
- k2.get_rnnt_logprobs_smoothed(lm, am, symbols, termination_symbol, lm_only_scale=0.1, am_only_scale=0.1, boundary=None, rnnt_type='regular')[source]
Reduces RNN-T problem (the simple case, where joiner network is just addition), to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This version allows you to make the loss-function one of the form:
lm_only_scale * lm_probs + am_only_scale * am_probs + (1-lm_only_scale-am_only_scale) * combined_probs
where lm_probs and am_probs are the probabilities given the lm and acoustic model independently.
This function is called from
rnnt_loss_smoothed()
, but may be useful for other purposes.- Parameters
lm (
Tensor
) –Language model part of un-normalized logprobs of symbols, to be added to acoustic model part before normalizing. Of shape:
[B][S+1][C]
where B is the batch size, S is the maximum sequence length of the symbol sequence, possibly including the EOS symbol; and C is size of the symbol vocabulary, including the termination/next-frame symbol. Conceptually, lm[b][s] is a vector of length [C] representing the “language model” part of the un-normalized logprobs of symbols, given all symbols earlier than s in the sequence. The reason we still need this for position S is that we may still be emitting the termination/next-frame symbol at this point.
am (
Tensor
) –Acoustic-model part of un-normalized logprobs of symbols, to be added to language-model part before normalizing. Of shape:
[B][T][C]
where B is the batch size, T is the maximum sequence length of the acoustic sequences (in frames); and C is size of the symbol vocabulary, including the termination/next-frame symbol. It reflects the “acoustic” part of the probability of any given symbol appearing next on this frame.
symbols (
Tensor
) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.termination_symbol (
int
) – The identity of the termination symbol, must be in {0..C-1}lm_only_scale (
float
) – the scale on the “LM-only” part of the loss.am_only_scale (
float
) – the scale on the “AM-only” part of the loss, for which we use an “averaged” LM (averaged over all histories, so effectively unigram).boundary (
Optional
[Tensor
]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
- Return type
Tuple
[Tensor
,Tensor
]- Returns
- (px, py) (the names are quite arbitrary).
- px: logprobs, of shape [B][S][T+1] if rnnt_type == “regular”,
[B][S][T] if rnnt_type != “regular”.
py: logprobs, of shape [B][S+1][T]
in the recursion:
p[b,0,0] = 0.0 if rnnt_type == "regular": p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t], p[b,s,t-1] + py[b,s,t-1]) if rnnt_type != "regular": p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1], p[b,s,t-1] + py[b,s,t-1]) .. where p[b][s][t] is the "joint score" of the pair of subsequences of length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol. px[:,:,T] equals -infinity, meaning on the "one-past-the-last" frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.
get_rnnt_prune_ranges
- k2.get_rnnt_prune_ranges(px_grad, py_grad, boundary, s_range)[source]
Get the pruning ranges of normal rnnt loss according to the grads of px and py returned by mutual_information_recursion.
For each sequence with T frames, we will generate a tensor with the shape of (T, s_range) containing the information that which symbols will be token into consideration for each frame. For example, here is a sequence with 10 frames and the corresponding symbols are [A B C D E F], if the s_range equals 3, one possible ranges tensor will be:
[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [1, 2, 3], [1, 2, 3], [1, 2, 3], [3, 4, 5], [3, 4, 5], [3, 4, 5]]
which means we only consider [A B C] at frame 0, 1, 2, 3, and [B C D] at frame 4, 5, 6, [D E F] at frame 7, 8, 9.
We can only consider limited number of symbols because frames and symbols are monotonic aligned, theoretically it can only generate particular range of symbols given a particular frame.
Note
For the generated tensor ranges (assuming batch size is 1), ranges[:, 0] is a monotonic increasing tensor from 0 to len(symbols) - s_range and it satisfies ranges[t+1, 0] - ranges[t, 0] < s_range which means we won’t skip any symbols.
- Parameters
px_grad (
Tensor
) – The gradient of px, see docs in mutual_information_recursion for more details of px.py_grad (
Tensor
) – The gradient of py, see docs in mutual_information_recursion for more details of py.boundary (
Tensor
) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame]s_range (
int
) – How many symbols to keep for each frame.
- Return type
Tensor
- Returns
A tensor with the shape of (B, T, s_range) containing the indexes of the kept symbols for each frame.
get_rnnt_prune_ranges_deprecated
- k2.get_rnnt_prune_ranges_deprecated(px_grad, py_grad, boundary, s_range)[source]
Get the pruning ranges of normal rnnt loss according to the grads of px and py returned by mutual_information_recursion.
For each sequence with T frames, we will generate a tensor with the shape of (T, s_range) containing the information that which symbols will be token into consideration for each frame. For example, here is a sequence with 10 frames and the corresponding symbols are [A B C D E F], if the s_range equals 3, one possible ranges tensor will be:
[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [1, 2, 3], [1, 2, 3], [1, 2, 3], [3, 4, 5], [3, 4, 5], [3, 4, 5]]
which means we only consider [A B C] at frame 0, 1, 2, 3, and [B C D] at frame 4, 5, 6, [D E F] at frame 7, 8, 9.
We can only consider limited number of symbols because frames and symbols are monotonic aligned, theoretically it can only generate particular range of symbols given a particular frame.
Note
For the generated tensor ranges (assuming batch size is 1), ranges[:, 0] is a monotonic increasing tensor from 0 to len(symbols) - s_range and it satisfies ranges[t+1, 0] - ranges[t, 0] < s_range which means we won’t skip any symbols.
- Parameters
px_grad (
Tensor
) – The gradient of px, see docs in mutual_information_recursion for more details of px.py_grad (
Tensor
) – The gradient of py, see docs in mutual_information_recursion for more details of py.boundary (
Tensor
) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame]s_range (
int
) – How many symbols to keep for each frame.
- Return type
Tensor
- Returns
A tensor with the shape of (B, T, s_range) containing the indexes of the kept symbols for each frame.
index_add
- k2.index_add(index, value, in_out)[source]
It implements in_out[index[i]] += value[i].
Caution
It has similar semantics with torch.Tensor.index_add_ except that:
index.dtype == torch.int32
-1 <= index[i] < in_out.shape[0]
index[i] == -1 is ignored.
index has to be a 1-D contiguous tensor.
Caution
in_out is modified in-place.
Caution
This functions does NOT support autograd.
- Parameters
index (
Tensor
) – A 1-D contiguous tensor with dtype torch.int32. Must satisfy -1 <= index[i] < in_out.shape[0]value (
Tensor
) – A 1-D or 2-D tensor with dtype torch.int32, torch.float32, or torch.float64. Must satisfy index.shape[0] == value.shape[0]in_out (
Tensor
) – A 1-D or 2-D tensor with the same dtype as value. It satisfies in_out.shape[1] == value.shape[1] if it is a 2-D tensor.
- Return type
None
- Returns
Return None.
index_fsa
- k2.index_fsa(src, indexes)[source]
Select a list of FSAs from src with a 1-D tensor.
- Parameters
src (
Fsa
) – An FsaVec.indexes (
Tensor
) – A 1-D torch.Tensor of dtype torch.int32 containing the ids of FSAs to select.
- Return type
Fsa
- Returns
Return an FsaVec containing only those FSAs specified by indexes.
index_select
- k2.index_select(src, index, default_value=0)[source]
Returns a new tensor which indexes the input tensor along dimension 0 using the entries in index.
If the entry in index is -1, then the corresponding entry in the returned tensor is 0.
Caution
index.dtype == torch.int32 and index.ndim == 1.
- Parameters
src (
Tensor
) – The input tensor. Either 1-D or 2-D with dtype torch.int32, torch.int64, torch.float32, or torch.float64.index (
Tensor
) – 1-D tensor of dtype torch.int32 containing the indexes. If an entry is -1, the corresponding entry in the returned value is 0. The elements of index should be in the range [-1..src.shape[0]-1].default_value (
float
) – Used only when src is a 1-D tensor. It sets ans[i] to default_value if index[i] is -1.
- Return type
Tensor
- Returns
A tensor with shape
(index.numel(), *src.shape[1:])
and dtype the same as src, e.g. if src.ndim == 1, ans.shape would be (index.shape[0],); if src.ndim == 2, ans.shape would be (index.shape[0], src.shape[1]). Will satisfy ans[i] == src[index[i]] if src.ndim == 1, or ans[i, j] == src[index[i], j] if src.ndim == 2, except for entries where index[i] == -1 which will be zero.
intersect
- k2.intersect(a_fsa, b_fsa, treat_epsilons_specially=True, ret_arc_maps=False)[source]
Compute the intersection of two FSAs.
When treat_epsilons_specially is True, this function works only on CPU. When treat_epsilons_specially is False and both a_fsa and b_fsa are on GPU, then this function works on GPU; in this case, the two input FSAs do not need to be arc sorted.
- Parameters
a_fsa (
Fsa
) – The first input FSA. It can be either a single FSA or an FsaVec.b_fsa (
Fsa
) – The second input FSA. it can be either a single FSA or an FsaVec. If both a_fsa and b_fsa are FsaVec, they must contain the same number of FSAs.treat_epsilons_specially (
bool
) – If True, epsilons will be treated as epsilon, meaning epsilon arcs can match with an implicit epsilon self-loop. If False, epsilons will be treated as real, normal symbols (to have them treated as epsilons in this case you may have to add epsilon self-loops to whichever of the inputs is naturally epsilon-free).ret_arc_maps (
bool
) –If False, return the resulting Fsa. If True, return a tuple containing three entries:
the resulting Fsa
a_arc_map, a 1-D torch.Tensor with dtype torch.int32. a_arc_map[i] is the arc index in a_fsa that corresponds to the i-th arc in the resulting Fsa. a_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in a_fsa.
b_arc_map, a 1-D torch.Tensor with dtype torch.int32. b_arc_map[i] is the arc index in b_fsa that corresponds to the i-th arc in the resulting Fsa. b_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in b_fsa.
Caution
The two input FSAs MUST be arc sorted if treat_epsilons_specially is True.
Caution
The rules for assigning the attributes of the output Fsa are as follows:
(1) For attributes where only one source (a_fsa or b_fsa) has that attribute: Copy via arc_map, or use zero if arc_map has -1. This rule works for both floating point and integer attributes.
(2) For attributes where both sources (a_fsa and b_fsa) have that attribute: For floating point attributes: sum via arc_maps, or use zero if arc_map has -1. For integer attributes, it’s not supported for now (the attributes will be discarded and will not be kept in the output FSA).
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
,Tensor
]]- Returns
If ret_arc_maps is False, return the result of intersecting a_fsa and b_fsa. len(out_fsa.shape) is 2 if and only if the two input FSAs are single FSAs; otherwise, len(out_fsa.shape) is 3. If ret_arc_maps is True, it returns additionally two arc_maps: a_arc_map and b_arc_map.
intersect_dense
- k2.intersect_dense(a_fsas, b_fsas, output_beam, max_states=15000000, max_arcs=1073741824, a_to_b_map=None, seqframe_idx_name=None, frame_idx_name=None)[source]
Intersect array of FSAs on CPU/GPU.
Caution
a_fsas MUST be arc sorted.
- Parameters
a_fsas (
Fsa
) – Input FsaVec, i.e., decoding graphs, one per sequence. It might just be a linear sequence of phones, or might be something more complicated. Must have a_fsas.shape[0] == b_fsas.dim0() if a_to_b_map is None. Otherwise, must have a_fsas.shape[0] == a_to_b_map.shape[0]b_fsas (
DenseFsaVec
) – Input FSAs that correspond to neural network output.output_beam (
float
) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.max_states (
int
) – The max number of states to prune the output, mainly to avoid out-of-memory and numerical overflow, default 15,000,000.max_arcs (
int
) – The max number of arcs to prune the output, mainly to avoid out-of-memory and numerical overflow, default 1073741824(2^30).a_to_b_map (
Optional
[Tensor
]) – Maps from FSA-index in a to FSA-index in b to use for it. If None, then we expect the number of FSAs in a_fsas to equal b_fsas.dim0(). If set, then it should be a Tensor with ndim=1 and dtype=torch.int32, with a_to_b_map.shape[0] equal to the number of FSAs in a_fsas (i.e. a_fsas.shape[0] if len(a_fsas.shape) == 3, else 1); and elements 0 <= i < b_fsas.dim0().seqframe_idx_name (
Optional
[str
]) – If set (e.g. to ‘seqframe’), an attribute in the output will be created that encodes the sequence-index and the frame-index within that sequence; this is equivalent to a row-index into b_fsas.values, or, equivalently, an element in b_fsas.shape.frame_idx_name (
Optional
[str
]) – If set (e.g. to ‘frame’, an attribute in the output will be created that contains the frame-index within the corresponding sequence.
- Return type
Fsa
- Returns
The result of the intersection (pruned to output_beam; this pruning is exact, it uses forward and backward scores.
intersect_dense_pruned
- k2.intersect_dense_pruned(a_fsas, b_fsas, search_beam, output_beam, min_active_states, max_active_states, seqframe_idx_name=None, frame_idx_name=None, allow_partial=False)[source]
Intersect array of FSAs on CPU/GPU.
Caution
a_fsas MUST be arc sorted.
- Parameters
a_fsas (
Fsa
) – Input FsaVec, i.e., decoding graphs, one per sequence. It might just be a linear sequence of phones, or might be something more complicated. Must have either a_fsas.shape[0] == b_fsas.dim0(), or a_fsas.shape[0] == 1 in which case the graph is shared.b_fsas (
DenseFsaVec
) – Input FSAs that correspond to neural network output.search_beam (
float
) – Decoding beam, e.g. 20. Smaller is faster, larger is more exact (less pruning). This is the default value; it may be modified by min_active_states and max_active_states.output_beam (
float
) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.min_active_states (
int
) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.max_active_states (
int
) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.active (allow_partial If true and there was no final state) – we will treat all the states on the last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice.
- :paramwe will treat all the states on the
last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice.
- Parameters
seqframe_idx_name (
Optional
[str
]) – If set (e.g. to ‘seqframe’), an attribute in the output will be created that encodes the sequence-index and the frame-index within that sequence; this is equivalent to a row-index into b_fsas.values, or, equivalently, an element in b_fsas.shape.frame_idx_name (
Optional
[str
]) – If set (e.g. to ‘frame’, an attribute in the output will be created that contains the frame-index within the corresponding sequence.
- Return type
Fsa
- Returns
The result of the intersection.
intersect_device
- k2.intersect_device(a_fsas, b_fsas, b_to_a_map, sorted_match_a=False, ret_arc_maps=False)[source]
Compute the intersection of two FsaVecs treating epsilons as real, normal symbols.
This function supports both CPU and GPU. But it is very slow on CPU. That’s why this function name ends with _device. It is intended for GPU. See
k2.intersect()
which is a more general interface (it will call the same underlying code, IntersectDevice(), if the inputs are on GPU and a_fsas is arc-sorted).Caution
Epsilons are treated as real, normal symbols.
Hint
The two inputs do not need to be arc-sorted.
Refer to
k2.intersect()
for how we assign the attributes of the output FsaVec.- Parameters
a_fsas (
Fsa
) – An FsaVec (must have 3 axes, i.e., len(a_fsas.shape) == 3.b_fsas (
Fsa
) – An FsaVec (must have 3 axes) on the same device as a_fsas.b_to_a_map (
Tensor
) –A 1-D torch.Tensor with dtype torch.int32 on the same device as a_fsas. Map from FSA-id in b_fsas to the corresponding FSA-id in a_fsas that we want to compose it with. E.g. might be an identity map, or all-to-zero, or something the user chooses.
- Requires
b_to_a_map.shape[0] == b_fsas.shape[0]
0 <= b_to_a_map[i] < a_fsas.shape[0]
sorted_match_a (
bool
) – If true, the arcs of a_fsas must be sorted by label (checked by calling code via properties), and we’ll use a matching approach that requires this.ret_arc_maps (
bool
) –If False, return the resulting Fsa. If True, return a tuple containing three entries:
the resulting Fsa
a_arc_map, a 1-D torch.Tensor with dtype torch.int32. a_arc_map[i] is the arc index in a_fsas that corresponds to the i-th arc in the resulting Fsa. a_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in a_fsas.
b_arc_map, a 1-D torch.Tensor with dtype torch.int32. b_arc_map[i] is the arc index in b_fsas that corresponds to the i-th arc in the resulting Fsa. b_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in b_fsas.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
,Tensor
]]- Returns
If ret_arc_maps is False, return intersected FsaVec; will satisfy ans.shape == b_fsas.shape. If ret_arc_maps is True, it returns additionally two arc maps: a_arc_map and b_arc_map.
invert
- k2.invert(fsa, ret_arc_map=False)[source]
Invert an FST, swapping the labels in the FSA with the auxiliary labels.
- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.ret_arc_map (
bool
) – True to return an extra arc map, which is a 1-D tensor with dtype torch.int32. The returned arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the returned fsa. arc_map[i] is -1 if the i-th arc in the returned fsa has no counterpart in the input fsa.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
]]- Returns
If ret_arc_map is False, return the inverted Fsa, it’s top-sorted if fsa is top-sorted. If ret_arc_map is True, return an extra arc map.
is_rand_equivalent
- k2.is_rand_equivalent(a, b, log_semiring, beam=inf, treat_epsilons_specially=True, delta=1e-06, npath=100)[source]
Check if the Fsa a appears to be equivalent to b by randomly checking some symbol sequences in them.
Caution
It works only on CPU.
- Parameters
a (
Fsa
) – One of the input FSA. It can be either a single FSA or an FsaVec. Must be top-sorted and on CPU.b (
Fsa
) – The other input FSA. It must have the same NumAxes() as a. Must be top-sorted and on CPU.log_semiring (
bool
) – The semiring to be used for all weight measurements; if false then we use ‘max’ on alternative paths; if true we use ‘log-add’.beam (
float
) – beam > 0 that affects pruning; the algorithm will only check paths within beam of the total score of the lattice (for tropical semiring, it’s max weight over all paths from start state to final state; for log semiring, it’s log-sum probs over all paths) in a or b.treat_epsilons_specially (
bool
) – We’ll do intersection between generated path and a or b when check equivalence. Generally, if it’s true, we will treat epsilons as epsilon when doing intersection; Otherwise, epsilons will just be treated as any other symbol.delta (
float
) – Tolerance for path weights to check the equivalence. If abs(weights_a, weights_b) <= delta, we say the two paths are equivalent.npath (
int
) – The number of paths will be generated to check the equivalence of a and b
- Return type
bool
- Returns
True if the Fsa a appears to be equivalent to b by randomly generating npath paths from one of them and then checking if the symbol sequence exists in the other one and if the total weight for that symbol sequence is the same in both FSAs.
joint_mutual_information_recursion
- k2.joint_mutual_information_recursion(px, py, boundary=None)[source]
A recursion that is useful for modifications of RNN-T and similar loss functions, where the recursion probabilities have a number of terms and you want them reported separately. See mutual_information_recursion() for more documentation of the basic aspects of this.
- Parameters
px (
Sequence
[Tensor
]) – a sequence of Tensors, each of the same shape [B][S][T+1]py (
Sequence
[Tensor
]) – a sequence of Tensor, each of the same shape [B][S+1][T], the sequence must be the same length as px.boundary (
Optional
[Tensor
]) – optionally, a LongTensor of shape [B][4] containing rows [s_begin, t_begin, s_end, t_end], with 0 <= s_begin <= s_end <= S and 0 <= t_begin <= t_end < T, defaulting to [0, 0, S, T]. These are the beginning and one-past-the-last positions in the x and y sequences respectively, and can be used if not all sequences are of the same length.
- Return type
Sequence
[Tensor
]- Returns
a Tensor of shape (len(px), B), whose sum over dim 0 is the total log-prob of the recursion mentioned below, per sequence. The first element of the sequence of length len(px) is “special”, in that it has an offset term reflecting the difference between sum-of-log and log-of-sum; for more interpretable loss values, the “main” part of your loss function should be first.
The recursion below applies if boundary == None, when it defaults to (0, 0, S, T); where px_sum, py_sum are the sums of the elements of px and py:
p = tensor of shape (B, S+1, T+1), containing -infinity p[b,0,0] = 0.0 # do the following in loop over s and t: p[b,s,t] = log_add(p[b,s-1,t] + px_sum[b,s-1,t], p[b,s,t-1] + py_sum[b,s,t-1]) (if s > 0 or t > 0) return b[:][S][T]
This function lets you implement the above recursion efficiently, except that it gives you a breakdown of the contribution from all the elements of px and py separately. As noted above, the first element of the sequence is “special”.
levenshtein_alignment
- k2.levenshtein_alignment(refs, hyps, hyp_to_ref_map, sorted_match_ref=False)[source]
Get the levenshtein alignment of two FsaVecs
This function supports both CPU and GPU. But it is very slow on CPU.
- Parameters
refs (
Fsa
) – An FsaVec (must have 3 axes, i.e., len(refs.shape) == 3. It is the output Fsa of thelevenshtein_graph()
.hyps (
Fsa
) – An FsaVec (must have 3 axes) on the same device as refs. It is the output Fsa of thelevenshtein_graph()
.hyp_to_ref_map (
Tensor
) –A 1-D torch.Tensor with dtype torch.int32 on the same device as refs. Map from FSA-id in hpys to the corresponding FSA-id in refs that we want to get levenshtein alignment with. E.g. might be an identity map, or all-to-zero, or something the user chooses.
- Requires
hyp_to_ref_map.shape[0] == hyps.shape[0]
0 <= hyp_to_ref_map[i] < refs.shape[0]
sorted_match_ref (
bool
) – If true, the arcs of refs must be sorted by label (checked by calling code via properties), and we’ll use a matching approach that requires this.
- Return type
Fsa
- Returns
Returns an FsaVec containing the alignment information and satisfing ans.Dim0() == hyps.Dim0(). Two attributes named ref_labels and hyp_labels will be added to the returned FsaVec. ref_labels contains the aligned sequences of refs and hyp_labels contains the aligned sequences of hyps. You can get the levenshtein distance by calling get_tot_scores on the returned FsaVec.
Examples
>>> hyps = k2.levenshtein_graph([[1, 2, 3], [1, 3, 3, 2]]) >>> refs = k2.levenshtein_graph([[1, 2, 4]]) >>> alignment = k2.levenshtein_alignment( refs, hyps, hyp_to_ref_map=torch.tensor([0, 0], dtype=torch.int32), sorted_match_ref=True) >>> alignment.labels tensor([ 1, 2, 0, -1, 1, 0, 0, 0, -1], dtype=torch.int32) >>> alignment.ref_labels tensor([ 1, 2, 4, -1, 1, 2, 4, 0, -1], dtype=torch.int32) >>> alignment.hyp_labels tensor([ 1, 2, 3, -1, 1, 3, 3, 2, -1], dtype=torch.int32) >>> -alignment.get_tot_scores( use_double_scores=False, log_semiring=False)) tensor([1., 3.])
levenshtein_graph
- k2.levenshtein_graph(symbols, ins_del_score=- 0.501, device='cpu')[source]
Construct levenshtein graphs from symbols.
See https://github.com/k2-fsa/k2/pull/828 for more details about levenshtein graph.
- Parameters
symbols (
Union
[RaggedTensor
,List
[List
[int
]]]) –It can be one of the following types:
A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]
An instance of
k2.RaggedTensor
. Must have num_axes == 2 and with dtype torch.int32.
ins_del_score (
float
) – The score on the self loops arcs in the graphs, the main idea of this score is to set insertion and deletion penalty, which will affect the shortest path searching produre.device (
Union
[device
,str
,None
]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. By default, the returned FSA is on CPU. If symbols is an instance ofk2.RaggedTensor
, the returned FSA will on the same device as k2.RaggedTensor.
- Return type
Fsa
- Returns
An FsaVec containing the returned levenshtein graphs, with “Dim0()” the same as “len(symbols)”(List[List[int]]) or “dim0”(k2.RaggedTensor).
linear_fsa
- k2.linear_fsa(labels, device=None)[source]
Construct an linear FSA from labels.
Note
The scores of arcs in the returned FSA are all 0.
- Parameters
labels (
Union
[List
[int
],List
[List
[int
]],RaggedTensor
]) –It can be one of the following types:
A list of integers, e.g., [1, 2, 3]
A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]
An instance of
k2.RaggedTensor
. Must have num_axes == 2.
device (
Union
[device
,str
,None
]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it isNone
, then the returned FSA is on CPU. It has to be None iflabels
is an instance ofk2.RaggedTensor
.
- Return type
Fsa
- Returns
If
labels
is a list of integers, return an FSAIf
labels
is a list of list-of-integers, return an FsaVecIf
labels
is an instance ofk2.RaggedTensor
, return an FsaVec
linear_fsa_with_self_loops
- k2.linear_fsa_with_self_loops(fsas)[source]
Create a linear FSA with epsilon self-loops by first removing epsilon transitions from the input linear FSA.
- Parameters
fsas (
Fsa
) – An FSA or an FsaVec. It MUST be a linear FSA or a vector of linear FSAs.- Returns
Return an FSA or FsaVec, where each FSA contains epsilon self-loops but contains no epsilon transitions for arcs that are not self-loops.
linear_fst
- k2.linear_fst(labels, aux_labels)[source]
Construct a linear FST from labels and its corresponding auxiliary labels.
Note
The scores of arcs in the returned FST are all 0.
- Parameters
labels (
Union
[List
[int
],List
[List
[int
]]]) – A list of integers or a list of list of integers.aux_labels (
Union
[List
[int
],List
[List
[int
]]]) – A list of integers or a list of list of integers.
- Return type
Fsa
- Returns
An FST if the labels is a list of integers. A vector of FSTs (FsaVec) if the input is a list of list of integers.
linear_fst_with_self_loops
- k2.linear_fst_with_self_loops(fsts)[source]
Create a linear FST with epsilon self-loops by first removing epsilon transitions from the input linear FST.
Note
The main difference to
linear_fsa_with_self_loops()
is that aux_labels and scores are also kept here.- Parameters
fsas – An FST or an FstVec. It MUST be a linear FST or a vector of linear FSTs.
- Returns
Return an FST or FstVec, where each FST contains epsilon self-loops but contains no epsilon transitions for arcs that are not self-loops.
mutual_information_recursion
- k2.mutual_information_recursion(px, py, boundary=None, return_grad=False)[source]
A recursion that is useful in computing mutual information between two sequences of real vectors, but may be useful more generally in sequence-to-sequence tasks where monotonic alignment between pairs of sequences is desired. The definitions of the arguments are definitions that would be used when computing this type of mutual information, but you can also view them as arbitrary quantities and just make use of the formula computed by this function.
- Parameters
px (
Tensor
) –A torch.Tensor of some floating point type, with shape
[B][S][T+1]
, whereB
is the batch size,S
is the length of thex
sequence (including representations ofEOS
symbols but notBOS
symbols), andT
is the length of they
sequence (including representations ofEOS
symbols but notBOS
symbols). In the mutual information application,px[b][s][t]
would represent the following log odds ratio; ignoring the b index on the right to make the notation more compact:px[b][s][t] = log [ p(x_s | x_{0..s-1}, y_{0..t-1}) / p(x_s) ]
This expression also implicitly includes the log-probability of choosing to generate an
x
value as opposed to ay
value. In practice it might be computed asa + b
, wherea
is the log probability of choosing to extend the sequence of length(s,t)
with anx
as opposed to ay
value; andb
might in practice be of the form:log(N exp f(x_s, y_{t-1}) / sum_t' exp f(x_s, y_t'))
where
N
is the number of terms that the sum overt'
included, which might include some or all of the other sequences as well as this one.Note
we don’t require
px
andpy
to be contiguous, but the code assumes for optimization purposes that theT
axis has stride 1.py (
Tensor
) –A torch.Tensor of the same dtype as
px
, with shape[B][S+1][T]
, representing:py[b][s][t] = log [ p(y_t | x_{0..s-1}, y_{0..t-1}) / p(y_t) ]
This function does not treat
x
andy
differently; the only difference is that for optimization purposes we assume the last axis (thet
axis) has stride of 1; this is true ifpx
andpy
are contiguous.boundary (
Optional
[Tensor
]) – If supplied, a torch.LongTensor of shape[B][4]
, where each row contains[s_begin, t_begin, s_end, t_end]
, with0 <= s_begin <= s_end <= S
and0 <= t_begin <= t_end < T
(this implies that empty sequences are allowed). If not supplied, the values[0, 0, S, T]
will be assumed. These are the beginning and one-past-the-last positions in thex
andy
sequences respectively, and can be used if not all sequences are of the same length.return_grad (
bool
) – Whether to return grads ofpx
andpy
, this grad standing for the occupation probability is the output of the backward with afake gradient
thefake gradient
is the same as the gradient you’d get if you didtorch.autograd.grad((scores.sum()), [px, py])
. This is useful to implement the pruned version of rnnt loss.
- Return type
Union
[Tuple
[Tensor
,Tuple
[Tensor
,Tensor
]],Tensor
]- Returns
Returns a torch.Tensor of shape
[B]
, containing the log of the mutual information between the b’th pair of sequences. This is defined by the following recursion onp[b,s,t]
(wherep
is of shape[B,S+1,T+1]
), representing a mutual information between sub-sequences of lengthss
andt
:p[b,0,0] = 0.0 if !modified: p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t], p[b,s,t-1] + py[b,s,t-1]) if modified: p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1], p[b,s,t-1] + py[b,s,t-1])
where we handle edge cases by treating quantities with negative indexes as -infinity. The extension to cases where the boundaries are specified should be obvious; it just works on shorter sequences with offsets into
px
andpy
.
mwer_loss
- k2.mwer_loss(lattice, ref_texts, nbest_scale=0.5, num_paths=200, temperature=1.0, use_double_scores=True, reduction='sum')[source]
Compute the Minimum loss given a lattice and corresponding ref_texts.
- Parameters
lattice – An FsaVec with axes [utt][state][arc].
ref_texts –
- It can be one of the following types:
A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]
An instance of
k2.RaggedTensor
. Must have num_axes == 2 and with dtype torch.int32.
nbest_scale – Scale lattice.score before passing it to
k2.random_paths()
. A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.’’num_paths – Number of paths to sample from the lattice using
k2.random_paths()
.temperature – For long utterances, the dynamic range of scores will be too large and the posteriors will be mostly 0 or 1. To prevent this it might be a good idea to have an extra argument that functions like a temperature. We scale the logprobs by before doing the normalization.
use_double_scores – True to use double precision floating point. False to use single precision.
reduction (
Literal
[‘none’, ‘mean’, ‘sum’]) –Specifies the reduction to apply to the output: ‘none’ | ‘sum’ | ‘mean’. ‘none’: no reduction will be applied.
The returned ‘loss’ is a k2.RaggedTensor, with loss.tot_size(0) == batch_size. loss.tot_size(1) == total_num_paths_of_current_batch If you want the MWER loss for each utterance, just do: loss_per_utt = loss.sum() Then loss_per_utt.shape[0] should be batch_size. See more example usages in ‘k2/python/tests/mwer_test.py’
’sum’: sum loss of each path over the whole batch together. ‘mean’: divide above ‘sum’ by total num paths over the whole batch.
- Return type
Union
[Tensor
,RaggedTensor
]- Returns
Minimum Word Error Rate loss.
one_best_decoding
- k2.one_best_decoding(lattice, use_double_scores=True)[source]
Get the best path from a lattice.
- Parameters
lattice (
Fsa
) – The decoding lattice returned byget_lattice()
.use_double_scores (
bool
) – True to use double precision floating point in the computation. False to use single precision.
- Return type
Fsa
- Returns
An FsaVec containing linear paths.
properties_to_str
prune_on_arc_post
- k2.prune_on_arc_post(fsas, threshold_prob, use_double_scores)[source]
Remove arcs whose posteriors are less than the given threshold.
- Parameters
fsas (
Fsa
) – An FsaVec. Must have 3 axes.threshold_prob (
float
) – Arcs whose posteriors are less than this value are removed. .. note:: 0 < threshold_prob < 1use_double_scores (
bool
) – True to use double precision during computation; False to use single precision.
- Return type
Fsa
- Returns
Return a pruned FsaVec.
pruned_ranges_to_lattice
- k2.pruned_ranges_to_lattice(ranges: torch.Tensor, frames: torch.Tensor, symbols: torch.Tensor, logits: torch.Tensor) Tuple[_k2.RaggedArc, torch.Tensor]
random_fsa
- k2.random_fsa(acyclic=True, max_symbol=50, min_num_arcs=0, max_num_arcs=1000)[source]
Generate a random Fsa.
- Parameters
acyclic (
bool
) – If true, generated Fsa will be acyclic.max_symbol (
int
) –- Maximum symbol on arcs. Generated arc symbols will be in range
[-1,max_symbol], note -1 is kFinalSymbol; must be at least 0;
- min_num_arcs:
Minimum number of arcs; must be at least 0.
- max_num_arcs:
Maximum number of arcs; must be >= min_num_arcs.
- Return type
Fsa
random_fsa_vec
- k2.random_fsa_vec(min_num_fsas=1, max_num_fsas=1000, acyclic=True, max_symbol=50, min_num_arcs=0, max_num_arcs=1000)[source]
Generate a random FsaVec.
- Parameters
min_num_fsas (
int
) – Minimum number of fsas we’ll generated in the returned FsaVec; must be at least 1.max_num_fsas (
int
) – Maximum number of fsas we’ll generated in the returned FsaVec; must be >= min_num_fsas.acyclic (
bool
) – If true, generated Fsas will be acyclic.max_symbol (
int
) – Maximum symbol on arcs. Generated arcs’ symbols will be in range [-1,max_symbol], note -1 is kFinalSymbol; must be at least 0;min_num_arcs (
int
) – Minimum number of arcs in each Fsa; must be at least 0.max_num_arcs (
int
) – Maximum number of arcs in each Fsa; must be >= min_num_arcs.
- Return type
Fsa
random_paths
- k2.random_paths(fsas, use_double_scores, num_paths)[source]
Compute pseudo-random paths through the FSAs in this vector of FSAs (this object must have 3 axes, self.arcs.num_axes() == 3)
Caution
It does not support autograd.
Caution
Do not be confused by the function name. There is no randomness at all, thus no seed. It uses a deterministic algorithm internally, similar to arithmetic coding (see https://en.wikipedia.org/wiki/Arithmetic_coding).
Look into the C++ implementation code for more details.
- Parameters
fsas (
Fsa
) – A FsaVec, i.e., len(fsas.shape) == 3use_double_scores (
bool
) – If true, do computation with double-precision, else float (single-precision)num_paths (
int
) – Number of paths requested through each FSA. FSAs that have no successful paths will have zero paths returned.
- Returns
[fsa][path][arc_pos]; the final sub-lists (indexed with arc_pos) are sequences of arcs starting from the start state and terminating in the final state. The values are arc_idx012, i.e. arc indexes.
- Return type
Returns a k2.RaggedTensor (dtype is torch.int32) with 3 axes
remove_epsilon
- k2.remove_epsilon(fsa)[source]
Remove epsilons (symbol zero) in the input Fsa.
Caution
Call
k2.connect()
if you are using a GPU version.- Parameters
fsa (
Fsa
) –The input FSA. It can be either a single FSA or an FsaVec. Works either for CPU or GPU, but the algorithm is different. We can only use the CPU algorithm if the input is top-sorted, and the GPU algorithm, while it works for CPU, may not be very fast.
fsa must be free of epsilon loops that have score greater than 0.
- Return type
Fsa
- Returns
The resulting Fsa is equivalent to the input fsa under the tropical semiring but will be epsilon-free. Any linear tensor attributes, such as ‘aux_labels’, will have been turned into ragged labels after removing fillers (i.e. labels whose value equals fsa.XXX_filler if the attribute name is XXX), counting -1’s on final-arcs as fillers even if the filler value for that attribute is not -1.
remove_epsilon_and_add_self_loops
- k2.remove_epsilon_and_add_self_loops(fsa, remove_filler=True)[source]
Remove epsilons (symbol zero) in the input Fsa, and then add epsilon self-loops to all states in the input Fsa (usually as a preparation for intersection with treat_epsilons_specially=0).
Caution
Call
k2.connect()
if you are using a GPU version.- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.remove_filler (
bool
) – If true, we will remove any filler values of attributes when converting linear to ragged attributes.
- Return type
Fsa
- Returns
The resulting Fsa. See
remove_epsilon()
for details. The only epsilons will be epsilon self-loops on all states.
remove_epsilon_self_loops
- k2.remove_epsilon_self_loops(fsa)[source]
Remove epsilon self-loops of an Fsa or an FsaVec.
Caution
Unlike
remove_epsilon()
, this funciton removes only epsilon self-loops.- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.- Return type
Fsa
- Returns
An instance of
Fsa
that has no epsilon self-loops on every non-final state.
replace_fsa
- k2.replace_fsa(src, index, symbol_begin_range=1, ret_arc_map=False)[source]
Replace arcs in index FSA with the corresponding fsas in a vector of FSAs(src). For arcs in index with label symbol_range_begin <= label < symbol_range_begin + src.Dim0() will be replaced with fsa indexed label - symbol_begin_range in src. The destination state of the arc in index is identified with the final-state of the corresponding FSA in src, and the arc in index will become an epsilon arc leading to a new state in the output that is a copy of the start-state of the corresponding FSA in src. Arcs with labels outside this range are just copied. Labels on final-arcs in src (Which will be -1) would be set to 0(epsilon) in the result fsa.
Caution
Attributes of the result inherits from index and src via arc_map_index and arc_map_src, But if there are attributes with same name, only the attributes with dtype torch.float32 are supported, the other kinds of attributes are discarded. See docs in fsa_from_binary_function_tensor for details.
- Parameters
src (
Fsa
) – Fsa that we’ll be inserting into the result, MUST have 3 axes.index (
Fsa
) – The Fsa that is to be replaced, It can be a single FSA or a vector of FSAs.symbol_range_begin – Beginning of the range of symbols that are to be replaced with Fsas.
ret_arc_map (
bool
) – if true, will return a tuple (new_fsas, arc_map_index, arc_map_src) with arc_map_index and arc_map_src tensors of int32 that maps from arcs in the result to arcs in index and src , with -1’s for the arcs not mapped. If false, just returns new_fsas.
- Return type
Union
[Fsa
,Tuple
[Fsa
,Tensor
,Tensor
]]
reverse
- k2.reverse(fsa)[source]
Reverse the input Fsa. If the input Fsa accepts string ‘x’ with weight ‘x.weight’, then the reversed Fsa accepts the reverse of string ‘x’ with weight ‘x.weight.reverse’. As the Fsas of k2 run on the Log-semiring or Tropical-semiring, the ‘weight.reverse’ will equal to the orignal ‘weight’.
- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.- Return type
Fsa
- Returns
An instance of
Fsa
which has been reversed.
rnnt_loss
- k2.rnnt_loss(logits, symbols, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean')[source]
A normal RNN-T loss, which uses a ‘joiner’ network output as input, i.e. a 4 dimensions tensor.
- Parameters
logits (
Tensor
) – The output of joiner network, with shape (B, T, S + 1, C), i.e. batch, time_seq_len, symbol_seq_len+1, num_classessymbols (
Tensor
) – The symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.termination_symbol (
int
) – the termination symbol, with 0 <= termination_symbol < Cboundary (
Optional
[Tensor
]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
delay_penalty (
float
) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.reduction (
Optional
[str
]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: mean
- Return type
Tensor
- Returns
If recursion is none, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch, otherwise a scalar with the reduction applied.
rnnt_loss_pruned
- k2.rnnt_loss_pruned(logits, symbols, ranges, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', use_hat_loss=False)[source]
A RNN-T loss with pruning, which uses the output of a pruned ‘joiner’ network as input, i.e. a 4 dimensions tensor with shape (B, T, s_range, C), s_range means the number of symbols kept for each frame.
- Parameters
logits (
Tensor
) – The pruned output of joiner network, with shape (B, T, s_range, C), i.e. batch, time_seq_len, prune_range, num_classessymbols (
Tensor
) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.ranges (
Tensor
) – A tensor containing the symbol ids for each frame that we want to keep. It is a LongTensor of shape[B][T][s_range]
, whereranges[b,t,0]
contains the begin symbol0 <= s <= S - s_range +1
, such thatlogits[b,t,:,:]
represents the logits with positionss, s + 1, ... s + s_range - 1
. See docs inget_rnnt_prune_ranges()
for more details of what ranges contains.termination_symbol (
int
) – The identity of the termination symbol, must be in {0..C-1}boundary (
Optional
[Tensor
]) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
delay_penalty (
float
) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.reduction (
Optional
[str
]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: meanuse_hat_loss (
bool
) – If True, we compute the Hybrid Autoregressive Transducer (HAT) loss from https://arxiv.org/abs/2003.07705. This is a variant of RNN-T that models the blank distribution separately as a Bernoulli distribution, and the non-blanks are modeled as a multinomial. This formulation may be useful for performing internal LM estimation, as described in the paper.
- Return type
Tensor
- Returns
If reduction is none, returns a tensor of shape (B,), containing the total RNN-T loss values for each sequence of the batch, otherwise a scalar with the reduction applied.
rnnt_loss_simple
- k2.rnnt_loss_simple(lm, am, symbols, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', return_grad=False)[source]
A simple case of the RNN-T loss, where the ‘joiner’ network is just addition.
- Parameters
lm (
Tensor
) – language-model part of unnormalized log-probs of symbols, with shape (B, S+1, C), i.e. batch, symbol_seq_len+1, num_classesam (
Tensor
) – acoustic-model part of unnormalized log-probs of symbols, with shape (B, T, C), i.e. batch, frame, num_classessymbols (
Tensor
) – the symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.termination_symbol (
int
) – the termination symbol, with 0 <= termination_symbol < Cboundary (
Optional
[Tensor
]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame either emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
delay_penalty (
float
) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.reduction (
Optional
[str
]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: meanreturn_grad (
bool
) – Whether to return grads of px and py, this grad standing for the occupation probability is the output of the backward with a fake gradient, the fake gradient is the same as the gradient you’d get if you did torch.autograd.grad((-loss.sum()), [px, py]), note, the loss here is the loss with reduction “none”. This is useful to implement the pruned version of rnnt loss.
- Return type
Union
[Tensor
,Tuple
[Tensor
,Tuple
[Tensor
,Tensor
]]]- Returns
If return_grad is False, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch if reduction equals to “none”, otherwise a scalar with the reduction applied. If return_grad is True, the grads of px and py, which is the output of backward with a `fake gradient`(see above), will be returned too. And the returned value will be a tuple like (loss, (px_grad, py_grad)).
rnnt_loss_smoothed
- k2.rnnt_loss_smoothed(lm, am, symbols, termination_symbol, lm_only_scale=0.1, am_only_scale=0.1, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', return_grad=False)[source]
A simple case of the RNN-T loss, where the ‘joiner’ network is just addition.
- Parameters
lm (
Tensor
) – language-model part of unnormalized log-probs of symbols, with shape (B, S+1, C), i.e. batch, symbol_seq_len+1, num_classes. These are assumed to be well-normalized, in the sense that we could use them as probabilities separately from the am scoresam (
Tensor
) – acoustic-model part of unnormalized log-probs of symbols, with shape (B, T, C), i.e. batch, frame, num_classessymbols (
Tensor
) – the symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.termination_symbol (
int
) – the termination symbol, with 0 <= termination_symbol < Clm_only_scale (
float
) – the scale on the “LM-only” part of the loss.am_only_scale (
float
) – the scale on the “AM-only” part of the loss, for which we use an “averaged” LM (averaged over all histories, so effectively unigram).boundary (
Optional
[Tensor
]) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.rnnt_type (
str
) –Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if
emitting a blank (i.e., emitting a symbol does not take you to the next frame).
- modified: A modified version of rnnt that will take you to the next
frame whether emitting a blank or a non-blank symbol.
- constrained: A version likes the modified one that will go to the next
frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.
delay_penalty (
float
) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.reduction (
Optional
[str
]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: meanreturn_grad (
bool
) – Whether to return grads of px and py, this grad standing for the occupation probability is the output of the backward with a fake gradient, the fake gradient is the same as the gradient you’d get if you did torch.autograd.grad((-loss.sum()), [px, py]), note, the loss here is the loss with reduction “none”. This is useful to implement the pruned version of rnnt loss.
- Return type
Union
[Tuple
[Tensor
,Tuple
[Tensor
,Tensor
]],Tensor
]- Returns
If return_grad is False, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch if reduction equals to “none”, otherwise a scalar with the reduction applied. If return_grad is True, the grads of px and py, which is the output of backward with a `fake gradient`(see above), will be returned too. And the returned value will be a tuple like (loss, (px_grad, py_grad)).
shortest_path
- k2.shortest_path(fsa, use_double_scores)[source]
Return the shortest paths as linear FSAs from the start state to the final state in the tropical semiring.
Note
It uses the opposite sign. That is, It uses max instead of min.
- Parameters
fsa (
Fsa
) – The input FSA. It can be either a single FSA or an FsaVec.use_double_scores (
bool
) – False to use float, i.e., single precision floating point, for scores. True to use double.
- Return type
Fsa
- Returns
FsaVec, it contains the best paths as linear FSAs
simple_ragged_index_select
- k2.simple_ragged_index_select(src: torch.Tensor, indexes: k2::RaggedAny) torch.Tensor
swoosh_l
- k2.swoosh_l(x: torch.Tensor, dropout_prob: float = 0.0) torch.Tensor
Compute
swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035
, and optionally apply dropout. If x.requires_grad is True, it returnsdropout(swoosh_l(l))
. In order to reduce momory, the function derivativeswoosh_l'(x)
is encoded into 8-bits. If x.requires_grad is False, it returnsswoosh_l(x)
.- Parameters
x – A Tensor.
dropout_prob – A float number. The default value is 0.
swoosh_l_forward
- k2.swoosh_l_forward(x: torch.Tensor) torch.Tensor
Compute
swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035
.- Parameters
x – A Tensor.
swoosh_l_forward_and_deriv
- k2.swoosh_l_forward_and_deriv(x: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]
Compute
swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035
, and also the derivativeswoosh_l'(x) = 0.92 - 1 / (1 + exp(x-4))
.Note
\[\begin{split}\text{swoosh_l}'(x) &= -0.08 + \exp(x-4) / (1 + \exp(x-4)) \\ &= -0.08 + (1 - 1 / (1 + \exp(x-4))) \\ &= 0.92 - 1 / (1 + \exp(x-4))\end{split}\]1 + exp(x-4)
might be infinity, but1 / (1 + exp(x-4))
will be 0 in that case. This is partly why we rearranged the expression above, to avoid infinity / infinity = nan.- Parameters
x – A Tensor.
swoosh_r
- k2.swoosh_r(x: torch.Tensor, dropout_prob: float = 0.0) torch.Tensor
Compute
swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687
, and optionally apply dropout. If x.requires_grad is True, it returnsdropout(swoosh_r(l))
. In order to reduce momory, the function derivativeswoosh_r'(x)
is encoded into 8-bits. If x.requires_grad is False, it returnsswoosh_r(x)
.- Parameters
x – A Tensor.
dropout_prob – A float number. The default value is 0.
swoosh_r_forward
- k2.swoosh_r_forward(x: torch.Tensor) torch.Tensor
Compute
swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687
.- Parameters
x – A Tensor.
swoosh_r_forward_and_deriv
- k2.swoosh_r_forward_and_deriv(x: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]
Compute
swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687
, and also the derivativeswoosh_r'(x) = 0.92 - 1 / (1 + exp(x-1))
.Note
\[\begin{split}\text{swoosh_r}'(x) &= -0.08 + \exp(x-1) / (1 + \exp(x-1)) \\ &= -0.08 + (1 - 1 / (1 + \exp(x-1))) \\ &= 0.92 - 1 / (1 + \exp(x-1))\end{split}\]1 + exp(x-1)
might be infinity, but1 / (1 + exp(x-1))
will be 0 in that case. This is partly why we rearranged the expression above, to avoid infinity / infinity = nan.- Parameters
x – A Tensor.
to_dot
- k2.to_dot(fsa, title=None)[source]
Visualize an Fsa via graphviz.
Note
Graphviz is needed only when this function is called.
- Parameters
fsa (
Fsa
) – The input FSA to be visualized.title (
Optional
[str
]) – Optional. The title of the resulting visualization.
- Return type
Digraph
- Returns
a Diagraph from grahpviz.
to_str
- k2.to_str(fsa, openfst=False)[source]
Convert an Fsa to a string. This version prints out all integer labels and integer ragged labels on the same line as each arc, the same format accepted by Fsa.from_str().
Note
The returned string can be used to construct an Fsa with Fsa.from_str(), but you would need to know the names of the auxiliary labels and ragged labels.
- Parameters
openfst (
bool
) – Optional. If true, we negate the scores during the conversion.- Return type
str
- Returns
A string representation of the Fsa.
to_str_simple
- k2.to_str_simple(fsa, openfst=False)[source]
Convert an Fsa to a string. This is less complete than Fsa.to_str(), fsa.__str__(), or to_str_full(), meaning it prints only fsa.aux_labels and no ragged labels, not printing any other attributes. This is used in testing.
Note
The returned string can be used to construct an Fsa. See also to_str().
- Parameters
openfst (
bool
) – Optional. If true, we negate the scores during the conversion.- Return type
str
- Returns
A string representation of the Fsa.
to_tensor
- k2.to_tensor(fsa)[source]
Convert an Fsa to a Tensor.
You can save the tensor to disk and read it later to construct an Fsa.
Note
The returned Tensor contains only the transition rules, e.g., arcs. You may want to save its aux_labels separately if any.
- Parameters
fsa (
Fsa
) – The input Fsa.- Return type
Tensor
- Returns
A torch.Tensor of dtype torch.int32. It is a 2-D tensor if the input is a single FSA. It is a 1-D tensor if the input is a vector of FSAs.
top_sort
- k2.top_sort(fsa)[source]
Sort an FSA topologically.
Note
It returns a new FSA. The input FSA is NOT changed.
- Parameters
fsa (
Fsa
) – The input FSA to be sorted. It can be either a single FSA or a vector of FSAs.- Return type
Fsa
- Returns
It returns a single FSA if the input is a single FSA; it returns a vector of FSAs if the input is a vector of FSAs.
trivial_graph
- k2.trivial_graph(max_token, device=None)[source]
Create a trivial graph which has only two states. On state 0, there are max_token self loops(i.e. a loop for each symbol from 1 to max_token), and state 1 is the final state.
- Parameters
max_token (
int
) – The maximum token ID (inclusive). We assume that token IDs are contiguous (from 1 to max_token).device (
Union
[device
,str
,None
]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it is None, then the returned FSA is on CPU.
- Return type
Fsa
- Returns
Returns the expected trivial graph on the given device. Note: The returned graph does not contain arcs with label being 0.
union
CtcLoss
forward
- CtcLoss.forward(decoding_graph, dense_fsa_vec, delay_penalty=0.0, target_lengths=None)[source]
Compute the CTC loss given a decoding graph and a dense fsa vector.
- Parameters
decoding_graph (
Fsa
) – An FsaVec. It can be the composition result of a CTC topology and a transcript.dense_fsa_vec (
DenseFsaVec
) – It represents the neural network output. Refer to the help information ink2.DenseFsaVec
.delay_penalty (
float
) – A constant to penalize symbol delay, which is used to make symbol emit earlier for streaming models. It is almost the same as the delay_penalty in our rnnt_loss, See https://github.com/k2-fsa/k2/issues/955 and https://arxiv.org/pdf/2211.00490.pdf for more details.target_lengths (
Optional
[Tensor
]) – Used only when reduction is mean. It is a 1-D tensor of batch size representing lengths of the targets, e.g., number of phones or number of word pieces in a sentence.
- Return type
Tensor
- Returns
If reduction is none, return a 1-D tensor with size equal to batch size. If reduction is mean or sum, return a scalar.
DecodeStateInfo
DenseFsaVec
__init__
- DenseFsaVec.__init__(log_probs, supervision_segments, allow_truncate=0)[source]
Construct a DenseFsaVec from neural net log-softmax outputs.
- Parameters
log_probs (
Tensor
) – A 3-D tensor of dtype torch.float32 with shape (N, T, C), where N is the number of sequences, T the maximum input length, and C the number of output classes.supervision_segments (
Tensor
) –A 2-D CPU tensor of dtype torch.int32 with 3 columns. Each row contains information for a supervision segment. Column 0 is the sequence_index indicating which sequence this segment comes from; column 1 specifies the start_frame of this segment within the sequence; column 2 contains the duration of this segment.
Note
0 < start_frame + duration <= T + allow_truncate
0 <= start_frame < T
duration > 0
Caution
If the resulting dense fsa vec is used as an input to k2.intersect_dense, then the last column, i.e., the duration column, has to be sorted in decreasing order. That is, the first supervision_segment (the first row) has the largest duration. Otherwise, you don’t need to sort the last column.
k2.intersect_dense is often used in the training stage, so you should usually sort dense fsa vecs by its duration in training. k2.intersect_dense_pruned is usually used in the decoding stage, so you don’t need to sort dense fsa vecs in decoding.
allow_truncate (
int
) – If not zero, it truncates at most this number of frames from duration in case start_frame + duration > T.
_from_dense_fsa_vec
- classmethod DenseFsaVec._from_dense_fsa_vec(dense_fsa_vec, scores)[source]
Construct a DenseFsaVec from _k2.DenseFsaVec and scores.
Note
It is intended for internal use. Users will normally not use it.
- Parameters
dense_fsa_vec (
DenseFsaVec
) – An instance of _k2.DenseFsaVec.scores (
Tensor
) – The scores of _k2.DenseFsaVec for back propagation.
- Return type
DenseFsaVec
- Returns
An instance of DenseFsaVec.
to
- DenseFsaVec.to(device)[source]
Move the DenseFsaVec onto a given device.
- Parameters
device (
Union
[device
,str
]) – An instance of torch.device or a string that can be used to construct a torch.device, e.g., ‘cpu’, ‘cuda:0’. It supports only cpu and cuda devices.- Return type
DenseFsaVec
- Returns
Returns a new DenseFsaVec which is this object copied to the given device (or this object itself, if the device was the same).
device
- DenseFsaVec.device
- Return type
device
duration
- DenseFsaVec.duration
Return the duration (on CPU) of each seq.
- Return type
Tensor
DeterminizeWeightPushingType
name
- DeterminizeWeightPushingType.name
value
- DeterminizeWeightPushingType.value
Fsa
__getattr__
- Fsa.__getattr__(name)[source]
Note: for attributes that exist as properties, e.g. self.labels, self.properties, self.requires_grad, we won’t reach this code because Python checks the class dict before calling getattr. The same is true for instance attributes such as self.{_tensor_attr,_non_tensor_attr,_cache,_properties}
The ‘virtual’ members of this class are those in self._tensor_attr and self._non_tensor_attr.
- Return type
Any
__getitem__
__init__
- Fsa.__init__(arcs, aux_labels=None, properties=None)[source]
Build an Fsa from a tensor with optional aux_labels.
It is useful when loading an Fsa from file.
- Parameters
arcs (
Union
[Tensor
,RaggedArc
]) –When the arcs is an instance of torch.Tensor, it is a torch tensor of dtype torch.int32 with 4 columns. Each row represents an arc. Column 0 is the src_state, column 1 the dest_state, column 2 the label, and column 3 the score. When the arcs is an instance of _k2.RaggedArc, it is a Ragged containing _k2.Arc returned by internal functions (i.e. C++/CUDA functions) or got from other Fsa object by fsa.arcs.
Caution
Scores are floats and their binary pattern is reinterpreted as integers and saved in a tensor of dtype torch.int32.
aux_labels (
Union
[Tensor
,RaggedTensor
,None
]) – Optional. If not None, it associates an aux_label with every arc, so it has as many rows as tensor. It is a 1-D tensor of dtype torch.int32 or k2.RaggedTensor whose dim0 equals to the number of arcs.properties – Tensor properties if known (should only be provided by internal code, as they are not checked; intended for use by
clone()
)
- Returns
An instance of Fsa.
__setattr__
__str__
_get_arc_post
- Fsa._get_arc_post(use_double_scores, log_semiring)[source]
Compute scores on arcs, representing log probabilities; with log_semiring=True you could call these log posteriors, but if log_semiring=False they can only be interpreted as the difference between the best-path score and the score of the best path that includes this arc.
This version is not differentiable; see also
get_arc_post()
.- Parameters
use_double_scores (
bool
) – if True, use double precision.log_semiring (
bool
) – if True, use log semiring, else tropical.
- Return type
Tensor
- Returns
A torch.Tensor with shape equal to (num_arcs,) and non-positive elements.
_get_backward_scores
- Fsa._get_backward_scores(use_double_scores, log_semiring)[source]
Compute backward-scores, i.e. total weight (or best-path weight) from each state to the final state.
For internal k2 use. Not differentiable.
See also
get_backward_scores()
which is differentiable.- Parameters
use_double_scores (
bool
) – True to use double precision floating point. False to use single precision.log_semiring (
bool
) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).
- Return type
Tensor
- Returns
A torch.Tensor with shape equal to (num_states,)
_get_entering_arcs
_get_forward_scores
- Fsa._get_forward_scores(use_double_scores, log_semiring)[source]
Get (and compute if necessary) cached property self.forward_scores_xxx_yyy (where xxx indicates float-type and yyy indicates semiring).
For use by internal k2 code; returns the total score from start-state to each state. Not differentiable; see
get_forward_scores()
which is the differentiable version.- Parameters
use_double_scores (
bool
) – True to use double precision floating point. False to use single precision.log_semiring (
bool
) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).
- Return type
Tensor
_get_tot_scores
- Fsa._get_tot_scores(use_double_scores, log_semiring)[source]
Compute total-scores (one per FSA) as the best-path score.
This version is not differentiable; see also
get_tot_scores()
which is differentiable.- Parameters
use_double_scores (
bool
) – If True, use double precision floating point; false; else single precision.log_semiring (
bool
) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).
- Return type
Tensor
_invalidate_cache_
- Fsa._invalidate_cache_(scores_only=True)[source]
Intended for internal use only so its name begins with an underline.
Also, it changes self in-place.
Currently, it is used only when the scores field are re-assigned.
- Parameters
scores_only (
bool
) – It True, it invalidates only cached entries related to scores. If False, the whole cache is invalidated.- Return type
None
as_dict
convert_attr_to_ragged
- Fsa.convert_attr_to_ragged_(name, remove_eps=True)[source]
Convert the attribute given by name from a 1-D torch.tensor to a k2.RaggedTensor.
Caution
This function ends with an underscore, meaning it changes the FSA in-place.
- Parameters
name (
str
) – The attribute name. This attribute is expected to be a 1-D tensor with dtype torch.int32.remove_eps (
bool
) – True to remove 0s in the resulting ragged tensor.
- Return type
Fsa
- Returns
Return self.
draw
- Fsa.draw(filename, title=None)[source]
Render FSA as an image via graphviz, and return the Digraph object; and optionally save to file filename. filename must have a suffix that graphviz understands, such as pdf, svg or png.
Note
You need to install graphviz to use this function:
pip install graphviz
- Parameters
filename (
Optional
[str
]) – Filename to (optionally) save to, e.g. ‘foo.png’, ‘foo.svg’, ‘foo.png’ (must have a suffix that graphviz understands).title (
Optional
[str
]) – Title to be displayed in image, e.g. ‘A simple FSA example’
- Return type
Digraph
from_openfst
- classmethod Fsa.from_openfst(s, acceptor=None, num_aux_labels=None, aux_label_names=None, ragged_label_names=[])[source]
Create an Fsa from a string in OpenFST format (or a slightly more general format, if num_aux_labels > 1). See also
from_str()
.The given string s consists of lines with the following format:
src_state dest_state label [aux_label1 aux_label2...] [cost]
(the cost defaults to 0.0 if not present).
The line for the final state consists of two fields:
final_state [cost]
Note
Fields are separated by space(s), tab(s) or both. The cost field is a float, while other fields are integers.
There might be multiple final states. Also, OpenFst may omit the cost if it is 0.0.
Caution
We use cost here to indicate that its value will be negated so that we can get scores. That is, score = -1 * cost.
Note
At most one of acceptor, num_aux_labels, and aux_label_names must be supplied; if none are supplied, acceptor format is assumed.
- Parameters
s (
str
) – The input string. Refer to the above comment for its format.acceptor (
Optional
[bool
]) – Set to true to denote acceptor format which is num_aux_labels == 0, or false to denote transducer format (i.e. num_aux_labels == 1 with name ‘aux_labels’).num_aux_labels (
Optional
[int
]) – The number of auxiliary labels to expect on each line (in addition to the ‘acceptor’ label; is 1 for traditional transducers but can be any non-negative number.aux_label_names (
Optional
[List
[str
]]) – If provided, the length of this list dictates the number of aux_labels. By default the names are ‘aux_labels’, ‘aux_labels2’, ‘aux_labels3’ and so on.ragged_label_names (
List
[str
]) – If provided, expect this number of ragged labels, in the order of this list. It is advisable that this list be in alphabetical order, so that the format when we write back to a string will be unchanged.
- Return type
Fsa
from_str
- classmethod Fsa.from_str(s, acceptor=None, num_aux_labels=None, aux_label_names=None, ragged_label_names=[], openfst=False)[source]
Create an Fsa from a string in the k2 or OpenFst format. (See also
from_openfst()
).The given string s consists of lines with the following format:
src_state dest_state label [aux_label1 aux_label2...] [score]
The line for the final state consists of only one field:
final_state
Note
Fields are separated by space(s), tab(s) or both. The score field is a float, while other fields are integers.
Caution
The first column has to be non-decreasing.
Caution
The final state has the largest state number. There is ONLY ONE final state. All arcs that are connected to the final state have label -1. If there are aux_labels, they are also -1 for arcs entering the final state.
Note
At most one of acceptor, num_aux_labels, and aux_label_names must be supplied; if none are supplied, acceptor format is assumed.
- Parameters
s (
str
) – The input string. Refer to the above comment for its format.acceptor (
Optional
[bool
]) – Set to true to denote acceptor format which is num_aux_labels == 0, or false to denote transducer format (i.e. num_aux_labels == 1 with name ‘aux_labels’).num_aux_labels (
Optional
[int
]) – The number of auxiliary labels to expect on each line (in addition to the ‘acceptor’ label; is 1 for traditional transducers but can be any non-negative number. The names of the aux_labels default to ‘aux_labels’ then ‘aux_labels2’, ‘aux_labels3’ and so on.aux_label_names (
Optional
[List
[str
]]) – If provided, the length of this list dictates the number of aux_labels and this list dictates their names.ragged_label_names (
List
[str
]) – If provided, expect this number of ragged labels, in the order of this list. It is advisable that this list be in alphabetical order, so that the format when we write back to a string will be unchanged.openfst (
bool
) – If true, will expect the OpenFST format (costs not scores, i.e. negated; final-probs rather than final-state specified).
- Return type
Fsa
get_arc_post
- Fsa.get_arc_post(use_double_scores, log_semiring)[source]
Compute scores on arcs, representing log probabilities; with log_semiring=True you could call these log posteriors, but if log_semiring=False they can only be interpreted as the difference between the best-path score and the score of the best path that includes this arc. This version is differentiable; see also
_get_arc_post()
.Caution
Because of how the autograd mechanics works and the need to avoid circular references, this is not cached; it’s best to store it if you’ll need it multiple times.
- Parameters
use_double_scores (
bool
) – if True, use double precision.log_semiring (
bool
) – if True, use log semiring, else tropical.
- Return type
Tensor
- Returns
A torch.Tensor with shape equal to (num_arcs,) and non-positive elements.
get_backward_scores
- Fsa.get_backward_scores(use_double_scores, log_semiring)[source]
Compute backward-scores, i.e. total weight (or best-path weight) from each state to the final state.
Supports autograd.
- Parameters
use_double_scores (
bool
) – if True, use double precision.log_semiring (
bool
) – if True, use log semiring, else tropical.
- Return type
Tensor
- Returns
A torch.Tensor with shape equal to (num_states,)
get_filler
- Fsa.get_filler(attribute_name)[source]
Return the filler value associated with attribute names.
This is 0 unless otherwise specified, but you can override this by for example, doing:
fsa.foo_filler = -1
which will mean the “filler” for attribute fsa.foo is -1; and this will get propagated when you do FSA operations, like any other non-tensor attribute. The filler is the value that means “nothing is here” (like epsilon).
- Caution::
you should use a value that is castable to float and back to integer without loss of precision, because currently the default_value parameter of index_select in ./ops.py is a float.
- Return type
Union
[int
,float
]
get_forward_scores
- Fsa.get_forward_scores(use_double_scores, log_semiring)[source]
Compute forward-scores, i.e. total weight (or best-path weight) from start state to each state.
Supports autograd.
- Parameters
use_double_scores (
bool
) – if True, use double precision.log_semiring (
bool
) – if True, use log semiring, else tropical.
- Return type
Tensor
- Returns
A torch.Tensor with shape equal to (num_states,)
get_tot_scores
- Fsa.get_tot_scores(use_double_scores, log_semiring)[source]
Compute total-scores (one per FSA) as the best-path score.
This version is differentiable.
- Parameters
use_double_scores (
bool
) – True to use double precision floating point; False to use single precision.log_semiring (
bool
) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).
- Return type
Tensor
invert
invert_
- Fsa.invert_()[source]
Swap the labels and aux_labels.
If there are symbol tables associated with labels and aux_labels, they are also swapped.
It is an error if the FSA contains no aux_labels.
Caution
The function name ends with an underscore which means this is an in-place operation.
- Return type
Fsa
- Returns
Return self.
rename_tensor_attribute
- Fsa.rename_tensor_attribute_(src_name, dest_name)[source]
Rename a tensor attribute (or, as a special case ‘labels’), and also rename non-tensor attributes that are associated with it, i.e. that have it as a prefix.
- Parameters
src_name (
str
) – The original name, exist as a tensor attribute, e.g. ‘aux_labels’, or, as a special case, equal ‘labels’; special attributes ‘labels’ and ‘scores’ are allowed but won’t be deleted.dest_name (
str
) – The new name, that we are renaming it to. If it already existed as a tensor attribute, it will be rewritten; and any previously existing non-tensor attributes that have this as a prefix will be deleted. As a special case, may equal ‘labels’.
- Return type
Fsa
- Returns
Return self.
- Note::
It is OK if src_name and/or dest_name equals ‘labels’ or ‘scores’, but these special attributes won’t be deleted.
requires_grad_
- Fsa.requires_grad_(requires_grad)[source]
Change if autograd should record operations on this FSA:
Sets the scores’s requires_grad attribute in-place.
Returns this FSA.
You can test whether this object has the requires_grad property true or false by accessing
requires_grad
(handled in__getattr__()
).Caution
This is an in-place operation as you can see that the function name ends with _.
- Parameters
requires_grad (
bool
) – If autograd should record operations on this FSA or not.- Return type
Fsa
- Returns
This FSA itself.
set_scores_stochastic
- Fsa.set_scores_stochastic_(scores)[source]
Normalize the given scores and assign it to self.scores.
- Parameters
scores – Tensor of scores of dtype torch.float32, and shape equal to self.scores.shape (one axis). Will be normalized so the sum, after exponentiating, of the scores leaving each state that has at least one arc leaving it is 1.
Caution
The function name ends with an underline indicating this function will modify self in-place.
- Return type
None
to
- Fsa.to(device)[source]
Move the FSA onto a given device.
- Parameters
device (
Union
[str
,device
]) – An instance of torch.device or a string that can be used to construct a torch.device, e.g., ‘cpu’, ‘cuda:0’. It supports only cpu and cuda devices.- Return type
Fsa
- Returns
Returns a new Fsa which is this object copied to the given device (or this object itself, if the device was the same)
device
- Fsa.device
- Return type
device
grad
- Fsa.grad
- Return type
Tensor
num_arcs
- Fsa.num_arcs
Return the number of arcs in this Fsa.
- Return type
int
properties
- Fsa.properties
- Return type
int
properties_str
- Fsa.properties_str
- Return type
str
requires_grad
- Fsa.requires_grad
- Return type
bool
shape
- Fsa.shape
Returns: (num_states, None) if this is an Fsa; (num_fsas, None, None) if this is an FsaVec.
- Return type
Tuple
[int
, …]
MWERLoss
forward
- MWERLoss.forward(lattice, ref_texts, nbest_scale, num_paths)[source]
Compute the Minimum Word Error loss given a lattice and corresponding ref_texts.
- Parameters
lattice (
Fsa
) – An FsaVec with axes [utt][state][arc].ref_texts (
Union
[RaggedTensor
,List
[List
[int
]]]) –- It can be one of the following types:
A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]
An instance of
k2.RaggedTensor
. Must have num_axes == 2 and with dtype torch.int32.
nbest_scale (
float
) – Scale lattice.score before passing it tok2.random_paths()
. A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.num_paths (
int
) – Number of paths to sample from the lattice usingk2.random_paths()
.
- Return type
Union
[Tensor
,RaggedTensor
]- Returns
Minimum Word Error Rate loss.
Nbest
from_lattice
- static Nbest.from_lattice(lattice, num_paths, use_double_scores=True, nbest_scale=0.5)[source]
Construct an Nbest object by sampling num_paths from a lattice.
Each sampled path is a linear FSA.
We assume lattice.labels contains token IDs and lattice.aux_labels contains word IDs.
- Parameters
lattice (
Fsa
) – An FsaVec with axes [utt][state][arc].num_paths (
int
) – Number of paths to sample from the lattice usingk2.random_paths()
.use_double_scores (
bool
) – True to use double precision ink2.random_paths()
. False to use single precision.nbest_scale (
float
) – Scale lattice.score before passing it tok2.random_paths()
. A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.
- Return type
Nbest
- Returns
Return an Nbest instance.
intersect
- Nbest.intersect(lats)[source]
Intersect this Nbest object with a lattice and get 1-best path from the resulting FsaVec.
Caution
We assume FSAs in self.fsa don’t have epsilon self-loops. We also assume self.fsa.labels and lats.labels are token IDs.
- Parameters
lats (
Fsa
) – An FsaVec. It can be the return value ofwhole_lattice_rescoring()
.- Return type
Nbest
- Returns
Return a new Nbest. This new Nbest shares the same shape with self, while its fsa is the 1-best path from intersecting self.fsa and lats.
top_k
- Nbest.top_k(k)[source]
Get a subset of paths in the Nbest. The resulting Nbest is regular in that each sequence (i.e., utterance) has the same number of paths (k).
We select the top-k paths according to the total_scores of each path. If a utterance has less than k paths, then its last path, after sorting by tot_scores in descending order, is repeated so that each utterance has exactly k paths.
- Parameters
k (
int
) – Number of paths in each utterance.- Return type
Nbest
- Returns
Return a new Nbest with a regular shape.
total_scores
OnlineDenseIntersecter
__init__
- OnlineDenseIntersecter.__init__(decoding_graph, num_streams, search_beam, output_beam, min_active_states, max_active_states, allow_partial=True)[source]
Create a new online intersecter object. :type decoding_graph:
Fsa
:param decoding_graph: The decoding graph used in this intersecter. :type num_streams:int
:param num_streams: How many streams can this intersecter handle parallelly. :type search_beam:float
:param search_beam: Decoding beam, e.g. 20. Smaller is faster, larger is more exact(less pruning). This is the default value; it may be modified by
min_active_states
andmax_active_states
.- Parameters
output_beam (
float
) – Pruning beam for the output of intersection (vs. best path); equivalent to kaldi’s lattice-beam. E.g. 8.min_active_states (
int
) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.max_active_states (
int
) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.
Examples
decode
- OnlineDenseIntersecter.decode(dense_fsas, decode_states)[source]
Does intersection/composition for current chunk of nnet_output(given by a DenseFsaVec), sequences in every chunk may come from different sources. :type dense_fsas:
DenseFsaVec
:param dense_fsas: The neural-net output, with each frame containing the log-likes ofeach modeling unit.
- Parameters
decode_states (
List
[DecodeStateInfo
]) – A list of history decoding states for current batch of sequences, its length equals todense_fsas.dim0()
(i.e. batch size). Each element indecode_states
belongs to the sequence at the corresponding position in current batch. For a new sequence(i.e. has no history states), just putNone
at the corresponding position.- Return type
Tuple
[Fsa
,List
[DecodeStateInfo
]]- Returns
Return a tuple containing an Fsa and a List of new decoding states. The Fsa which has 3 axes(i.e. (batch, state, arc)) contains the output lattices. See the example in the constructor to get more info about how to use the list of new decoding states.
num_streams
- OnlineDenseIntersecter.num_streams
- Return type
int
RaggedShape
__eq__
- RaggedShape.__eq__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool
Return
True
if two shapes are equal. Otherwise, returnFalse
.Caution
The two shapes have to be on the same device. Otherwise, it throws an exception.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x] ]') >>> shape3 = k2r.RaggedShape('[ [x] [x] ]') >>> shape1 == shape2 False >>> shape3 == shape2 True
- Parameters
other – The shape that we want to compare with
self
.- Returns
Return
True
if the two shapes are the same. ReturnFalse
otherwise.
__getitem__
- RaggedShape.__getitem__(self: _k2.ragged.RaggedShape, i: int) _k2.ragged.RaggedShape
Select the i-th sublist along axis 0.
Note
It requires that this shape has at least 3 axes.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x] [x x]] [[x x x] [] [x x]] ]') >>> shape[0] [ [ x ] [ x x ] ] >>> shape[1] [ [ x x x ] [ ] [ x x ] ]
- Parameters
i – The i-th sublist along axis 0.
- Returns
Return a new ragged shape with one fewer axis.
__init__
- RaggedShape.__init__(self: _k2.ragged.RaggedShape, s: str) None
Construct a ragged shape from a string.
An example string for a ragged shape with 2 axes is:
[ [x x] [ ] [x] ]
An example string for a ragged shape with 3 axes is:
[ [[x] []] [[x] [x x]] ]
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x] ]') >>> shape [ [ x ] [ ] [ x x ] ] >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[]] ]') >>> shape2 [ [ [ x ] [ ] [ x x ] ] [ [ ] ] ]
__ne__
- RaggedShape.__ne__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool
Return
True
if two shapes are not equal. Otherwise, returnFalse
.Caution
The two shapes have to be on the same device. Otherwise, it throws an exception.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x] ]') >>> shape3 = k2r.RaggedShape('[ [x] [x] ]') >>> shape1 != shape2 True >>> shape2 != shape3 False
- Parameters
other – The shape that we want to compare with
self
.- Returns
Return
True
if the two shapes are not equal. ReturnFalse
otherwise.
__repr__
- RaggedShape.__repr__(self: _k2.ragged.RaggedShape) str
Return a string representation of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]') >>> print(shape) [ [ x ] [ ] [ x x ] ] >>> shape [ [ x ] [ ] [ x x ] ]
__str__
- RaggedShape.__str__(self: _k2.ragged.RaggedShape) str
Return a string representation of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]') >>> print(shape) [ [ x ] [ ] [ x x ] ] >>> shape [ [ x ] [ ] [ x x ] ]
compose
- RaggedShape.compose(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) _k2.ragged.RaggedShape
Compose
self
with a given shape.Caution
other
andself
MUST be on the same device.Hint
In order to compose
self
withother
, it has to satisfyself.tot_size(self.num_axes - 1) == other.dim0
Example 1:
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [x x] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x x x] [x x] [] ]') >>> shape1.compose(shape2) [ [ [ x x x ] [ x x ] ] [ [ ] ] ]
Example 2:
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [[x x] [x x x] []] [[x] [x x x x]] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x x x] [] [] [x x] [x] [] [x x x x] [] [x x] ]') >>> shape1.compose(shape2) [ [ [ [ x ] [ x x x ] ] [ [ ] [ ] [ x x ] ] [ ] ] [ [ [ x ] ] [ [ ] [ x x x x ] [ ] [ x x ] ] ] ] >>> shape1.tot_size(shape1.num_axes - 1) 10 >>> shape2.dim0 10
- Parameters
other – The other shape that is to be composed with
self
.- Returns
Return a composed ragged shape.
get_layer
- RaggedShape.get_layer(self: _k2.ragged.RaggedShape, arg0: int) _k2.ragged.RaggedShape
Returns a sub-shape of
self
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x x] [x] []] [[] [x x x] [x]] [[]] ]') >>> shape.get_layer(0) [ [ x x x ] [ x x x ] [ x ] ] >>> shape.get_layer(1) [ [ x x ] [ x ] [ ] [ ] [ x x x ] [ x ] [ ] ]
- Parameters
layer – Layer that is desired, from
0 .. src.num_axes - 2
(inclusive).- Returns
This returned shape will have
num_axes == 2
, the minimal case of aRaggedShape
.
index
- RaggedShape.index(self: _k2.ragged.RaggedShape, axis: int, indexes: torch.Tensor, need_value_indexes: bool = True) Tuple[_k2.ragged.RaggedShape, Optional[torch.Tensor]]
Indexing operation on a ragged shape, returns
self[indexes]
, where elements ofindexes
are interpreted as indexes into axisaxis
of self``.Caution
indexes
is a 1-D tensor andindexes.dtype == torch.int32
.Example 1:
>>> shape = k2r.RaggedShape('[ [x x] [x] [x x x] ]') >>> value = torch.arange(6, dtype=torch.float32) * 10 >>> ragged = k2r.RaggedTensor(shape, value) >>> ragged [ [ 0 10 ] [ 20 ] [ 30 40 50 ] ] >>> i = torch.tensor([0, 2, 1], dtype=torch.int32) >>> sub_shape, value_indexes = shape.index(axis=0, indexes=i, need_value_indexes=True) >>> sub_shape [ [ x x ] [ x x x ] [ x ] ] >>> value_indexes tensor([0, 1, 3, 4, 5, 2], dtype=torch.int32) >>> ragged.data[value_indexes.long()] tensor([ 0., 10., 30., 40., 50., 20.]) >>> k = torch.tensor([0, -1, 1, 0, 2, -1], dtype=torch.int32) >>> sub_shape2, value_indexes2 = shape.index(axis=0, indexes=k, need_value_indexes=True) >>> sub_shape2 [ [ x x ] [ ] [ x ] [ x x ] [ x x x ] [ ] ] >>> value_indexes2 tensor([0, 1, 2, 0, 1, 3, 4, 5], dtype=torch.int32)
Example 2:
>>> import torch >>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x x] [x]] [[] [x x x] [x]] [[x] [] [] [x x]] ]') >>> i = torch.tensor([0, 1, 3, 5, 7, 8], dtype=torch.int32) >>> shape.index(axis=1, indexes=i) ([ [ [ x x ] [ x ] ] [ [ x x x ] ] [ [ x ] [ ] [ x x ] ] ], tensor([0, 1, 2, 3, 4, 5, 7, 8, 9], dtype=torch.int32))
- Parameters
axis – The axis to be indexed. Must satisfy
0 <= axis < self.num_axes
.indexes – Array of indexes, which will be interpreted as indexes into axis
axis
ofself
, i.e. with0 <= indexes[i] < self.tot_size(axis)
. Note that ifaxis
is 0, then -1 is also a valid entry inindex
, in which case, an empty list is returned.need_value_indexes –
If
True
, it will return a torch.Tensor containing the indexes intoragged_tensor.data
thatans.data
has, as inans.data = ragged_tensor.data[value_indexes]
, whereragged_tensor
usesself
as its shape.Caution
It is currently not allowed to change the order on axes less than
axis
, i.e. ifaxis > 0
, we require:IsMonotonic(self.row_ids(axis)[indexes])
.
- Returns
Return an indexed ragged shape.
max_size
- RaggedShape.max_size(self: _k2.ragged.RaggedShape, axis: int) int
Return the maximum number of elements of any sublist at the given axis.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [] [x] [x x] [x x x] [x x x x] ]') >>> shape.max_size(1) 4 >>> shape = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]') >>> shape.max_size(1) 5 >>> shape.max_size(2) 4
- Parameters
axis –
Compute the max size of this axis.
Caution
axis
has to be greater than 0.- Returns
Return the maximum number of elements of sublists at the given
axis
.
numel
- RaggedShape.numel(self: _k2.ragged.RaggedShape) int
Return the number of elements in this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]') >>> shape.numel() 6 >>> shape2 = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]') >>> shape2.numel() 8 >>> shape3 = k2r.RaggedShape('[ [x x x] [x] ]') >>> shape3.numel() 4
- Returns
Return the number of elements in this shape.
Hint
It’s the number of
x
’s.
regular_ragged_shape
- static RaggedShape.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape
Create a ragged shape with 2 axes that has a regular structure.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3) >>> shape1 [ [ x x x ] [ x x x ] ] >>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2) >>> shape2 [ [ x x ] [ x x ] [ x x ] ]
- Parameters
dim0 – Number of entries at axis 0.
dim1 – Number of entries in each sublist at axis 1.
- Returns
Return a ragged shape on CPU.
remove_axis
- RaggedShape.remove_axis(self: _k2.ragged.RaggedShape, axis: int) _k2.ragged.RaggedShape
Remove a certain axis.
Caution
self.num_axes
MUST be greater than 2.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x x x x]] [[] [] []]]') >>> shape.remove_axis(0) [ [ x ] [ ] [ x x ] [ x x x ] [ x x x x ] [ ] [ ] [ ] ] >>> shape.remove_axis(1) [ [ x x x ] [ x x x x x x x ] [ ] ]
- Parameters
axis – The axis to be removed.
- Returns
Return a ragged shape with one fewer axis.
row_ids
- RaggedShape.row_ids(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor
Return the row ids of a certain
axis
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> shape.row_ids(1) tensor([0, 0, 2, 2, 2], dtype=torch.int32) >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]') >>> shape2.row_ids(1) tensor([0, 0, 0, 1, 1, 1, 1, 1], dtype=torch.int32) >>> shape2.row_ids(2) tensor([0, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5], dtype=torch.int32)
- Parameters
axis – The axis whose row ids is to be returned.
Hint –
axis >= 1
.
- Returns
Return the row ids of the given
axis
.
row_splits
- RaggedShape.row_splits(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor
Return the row splits of a certain
axis
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> shape.row_splits(1) tensor([0, 2, 2, 5], dtype=torch.int32) >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]') >>> shape2.row_splits(1) tensor([0, 3, 8], dtype=torch.int32) >>> shape2.row_splits(2) tensor([ 0, 1, 1, 3, 6, 7, 11, 11, 11], dtype=torch.int32)
- Parameters
axis – The axis whose row splits is to be returned.
Hint –
axis >= 1
.
- Returns
Return the row splits of the given
axis
.
to
- RaggedShape.to(self: _k2.ragged.RaggedShape, device: object) _k2.ragged.RaggedShape
Move this shape to the specified device.
Hint
If the shape is already on the specified device, the returned shape shares the underlying memory with
self
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[x]]') >>> shape.device device(type='cpu') >>> import torch >>> shape2 = shape.to(torch.device('cuda', 0)) >>> shape2.device device(type='cuda', index=0) >>> shape [ [ x ] ] >>> shape2 [ [ x ] ]
- Parameters
device – An instance of
torch.device
. It can be either a CPU device or a CUDA device.- Returns
Return a shape on the given device.
tot_size
- RaggedShape.tot_size(self: _k2.ragged.RaggedShape, axis: int) int
Return the number of elements at a certain``axis``.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [x x] [x x x] []]') >>> shape.tot_size(1) 6 >>> shape.numel() 6 >>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[]] ]') >>> shape2.tot_size(1) 7 >>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[] []] ]') >>> shape2.tot_size(1) 8 >>> shape2.tot_size(2) 6 >>> shape2.numel() 6
- Parameters
axis – Return the number of elements for this
axis
.- Returns
Return the number of elements at
axis
.
tot_sizes
- RaggedShape.tot_sizes(self: _k2.ragged.RaggedShape) tuple
Return total sizes of every axis in a tuple.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [ ] [x x x x]]') >>> shape.dim0 3 >>> shape.tot_size(1) 5 >>> shape.tot_sizes() (3, 5) >>> shape2 = k2r.RaggedShape('[ [[x] []] [[x x x x]]]') >>> shape2.dim0 2 >>> shape2.tot_size(1) 3 >>> shape2.tot_size(2) 5 >>> shape2.tot_sizes() (2, 3, 5)
- Returns
Return a tuple containing the total sizes of each axis.
ans[i]
is the total size of axisi
(fori > 0
). Fori=0
, it is thedim0
of this shape.
device
- RaggedShape.device
Return the device of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[]]') >>> shape.device device(type='cpu') >>> import torch >>> shape2 = shape.to(torch.device('cuda', 0)) >>> shape2.device device(type='cuda', index=0)
dim0
- RaggedShape.dim0
Return number of sublists at axis 0.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]') >>> shape.dim0 3 >>> shape2 = k2r.RaggedShape('[ [[x] []] [[]] [[x] [x x] [x x x]] [[]]]') >>> shape2.dim0 4
num_axes
- RaggedShape.num_axes
Return the number of axes of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[] []]') >>> shape.num_axes 2 >>> shape2 = k2r.RaggedShape('[ [[]] [[]]]') >>> shape2.num_axes 3
RaggedTensor
__eq__
- RaggedTensor.__eq__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool
Compare two ragged tensors.
Caution
The two tensors MUST have the same dtype. Otherwise, it throws an exception.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]]) >>> b = a.clone() >>> a == b True >>> c = a.to(torch.float32) >>> try: ... c == b ... except RuntimeError: ... print("raised exception")
- Parameters
other – The tensor to be compared.
- Returns
Return
True
if the two tensors are equal. ReturnFalse
otherwise.
__getitem__
- RaggedTensor.__getitem__(*args, **kwargs)
Overloaded function.
__getitem__(self: _k2.ragged.RaggedTensor, i: int) -> object
Select the i-th sublist along axis 0.
Caution
Support for autograd is to be implemented.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] ]') >>> a RaggedTensor([[[1, 3], [], [9]], [[8]]], dtype=torch.int32) >>> a[0] RaggedTensor([[1, 3], [], [9]], dtype=torch.int32) >>> a[1] RaggedTensor([[8]], dtype=torch.int32)
Example 2:
>>> a = k2r.RaggedTensor('[ [1 3] [9] [8] ]') >>> a RaggedTensor([[1, 3], [9], [8]], dtype=torch.int32) >>> a[0] tensor([1, 3], dtype=torch.int32) >>> a[1] tensor([9], dtype=torch.int32)
- Parameters
i – The i-th sublist along axis 0.
- Returns
Return a new ragged tensor with one fewer axis. If num_axes == 2, the return value will be a 1D tensor.
__getitem__(self: _k2.ragged.RaggedTensor, key: slice) -> _k2.ragged.RaggedTensor
Slices sublists along axis 0 with the given range. Only support slicing step equals to 1.
Caution
Support for autograd is to be implemented.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] [[10 11]] ]') >>> a RaggedTensor([[[1, 3], [], [9]], [[8]], [[10, 11]]], dtype=torch.int32) >>> a[0:2] RaggedTensor([[[1, 3], [], [9]], [[8]]], dtype=torch.int32) >>> a[1:2] RaggedTensor([[[8]]], dtype=torch.int32)
- Parameters
key – Slice containing integer constants.
- Returns
Return a new ragged tensor with the same axes as original ragged tensor, but only contains the sublists within the range.
__getitem__(self: _k2.ragged.RaggedTensor, key: torch.Tensor) -> _k2.ragged.RaggedTensor
Slice a ragged tensor along axis 0 using a 1-D torch.int32 tensor.
Example 1:
>>> import k2 >>> a = k2.RaggedTensor([[1, 2, 0], [0, 1], [2, 3]]) >>> b = k2.RaggedTensor([[10, 20], [300], [-10, 0, -1], [-2, 4, 5]]) >>> a[0] tensor([1, 2, 0], dtype=torch.int32) >>> b[a[0]] RaggedTensor([[300], [-10, 0, -1], [10, 20]], dtype=torch.int32) >>> a[1] tensor([0, 1], dtype=torch.int32) >>> b[a[1]] RaggedTensor([[10, 20], [300]], dtype=torch.int32) >>> a[2] tensor([2, 3], dtype=torch.int32) >>> b[a[2]] RaggedTensor([[-10, 0, -1], [-2, 4, 5]], dtype=torch.int32)
Example 2:
>>> import torch >>> import k2 >>> a = k2.RaggedTensor([ [[1], [2, 3], [0]], [[], [2]], [[10, 20]] ]) >>> i = torch.tensor([0, 2, 1, 0], dtype=torch.int32) >>> a[i] RaggedTensor([[[1], [2, 3], [0]], [[10, 20]], [[], [2]], [[1], [2, 3], [0]]], dtype=torch.int32)
- Parameters
key – A 1-D torch.int32 tensor containing the indexes to select along axis 0.
- Returns
Return a new ragged tensor with the same number of axes as
self
but only contains the specified sublists.
__getstate__
- RaggedTensor.__getstate__(self: k2.RaggedTensor) tuple
Requires a tensor with 2 axes or 3 axes. Other number of axes are not implemented yet.
This method is to support
pickle
, e.g., used bytorch.save()
. You are not expected to call it by yourself.- Returns
If this tensor has 2 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.values). If this tensor has 3 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.row_splits(1), “row_ids2”, self.values)
Note
“row_ids1” and “row_ids2” in the returned value is for backward compatibility.
__init__
- RaggedTensor.__init__(*args, **kwargs)
Overloaded function.
__init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: object = ‘cpu’) -> None
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64) >>> c RaggedTensor([[1]], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.RaggedTensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32)
Example 2:
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ]) RaggedTensor([[[1, 2]], [], [[]]], dtype=torch.int32) >>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0') RaggedTensor([[[1, 2]], [], [[]]], device='cuda:0', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: str = ‘cpu’) -> None
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64) >>> c RaggedTensor([[1]], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.RaggedTensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32)
Example 2:
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ]) RaggedTensor([[[1, 2]], [], [[]]], dtype=torch.int32) >>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0') RaggedTensor([[[1, 2]], [], [[]]], device='cuda:0', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: object = ‘cpu’) -> None
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b RaggedTensor([[[], [3]], [[10]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.RaggedTensor('[[1.]]') >>> c.dtype torch.float32 >>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0') >>> d RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: str = ‘cpu’) -> None
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b RaggedTensor([[[], [3]], [[10]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.RaggedTensor('[[1.]]') >>> c.dtype torch.float32 >>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0') >>> d RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, shape: _k2.ragged.RaggedShape, value: torch.Tensor) -> None
Create a ragged tensor from a shape and a value.
>>> import torch >>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> value = torch.tensor([10, 0, 20, 30, 40], dtype=torch.float32) >>> ragged = k2r.RaggedTensor(shape, value) >>> ragged RaggedTensor([[10, 0], [], [20, 30, 40]], dtype=torch.float32)
- Parameters
shape – The shape of the tensor.
value – The value of the tensor.
__init__(self: _k2.ragged.RaggedTensor, tensor: torch.Tensor) -> None
Create a ragged tensor from a torch tensor.
Note
It turns a regular tensor into a ragged tensor.
Caution
The input tensor has to have more than 1 dimension. That is
tensor.ndim > 1
.Also, if the input tensor is contiguous,
self
will share the underlying memory with it. Otherwise, memory of the input tensor is copied to createself
.Supported dtypes of the input tensor are:
torch.int32
,torch.float32
, andtorch.float64
.Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3) >>> b = k2r.RaggedTensor(a) >>> a tensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b RaggedTensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> a.is_contiguous() True >>> a[0, 0] = 10 >>> b RaggedTensor([[10, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b.values[1] = -2 >>> a tensor([[10, -2, 2], [ 3, 4, 5]], dtype=torch.int32)
Example 2:
>>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4] >>> a tensor([[ 0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a.is_contiguous() False >>> b = k2r.RaggedTensor(a) >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a[0, 0] = 10 >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a tensor([[10, 4, 8], [12, 16, 20]], dtype=torch.int32)
Example 3:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) >>> a tensor([[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], [[12., 13., 14., 15.], [16., 17., 18., 19.], [20., 21., 22., 23.]]]) >>> b = k2r.RaggedTensor(a) >>> b RaggedTensor([[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = torch.tensor([[1, 2]], device='cuda:0', dtype=torch.float32) >>> k2r.RaggedTensor(c) RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.float32)
- Parameters
tensor – An N-D (N > 1) tensor.
__ne__
- RaggedTensor.__ne__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool
Compare two ragged tensors.
Caution
The two tensors MUST have the same dtype. Otherwise, it throws an exception.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2], [3]]) >>> b = a.clone() >>> b != a False >>> c = k2r.RaggedTensor([[1], [2], [3]]) >>> c != a True
- Parameters
other – The tensor to be compared.
- Returns
Return
True
if the two tensors are NOT equal. ReturnFalse
otherwise.
__repr__
- RaggedTensor.__repr__(self: _k2.ragged.RaggedTensor) str
Return a string representation of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3], []]) >>> a RaggedTensor([[1], [2, 3], []], dtype=torch.int32) >>> str(a) 'RaggedTensor([[1],\n [2, 3],\n []], dtype=torch.int32)' >>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0') >>> b RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)
__setstate__
- RaggedTensor.__setstate__(self: k2.RaggedTensor, arg0: tuple) None
Set the content of this class from
arg0
.This method is to support
pickle
, e.g., used by torch.load(). You are not expected to call it by yourself.- Parameters
arg0 – It is the return value from the method
__getstate__
.
__str__
- RaggedTensor.__str__(self: _k2.ragged.RaggedTensor) str
Return a string representation of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3], []]) >>> a RaggedTensor([[1], [2, 3], []], dtype=torch.int32) >>> str(a) 'RaggedTensor([[1],\n [2, 3],\n []], dtype=torch.int32)' >>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0') >>> b RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)
add
- RaggedTensor.add(self: _k2.ragged.RaggedTensor, value: torch.Tensor, alpha: object) _k2.ragged.RaggedTensor
Add value scaled by alpha to source ragged tensor over the last axis.
It implements:
dest[…][i][j] = src[…][i][j] + alpha * value[i]
>>> import k2.ragged as k2r >>> import torch >>> src = k2r.RaggedTensor([[1, 3], [1], [2, 8]], dtype=torch.int32) >>> value = torch.tensor([1, 2, 3], dtype=torch.int32) >>> src.add(value, 1) RaggedTensor([[2, 4], [3], [5, 11]], dtype=torch.int32) >>> src.add(value, -1) RaggedTensor([[0, 2], [-1], [-1, 5]], dtype=torch.int32)
- Parameters
value – The value to be added to the
self
, whose dimension MUST equal the number of sublists along the last dimension ofself
.alpha – The number used to scaled value before adding to
self
.
- Returns
Returns a new RaggedTensor, sharing the same dtype and device with
self
.
arange
- RaggedTensor.arange(self: _k2.ragged.RaggedTensor, axis: int, begin: int, end: int) _k2.ragged.RaggedTensor
Return a sub-range of
self
containing indexesbegin
throughend - 1
along axisaxis
ofself
.The
axis
argument may be confusing; its behavior is equivalent to:for i in range(axis): self = self.remove_axis(0) return self.arange(0, begin, end)
Caution
The returned tensor shares the underlying memory with
self
.Example 1
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]] ]) >>> a RaggedTensor([[[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]]], dtype=torch.int32) >>> a.num_axes 3 >>> b = a.arange(axis=0, begin=1, end=3) >>> b RaggedTensor([[[], [4, 5], []], [[], [1]]], dtype=torch.int32) >>> b.num_axes 3 >>> c = a.arange(axis=0, begin=1, end=2) >>> c RaggedTensor([[[], [4, 5], []]], dtype=torch.int32) >>> c.num_axes 3 >>> d = a.arange(axis=1, begin=0, end=4) >>> d RaggedTensor([[1], [], [2], []], dtype=torch.int32) >>> d.num_axes 2 >>> e = a.arange(axis=1, begin=2, end=5) >>> e RaggedTensor([[2], [], [4, 5]], dtype=torch.int32) >>> e.num_axes 2
Example 2
>>> a = k2r.RaggedTensor([ [[[], [1], [2, 3]],[[5, 8], [], [9]]], [[[10], [0], []]], [[[], [], [1]]] ]) >>> a.num_axes 4 >>> b = a.arange(axis=0, begin=0, end=2) >>> b RaggedTensor([[[[], [1], [2, 3]], [[5, 8], [], [9]]], [[[10], [0], []]]], dtype=torch.int32) >>> b.num_axes 4 >>> c = a.arange(axis=1, begin=1, end=3) >>> c RaggedTensor([[[5, 8], [], [9]], [[10], [0], []]], dtype=torch.int32) >>> c.num_axes 3 >>> d = a.arange(axis=2, begin=0, end=5) >>> d RaggedTensor([[], [1], [2, 3], [5, 8], []], dtype=torch.int32) >>> d.num_axes 2
Example 3
>>> a = k2r.RaggedTensor([[0], [1], [2], [], [3]]) >>> a RaggedTensor([[0], [1], [2], [], [3]], dtype=torch.int32) >>> a.num_axes 2 >>> b = a.arange(axis=0, begin=1, end=4) >>> b RaggedTensor([[1], [2], []], dtype=torch.int32) >>> b.values[0] = -1 >>> a RaggedTensor([[0], [-1], [2], [], [3]], dtype=torch.int32)
- Parameters
axis – The axis from which
begin
andend
correspond to.begin – The beginning of the range (inclusive).
end – The end of the range (exclusive).
argmax
- RaggedTensor.argmax(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor
Return a tensor containing maximum value indexes within each sub-list along the last axis of
self
, i.e. the max taken over the last axis, The index is -1 if the sub-list was empty or all values in the sub-list are less thaninitial_value
.>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [3, -1], [], [], [] ]) >>> a.argmax() tensor([ 0, -1, -1, -1], dtype=torch.int32) >>> b = a.argmax(initial_value=0) >>> b tensor([ 0, -1, -1, -1], dtype=torch.int32) >>> c = k2r.RaggedTensor([ [3, 0, 2, 5, 1], [], [1, 3, 8, 2, 0] ]) >>> c.argmax() tensor([ 3, -1, 7], dtype=torch.int32) >>> d = c.argmax(initial_value=0) >>> d tensor([ 3, -1, 7], dtype=torch.int32) >>> c.values[3], c.values[7] (tensor(5, dtype=torch.int32), tensor(8, dtype=torch.int32)) >>> c.argmax(initial_value=6) tensor([-1, -1, 7], dtype=torch.int32) >>> c.to('cuda:0').argmax(0) tensor([ 3, -1, 7], device='cuda:0', dtype=torch.int32) >>> import torch >>> c.to(torch.float32).argmax(0) tensor([ 3, -1, 7], dtype=torch.int32)
- Parameters
initial_value – A base value to compare. If values in a sublist are all less than this value, then the
argmax
of this sublist is -1. If a sublist is empty, theargmax
of it is also -1. If it isNone
, the lowest value ofself.dtype
is used.- Returns
Return a 1-D
torch.int32
tensor. It is on the same device asself
.
cat
- static RaggedTensor.cat(srcs: List[_k2.ragged.RaggedTensor], axis: int) _k2.ragged.RaggedTensor
Concatenate a list of ragged tensor over a specified axis.
Example 1
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [], [2, 3]]) >>> k2r.cat([a, a], axis=0) RaggedTensor([[1], [], [2, 3], [1], [], [2, 3]], dtype=torch.int32) >>> k2r.cat((a, a), axis=1) RaggedTensor([[1, 1], [], [2, 3, 2, 3]], dtype=torch.int32)
Example 2
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 3], [], [5, 8], [], [9]]) >>> b = k2r.RaggedTensor([[0], [1, 8], [], [-1], [10]]) >>> c = k2r.cat([a, b], axis=0) >>> c RaggedTensor([[1, 3], [], [5, 8], [], [9], [0], [1, 8], [], [-1], [10]], dtype=torch.int32) >>> c.num_axes 2 >>> d = k2r.cat([a, b], axis=1) >>> d RaggedTensor([[1, 3, 0], [1, 8], [5, 8], [-1], [9, 10]], dtype=torch.int32) >>> d.num_axes 2 >>> k2r.RaggedTensor.cat([a, b], axis=1) RaggedTensor([[1, 3, 0], [1, 8], [5, 8], [-1], [9, 10]], dtype=torch.int32) >>> k2r.cat((b, a), axis=0) RaggedTensor([[0], [1, 8], [], [-1], [10], [1, 3], [], [5, 8], [], [9]], dtype=torch.int32)
- Parameters
srcs – A list (or a tuple) of ragged tensors to concatenate. They MUST all have the same dtype and on the same device.
axis – Only 0 and 1 are supported right now. If it is 1, then
srcs[i].dim0
must all have the same value.
- Returns
Return a concatenated tensor.
clone
- RaggedTensor.clone(self: _k2.ragged.RaggedTensor) _k2.ragged.RaggedTensor
Return a copy of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2], [3]]) >>> b = a >>> c = a.clone() >>> a RaggedTensor([[1, 2], [3]], dtype=torch.int32) >>> b.values[0] = 10 >>> a RaggedTensor([[10, 2], [3]], dtype=torch.int32) >>> c RaggedTensor([[1, 2], [3]], dtype=torch.int32) >>> c.values[0] = -1 >>> c RaggedTensor([[-1, 2], [3]], dtype=torch.int32) >>> a RaggedTensor([[10, 2], [3]], dtype=torch.int32) >>> b RaggedTensor([[10, 2], [3]], dtype=torch.int32)
index
- RaggedTensor.index(*args, **kwargs)
Overloaded function.
index(self: _k2.ragged.RaggedTensor, indexes: _k2.ragged.RaggedTensor) -> _k2.ragged.RaggedTensor
Index a ragged tensor with a ragged tensor.
Example 1:
>>> import k2.ragged as k2r >>> src = k2r.RaggedTensor([[10, 11], [12, 13.5]]) >>> indexes = k2r.RaggedTensor([[0, 1]]) >>> src.index(indexes) RaggedTensor([[[10, 11], [12, 13.5]]], dtype=torch.float32) >>> i = k2r.RaggedTensor([[0], [1], [0, 0]]) >>> src.index(i) RaggedTensor([[[10, 11]], [[12, 13.5]], [[10, 11], [10, 11]]], dtype=torch.float32)
Example 2:
>>> import k2.ragged as k2r >>> src = k2r.RaggedTensor([ [[1, 0], [], [2]], [[], [3], [0, 0, 1]], [[1, 2], [-1]]]) >>> i = k2r.RaggedTensor([[[0, 2], [1]], [[0]]]) >>> src.index(i) RaggedTensor([[[[[1, 0], [], [2]], [[1, 2], [-1]]], [[[], [3], [0, 0, 1]]]], [[[[1, 0], [], [2]]]]], dtype=torch.int32)
- Parameters
indexes –
Its values must satisfy
0 <= values[i] < self.dim0
.Caution
Its dtype has to be
torch.int32
.- Returns
Return indexed tensor.
index(self: _k2.ragged.RaggedTensor, indexes: torch.Tensor, axis: int, need_value_indexes: bool = False) -> Tuple[_k2.ragged.RaggedTensor, Optional[torch.Tensor]]
Indexing operation on ragged tensor, returns
self[indexes]
, where the elements ofindexes
are interpreted as indexes into axisaxis
ofself
.Caution
indexes
is a 1-D tensor andindexes.dtype == torch.int32
.Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[0, 2, 3], [], [0, 1, 2], [], [], [3, -1.25]]) >>> i = torch.tensor([2, 0, 3, 5], dtype=torch.int32) >>> b, value_indexes = a.index(i, axis=0, need_value_indexes=True) >>> b RaggedTensor([[0, 1, 2], [0, 2, 3], [], [3, -1.25]], dtype=torch.float32) >>> value_indexes tensor([3, 4, 5, 0, 1, 2, 6, 7], dtype=torch.int32) >>> a.values[value_indexes.long()] tensor([ 0.0000, 1.0000, 2.0000, 0.0000, 2.0000, 3.0000, 3.0000, -1.2500]) >>> k = torch.tensor([2, -1, 0], dtype=torch.int32) >>> a.index(k, axis=0, need_value_indexes=True) (RaggedTensor([[0, 1, 2], [], [0, 2, 3]], dtype=torch.float32), tensor([3, 4, 5, 0, 1, 2], dtype=torch.int32))
Example 2:
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1, 3], [], [2]], [[5, 8], [], [-1], [2]] ]) >>> i = torch.tensor([0, 2, 1, 6, 3, 5, 4], dtype=torch.int32) >>> a.shape.row_ids(1)[i.long()] tensor([0, 0, 0, 1, 1, 1, 1], dtype=torch.int32) >>> b, value_indexes = a.index(i, axis=1, need_value_indexes=True) >>> b RaggedTensor([[[1, 3], [2], []], [[2], [5, 8], [-1], []]], dtype=torch.int32) >>> value_indexes tensor([0, 1, 2, 6, 3, 4, 5], dtype=torch.int32) >>> a.values[value_indexes.long()] tensor([ 1, 3, 2, 2, 5, 8, -1], dtype=torch.int32)
- Parameters
indexes –
Array of indexes, which will be interpreted as indexes into axis
axis
ofself
, i.e. with0 <= indexes[i] < self.tot_size(axis)
. Note that ifaxis
is 0, then -1 is also a valid entry inindex
, -1 as an index, which will result in an empty list (as if it were the index into a position inself
that had an empty list at that point).Caution
It is currently not allowed to change the order on axes less than
axis
, i.e. ifaxis > 0
, we require:IsMonotonic(self.shape.row_ids(axis)[indexes])
.axis – The axis to be indexed. Must satisfy
0 <= axis < self.num_axes
.need_value_indexes – If
True
, it will return a torch.Tensor containing the indexes intoself.values
thatans.values
has, as inans.values = self.values[value_indexes]
.
- Returns
A ragged tensor, sharing the same dtype and device with
self
None
ifneed_value_indexes
is False; a 1-D torch.tensor of dtypetorch.int32
containing the indexes intoself.values
thatans.values
has.
- Return type
Return a tuple containing
logsumexp
- RaggedTensor.logsumexp(self: _k2.ragged.RaggedTensor, initial_value: float = - inf) torch.Tensor
Compute the logsumexp of sublists over the last axis of this tensor.
Note
It is similar to torch.logsumexp except it accepts a ragged tensor. See https://pytorch.org/docs/stable/generated/torch.logsumexp.html for definition of logsumexp.
Note
If a sublist is empty, the logsumexp for it is the provided
initial_value
.Note
This operation only supports float type input, i.e., with dtype being torch.float32 or torch.float64.
>>> import torch >>> import k2 >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[-0.25, -0.25, -0.25, -0.25], [], [-0.5, -0.5]], dtype=torch.float32) >>> a.requires_grad_(True) RaggedTensor([[-0.25, -0.25, -0.25, -0.25], [], [-0.5, -0.5]], dtype=torch.float32) >>> b = a.logsumexp() >>> b tensor([1.1363, -inf, 0.1931], grad_fn=<LogSumExpFunction>>) >>> c = b.sum() >>> c tensor(-inf, grad_fn=<SumBackward0>) >>> c.backward() >>> a.grad tensor([0.2500, 0.2500, 0.2500, 0.2500, 0.5000, 0.5000]) >>> >>> # if a is a 3-d ragged tensor >>> a = k2r.RaggedTensor([[[-0.25, -0.25, -0.25, -0.25]], [[], [-0.5, -0.5]]], dtype=torch.float32) >>> a.requires_grad_(True) RaggedTensor([[[-0.25, -0.25, -0.25, -0.25]], [[], [-0.5, -0.5]]], dtype=torch.float32) >>> b = a.logsumexp() >>> b tensor([1.1363, -inf, 0.1931], grad_fn=<LogSumExpFunction>>) >>> c = b.sum() >>> c tensor(-inf, grad_fn=<SumBackward0>) >>> c.backward() >>> a.grad tensor([0.2500, 0.2500, 0.2500, 0.2500, 0.5000, 0.5000])
- Parameters
initial_value – If a sublist is empty, its logsumexp is this value.
- Returns
Return a 1-D tensor with the same dtype of this tensor containing the computed logsumexp.
max
- RaggedTensor.max(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor
Return a tensor containing the maximum of each sub-list along the last axis of
self
. The max is taken over the last axis orinitial_value
, whichever was larger.>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1, 3, 0], [2, 5, -1, 1, 3], [], []], [[1, 8, 9, 2], [], [2, 4, 6, 8]] ]) >>> a.max() tensor([ 3, 5, -2147483648, -2147483648, 9, -2147483648, 8], dtype=torch.int32) >>> a.max(initial_value=-10) tensor([ 3, 5, -10, -10, 9, -10, 8], dtype=torch.int32) >>> a.max(initial_value=7) tensor([7, 7, 7, 7, 9, 7, 8], dtype=torch.int32) >>> import torch >>> a.to(torch.float32).max(-3) tensor([ 3., 5., -3., -3., 9., -3., 8.]) >>> a.to('cuda:0').max(-2) tensor([ 3, 5, -2, -2, 9, -2, 8], device='cuda:0', dtype=torch.int32)
- Parameters
initial_value – The base value to compare. If values in a sublist are all less than this value, then the max of this sublist is
initial_value
. If a sublist is empty, its max is alsoinitial_value
.- Returns
Return 1-D tensor containing the max value of each sublist. It shares the same dtype and device with
self
.
min
- RaggedTensor.min(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor
Return a tensor containing the minimum of each sub-list along the last axis of
self
. The min is taken over the last axis orinitial_value
, whichever was smaller.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1, 3, 0], [2, 5, -1, 1, 3], [], []], [[1, 8, 9, 2], [], [2, 4, 6, 8]] ], dtype=torch.float32) >>> a.min() tensor([ 0.0000e+00, -1.0000e+00, 3.4028e+38, 3.4028e+38, 1.0000e+00, 3.4028e+38, 2.0000e+00]) >>> a.min(initial_value=float('inf')) tensor([ 0., -1., inf, inf, 1., inf, 2.]) >>> a.min(100) tensor([ 0., -1., 100., 100., 1., 100., 2.]) >>> a.to(torch.int32).min(20) tensor([ 0, -1, 20, 20, 1, 20, 2], dtype=torch.int32) >>> a.to('cuda:0').min(15) tensor([ 0., -1., 15., 15., 1., 15., 2.], device='cuda:0')
- Parameters
initial_value – The base value to compare. If values in a sublist are all larger than this value, then the minimum of this sublist is
initial_value
. If a sublist is empty, its minimum is alsoinitial_value
.- Returns
Return 1-D tensor containing the minimum of each sublist. It shares the same dtype and device with
self
.
normalize
- RaggedTensor.normalize(self: _k2.ragged.RaggedTensor, use_log: bool) _k2.ragged.RaggedTensor
Normalize a ragged tensor over the last axis.
If
use_log
isTrue
, the normalization per sublist is done as follows:Compute the log sum per sublist
2. Subtract the log sum computed above from the sublist and return it
If
use_log
isFalse
, the normalization per sublist is done as follows:Compute the sum per sublist
Divide the sublist by the above sum and return the resulting sublist
Note
If a sublist contains 3 elements
[a, b, c]
, then the log sum is defined as:s = log(exp(a) + exp(b) + exp(c))
The resulting sublist looks like below if
use_log
isTrue
:[a - s, b - s, c - s]
If
use_log
isFalse
, the resulting sublist looks like:[a/(a+b+c), b/(a+b+c), c/(a+b+c)]
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[0.1, 0.3], [], [1], [0.2, 0.8]]) >>> a.normalize(use_log=False) RaggedTensor([[0.25, 0.75], [], [1], [0.2, 0.8]], dtype=torch.float32) >>> a.normalize(use_log=True) RaggedTensor([[-0.798139, -0.598139], [], [0], [-1.03749, -0.437488]], dtype=torch.float32) >>> b = k2r.RaggedTensor([ [[0.1, 0.3], []], [[1], [0.2, 0.8]] ]) >>> b.normalize(use_log=False) RaggedTensor([[[0.25, 0.75], []], [[1], [0.2, 0.8]]], dtype=torch.float32) >>> b.normalize(use_log=True) RaggedTensor([[[-0.798139, -0.598139], []], [[0], [-1.03749, -0.437488]]], dtype=torch.float32) >>> a.num_axes 2 >>> b.num_axes 3 >>> import torch >>> (torch.tensor([0.1, 0.3]).exp() / torch.tensor([0.1, 0.3]).exp().sum()).log() tensor([-0.7981, -0.5981])
- Parameters
use_log – It indicates which kind of normalization to be applied.
- Returns
Returns a 1-D tensor, sharing the same dtype and device with
self
.
numel
- RaggedTensor.numel(self: _k2.ragged.RaggedTensor) int
- Returns
Return number of elements in this tensor. It equals to
self.values.numel()
.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [], [3, 4, 5, 6]]) >>> a.numel() 5 >>> b = k2r.RaggedTensor('[ [[1] [] []] [[2 3]]]') >>> b.numel() 3 >>> c = k2r.RaggedTensor('[[1] [] [3 4 5 6]]') >>> c.numel() 5
pad
- RaggedTensor.pad(self: _k2.ragged.RaggedTensor, mode: str, padding_value: object) torch.Tensor
Pad a ragged tensor with 2-axes to a 2-D torch tensor.
For example, if
self
has the following values:[ [1 2 3] [4] [5 6 7 8] ]
Then it returns a 2-D tensor as follows if
padding_value
is 0 and mode isconstant
:tensor([[1, 2, 3, 0], [4, 0, 0, 0], [5, 6, 7, 8]])
Caution
It requires that
self.num_axes == 2
.>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [], [2, 3], [5, 8, 9, 8, 2]]) >>> a.pad(mode='constant', padding_value=-1) tensor([[ 1, -1, -1, -1, -1], [-1, -1, -1, -1, -1], [ 2, 3, -1, -1, -1], [ 5, 8, 9, 8, 2]], dtype=torch.int32) >>> a.pad(mode='replicate', padding_value=-1) tensor([[ 1, 1, 1, 1, 1], [-1, -1, -1, -1, -1], [ 2, 3, 3, 3, 3], [ 5, 8, 9, 8, 2]], dtype=torch.int32)
- Parameters
mode – Valid values are:
constant
,replicate
. If it isconstant
, the givenpadding_value
is used for filling. If it isreplicate
, the last entry in a list is used for filling. If a list is empty, then the given padding_value is also used for filling.padding_value – The filling value.
- Returns
A 2-D torch tensor, sharing the same dtype and device with
self
.
remove_axis
- RaggedTensor.remove_axis(self: _k2.ragged.RaggedTensor, axis: int) _k2.ragged.RaggedTensor
Remove an axis; if it is not the first or last axis, this is done by appending lists (effectively the axis is combined with the following axis). If it is the last axis it is just removed and the number of elements may be changed.
Caution
The tensor has to have more than two axes.
Example 1:
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1], [], [0, -1]], [[], [2, 3], []], [[0]], [[]] ]) >>> a RaggedTensor([[[1], [], [0, -1]], [[], [2, 3], []], [[0]], [[]]], dtype=torch.int32) >>> a.num_axes 3 >>> b = a.remove_axis(0) >>> b RaggedTensor([[1], [], [0, -1], [], [2, 3], [], [0], []], dtype=torch.int32) >>> c = a.remove_axis(1) >>> c RaggedTensor([[1, 0, -1], [2, 3], [0], []], dtype=torch.int32)
Example 2:
>>> a = k2r.RaggedTensor([ [[[1], [], [2]]], [[[3, 4], [], [5, 6], []]], [[[], [0]]] ]) >>> a.num_axes 4 >>> a RaggedTensor([[[[1], [], [2]]], [[[3, 4], [], [5, 6], []]], [[[], [0]]]], dtype=torch.int32) >>> b = a.remove_axis(0) >>> b RaggedTensor([[[1], [], [2]], [[3, 4], [], [5, 6], []], [[], [0]]], dtype=torch.int32) >>> c = a.remove_axis(1) >>> c RaggedTensor([[[1], [], [2]], [[3, 4], [], [5, 6], []], [[], [0]]], dtype=torch.int32) >>> d = a.remove_axis(2) >>> d RaggedTensor([[[1, 2]], [[3, 4, 5, 6]], [[0]]], dtype=torch.int32)
- Parameters
axis – The axis to move.
- Returns
Return a ragged tensor with one fewer axes.
remove_values_eq
- RaggedTensor.remove_values_eq(self: _k2.ragged.RaggedTensor, target: object) _k2.ragged.RaggedTensor
Returns a ragged tensor after removing all ‘values’ that equal a provided target. Leaves all layers of the shape except for the last one unaffected.
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]]) >>> a RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]], dtype=torch.int32) >>> b = a.remove_values_eq(3) >>> b RaggedTensor([[1, 2, 0, 2], [], [2], []], dtype=torch.int32) >>> c = a.remove_values_eq(2) >>> c RaggedTensor([[1, 3, 0, 3], [], [3, 3], [3]], dtype=torch.int32)
- Parameters
target – The target value to delete.
- Returns
Return a ragged tensor whose values don’t contain the
target
.
remove_values_leq
- RaggedTensor.remove_values_leq(self: _k2.ragged.RaggedTensor, cutoff: object) _k2.ragged.RaggedTensor
Returns a ragged tensor after removing all ‘values’ that are equal to or less than a provided cutoff. Leaves all layers of the shape except for the last one unaffected.
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]]) >>> a RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]], dtype=torch.int32) >>> b = a.remove_values_leq(3) >>> b RaggedTensor([[], [], [], []], dtype=torch.int32) >>> c = a.remove_values_leq(2) >>> c RaggedTensor([[3, 3], [], [3, 3], [3]], dtype=torch.int32) >>> d = a.remove_values_leq(1) >>> d RaggedTensor([[2, 3, 3, 2], [], [3, 2, 3], [3]], dtype=torch.int32)
- Parameters
cutoff – Values less than or equal to this
cutoff
are deleted.- Returns
Return a ragged tensor whose values are all above
cutoff
.
requires_grad_
- RaggedTensor.requires_grad_(self: _k2.ragged.RaggedTensor, requires_grad: bool = True) _k2.ragged.RaggedTensor
Change if autograd should record operations on this tensor: Set this tensor’s
requires_grad
attribute in-place.Note
If this tensor is not a float tensor, PyTorch will throw a RuntimeError exception.
Caution
This method ends with an underscore, meaning it changes this tensor in-place.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]], dtype=torch.float64) >>> a.requires_grad False >>> a.requires_grad_(True) RaggedTensor([[1]], dtype=torch.float64) >>> a.requires_grad True
- Parameters
requires_grad – If autograd should record operations on this tensor.
- Returns
Return this tensor.
sort_
- RaggedTensor.sort_(self: _k2.ragged.RaggedTensor, descending: bool = False, need_new2old_indexes: bool = False) Optional[torch.Tensor]
Sort a ragged tensor over the last axis in-place.
Caution
sort_
ends with an underscore, meaning this operation changesself
in-place.>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 3, 0], [2, 5, 3], [], [1, 3, 0.] ]) >>> a_clone = a.clone() >>> b = a.sort_(descending=True, need_new2old_indexes=True) >>> b tensor([1, 0, 2, 4, 5, 3, 7, 6, 8], dtype=torch.int32) >>> a RaggedTensor([[3, 1, 0], [5, 3, 2], [], [3, 1, 0]], dtype=torch.float32) >>> a_clone.values[b.long()] tensor([3., 1., 0., 5., 3., 2., 3., 1., 0.]) >>> a_clone = a.clone() >>> c = a.sort_(descending=False, need_new2old_indexes=True) >>> c tensor([2, 1, 0, 5, 4, 3, 8, 7, 6], dtype=torch.int32) >>> a RaggedTensor([[0, 1, 3], [2, 3, 5], [], [0, 1, 3]], dtype=torch.float32) >>> a_clone.values[c.long()] tensor([0., 1., 3., 2., 3., 5., 0., 1., 3.])
- Parameters
descending –
True
to sort in descending order.False
to sort in ascending order.need_new2old_indexes – If
True
, also returns a 1-D tensor, containing the indexes mapping from the sorted elements to the unsorted elements. We can useself.clone().values[returned_tensor]
to get a sorted tensor.
- Returns
If
need_new2old_indexes
is False, returns None. Otherwise, returns a 1-D tensor of dtypetorch.int32
.
sum
- RaggedTensor.sum(self: _k2.ragged.RaggedTensor, initial_value: float = 0) torch.Tensor
Compute the sum of sublists over the last axis of this tensor.
Note
If a sublist is empty, the sum for it is the provided
initial_value
.Note
This operation supports autograd if this tensor is a float tensor, i.e., with dtype being torch.float32 or torch.float64.
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [[1 2] [] [5]] [[10]] ]', dtype=torch.float32) >>> a.requires_grad_(True) RaggedTensor([[[1, 2], [], [5]], [[10]]], dtype=torch.float32) >>> b = a.sum() >>> c = (b * torch.arange(4)).sum() >>> c.backward() >>> a.grad tensor([0., 0., 2., 3.]) >>> b tensor([ 3., 0., 5., 10.], grad_fn=<SumFunction>>) >>> c tensor(40., grad_fn=<SumBackward0>)
- Parameters
initial_value – This value is added to the sum of each sublist. So when a sublist is empty, its sum is this value.
- Returns
Return a 1-D tensor with the same dtype of this tensor containing the computed sum.
to
- RaggedTensor.to(*args, **kwargs)
Overloaded function.
to(self: _k2.ragged.RaggedTensor, device: object) -> _k2.ragged.RaggedTensor
Transfer this tensor to a given device.
Note
If
self
is already on the specified device, return a ragged tensor sharing the underlying memory withself
. Otherwise, a new tensor is returned.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3]]) >>> a.device device(type='cpu') >>> b = a.to(torch.device('cuda', 0)) >>> b.device device(type='cuda', index=0)
- Parameters
device – The target device to move this tensor.
- Returns
Return a tensor on the given device.
to(self: _k2.ragged.RaggedTensor, device: str) -> _k2.ragged.RaggedTensor
Transfer this tensor to a given device.
Note
If
self
is already on the specified device, return a ragged tensor sharing the underlying memory withself
. Otherwise, a new tensor is returned.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]]) >>> a.device device(type='cpu') >>> b = a.to('cuda:0') >>> b.device device(type='cuda', index=0) >>> c = b.to('cpu') >>> c.device device(type='cpu') >>> d = c.to('cuda:1') >>> d.device device(type='cuda', index=1)
- Parameters
device – The target device to move this tensor. Note: The device is represented as a string. Valid strings are: “cpu”, “cuda:0”, “cuda:1”, etc.
- Returns
Return a tensor on the given device.
to(self: _k2.ragged.RaggedTensor, dtype: torch::dtype) -> _k2.ragged.RaggedTensor
Convert this tensor to a specific dtype.
Note
If
self
is already of the specified dtype, return a ragged tensor sharing the underlying memory withself
. Otherwise, a new tensor is returned.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3, 5]]) >>> a.dtype torch.int32 >>> b = a.to(torch.float64) >>> b.dtype torch.float64
Caution
Currently, only support dtypes
torch.int32
,torch.float32
, andtorch.float64
. We can support other types if needed.- Parameters
dtype – The dtype this tensor should be converted to.
- Returns
Return a tensor of the given dtype.
to_str_simple
- RaggedTensor.to_str_simple(self: _k2.ragged.RaggedTensor) str
Convert a ragged tensor to a string representation, which is more compact than
self.__str__
.An example output is given below:
RaggedTensor([[[1, 2, 3], [], [0]], [[2], [3, 10.5]]], dtype=torch.float32)
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1, 2, 3], [], [0]], [[2], [3, 10.5]] ]) >>> a RaggedTensor([[[1, 2, 3], [], [0]], [[2], [3, 10.5]]], dtype=torch.float32) >>> str(a) 'RaggedTensor([[[1, 2, 3],\n [],\n [0]],\n [[2],\n [3, 10.5]]], dtype=torch.float32)' >>> a.to_str_simple() 'RaggedTensor([[[1, 2, 3], [], [0]], [[2], [3, 10.5]]], dtype=torch.float32)'
tolist
- RaggedTensor.tolist(self: _k2.ragged.RaggedTensor) list
Turn a ragged tensor into a list of lists [of lists..].
Hint
You can pass the returned list to the constructor of
RaggedTensor
.>>> a = k2r.RaggedTensor([ [[], [1, 2], [3], []], [[5, 6, 7]], [[], [0, 2, 3], [], []]]) >>> a.tolist() [[[], [1, 2], [3], []], [[5, 6, 7]], [[], [0, 2, 3], [], []]] >>> b = k2r.RaggedTensor(a.tolist()) >>> a == b True >>> c = k2r.RaggedTensor([[1.], [2.], [], [3.25, 2.5]]) >>> c.tolist() [[1.0], [2.0], [], [3.25, 2.5]]
- Returns
A list of list of lists [of lists …] containing the same elements and structure as
self
.
tot_size
- RaggedTensor.tot_size(self: _k2.ragged.RaggedTensor, axis: int) int
Return the number of elements of an given axis. If axis is 0, it’s equivalent to the property
dim0
.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [1 2 3] [] [5 8 ] ]') >>> a.tot_size(0) 3 >>> a.tot_size(1) 5 >>> import k2.ragged as k2r >>> b = k2r.RaggedTensor('[ [[1 2 3] [] [5 8]] [[] [1 5 9 10 -1] [] [] []] ]') >>> b.tot_size(0) 2 >>> b.tot_size(1) 8 >>> b.tot_size(2) 10
unique
- RaggedTensor.unique(self: _k2.ragged.RaggedTensor, need_num_repeats: bool = False, need_new2old_indexes: bool = False) Tuple[_k2.ragged.RaggedTensor, Optional[_k2.ragged.RaggedTensor], Optional[torch.Tensor]]
If
self
has two axes, this will return the unique sub-lists (in a possibly different order, but without repeats). Ifself
has 3 axes, it will do the above but separately for each index on axis 0; if more than 3 axes, the earliest axes will be ignored.Caution
It does not completely guarantee that all unique sequences will be present in the output, as it relies on hashing and ignores collisions. If several sequences have the same hash, only one of them is kept, even if the actual content in the sequence is different.
Caution
Even if there are no repeated sequences, the output may be different from
self
. That is, new2old_indexes may NOT be an identity map even if nothing was removed.Example 1
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[3, 1], [3], [1], [1], [3, 1], [2]]) >>> a.unique() (RaggedTensor([[1], [2], [3], [3, 1]], dtype=torch.int32), None, None) >>> a.unique(need_num_repeats=True, need_new2old_indexes=True) (RaggedTensor([[1], [2], [3], [3, 1]], dtype=torch.int32), RaggedTensor([[2, 1, 1, 2]], dtype=torch.int32), tensor([2, 5, 1, 0], dtype=torch.int32)) >>> a.unique(need_num_repeats=True) (RaggedTensor([[1], [2], [3], [3, 1]], dtype=torch.int32), RaggedTensor([[2, 1, 1, 2]], dtype=torch.int32), None) >>> a.unique(need_new2old_indexes=True) (RaggedTensor([[1], [2], [3], [3, 1]], dtype=torch.int32), None, tensor([2, 5, 1, 0], dtype=torch.int32))
Example 2
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[[1, 2], [2, 1], [1, 2], [1, 2]], [[3], [2], [0, 1], [2]], [[], [2, 3], [], [3]] ]) >>> a.unique() (RaggedTensor([[[1, 2], [2, 1]], [[2], [3], [0, 1]], [[], [3], [2, 3]]], dtype=torch.int32), None, None) >>> a.unique(need_num_repeats=True, need_new2old_indexes=True) (RaggedTensor([[[1, 2], [2, 1]], [[2], [3], [0, 1]], [[], [3], [2, 3]]], dtype=torch.int32), RaggedTensor([[3, 1], [2, 1, 1], [2, 1, 1]], dtype=torch.int32), tensor([ 0, 1, 5, 4, 6, 8, 11, 9], dtype=torch.int32)) >>> a.unique(need_num_repeats=True) (RaggedTensor([[[1, 2], [2, 1]], [[2], [3], [0, 1]], [[], [3], [2, 3]]], dtype=torch.int32), RaggedTensor([[3, 1], [2, 1, 1], [2, 1, 1]], dtype=torch.int32), None) >>> a.unique(need_new2old_indexes=True) (RaggedTensor([[[1, 2], [2, 1]], [[2], [3], [0, 1]], [[], [3], [2, 3]]], dtype=torch.int32), None, tensor([ 0, 1, 5, 4, 6, 8, 11, 9], dtype=torch.int32))
Example 3
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [3], [2]]) >>> a.unique(True, True) (RaggedTensor([[1], [2], [3]], dtype=torch.int32), RaggedTensor([[1, 1, 1]], dtype=torch.int32), tensor([0, 2, 1], dtype=torch.int32))
- Parameters
need_num_repeats – If True, it also returns the number of repeats of each sequence.
need_new2old_indexes –
If true, it returns an extra 1-D tensor new2old_indexes. If src has 2 axes, this tensor contains src_idx0; if src has 3 axes, this tensor contains src_idx01.
Caution
For repeated sublists, only one of them is kept. The choice of which one to keep is deterministic and is an implementation detail.
- Returns
ans: A ragged tensor with the same number of axes as
self
and possibly fewer elements due to removing repeated sequences on the last axis (and with the last-but-one indexes possibly in a different order).num_repeats: A tensor containing number of repeats of each returned sequence if
need_num_repeats
is True; it isNone
otherwise. If it is notNone
,num_repeats.num_axes
is always 2. Ifans.num_axes
is 2, thennum_repeats.dim0 == 1
andnum_repeats.numel() == ans.dim0
. Ifans.num_axes
is 3, thennum_repeats.dim0 == ans.dim0
andnum_repeats.numel() == ans.tot_size(1)
.new2old_indexes: A 1-D tensor whose i-th element specifies the input sublist that the i-th output sublist corresponds to.
- Return type
Returns a tuple containing
device
- RaggedTensor.device
Return the device of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]]) >>> a.device device(type='cpu') >>> b = a.to(torch.device('cuda', 0)) >>> b.device device(type='cuda', index=0) >>> b.device == torch.device('cuda:0')
dim0
- RaggedTensor.dim0
Return number of sublists at axis 0.
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [3], [], [], [] ]) >>> a.dim0 5 >>> b = k2r.RaggedTensor('[ [[]] [[] []]]') >>> b.dim0 2
dtype
- RaggedTensor.dtype
Return the dtype of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], []]) >>> a.dtype torch.int32 >>> a = a.to(torch.float32) >>> a.dtype torch.float32 >>> b = k2r.RaggedTensor([[3]], dtype=torch.float64) >>> b.dtype torch.float64
grad
- RaggedTensor.grad
This attribute is
None
by default. PyTorch will set it duringbackward()
.The attribute will contain the gradients computed and future calls to
backward()
will accumulate (add) gradients into it.>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2], [3], [5, 6], []], dtype=torch.float32) >>> a.requires_grad_(True) RaggedTensor([[1, 2], [3], [5, 6], []], dtype=torch.float32) >>> b = a.sum() >>> b tensor([ 3., 3., 11., 0.], grad_fn=<SumFunction>>) >>> c = b * torch.arange(4) >>> c.sum().backward() >>> a.grad tensor([0., 0., 1., 2., 2.])
is_cuda
- RaggedTensor.is_cuda
- Returns
Return
True
if the tensor is stored on the GPU,False
otherwise.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]]) >>> a.is_cuda False >>> b = a.to(torch.device('cuda', 0)) >>> b.is_cuda True
num_axes
- RaggedTensor.num_axes
Return the number of axes of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [] [] [] [] ]') >>> a.num_axes 2 >>> b = k2r.RaggedTensor('[ [[] []] [[]] ]') >>> b.num_axes 3 >>> c = k24.Tensor('[ [ [[] [1]] [[3 4] []] ] [ [[1]] [[2] [3 4]] ] ]') >>> c.num_axes 4
- Returns
Return number of axes of this tensor, which is at least 2.
requires_grad
- RaggedTensor.requires_grad
Return
True
if gradients need to be computed for this tensor. ReturnFalse
otherwise.>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]], dtype=torch.float32) >>> a.requires_grad False >>> a.requires_grad = True >>> a.requires_grad True
shape
- RaggedTensor.shape
Return the shape of this tensor.
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [], [3] ]) >>> a.shape [ [ x x ] [ ] [ x ] ] >>> type(a.shape) <class '_k2.ragged.RaggedShape'>
values
- RaggedTensor.values
Return the underlying memory as a 1-D tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2], [], [5], [], [8, 9, 10]]) >>> a.values tensor([ 1, 2, 5, 8, 9, 10], dtype=torch.int32) >>> isinstance(a.values, torch.Tensor) True >>> a.values[-2] = -1 >>> a RaggedTensor([[1, 2], [], [5], [], [8, -1, 10]], dtype=torch.int32) >>> a.values[3] = -3 >>> a RaggedTensor([[1, 2], [], [5], [], [-3, -1, 10]], dtype=torch.int32) >>> a.values[2] = -2 >>> a RaggedTensor([[1, 2], [], [-2], [], [-3, -1, 10]], dtype=torch.int32)
RnntDecodingConfig
__init__
- RnntDecodingConfig.__init__(self: _k2.RnntDecodingConfig, vocab_size: int, decoder_history_len: int, beam: float, max_states: int, max_contexts: int) None
Construct a RnntDecodingConfig object, it contains the parameters needed by rnnt decoding.
- Parameters
vocab_size – It indicates how many symbols we are using, equals the largest-symbol plus one.
decoder_history_len – The number of symbols of history the decoder takes; will normally be one or two (“stateless decoder”), our RNN-T decoding setup does not support unlimited decoder context such as with LSTMs.
beam – beam imposes a limit on the score of a state, relative to the best-scoring state on the same frame. E.g. 10.
max_states – max_states is a limit on the number of distinct states that we allow per frame, per stream; the number of states will not be allowed to exceed this limit.
max_contexts – max_contexts is a limit on the number of distinct contexts that we allow per frame, per stream; the number of contexts will not be allowed to exceed this limit.
beam
- RnntDecodingConfig.beam
decoder_history_len
- RnntDecodingConfig.decoder_history_len
max_contexts
- RnntDecodingConfig.max_contexts
max_states
- RnntDecodingConfig.max_states
vocab_size
- RnntDecodingConfig.vocab_size
RnntDecodingStream
__init__
- RnntDecodingStream.__init__(fsa)[source]
Create a new rnnt decoding stream.
Every sequence(wave data) needs a decoding stream, this function is expected to be called when a new sequence comes. We support different decoding graphs for different streams.
- Parameters
fsa (
Fsa
) – The decoding graph used in this stream.- Returns
A rnnt decoding stream object, which will be combined into
RnntDecodingStreams
to do decoding together with other sequences in parallel.
__str__
RnntDecodingStreams
__init__
- RnntDecodingStreams.__init__(src_streams, config)[source]
Combines multiple RnntDecodingStream objects to create a RnntDecodingStreams object, then all these RnntDecodingStreams can do decoding in parallel.
- Parameters
src_streams (
List
[RnntDecodingStream
]) – A list of RnntDecodingStream object to be combined.config (
RnntDecodingConfig
) – A configuration object which contains decoding parameters like vocab-size, decoder_history_len, beam, max_states, max_contexts etc.
- Returns
Return a RnntDecodingStreams object.
__str__
advance
- RnntDecodingStreams.advance(logprobs)[source]
Advance decoding streams by one frame.
- Parameters
logprobs (
Tensor
) – A tensor of shape [tot_contexts][num_symbols], containing log-probs of symbols given the contexts output by get_contexts(). It satisfies logprobs.Dim0() == shape.TotSize(1), shape is returned by get_contexts().- Return type
None
format_output
- RnntDecodingStreams.format_output(num_frames, allow_partial=False, log_probs=None, t2s2c_shape=None)[source]
Generate the lattice Fsa currently got.
Note
The attributes of the generated lattice is a union of the attributes of all the decoding graphs. For example, if self contains three individual stream, each stream has its own decoding graphs, graph[0] has attributes attr1, attr2; graph[1] has attributes attr1, attr3; graph[2] has attributes attr3, attr4; then the generated lattice has attributes attr1, attr2, attr3, attr4.
- Parameters
num_frames (
List
[int
]) – A List containing the number of frames we want to gather for each stream (note: the frames we have ever received for the corresponding stream). It MUST satisfy len(num_frames) == self.num_streams.allow_partial (
bool
) – If true and there is no final state active, we will treat all the states on the last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice. Default False.log_probs (
Optional
[Tensor
]) – A tensor of shape [t2s2c_shape.tot_size(2)][num_symbols]. It’s a stacked tensor of logprobs passed to function advance during decoding.t2s2c_shape (
Optional
[RaggedShape
]) – It is short for time2stream2context_shape, which describes shape of log_probs used to generate lattice. Used to generate arc_map_token and make the whole decoding process differentiable.
- Return type
Fsa
- Returns
Return the lattice Fsa with all the attributes propagated. The returned Fsa has 3 axes with fsa.dim0==self.num_streams.
get_contexts
- RnntDecodingStreams.get_contexts()[source]
This function must be called prior to evaluating the joiner network for a particular frame. It tells the calling code for which contexts it must evaluate the joiner network.
- Return type
Tuple
[RaggedShape
,Tensor
]- Returns
Return a two-element tuple containing a RaggedShape and a tensor.
- shape:
A RaggedShape with 2 axes, representing [stream][context].
- contexts:
A tensor of shape [tot_contexts][decoder_history_len], where tot_contexts == shape->TotSize(1) and decoder_history_len comes from the config, it represents the number of symbols in the context of the decoder network (assumed to be finite). It contains the token ids into the vocabulary(i.e. 0 <= value < vocab_size). Its dtype is torch.int32.
terminate_and_flush_to_streams
- RnntDecodingStreams.terminate_and_flush_to_streams()[source]
Terminate the decoding process of current RnntDecodingStreams object. It will update the decoding states and store the decoding results currently got to each of the individual streams.
Note
We can not decode with this object anymore after calling terminate_and_flush_to_streams().
- Return type
None
SymbolTable
add
- SymbolTable.add(symbol, index=None)[source]
Add a new symbol to the SymbolTable.
- Parameters
symbol (~Symbol) – The symbol to be added.
index (
Optional
[int
]) – Optional int id to which the symbol should be assigned. If it is not available, a ValueError will be raised.
- Return type
int
- Returns
The int id to which the symbol has been assigned.
from_file
- static SymbolTable.from_file(filename)[source]
Build a symbol table from file.
Every line in the symbol table file has two fields separated by space(s), tab(s) or both. The following is an example file:
<eps> 0 a 1 b 2 c 3
- Parameters
filename (
str
) – Name of the symbol table file. Its format is documented above.- Return type
SymbolTable
- Returns
An instance of
SymbolTable
.
from_str
- static SymbolTable.from_str(s)[source]
Build a symbol table from a string.
The string consists of lines. Every line has two fields separated by space(s), tab(s) or both. The first field is the symbol and the second the integer id of the symbol.
- Parameters
s (
str
) – The input string with the format described above.- Return type
SymbolTable
- Returns
An instance of
SymbolTable
.
get
- SymbolTable.get(k)[source]
Get a symbol for an id or get an id for a symbol
- Parameters
k (
Union
[int
, ~Symbol]) – If it is an id, it tries to find the symbol corresponding to the id; if it is a symbol, it tries to find the id corresponding to the symbol.- Return type
Union
[~Symbol,int
]- Returns
An id or a symbol depending on the given k.
merge
to_file
- SymbolTable.to_file(filename)[source]
Serialize the SymbolTable to a file.
Every line in the symbol table file has two fields separated by space(s), tab(s) or both. The following is an example file:
<eps> 0 a 1 b 2 c 3
- Parameters
filename (
str
) – Name of the symbol table file. Its format is documented above.
ids
- SymbolTable.ids
Returns a list of integer IDs corresponding to the symbols.
- Return type
List
[int
]
symbols
- SymbolTable.symbols
Returns a list of symbols (e.g., strings) corresponding to the integer IDs.
- Return type
List
[~Symbol]
k2.ragged
cat
- k2.ragged.cat(srcs: List[_k2.ragged.RaggedTensor], axis: int) _k2.ragged.RaggedTensor
Concatenate a list of ragged tensor over a specified axis.
Example 1
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [], [2, 3]]) >>> k2r.cat([a, a], axis=0) RaggedTensor([[1], [], [2, 3], [1], [], [2, 3]], dtype=torch.int32) >>> k2r.cat((a, a), axis=1) RaggedTensor([[1, 1], [], [2, 3, 2, 3]], dtype=torch.int32)
Example 2
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 3], [], [5, 8], [], [9]]) >>> b = k2r.RaggedTensor([[0], [1, 8], [], [-1], [10]]) >>> c = k2r.cat([a, b], axis=0) >>> c RaggedTensor([[1, 3], [], [5, 8], [], [9], [0], [1, 8], [], [-1], [10]], dtype=torch.int32) >>> c.num_axes 2 >>> d = k2r.cat([a, b], axis=1) >>> d RaggedTensor([[1, 3, 0], [1, 8], [5, 8], [-1], [9, 10]], dtype=torch.int32) >>> d.num_axes 2 >>> k2r.RaggedTensor.cat([a, b], axis=1) RaggedTensor([[1, 3, 0], [1, 8], [5, 8], [-1], [9, 10]], dtype=torch.int32) >>> k2r.cat((b, a), axis=0) RaggedTensor([[0], [1, 8], [], [-1], [10], [1, 3], [], [5, 8], [], [9]], dtype=torch.int32)
- Parameters
srcs – A list (or a tuple) of ragged tensors to concatenate. They MUST all have the same dtype and on the same device.
axis – Only 0 and 1 are supported right now. If it is 1, then
srcs[i].dim0
must all have the same value.
- Returns
Return a concatenated tensor.
create_ragged_shape2
- k2.ragged.create_ragged_shape2(row_splits: Optional[torch.Tensor] = None, row_ids: Optional[torch.Tensor] = None, cached_tot_size: int = - 1) _k2.ragged.RaggedShape
Construct a RaggedShape from row_ids and/or row_splits vectors. For the overall concepts, please see comments in k2/csrc/utils.h.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[x x] [x]]') >>> k2r.create_ragged_shape2(shape.row_splits(1), shape.row_ids(1)) [ [ x x ] [ x ] ]
- Parameters
row_splits – Optional. A 1-D torch.Tensor with dtype torch.int32. If
None
, you have to specifyrow_ids
.row_ids – Optional. A 1-D torch.Tensor with dtype torch.int32. If
None
, you have to specifyrow_splits
.cached_tot_size – The number of elements (length of row_ids, even if row_ids is not provided); would be identical to the last element of row_splits, but can avoid a GPU to CPU transfer if known.
- Returns
An instance of
RaggedShape
, withans.num_axes == 2
.
create_ragged_tensor
- k2.ragged.create_ragged_tensor(*args, **kwargs)
Overloaded function.
create_ragged_tensor(data: list, dtype: object = None, device: object = ‘cpu’) -> _k2.ragged.RaggedTensor
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Hint
The returned tensor is on CPU.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.create_ragged_tensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.create_ragged_tensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.create_ragged_tensor([ [1] ], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.create_ragged_tensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.create_ragged_tensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32) >>> f = k2r.create_ragged_tensor([ [1, 2], [], [3] ], device=torch.device('cuda', 0)) >>> f RaggedTensor([[1, 2], [], [3]], device='cuda:0', dtype=torch.int32) >>> e = k2r.create_ragged_tensor([[1], []], device='cuda:1') >>> e RaggedTensor([[1], []], device='cuda:1', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
- Returns
Return a ragged tensor.
create_ragged_tensor(data: list, dtype: object = None, device: str = ‘cpu’) -> _k2.ragged.RaggedTensor
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Hint
The returned tensor is on CPU.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.create_ragged_tensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.create_ragged_tensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.create_ragged_tensor([ [1] ], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.create_ragged_tensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.create_ragged_tensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32) >>> f = k2r.create_ragged_tensor([ [1, 2], [], [3] ], device=torch.device('cuda', 0)) >>> f RaggedTensor([[1, 2], [], [3]], device='cuda:0', dtype=torch.int32) >>> e = k2r.create_ragged_tensor([[1], []], device='cuda:1') >>> e RaggedTensor([[1], []], device='cuda:1', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
- Returns
Return a ragged tensor.
create_ragged_tensor(s: str, dtype: object = None, device: object = ‘cpu’) -> _k2.ragged.RaggedTensor
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.create_ragged_tensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.create_ragged_tensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b [ [ [ ] [ 3 ] ] [ [ 10 ] ] ] >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.create_ragged_tensor('[[1.]]') >>> c.dtype torch.float32
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
- Returns
Return a ragged tensor.
create_ragged_tensor(s: str, dtype: object = None, device: str = ‘cpu’) -> _k2.ragged.RaggedTensor
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.create_ragged_tensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.create_ragged_tensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b [ [ [ ] [ 3 ] ] [ [ 10 ] ] ] >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.create_ragged_tensor('[[1.]]') >>> c.dtype torch.float32
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
- Returns
Return a ragged tensor.
create_ragged_tensor(tensor: torch.Tensor) -> _k2.ragged.RaggedTensor
Create a ragged tensor from a torch tensor.
Note
It turns a regular tensor into a ragged tensor.
Caution
The input tensor has to have more than 1 dimension. That is
tensor.ndim > 1
.Also, if the input tensor is contiguous,
self
will share the underlying memory with it. Otherwise, memory of the input tensor is copied to createself
.Supported dtypes of the input tensor are:
torch.int32
,torch.float32
, andtorch.float64
.Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3) >>> b = k2r.create_ragged_tensor(a) >>> a tensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b RaggedTensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b.dtype torch.int32 >>> a.is_contiguous() True >>> a[0, 0] = 10 >>> b RaggedTensor([[10, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b.values[1] = -2 >>> a tensor([[10, -2, 2], [ 3, 4, 5]], dtype=torch.int32)
Example 2:
>>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4] >>> a tensor([[ 0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a.is_contiguous() False >>> b = k2r.create_ragged_tensor(a) >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> b.dtype torch.int32 >>> a[0, 0] = 10 >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a tensor([[10, 4, 8], [12, 16, 20]], dtype=torch.int32)
Example 3:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) >>> a tensor([[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], [[12., 13., 14., 15.], [16., 17., 18., 19.], [20., 21., 22., 23.]]]) >>> b = k2r.create_ragged_tensor(a) >>> b RaggedTensor([[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]], dtype=torch.float32)
- Parameters
tensor – An N-D (N > 1) tensor.
- Returns
Return a ragged tensor.
index
- k2.ragged.index(src: torch.Tensor, indexes: _k2.ragged.RaggedTensor, default_value: object = None) _k2.ragged.RaggedTensor
Use a ragged tensor to index a 1-d torch tensor.
>>> import torch >>> import k2.ragged as k2r >>> i = k2r.RaggedTensor([ [1, 5, 3], [0, 2] ]) >>> src = torch.arange(6, dtype=torch.int32) * 10 >>> src tensor([ 0, 10, 20, 30, 40, 50], dtype=torch.int32) >>> k2r.index(src, i) RaggedTensor([[10, 50, 30], [0, 20]], dtype=torch.int32) >>> k = k2r.RaggedTensor([ [[1, 5, 3], [0]], [[0, 2], [1, 3]] ]) >>> k2r.index(src, k) RaggedTensor([[[10, 50, 30], [0]], [[0, 20], [10, 30]]], dtype=torch.int32) >>> n = k2r.RaggedTensor([ [1, -1], [-1, 0], [-1] ]) >>> k2r.index(src, n) RaggedTensor([[10, 0], [0, 0], [0]], dtype=torch.int32) >>> k2r.index(src, n, default_value=-2) RaggedTensor([[10, -2], [-2, 0], [-2]], dtype=torch.int32)
- Parameters
src – A 1-D torch tensor.
indexes – A ragged tensor with dtype
torch.int32
.default_value – Used only when an entry in
indexes
is -1, in which case it returnsdefault_value
as -1 is not a valid index. If it isNone
and an entry inindexes
is -1, 0 is returned.
- Returns
Return a ragged tensor with the same dtype and device as
src
.
index_and_sum
- k2.ragged.index_and_sum(src: torch.Tensor, indexes: _k2.ragged.RaggedTensor) torch.Tensor
Index a 1-D tensor with a ragged tensor of indexes, perform a sum-per-sublist operation, and return the resulting 1-D tensor.
>>> import torch >>> import k2.ragged as k2r >>> i = k2r.RaggedTensor([[1, 3, 5], [0, 2, 3]]) >>> src = torch.arange(6, dtype=torch.float32) * 10 >>> src tensor([ 0., 10., 20., 30., 40., 50.]) >>> k2r.index_and_sum(src, i) tensor([90., 50.]) >>> k = k2r.RaggedTensor([[1, -1, 2], [-1], [2, 5, -1]]) >>> k2r.index_and_sum(src, k) tensor([30., 0., 70.])
- Parameters
src – A 1-D tensor.
indexes – A ragged tensor with two axes. Its dtype MUST be
torch.int32
. For instance, it can be the arc map returned from the functionremove_epsilon
. If an index is -1, the resulting sublist is 0.
- Returns
Return a 1-D tensor with the same dtype and device as
src
.
random_ragged_shape
- k2.ragged.random_ragged_shape(set_row_ids: bool = False, min_num_axes: int = 2, max_num_axes: int = 4, min_num_elements: int = 0, max_num_elements: int = 2000) _k2.ragged.RaggedShape
RandomRaggedShape
regular_ragged_shape
- k2.ragged.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape
Create a ragged shape with 2 axes that has a regular structure.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3) >>> shape1 [ [ x x x ] [ x x x ] ] >>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2) >>> shape2 [ [ x x ] [ x x ] [ x x ] ]
- Parameters
dim0 – Number of entries at axis 0.
dim1 – Number of entries in each sublist at axis 1.
- Returns
Return a ragged shape on CPU.
RaggedShape
__eq__
- RaggedShape.__eq__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool
Return
True
if two shapes are equal. Otherwise, returnFalse
.Caution
The two shapes have to be on the same device. Otherwise, it throws an exception.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x] ]') >>> shape3 = k2r.RaggedShape('[ [x] [x] ]') >>> shape1 == shape2 False >>> shape3 == shape2 True
- Parameters
other – The shape that we want to compare with
self
.- Returns
Return
True
if the two shapes are the same. ReturnFalse
otherwise.
__getitem__
- RaggedShape.__getitem__(self: _k2.ragged.RaggedShape, i: int) _k2.ragged.RaggedShape
Select the i-th sublist along axis 0.
Note
It requires that this shape has at least 3 axes.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x] [x x]] [[x x x] [] [x x]] ]') >>> shape[0] [ [ x ] [ x x ] ] >>> shape[1] [ [ x x x ] [ ] [ x x ] ]
- Parameters
i – The i-th sublist along axis 0.
- Returns
Return a new ragged shape with one fewer axis.
__init__
- RaggedShape.__init__(self: _k2.ragged.RaggedShape, s: str) None
Construct a ragged shape from a string.
An example string for a ragged shape with 2 axes is:
[ [x x] [ ] [x] ]
An example string for a ragged shape with 3 axes is:
[ [[x] []] [[x] [x x]] ]
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x] ]') >>> shape [ [ x ] [ ] [ x x ] ] >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[]] ]') >>> shape2 [ [ [ x ] [ ] [ x x ] ] [ [ ] ] ]
__ne__
- RaggedShape.__ne__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool
Return
True
if two shapes are not equal. Otherwise, returnFalse
.Caution
The two shapes have to be on the same device. Otherwise, it throws an exception.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x] ]') >>> shape3 = k2r.RaggedShape('[ [x] [x] ]') >>> shape1 != shape2 True >>> shape2 != shape3 False
- Parameters
other – The shape that we want to compare with
self
.- Returns
Return
True
if the two shapes are not equal. ReturnFalse
otherwise.
__repr__
- RaggedShape.__repr__(self: _k2.ragged.RaggedShape) str
Return a string representation of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]') >>> print(shape) [ [ x ] [ ] [ x x ] ] >>> shape [ [ x ] [ ] [ x x ] ]
__str__
- RaggedShape.__str__(self: _k2.ragged.RaggedShape) str
Return a string representation of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]') >>> print(shape) [ [ x ] [ ] [ x x ] ] >>> shape [ [ x ] [ ] [ x x ] ]
compose
- RaggedShape.compose(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) _k2.ragged.RaggedShape
Compose
self
with a given shape.Caution
other
andself
MUST be on the same device.Hint
In order to compose
self
withother
, it has to satisfyself.tot_size(self.num_axes - 1) == other.dim0
Example 1:
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [x x] [x] ]') >>> shape2 = k2r.RaggedShape('[ [x x x] [x x] [] ]') >>> shape1.compose(shape2) [ [ [ x x x ] [ x x ] ] [ [ ] ] ]
Example 2:
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape('[ [[x x] [x x x] []] [[x] [x x x x]] ]') >>> shape2 = k2r.RaggedShape('[ [x] [x x x] [] [] [x x] [x] [] [x x x x] [] [x x] ]') >>> shape1.compose(shape2) [ [ [ [ x ] [ x x x ] ] [ [ ] [ ] [ x x ] ] [ ] ] [ [ [ x ] ] [ [ ] [ x x x x ] [ ] [ x x ] ] ] ] >>> shape1.tot_size(shape1.num_axes - 1) 10 >>> shape2.dim0 10
- Parameters
other – The other shape that is to be composed with
self
.- Returns
Return a composed ragged shape.
get_layer
- RaggedShape.get_layer(self: _k2.ragged.RaggedShape, arg0: int) _k2.ragged.RaggedShape
Returns a sub-shape of
self
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x x] [x] []] [[] [x x x] [x]] [[]] ]') >>> shape.get_layer(0) [ [ x x x ] [ x x x ] [ x ] ] >>> shape.get_layer(1) [ [ x x ] [ x ] [ ] [ ] [ x x x ] [ x ] [ ] ]
- Parameters
layer – Layer that is desired, from
0 .. src.num_axes - 2
(inclusive).- Returns
This returned shape will have
num_axes == 2
, the minimal case of aRaggedShape
.
index
- RaggedShape.index(self: _k2.ragged.RaggedShape, axis: int, indexes: torch.Tensor, need_value_indexes: bool = True) Tuple[_k2.ragged.RaggedShape, Optional[torch.Tensor]]
Indexing operation on a ragged shape, returns
self[indexes]
, where elements ofindexes
are interpreted as indexes into axisaxis
of self``.Caution
indexes
is a 1-D tensor andindexes.dtype == torch.int32
.Example 1:
>>> shape = k2r.RaggedShape('[ [x x] [x] [x x x] ]') >>> value = torch.arange(6, dtype=torch.float32) * 10 >>> ragged = k2r.RaggedTensor(shape, value) >>> ragged [ [ 0 10 ] [ 20 ] [ 30 40 50 ] ] >>> i = torch.tensor([0, 2, 1], dtype=torch.int32) >>> sub_shape, value_indexes = shape.index(axis=0, indexes=i, need_value_indexes=True) >>> sub_shape [ [ x x ] [ x x x ] [ x ] ] >>> value_indexes tensor([0, 1, 3, 4, 5, 2], dtype=torch.int32) >>> ragged.data[value_indexes.long()] tensor([ 0., 10., 30., 40., 50., 20.]) >>> k = torch.tensor([0, -1, 1, 0, 2, -1], dtype=torch.int32) >>> sub_shape2, value_indexes2 = shape.index(axis=0, indexes=k, need_value_indexes=True) >>> sub_shape2 [ [ x x ] [ ] [ x ] [ x x ] [ x x x ] [ ] ] >>> value_indexes2 tensor([0, 1, 2, 0, 1, 3, 4, 5], dtype=torch.int32)
Example 2:
>>> import torch >>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x x] [x]] [[] [x x x] [x]] [[x] [] [] [x x]] ]') >>> i = torch.tensor([0, 1, 3, 5, 7, 8], dtype=torch.int32) >>> shape.index(axis=1, indexes=i) ([ [ [ x x ] [ x ] ] [ [ x x x ] ] [ [ x ] [ ] [ x x ] ] ], tensor([0, 1, 2, 3, 4, 5, 7, 8, 9], dtype=torch.int32))
- Parameters
axis – The axis to be indexed. Must satisfy
0 <= axis < self.num_axes
.indexes – Array of indexes, which will be interpreted as indexes into axis
axis
ofself
, i.e. with0 <= indexes[i] < self.tot_size(axis)
. Note that ifaxis
is 0, then -1 is also a valid entry inindex
, in which case, an empty list is returned.need_value_indexes –
If
True
, it will return a torch.Tensor containing the indexes intoragged_tensor.data
thatans.data
has, as inans.data = ragged_tensor.data[value_indexes]
, whereragged_tensor
usesself
as its shape.Caution
It is currently not allowed to change the order on axes less than
axis
, i.e. ifaxis > 0
, we require:IsMonotonic(self.row_ids(axis)[indexes])
.
- Returns
Return an indexed ragged shape.
max_size
- RaggedShape.max_size(self: _k2.ragged.RaggedShape, axis: int) int
Return the maximum number of elements of any sublist at the given axis.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [] [x] [x x] [x x x] [x x x x] ]') >>> shape.max_size(1) 4 >>> shape = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]') >>> shape.max_size(1) 5 >>> shape.max_size(2) 4
- Parameters
axis –
Compute the max size of this axis.
Caution
axis
has to be greater than 0.- Returns
Return the maximum number of elements of sublists at the given
axis
.
numel
- RaggedShape.numel(self: _k2.ragged.RaggedShape) int
Return the number of elements in this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]') >>> shape.numel() 6 >>> shape2 = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]') >>> shape2.numel() 8 >>> shape3 = k2r.RaggedShape('[ [x x x] [x] ]') >>> shape3.numel() 4
- Returns
Return the number of elements in this shape.
Hint
It’s the number of
x
’s.
regular_ragged_shape
- static RaggedShape.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape
Create a ragged shape with 2 axes that has a regular structure.
>>> import k2.ragged as k2r >>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3) >>> shape1 [ [ x x x ] [ x x x ] ] >>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2) >>> shape2 [ [ x x ] [ x x ] [ x x ] ]
- Parameters
dim0 – Number of entries at axis 0.
dim1 – Number of entries in each sublist at axis 1.
- Returns
Return a ragged shape on CPU.
remove_axis
- RaggedShape.remove_axis(self: _k2.ragged.RaggedShape, axis: int) _k2.ragged.RaggedShape
Remove a certain axis.
Caution
self.num_axes
MUST be greater than 2.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x x x x]] [[] [] []]]') >>> shape.remove_axis(0) [ [ x ] [ ] [ x x ] [ x x x ] [ x x x x ] [ ] [ ] [ ] ] >>> shape.remove_axis(1) [ [ x x x ] [ x x x x x x x ] [ ] ]
- Parameters
axis – The axis to be removed.
- Returns
Return a ragged shape with one fewer axis.
row_ids
- RaggedShape.row_ids(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor
Return the row ids of a certain
axis
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> shape.row_ids(1) tensor([0, 0, 2, 2, 2], dtype=torch.int32) >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]') >>> shape2.row_ids(1) tensor([0, 0, 0, 1, 1, 1, 1, 1], dtype=torch.int32) >>> shape2.row_ids(2) tensor([0, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5], dtype=torch.int32)
- Parameters
axis – The axis whose row ids is to be returned.
Hint –
axis >= 1
.
- Returns
Return the row ids of the given
axis
.
row_splits
- RaggedShape.row_splits(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor
Return the row splits of a certain
axis
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> shape.row_splits(1) tensor([0, 2, 2, 5], dtype=torch.int32) >>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]') >>> shape2.row_splits(1) tensor([0, 3, 8], dtype=torch.int32) >>> shape2.row_splits(2) tensor([ 0, 1, 1, 3, 6, 7, 11, 11, 11], dtype=torch.int32)
- Parameters
axis – The axis whose row splits is to be returned.
Hint –
axis >= 1
.
- Returns
Return the row splits of the given
axis
.
to
- RaggedShape.to(self: _k2.ragged.RaggedShape, device: object) _k2.ragged.RaggedShape
Move this shape to the specified device.
Hint
If the shape is already on the specified device, the returned shape shares the underlying memory with
self
.>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[x]]') >>> shape.device device(type='cpu') >>> import torch >>> shape2 = shape.to(torch.device('cuda', 0)) >>> shape2.device device(type='cuda', index=0) >>> shape [ [ x ] ] >>> shape2 [ [ x ] ]
- Parameters
device – An instance of
torch.device
. It can be either a CPU device or a CUDA device.- Returns
Return a shape on the given device.
tot_size
- RaggedShape.tot_size(self: _k2.ragged.RaggedShape, axis: int) int
Return the number of elements at a certain``axis``.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [x x] [x x x] []]') >>> shape.tot_size(1) 6 >>> shape.numel() 6 >>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[]] ]') >>> shape2.tot_size(1) 7 >>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[] []] ]') >>> shape2.tot_size(1) 8 >>> shape2.tot_size(2) 6 >>> shape2.numel() 6
- Parameters
axis – Return the number of elements for this
axis
.- Returns
Return the number of elements at
axis
.
tot_sizes
- RaggedShape.tot_sizes(self: _k2.ragged.RaggedShape) tuple
Return total sizes of every axis in a tuple.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [ ] [x x x x]]') >>> shape.dim0 3 >>> shape.tot_size(1) 5 >>> shape.tot_sizes() (3, 5) >>> shape2 = k2r.RaggedShape('[ [[x] []] [[x x x x]]]') >>> shape2.dim0 2 >>> shape2.tot_size(1) 3 >>> shape2.tot_size(2) 5 >>> shape2.tot_sizes() (2, 3, 5)
- Returns
Return a tuple containing the total sizes of each axis.
ans[i]
is the total size of axisi
(fori > 0
). Fori=0
, it is thedim0
of this shape.
device
- RaggedShape.device
Return the device of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[]]') >>> shape.device device(type='cpu') >>> import torch >>> shape2 = shape.to(torch.device('cuda', 0)) >>> shape2.device device(type='cuda', index=0)
dim0
- RaggedShape.dim0
Return number of sublists at axis 0.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]') >>> shape.dim0 3 >>> shape2 = k2r.RaggedShape('[ [[x] []] [[]] [[x] [x x] [x x x]] [[]]]') >>> shape2.dim0 4
num_axes
- RaggedShape.num_axes
Return the number of axes of this shape.
>>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[[] []]') >>> shape.num_axes 2 >>> shape2 = k2r.RaggedShape('[ [[]] [[]]]') >>> shape2.num_axes 3
RaggedTensor
__eq__
- RaggedTensor.__eq__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool
Compare two ragged tensors.
Caution
The two tensors MUST have the same dtype. Otherwise, it throws an exception.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1]]) >>> b = a.clone() >>> a == b True >>> c = a.to(torch.float32) >>> try: ... c == b ... except RuntimeError: ... print("raised exception")
- Parameters
other – The tensor to be compared.
- Returns
Return
True
if the two tensors are equal. ReturnFalse
otherwise.
__getitem__
- RaggedTensor.__getitem__(*args, **kwargs)
Overloaded function.
__getitem__(self: _k2.ragged.RaggedTensor, i: int) -> object
Select the i-th sublist along axis 0.
Caution
Support for autograd is to be implemented.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] ]') >>> a RaggedTensor([[[1, 3], [], [9]], [[8]]], dtype=torch.int32) >>> a[0] RaggedTensor([[1, 3], [], [9]], dtype=torch.int32) >>> a[1] RaggedTensor([[8]], dtype=torch.int32)
Example 2:
>>> a = k2r.RaggedTensor('[ [1 3] [9] [8] ]') >>> a RaggedTensor([[1, 3], [9], [8]], dtype=torch.int32) >>> a[0] tensor([1, 3], dtype=torch.int32) >>> a[1] tensor([9], dtype=torch.int32)
- Parameters
i – The i-th sublist along axis 0.
- Returns
Return a new ragged tensor with one fewer axis. If num_axes == 2, the return value will be a 1D tensor.
__getitem__(self: _k2.ragged.RaggedTensor, key: slice) -> _k2.ragged.RaggedTensor
Slices sublists along axis 0 with the given range. Only support slicing step equals to 1.
Caution
Support for autograd is to be implemented.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] [[10 11]] ]') >>> a RaggedTensor([[[1, 3], [], [9]], [[8]], [[10, 11]]], dtype=torch.int32) >>> a[0:2] RaggedTensor([[[1, 3], [], [9]], [[8]]], dtype=torch.int32) >>> a[1:2] RaggedTensor([[[8]]], dtype=torch.int32)
- Parameters
key – Slice containing integer constants.
- Returns
Return a new ragged tensor with the same axes as original ragged tensor, but only contains the sublists within the range.
__getitem__(self: _k2.ragged.RaggedTensor, key: torch.Tensor) -> _k2.ragged.RaggedTensor
Slice a ragged tensor along axis 0 using a 1-D torch.int32 tensor.
Example 1:
>>> import k2 >>> a = k2.RaggedTensor([[1, 2, 0], [0, 1], [2, 3]]) >>> b = k2.RaggedTensor([[10, 20], [300], [-10, 0, -1], [-2, 4, 5]]) >>> a[0] tensor([1, 2, 0], dtype=torch.int32) >>> b[a[0]] RaggedTensor([[300], [-10, 0, -1], [10, 20]], dtype=torch.int32) >>> a[1] tensor([0, 1], dtype=torch.int32) >>> b[a[1]] RaggedTensor([[10, 20], [300]], dtype=torch.int32) >>> a[2] tensor([2, 3], dtype=torch.int32) >>> b[a[2]] RaggedTensor([[-10, 0, -1], [-2, 4, 5]], dtype=torch.int32)
Example 2:
>>> import torch >>> import k2 >>> a = k2.RaggedTensor([ [[1], [2, 3], [0]], [[], [2]], [[10, 20]] ]) >>> i = torch.tensor([0, 2, 1, 0], dtype=torch.int32) >>> a[i] RaggedTensor([[[1], [2, 3], [0]], [[10, 20]], [[], [2]], [[1], [2, 3], [0]]], dtype=torch.int32)
- Parameters
key – A 1-D torch.int32 tensor containing the indexes to select along axis 0.
- Returns
Return a new ragged tensor with the same number of axes as
self
but only contains the specified sublists.
__getstate__
- RaggedTensor.__getstate__(self: k2.RaggedTensor) tuple
Requires a tensor with 2 axes or 3 axes. Other number of axes are not implemented yet.
This method is to support
pickle
, e.g., used bytorch.save()
. You are not expected to call it by yourself.- Returns
If this tensor has 2 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.values). If this tensor has 3 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.row_splits(1), “row_ids2”, self.values)
Note
“row_ids1” and “row_ids2” in the returned value is for backward compatibility.
__init__
- RaggedTensor.__init__(*args, **kwargs)
Overloaded function.
__init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: object = ‘cpu’) -> None
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64) >>> c RaggedTensor([[1]], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.RaggedTensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32)
Example 2:
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ]) RaggedTensor([[[1, 2]], [], [[]]], dtype=torch.int32) >>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0') RaggedTensor([[[1, 2]], [], [[]]], device='cuda:0', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: str = ‘cpu’) -> None
Create a ragged tensor with arbitrary number of axes.
Note
A ragged tensor has at least two axes.
Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ]) >>> a RaggedTensor([[1, 2], [5], [], [9]], dtype=torch.int32) >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor([ [1, 3.0], [] ]) >>> b RaggedTensor([[1, 3], []], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64) >>> c RaggedTensor([[1]], dtype=torch.float64) >>> c.dtype torch.float64 >>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ]) >>> d RaggedTensor([[[1], [2, 3]], [[4], []]], dtype=torch.int32) >>> d.num_axes 3 >>> e = k2r.RaggedTensor([]) >>> e RaggedTensor([], dtype=torch.int32) >>> e.num_axes 2 >>> e.shape.row_splits(1) tensor([0], dtype=torch.int32) >>> e.shape.row_ids(1) tensor([], dtype=torch.int32)
Example 2:
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ]) RaggedTensor([[[1, 2]], [], [[]]], dtype=torch.int32) >>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0') RaggedTensor([[[1, 2]], [], [[]]], device='cuda:0', dtype=torch.int32)
- Parameters
data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).
dtype – Optional. If None, it infers the dtype from
data
automatically, which is eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: object = ‘cpu’) -> None
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b RaggedTensor([[[], [3]], [[10]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.RaggedTensor('[[1.]]') >>> c.dtype torch.float32 >>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0') >>> d RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: str = ‘cpu’) -> None
Create a ragged tensor from its string representation.
Fields are separated by space(s) or comma(s).
An example string for a 2-axis ragged tensor is given below:
[ [1] [2] [3, 4], [5 6 7, 8] ]
An example string for a 3-axis ragged tensor is given below:
[ [[1]] [[]] ]
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]') >>> a RaggedTensor([[1], [], [3, 4]], dtype=torch.int32) >>> a.num_axes 2 >>> a.dtype torch.int32 >>> b = k2r.RaggedTensor('[ [[] [3]] [[10]] ]', dtype=torch.float32) >>> b RaggedTensor([[[], [3]], [[10]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> b.num_axes 3 >>> c = k2r.RaggedTensor('[[1.]]') >>> c.dtype torch.float32 >>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0') >>> d RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)
Note
Number of spaces or commas in
s
does not affect the result. Of course, numbers have to be separated by at least one space or comma.- Parameters
s – A string representation of a ragged tensor.
dtype – The desired dtype of the tensor. If it is
None
, it tries to infer the correct dtype froms
, which is assumed to be eithertorch.int32
ortorch.float32
. Supported dtypes are:torch.int32
,torch.float32
, andtorch.float64
.device – It can be either an instance of
torch.device
or a string representing a torch device. Example values are:"cpu"
,"cuda:0"
,torch.device("cpu")
,torch.device("cuda", 0)
.
__init__(self: _k2.ragged.RaggedTensor, shape: _k2.ragged.RaggedShape, value: torch.Tensor) -> None
Create a ragged tensor from a shape and a value.
>>> import torch >>> import k2.ragged as k2r >>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]') >>> value = torch.tensor([10, 0, 20, 30, 40], dtype=torch.float32) >>> ragged = k2r.RaggedTensor(shape, value) >>> ragged RaggedTensor([[10, 0], [], [20, 30, 40]], dtype=torch.float32)
- Parameters
shape – The shape of the tensor.
value – The value of the tensor.
__init__(self: _k2.ragged.RaggedTensor, tensor: torch.Tensor) -> None
Create a ragged tensor from a torch tensor.
Note
It turns a regular tensor into a ragged tensor.
Caution
The input tensor has to have more than 1 dimension. That is
tensor.ndim > 1
.Also, if the input tensor is contiguous,
self
will share the underlying memory with it. Otherwise, memory of the input tensor is copied to createself
.Supported dtypes of the input tensor are:
torch.int32
,torch.float32
, andtorch.float64
.Example 1:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3) >>> b = k2r.RaggedTensor(a) >>> a tensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b RaggedTensor([[0, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> a.is_contiguous() True >>> a[0, 0] = 10 >>> b RaggedTensor([[10, 1, 2], [3, 4, 5]], dtype=torch.int32) >>> b.values[1] = -2 >>> a tensor([[10, -2, 2], [ 3, 4, 5]], dtype=torch.int32)
Example 2:
>>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4] >>> a tensor([[ 0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a.is_contiguous() False >>> b = k2r.RaggedTensor(a) >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a[0, 0] = 10 >>> b RaggedTensor([[0, 4, 8], [12, 16, 20]], dtype=torch.int32) >>> a tensor([[10, 4, 8], [12, 16, 20]], dtype=torch.int32)
Example 3:
>>> import torch >>> import k2.ragged as k2r >>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4) >>> a tensor([[[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], [[12., 13., 14., 15.], [16., 17., 18., 19.], [20., 21., 22., 23.]]]) >>> b = k2r.RaggedTensor(a) >>> b RaggedTensor([[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]], dtype=torch.float32) >>> b.dtype torch.float32 >>> c = torch.tensor([[1, 2]], device='cuda:0', dtype=torch.float32) >>> k2r.RaggedTensor(c) RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.float32)
- Parameters
tensor – An N-D (N > 1) tensor.
__ne__
- RaggedTensor.__ne__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool
Compare two ragged tensors.
Caution
The two tensors MUST have the same dtype. Otherwise, it throws an exception.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1, 2], [3]]) >>> b = a.clone() >>> b != a False >>> c = k2r.RaggedTensor([[1], [2], [3]]) >>> c != a True
- Parameters
other – The tensor to be compared.
- Returns
Return
True
if the two tensors are NOT equal. ReturnFalse
otherwise.
__repr__
- RaggedTensor.__repr__(self: _k2.ragged.RaggedTensor) str
Return a string representation of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3], []]) >>> a RaggedTensor([[1], [2, 3], []], dtype=torch.int32) >>> str(a) 'RaggedTensor([[1],\n [2, 3],\n []], dtype=torch.int32)' >>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0') >>> b RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)
__setstate__
- RaggedTensor.__setstate__(self: k2.RaggedTensor, arg0: tuple) None
Set the content of this class from
arg0
.This method is to support
pickle
, e.g., used by torch.load(). You are not expected to call it by yourself.- Parameters
arg0 – It is the return value from the method
__getstate__
.
__str__
- RaggedTensor.__str__(self: _k2.ragged.RaggedTensor) str
Return a string representation of this tensor.
>>> import torch >>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([[1], [2, 3], []]) >>> a RaggedTensor([[1], [2, 3], []], dtype=torch.int32) >>> str(a) 'RaggedTensor([[1],\n [2, 3],\n []], dtype=torch.int32)' >>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0') >>> b RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)
add
- RaggedTensor.add(self: _k2.ragged.RaggedTensor, value: torch.Tensor, alpha: object) _k2.ragged.RaggedTensor
Add value scaled by alpha to source ragged tensor over the last axis.
It implements:
dest[…][i][j] = src[…][i][j] + alpha * value[i]
>>> import k2.ragged as k2r >>> import torch >>> src = k2r.RaggedTensor([[1, 3], [1], [2, 8]], dtype=torch.int32) >>> value = torch.tensor([1, 2, 3], dtype=torch.int32) >>> src.add(value, 1) RaggedTensor([[2, 4], [3], [5, 11]], dtype=torch.int32) >>> src.add(value, -1) RaggedTensor([[0, 2], [-1], [-1, 5]], dtype=torch.int32)
- Parameters
value – The value to be added to the
self
, whose dimension MUST equal the number of sublists along the last dimension ofself
.alpha – The number used to scaled value before adding to
self
.
- Returns
Returns a new RaggedTensor, sharing the same dtype and device with
self
.
arange
- RaggedTensor.arange(self: _k2.ragged.RaggedTensor, axis: int, begin: int, end: int) _k2.ragged.RaggedTensor
Return a sub-range of
self
containing indexesbegin
throughend - 1
along axisaxis
ofself
.The
axis
argument may be confusing; its behavior is equivalent to:for i in range(axis): self = self.remove_axis(0) return self.arange(0, begin, end)
Caution
The returned tensor shares the underlying memory with
self
.Example 1
>>> import k2.ragged as k2r >>> a = k2r.RaggedTensor([ [[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]] ]) >>> a RaggedTensor([[[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]]], dtype=torch.int32) >>> a.num_axes 3 >>> b = a.arange(axis=0, begin=1, end=3) >>> b RaggedTensor([[[], [4, 5], []], [[], [1]]], dtype=torch.int32) >>> b.num_axes 3 >>> c = a.arange(axis=0, begin=1, end=2) >>> c RaggedTensor([[[], [4, 5], []]], dtype=torch.int32) >>> c.num_axes 3 >>> d = a.arange(axis=1, begin=0, end=4) >>> d RaggedTensor([[1], [], [2], []], dtype=torch.int32) >>> d.num_axes 2 >>> e = a.arange(axis=1, begin=2, end=5) >>> e RaggedTensor([[2], [], [4, 5]], dtype=torch.int32) >>> e.num_axes 2
Example 2
>>> a = k2r.RaggedTensor([ [[[], [1], [2, 3]],[[5, 8], [], [9]]], [[[10], [0], []]], [[[], [], [1]]] ]) >>> a.num_axes 4 >>> b = a.arange(axis=0, begin=0, end=2) >>> b RaggedTensor([[[[], [1], [2, 3]], [[5, 8], [], [9]]], [[[10], [0], []]]], dtype=torch.int32) >>> b.num_axes 4 >>> c = a.arange(axis=1, begin=1, end=3) >>> c RaggedTensor([[[5, 8], [], [9]], [[10], [0], []]], dtype=torch.int32) >>> c.num_axes 3 >>> d = a.arange(axis=2, begin=0, end=5) >>> d RaggedTensor([[], [1], [2, 3], [5, 8], []], dtype=torch.int32) >>> d.num_axes 2
Example 3
>>> a = k2r.RaggedTensor([[0], [1], [2], [], [3]]) >>> a RaggedTensor([[0], [1], [2], [], [3]], dtype=torch.int32) >>> a.num_axes 2 >>> b = a.arange(axis=0, begin=1, end=4) >>> b RaggedTensor([[1], [2], []], dtype=torch.int32) >>> b.values[0] = -1 >>> a RaggedTensor([[0], [-1], [2], [], [3]], dtype=torch.int32)
- Parameters
axis – The axis from which
begin
andend
correspond to.begin – The beginning of the range (inclusive).