k2

add_epsilon_self_loops

k2.add_epsilon_self_loops(fsa, ret_arc_map=False)[source]

Add epsilon self-loops to an Fsa or FsaVec.

This is required when composing using a composition method that does not treat epsilons specially, if the other FSA has epsilons in it.

Parameters
  • fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

  • ret_arc_map (bool) – If False, return the resulting Fsa. If True, return an extra arc map.

Return type

Union[Fsa, Tuple[Fsa, Tensor]]

Returns

If ret_arc_map is False, return an instance of Fsa that has an epsilon self-loop on every non-final state. If ret_arc_map is True, it returns an extra arc_map. arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the resulting Fsa. arc_map[i] is -1 if the i-th arc in the resulting Fsa has no counterpart in the input fsa.

arc_sort

k2.arc_sort(fsa, ret_arc_map=False)[source]

Sort arcs of every state.

Note

Arcs are sorted by labels first, and then by dest states.

Caution

If the input fsa is already arc sorted, we return it directly. Otherwise, a new sorted fsa is returned.

Parameters
  • fsa (Fsa) – The input FSA.

  • ret_arc_map (bool) – True to return an extra arc_map (a 1-D tensor with dtype being torch.int32). arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the output Fsa.

Return type

Union[Fsa, Tuple[Fsa, Tensor]]

Returns

If ret_arc_map is False, return the sorted FSA. It is the same as the input fsa if the input fsa is arc sorted. Otherwise, a new sorted fsa is returned and the input fsa is NOT modified. If ret_arc_map is True, an extra arc map is also returned.

Example: Sort a single FSA

Listing 15 Sort a single FSA
 1#!/usr/bin/env python3
 2
 3import k2
 4
 5s = '''
 60 1 1 4 0.1
 70 1 3 5 0.2
 80 1 2 3 0.3
 90 2 5 2 0.4
100 2 4 1 0.5
111 2 2 3 0.6
121 2 3 1 0.7
131 2 1 2 0.8
142 3 -1 -1 0.9
153
16'''
17fsa = k2.Fsa.from_str(s, acceptor=False)
18fsa.draw('arc_sort_single_before.svg', title='Before k2.arc_sort')
19sorted_fsa = k2.arc_sort(fsa)
20sorted_fsa.draw('arc_sort_single_after.svg', title='After k2.arc_sort')
21
22# If you want to sort by aux_labels, you can use
23inverted_fsa = k2.invert(fsa)
24sorted_fsa_2 = k2.arc_sort(inverted_fsa)
25sorted_fsa_2 = k2.invert(sorted_fsa_2)
26sorted_fsa_2.draw('arc_sort_single_after_aux_labels.svg',
27                  title='After k2.arc_sort by aux_labels')
Fsa before k2.arc_sort
Fsa before k2.arc_sort
Fsa before k2.arc_sort

cat

k2.cat(srcs)[source]

Concatenate a list of FsaVec into a single FsaVec.

Caution

Only common tensor attributes are kept in the output FsaVec. For non-tensor attributes, only one copy is kept in the output FsaVec. We choose the first copy of the FsaVec that has the lowest index in srcs.

Parameters

srcs (List[Fsa]) – A list of FsaVec. Each element MUST be an FsaVec.

Return type

Fsa

Returns

Return a single FsaVec concatenated from the input FsaVecs.

closure

k2.closure(fsa)[source]

Compute the Kleene closure of the input FSA.

Parameters

fsa (Fsa) – The input FSA. It has to be a single FSA. That is, len(fsa.shape) == 2.

Return type

Fsa

Returns

The resulting FSA which is the Kleene closure of the input FSA.

compose

k2.compose(a_fsa, b_fsa, treat_epsilons_specially=True, inner_labels=None)[source]

Compute the composition of two FSAs.

When treat_epsilons_specially is True, this function works only on CPU. When treat_epsilons_specially is False and both a_fsa and b_fsa are on GPU, then this function works on GPU; in this case, the two input FSAs do not need to be arc sorted.

Note

a_fsa.aux_labels is required to be defined and it can be either a torch.Tensor or a ragged tensor of type k2.RaggedTensor. If it is a ragged tensor, then it requires that a_fsa.requires_grad is False.

For both FSAs, the aux_labels attribute is interpreted as output labels, (olabels), and the composition involves matching the olabels of a_fsa with the ilabels of b_fsa. This is implemented by intersecting the inverse of a_fsa (a_fsa_inv) with b_fsa, then replacing the ilabels of the result with the original ilabels on a_fsa which are now the aux_labels of a_fsa_inv. If b_fsa.aux_labels is not defined, b_fsa is treated as an acceptor (as in OpenFST), i.e. its olabels and ilabels are assumed to be the same.

Refer to k2.intersect() for how we assign the attributes of the output FSA.

Parameters
  • a_fsa (Fsa) – The first input FSA. It can be either a single FSA or an FsaVec.

  • b_fsa (Fsa) – The second input FSA. it can be either a single FSA or an FsaVec.

  • treat_epsilons_specially (bool) – If True, epsilons will be treated as epsilon, meaning epsilon arcs can match with an implicit epsilon self-loop. If False, epsilons will be treated as real, normal symbols (to have them treated as epsilons in this case you may have to add epsilon self-loops to whichever of the inputs is naturally epsilon-free).

  • inner_labels (Optional[str]) – If specified (and if a_fsa has aux_labels), the labels that we matched on, which would normally be discarded, will instead be copied to this attribute name.

Caution

b_fsa has to be arc sorted if the function runs on CPU.

Return type

Fsa

Returns

The result of composing a_fsa and b_fsa. len(out_fsa.shape) is 2 if and only if the two input FSAs are single FSAs; otherwise, len(out_fsa.shape) is 3.

compose_arc_maps

k2.compose_arc_maps(step1_arc_map, step2_arc_map)[source]

Compose arc maps from two Fsa operations.

It implements:

  • ans_arc_map[i] = step1_arc_map[step2_arc_map[i]] if step2_arc_map[i] is not -1

  • ans_arc_map[i] = -1 if step2_arc_map[i] is -1

for i in 0 to step2_arc_map.numel() - 1.

Parameters
  • step1_arc_map (Tensor) – A 1-D tensor with dtype torch.int32 from the first Fsa operation.

  • step2_arc_map (Tensor) – A 1-D tensor with dtype torch.int32 from the second Fsa operation.

Return type

Tensor

Returns

Return a 1-D tensor with dtype torch.int32. It has the same number of elements as step2_arc_map. That is, ans_arc_map.shape == step2_arc_map.shape.

connect

k2.connect(fsa)[source]

Connect an FSA.

Removes states that are neither accessible nor co-accessible.

Note

A state is not accessible if it is not reachable from the start state. A state is not co-accessible if it cannot reach the final state.

Caution

If the input FSA is already connected, it is returned directly. Otherwise, a new connected FSA is returned.

Parameters

fsa (Fsa) – The input FSA to be connected.

Return type

Fsa

Returns

An FSA that is connected.

convert_dense_to_fsa_vec

k2.convert_dense_to_fsa_vec(dense_fsa_vec)[source]

Convert a DenseFsaVec to an FsaVec.

Caution

Intended for use in testing/debug mode only. This operation is NOT differentiable.

Parameters

dense_fsa_vec (DenseFsaVec) – DenseFsaVec to convert.

Return type

Fsa

Returns

The converted FsaVec .

create_fsa_vec

k2.create_fsa_vec(fsas)[source]

Create an FsaVec from a list of FSAs

We use the following rules to set the attributes of the output FsaVec:

  • For tensor attributes, we assume that all input FSAs have the same attribute name and the values are concatenated.

  • For non-tensor attributes, if any two of the input FSAs have the same attribute name, then we assume that their attribute values are equal and the output FSA will inherit the attribute.

Parameters

fsas – A list of Fsa. Each element must be a single FSA.

Returns

An instance of Fsa that represents a FsaVec.

create_sparse

k2.create_sparse(rows, cols, values, size=None, min_col_index=None)[source]

This is a utility function that creates a (torch) sparse matrix likely intended to represent posteriors. The likely usage is something like (for example):

post = k2.create_sparse(fsa.seqframe, fsa.phones,
                        fsa.get_arc_post(True,True).exp(),
                        min_col_index=1)

(assuming seqframe and phones were integer-valued attributes of fsa).

Parameters
  • rows (Tensor) – Row indexes of the sparse matrix (a torch.Tensor), which must have values >= 0; likely fsa.seqframe. Must have row_indexes.dim == 1. Will be converted to dtype=torch.long

  • cols (Tensor) – Column indexes of the sparse matrix, with the same shape as rows. Will be converted to dtype=torch.long

  • values (Tensor) – Values of the sparse matrix, likely of dtype float or double, with the same shape as rows and cols.

  • size (Optional[Tuple[int, int]]) – Optional. If not None, it is assumed to be a tuple containing (num_frames, highest_phone_plus_one)

  • min_col_index (Optional[int]) – If provided, before the sparse tensor is constructed we will filter out elements with cols[i] < min_col_index. Will likely be 0 or 1, if set. This is necessary if col_indexes may have values less than 0, or if you want to filter out 0 values (e.g. as representing blanks).

Returns

Returns a torch.Tensor that is sparse with coo (coordinate) format, i.e. layout=torch.sparse_coo (which is actually the only sparse format that torch currently supports).

ctc_graph

k2.ctc_graph(symbols, modified=False, device='cpu')[source]

Construct ctc graphs from symbols.

Note

The scores of arcs in the returned FSA are all 0.

Parameters
  • symbols (Union[List[List[int]], RaggedTensor]) –

    It can be one of the following types:

    • A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]

    • An instance of k2.RaggedTensor. Must have num_axes == 2.

  • standard – Option to specify the type of CTC topology: “standard” or “simplified”, where the “standard” one makes the blank mandatory between a pair of identical symbols. Default True.

  • device (Union[device, str, None]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. By default, the returned FSA is on CPU. If symbols is an instance of k2.RaggedTensor, the returned FSA will on the same device as k2.RaggedTensor.

Return type

Fsa

Returns

An FsaVec containing the returned ctc graphs, with “Dim0()” the same as “len(symbols)”(List[List[int]]) or “dim0”(k2.RaggedTensor)

Example 1

Listing 16 Usage of k2.ctc_graph
 1#!/usr/bin/env python3
 2
 3import k2
 4
 5isym = k2.SymbolTable.from_str('''
 6blk 0
 7a 1
 8b 2
 9c 3
10''')
11
12osym = k2.SymbolTable.from_str('''
13a 1
14b 2
15c 3
16''')
17
18fsa = k2.ctc_graph([[1, 2, 2, 3]], modified=False)
19fsa_modified = k2.ctc_graph([[1, 2, 2, 3]], modified=True)
20
21fsa.labels_sym = isym
22fsa.aux_labels_sym = osym
23
24fsa_modified.labels_sym = isym
25fsa_modified.aux_labels_sym = osym
26
27# fsa is an FsaVec, so we use fsa[0] to visualize the first Fsa
28fsa[0].draw('ctc_graph.svg',
29            title='CTC graph for the string "abbc" (modified=False)')
30fsa_modified[0].draw('modified_ctc_graph.svg',
31                     title='CTC graph for the string "abbc" (modified=True)')
CTC graph (modified=False)

Fig. 20 Note: There is a mandatory blank between state 3 and 5.

CTC graph (modified=True)

Fig. 21 Note: There is no mandatory blank between state 3 and 5.

Example 2 Construct a CTC graph using composition

Listing 17 Construct a CTC graph using composition
 1#!/usr/bin/env python3
 2
 3# Construct a CTC graph by intersection
 4
 5import k2
 6
 7isym = k2.SymbolTable.from_str('''
 8blk 0
 9a 1
10b 2
11c 3
12''')
13
14osym = k2.SymbolTable.from_str('''
15a 1
16b 2
17c 3
18''')
19
20linear_fsa = k2.linear_fsa([1, 2, 2, 3])
21linear_fsa.labels_sym = isym
22
23ctc_topo = k2.ctc_topo(max_token=3, modified=False)
24ctc_topo_modified = k2.ctc_topo(max_token=3, modified=True)
25
26ctc_topo.labels_sym = isym
27ctc_topo.aux_labels_sym = osym
28
29ctc_topo_modified.labels_sym = isym
30ctc_topo_modified.aux_labels_sym = osym
31
32ctc_graph = k2.compose(ctc_topo, linear_fsa)
33ctc_graph_modified = k2.compose(ctc_topo_modified, linear_fsa)
34
35linear_fsa.draw('linear_fsa.svg', title='Linear FSA of the string "abbc"')
36ctc_topo.draw('ctc_topo.svg', title='CTC topology')
37ctc_topo_modified.draw('ctc_topo_modified.svg', title='Modified CTC topology')
38
39ctc_graph.draw('ctc_topo_compose_linear_fsa.svg',
40               title='k2.compose(ctc_topo, linear_fsa)')
41
42ctc_graph_modified.draw('ctc_topo_modified_compose_linear_fsa.svg',
43                        title='k2.compose(ctc_topo_modified, linear_fsa)')
Linear FSA
CTC topology (modified=False)
k2.compose(ctc_topo, linear_fsa)
CTC topology (modified=True)
k2.compose(ctc_topo_modified, linear_fsa)

ctc_loss

k2.ctc_loss(decoding_graph, dense_fsa_vec, output_beam=10, delay_penalty=0.0, reduction='sum', use_double_scores=True, target_lengths=None)[source]

Compute the CTC loss given a decoding graph and a dense fsa vector.

Parameters
  • decoding_graph (Fsa) – An FsaVec. It can be the composition result of a ctc topology and a transcript.

  • dense_fsa_vec (DenseFsaVec) – It represents the neural network output. Refer to the help information in k2.DenseFsaVec.

  • output_beam (float) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.

  • delay_penalty (float) – A constant to penalize symbol delay, which is used to make symbol emit earlier for streaming models. It is almost the same as the delay_penalty in our rnnt_loss, See https://github.com/k2-fsa/k2/issues/955 and https://arxiv.org/pdf/2211.00490.pdf for more details.

  • reduction (Literal[‘none’, ‘mean’, ‘sum’]) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied, ‘mean’: the output losses will be divided by the target lengths and then the mean over the batch is taken. ‘sum’: sum the output losses over batches.

  • use_double_scores (bool) – True to use double precision floating point in computing the total scores. False to use single precision.

  • target_lengths (Optional[Tensor]) – Used only when reduction is mean. It is a 1-D tensor of batch size representing lengths of the targets, e.g., number of phones or number of word pieces in a sentence.

Return type

Tensor

Returns

If reduction is none, return a 1-D tensor with size equal to batch size. If reduction is mean or sum, return a scalar.

ctc_topo

k2.ctc_topo(max_token, modified=False, device=None)[source]

Create a CTC topology.

A token which appears once on the right side (i.e. olabels) may appear multiple times on the left side (ilabels), possibly with epsilons in between. When 0 appears on the left side, it represents the blank symbol; when it appears on the right side, it indicates an epsilon. That is, 0 has two meanings here.

A standard CTC topology is the conventional one, where there is a mandatory blank between two repeated neighboring symbols. A non-standard, i.e., modified CTC topology, imposes no such constraint.

See https://github.com/k2-fsa/k2/issues/746#issuecomment-856421616 and https://github.com/k2-fsa/snowfall/pull/209 for more details.

Parameters
  • max_token (int) – The maximum token ID (inclusive). We assume that token IDs are contiguous (from 1 to max_token). 0 represents blank.

  • modified (bool) – If False, create a standard CTC topology. Otherwise, create a modified CTC topology.

  • device (Union[device, str, None]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it is None, then the returned FSA is on CPU.

Return type

Fsa

Returns

Return either a standard or a modified CTC topology as an FSA depending on whether standard is True or False.

Example

Listing 18 Usage of k2.ctc_topo
 1#!/usr/bin/env python3
 2
 3import k2
 4
 5isym = k2.SymbolTable.from_str('''
 6blk 0
 7a 1
 8b 2
 9c 3
10''')
11
12osym = k2.SymbolTable.from_str('''
13a 1
14b 2
15c 3
16''')
17
18fsa = k2.ctc_topo(max_token=3, modified=False)
19fsa_modified = k2.ctc_topo(max_token=3, modified=True)
20
21fsa.labels_sym = isym
22fsa.aux_labels_sym = osym
23
24fsa_modified.labels_sym = isym
25fsa_modified.aux_labels_sym = osym
26
27fsa.draw('ctc_topo.svg',
28         title='CTC topology with max_token=3 (modified=False)')
29fsa_modified.draw('modified_ctc_topo.svg',
30                  title='CTC topology with max_token=3 (modified=True)')
CTC topology
CTC topology

determinize

k2.determinize(fsa, weight_pushing_type=<DeterminizeWeightPushingType.kNoWeightPushing: 2>)[source]

Determinize the input Fsa.

Caution

  • It only works on for CPU.

  • Any weight_pushing_type value other than kNoWeightPushing causes the ‘arc_derivs’ to not accurately reflect the real derivatives, although this will not matter as long as the derivatives ultimately derive from FSA operations such as getting total scores or arc posteriors, which are insensitive to pushing.

Parameters
  • fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec. Must be connected. It’s also expected to be epsilon-free, but this is not checked; in any case, epsilon will be treated as a normal symbol.

  • weight_pushing_type (DeterminizeWeightPushingType) –

    An enum value that determines what kind of weight pushing is desired, default kNoWeightPushing.

    kTropicalWeightPushing:

    use tropical semiring (actually, max on scores) for weight pushing.

    kLogWeightPushing:

    use log semiring (actually, log-sum on score) for weight pushing

    kNoWeightPushing:

    do no weight pushing; this will cause some delay in scores being emitted, and the weights created in this way will correspond exactly to those that would be produced by the arc_derivs.

    For decoding graph creation, we recommend kLogSumWeightPushing.

Return type

Fsa

Returns

The resulting Fsa, it’s equivalent to the input fsa under tropical semiring but will be deterministic. It will be the same as the input fsa if the input fsa has property kFsaPropertiesArcSortedAndDeterministic. Otherwise, a new deterministic fsa is returned and the input fsa is NOT modified.

do_rnnt_pruning

k2.do_rnnt_pruning(am, lm, ranges)[source]

Prune the output of encoder(am) and prediction network(lm) with ranges generated by get_rnnt_prune_ranges.

Parameters
  • am (Tensor) – The encoder output, with shape (B, T, encoder_dim)

  • lm (Tensor) – The prediction network output, with shape (B, S + 1, decoder_dim)

  • ranges (Tensor) – A tensor containing the symbol indexes for each frame that we want to keep. Its shape is (B, T, s_range), see the docs in get_rnnt_prune_ranges for more details of this tensor.

Return type

Tuple[Tensor, Tensor]

Returns

Return the pruned am and lm with shape (B, T, s_range, C)

expand_ragged_attributes

k2.expand_ragged_attributes(fsas, ret_arc_map=False, ragged_attribute_names=None)[source]

Turn ragged labels attached to this FSA into linear (Tensor) labels, expanding arcs into sequences of arcs as necessary to achieve this. Supports autograd. If fsas had no ragged attributes, returns fsas itself.

Caution

This function will ensure that for final-arcs in the returned fsa, the corresponding labels for all ragged attributes are -1; it will add an extra arc at the end if necessary to ensure this, if the original ragged attributes did not have -1 as their final element on final-arcs (note: our intention is that -1’s on final arcs, like filler symbols, are removed when making attributes ragged; this is what fsa_from_unary_function_ragged() does if remove_filler==True (the default).

Parameters
  • fsas (Fsa) – The source Fsa

  • ret_arc_map (bool) – If true, will return a pair (new_fsas, arc_map) with arc_map a tensor of int32 that maps from arcs in the result to arcs in fsas, with -1’s for newly created arcs. If false, just returns new_fsas.

  • ragged_attribute_names (Optional[List[str]]) – If specified, just this list of ragged attributes will be expanded to linear tensor attributes, and the rest will stay ragged.

Return type

Union[Fsa, Tuple[Fsa, Tensor]]

get_aux_labels

k2.get_aux_labels(best_paths)[source]

Extract aux_labels from the best-path FSAs and remove 0s and -1s. :type best_paths: Fsa :param best_paths: An Fsa with best_paths.arcs.num_axes() == 3, i.e.

containing multiple FSAs, which is expected to be the result of shortest_path (otherwise the returned values won’t be meaningful).

Return type

List[List[int]]

Returns

Returns a list of lists of int, containing the label sequences we decoded.

get_best_matching_stats

k2.get_best_matching_stats(tokens, scores, counts, eos, min_token, max_token, max_order)[source]

For “query” sentences, this function gets the mean and variance of scores from the best matching words-in-context in a set of provided “key” sentences. This matching process matches the word and the words preceding it, looking for the highest-order match it can find (it’s intended for approximating the scores of models that see only left-context, like language models). The intended application is in estimating the scores of hypothesized transcripts, when we have actually computed the scores for only a subset of the hypotheses.

Caution

This function only runs on CPU for now.

Parameters
  • tokens (RaggedTensor) –

    A ragged tensor of int32_t with 2 or 3 axes. If 2 axes, this represents a collection of key and query sequences. If 3 axes, this represents a set of such collections.

    2-axis example:

    [ [ the, cat, said, eos ], [ the, cat, fed, eos ] ]
    

    3-axis example:

    [ [ [ the, cat, said, eos ], [ the, cat, fed, eos ] ],
      [ [ hi, my, name, is, eos ], [ bye, my, name, is, eos ] ], ... ]
    

    where the words would actually be represented as integers, The eos symbol is required if this code is to work as intended (otherwise this code will not be able to recognize when we have reached the beginnings of sentences when comparing histories). bos symbols are allowed but not required.

  • scores (Tensor) – A one dim torch.tensor with scores.size() == tokens.NumElements(), this is the item for which we are requesting best-matching values (as means and variances in case there are multiple best matches). In our anticipated use, these would represent scores of words in the sentences, but they could represent anything.

  • counts (Tensor) – An one dim torch.tensor with counts.size() == tokens.NumElements(), containing 1 for words that are considered “keys” and 0 for words that are considered “queries”. Typically some entire sentences will be keys and others will be queries.

  • eos (int) – The value of the eos (end of sentence) symbol; internally, this is used as an extra padding value before the first sentence in each collection, so that it can act like a “bos” symbol.

  • min_token (int) – The lowest possible token value, including the bos symbol (e.g., might be -1).

  • max_token (int) – The maximum possible token value. Be careful not to set this too large the implementation contains a part which takes time and space O(max_token - min_token).

  • max_order (int) – The maximum n-gram order to ever return in the ngram_order output; the output will be the minimum of max_order and the actual order matched; or max_order if we matched all the way to the beginning of both sentences. The main reason this is needed is that we need a finite number to return at the beginning of sentences.

Return type

Tuple[Tensor, Tensor, Tensor, Tensor]

Returns

Returns a tuple of four torch.tensor (mean, var, counts_out, ngram_order)
mean:

For query positions, will contain the mean of the scores at the best matching key positions, or zero if that is undefined because there are no key positions at all. For key positions, you can treat the output as being undefined (actually they are treated the same as queries, but won’t match with only themselves because we don’t match at singleton intervals).

var:

Like mean, but contains the (centered) variance of the best matching positions.

counts_out:

The number of key positions that contributed to the mean and var statistics. This should only be zero if counts was all zero.

ngram_order:

The n-gram order corresponding to the best matching positions found at each query position, up to a maximum of max_order; will be max_order if we matched all the way to the beginning of a sentence.

get_lattice

k2.get_lattice(log_prob, log_prob_len, decoding_graph, search_beam=20, output_beam=8, min_active_states=30, max_active_states=10000, subsampling_factor=1)[source]

Get the decoding lattice from a decoding graph and log_softmax output. :type log_prob: Tensor :param log_prob: Output from a log_softmax layer of shape (N, T, C). :type log_prob_len: Tensor :param log_prob_len: A tensor of shape (N,) containing number of valid frames from

log_prob before padding.

Parameters
  • decoding_graph (Fsa) – An Fsa, the decoding graph. It can be either an HLG or an H. You can use ctc_topo() to build an H.

  • search_beam (float) – Decoding beam, e.g. 20. Smaller is faster, larger is more exact (less pruning). This is the default value; it may be modified by min_active_states and max_active_states.

  • output_beam (float) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.

  • min_active_states (int) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.

  • max_active_states (int) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.

  • subsampling_factor (int) – The subsampling factor of the model.

Return type

Fsa

Returns

An FsaVec containing the decoding result. It has axes [utt][state][arc].

get_rnnt_logprobs

k2.get_rnnt_logprobs(lm, am, symbols, termination_symbol, rnnt_type='regular', boundary=None)[source]

Reduces RNN-T problem (the simple case, where joiner network is just addition), to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This function is called from rnnt_loss_simple(), but may be useful for other purposes.

Parameters
  • lm (Tensor) –

    Language model part of un-normalized logprobs of symbols, to be added to acoustic model part before normalizing. Of shape:

    [B][S+1][C]
    

    where B is the batch size, S is the maximum sequence length of the symbol sequence, possibly including the EOS symbol; and C is size of the symbol vocabulary, including the termination/next-frame symbol. Conceptually, lm[b][s] is a vector of length [C] representing the “language model” part of the un-normalized logprobs of symbols, given all symbols earlier than s in the sequence. The reason we still need this for position S is that we may still be emitting the termination/next-frame symbol at this point.

  • am (Tensor) –

    Acoustic-model part of un-normalized logprobs of symbols, to be added to language-model part before normalizing. Of shape:

    [B][T][C]
    

    where B is the batch size, T is the maximum sequence length of the acoustic sequences (in frames); and C is size of the symbol vocabulary, including the termination/next-frame symbol. It reflects the “acoustic” part of the probability of any given symbol appearing next on this frame.

  • symbols (Tensor) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.

  • termination_symbol (int) – The identity of the termination symbol, must be in {0..C-1}

  • boundary (Optional[Tensor]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

Return type

Tuple[Tensor, Tensor]

Returns

(px, py) (the names are quite arbitrary).
px: logprobs, of shape [B][S][T+1] if rnnt_type is regular,

[B][S][T] if rnnt_type is not regular.

py: logprobs, of shape [B][S+1][T]

in the recursion:

p[b,0,0] = 0.0
if rnnt_type == "regular":
   p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t],
                      p[b,s,t-1] + py[b,s,t-1])
if rnnt_type != "regular":
   p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1],
                      p[b,s,t-1] + py[b,s,t-1])
.. where p[b][s][t] is the "joint score" of the pair of subsequences
of length s and t respectively.  px[b][s][t] represents the
probability of extending the subsequences of length (s,t) by one in
the s direction, given the particular symbol, and py[b][s][t]
represents the probability of extending the subsequences of length
(s,t) by one in the t direction,
i.e. of emitting the termination/next-frame symbol.

if rnnt_type == "regular", px[:,:,T] equals -infinity, meaning on the
"one-past-the-last" frame we cannot emit any symbols.
This is simply a way of incorporating
the probability of the termination symbol on the last frame.

get_rnnt_logprobs_joint

k2.get_rnnt_logprobs_joint(logits, symbols, termination_symbol, rnnt_type='regular', boundary=None)[source]

Reduces RNN-T problem to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This function is called from rnnt_loss().

Parameters
  • logits (Tensor) – The output of joiner network, with shape (B, T, S + 1, C), i.e. batch, time_seq_len, symbol_seq_len+1, num_classes

  • symbols (Tensor) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.

  • termination_symbol (int) – The identity of the termination symbol, must be in {0..C-1}

  • boundary (Optional[Tensor]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

Return type

Tuple[Tensor, Tensor]

Returns

(px, py) (the names are quite arbitrary):

px: logprobs, of shape [B][S][T+1] if rnnt_type is regular,
                       [B][S][T] if rnnt_type is not regular.
py: logprobs, of shape [B][S+1][T]

in the recursion:

p[b,0,0] = 0.0
if rnnt_type == "regular":
   p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t],
                      p[b,s,t-1] + py[b,s,t-1])
if rnnt_type != "regular":
   p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1],
                      p[b,s,t-1] + py[b,s,t-1])

length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol.

if rnnt_type == “regular”, px[:,:,T] equals -infinity, meaning on the “one-past-the-last” frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.

get_rnnt_logprobs_pruned

k2.get_rnnt_logprobs_pruned(logits, symbols, ranges, termination_symbol, boundary, rnnt_type='regular')[source]

Construct px, py for mutual_information_recursion with pruned output.

Parameters
  • logits (Tensor) – The pruned output of joiner network, with shape (B, T, s_range, C)

  • symbols (Tensor) – The symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.

  • ranges (Tensor) – A tensor containing the symbol ids for each frame that we want to keep. It is a LongTensor of shape [B][T][s_range], where ranges[b,t,0] contains the begin symbol 0 <= s <= S - s_range + 1, such that logits[b,t,:,:] represents the logits with positions s, s + 1, ... s + s_range - 1. See docs in get_rnnt_prune_ranges() for more details of what ranges contains.

  • termination_symbol (int) – the termination symbol, with 0 <= termination_symbol < C

  • boundary (Tensor) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame whether emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

Return type

Tuple[Tensor, Tensor]

Returns

(px, py) (the names are quite arbitrary):

px: logprobs, of shape [B][S][T+1] if rnnt_type is regular,
                       [B][S][T] if rnnt_type is not regular.
py: logprobs, of shape [B][S+1][T]

in the recursion:

p[b,0,0] = 0.0
if rnnt_type == "regular":
   p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t],
                      p[b,s,t-1] + py[b,s,t-1])
if rnnt_type != "regular":
   p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1],
                      p[b,s,t-1] + py[b,s,t-1])

length s and t respectively. px[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the s direction, given the particular symbol, and py[b][s][t] represents the probability of extending the subsequences of length (s,t) by one in the t direction, i.e. of emitting the termination/next-frame symbol.

if rnnt_type == “regular”, px[:,:,T] equals -infinity, meaning on the “one-past-the-last” frame we cannot emit any symbols. This is simply a way of incorporating the probability of the termination symbol on the last frame.

get_rnnt_logprobs_smoothed

k2.get_rnnt_logprobs_smoothed(lm, am, symbols, termination_symbol, lm_only_scale=0.1, am_only_scale=0.1, boundary=None, rnnt_type='regular')[source]

Reduces RNN-T problem (the simple case, where joiner network is just addition), to a compact, standard form that can then be given (with boundaries) to mutual_information_recursion(). This version allows you to make the loss-function one of the form:

lm_only_scale * lm_probs +
am_only_scale * am_probs +
(1-lm_only_scale-am_only_scale) * combined_probs

where lm_probs and am_probs are the probabilities given the lm and acoustic model independently.

This function is called from rnnt_loss_smoothed(), but may be useful for other purposes.

Parameters
  • lm (Tensor) –

    Language model part of un-normalized logprobs of symbols, to be added to acoustic model part before normalizing. Of shape:

    [B][S+1][C]
    

    where B is the batch size, S is the maximum sequence length of the symbol sequence, possibly including the EOS symbol; and C is size of the symbol vocabulary, including the termination/next-frame symbol. Conceptually, lm[b][s] is a vector of length [C] representing the “language model” part of the un-normalized logprobs of symbols, given all symbols earlier than s in the sequence. The reason we still need this for position S is that we may still be emitting the termination/next-frame symbol at this point.

  • am (Tensor) –

    Acoustic-model part of un-normalized logprobs of symbols, to be added to language-model part before normalizing. Of shape:

    [B][T][C]
    

    where B is the batch size, T is the maximum sequence length of the acoustic sequences (in frames); and C is size of the symbol vocabulary, including the termination/next-frame symbol. It reflects the “acoustic” part of the probability of any given symbol appearing next on this frame.

  • symbols (Tensor) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.

  • termination_symbol (int) – The identity of the termination symbol, must be in {0..C-1}

  • lm_only_scale (float) – the scale on the “LM-only” part of the loss.

  • am_only_scale (float) – the scale on the “AM-only” part of the loss, for which we use an “averaged” LM (averaged over all histories, so effectively unigram).

  • boundary (Optional[Tensor]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

Return type

Tuple[Tensor, Tensor]

Returns

(px, py) (the names are quite arbitrary).
px: logprobs, of shape [B][S][T+1] if rnnt_type == “regular”,

[B][S][T] if rnnt_type != “regular”.

py: logprobs, of shape [B][S+1][T]

in the recursion:

p[b,0,0] = 0.0
if rnnt_type == "regular":
   p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t],
                      p[b,s,t-1] + py[b,s,t-1])
if rnnt_type != "regular":
   p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1],
                      p[b,s,t-1] + py[b,s,t-1])
.. where p[b][s][t] is the "joint score" of the pair of subsequences
of length s and t respectively.  px[b][s][t] represents the
probability of extending the subsequences of length (s,t) by one in
the s direction, given the particular symbol, and py[b][s][t]
represents the probability of extending the subsequences of length
(s,t) by one in the t direction,
i.e. of emitting the termination/next-frame symbol.

px[:,:,T] equals -infinity, meaning on the "one-past-the-last" frame
we cannot emit any symbols.  This is simply a way of incorporating
the probability of the termination symbol on the last frame.

get_rnnt_prune_ranges

k2.get_rnnt_prune_ranges(px_grad, py_grad, boundary, s_range)[source]

Get the pruning ranges of normal rnnt loss according to the grads of px and py returned by mutual_information_recursion.

For each sequence with T frames, we will generate a tensor with the shape of (T, s_range) containing the information that which symbols will be token into consideration for each frame. For example, here is a sequence with 10 frames and the corresponding symbols are [A B C D E F], if the s_range equals 3, one possible ranges tensor will be:

[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [1, 2, 3],
 [1, 2, 3], [1, 2, 3], [3, 4, 5], [3, 4, 5], [3, 4, 5]]

which means we only consider [A B C] at frame 0, 1, 2, 3, and [B C D] at frame 4, 5, 6, [D E F] at frame 7, 8, 9.

We can only consider limited number of symbols because frames and symbols are monotonic aligned, theoretically it can only generate particular range of symbols given a particular frame.

Note

For the generated tensor ranges (assuming batch size is 1), ranges[:, 0] is a monotonic increasing tensor from 0 to len(symbols) - s_range and it satisfies ranges[t+1, 0] - ranges[t, 0] < s_range which means we won’t skip any symbols.

Parameters
  • px_grad (Tensor) – The gradient of px, see docs in mutual_information_recursion for more details of px.

  • py_grad (Tensor) – The gradient of py, see docs in mutual_information_recursion for more details of py.

  • boundary (Tensor) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame]

  • s_range (int) – How many symbols to keep for each frame.

Return type

Tensor

Returns

A tensor with the shape of (B, T, s_range) containing the indexes of the kept symbols for each frame.

get_rnnt_prune_ranges_deprecated

k2.get_rnnt_prune_ranges_deprecated(px_grad, py_grad, boundary, s_range)[source]

Get the pruning ranges of normal rnnt loss according to the grads of px and py returned by mutual_information_recursion.

For each sequence with T frames, we will generate a tensor with the shape of (T, s_range) containing the information that which symbols will be token into consideration for each frame. For example, here is a sequence with 10 frames and the corresponding symbols are [A B C D E F], if the s_range equals 3, one possible ranges tensor will be:

[[0, 1, 2], [0, 1, 2], [0, 1, 2], [0, 1, 2], [1, 2, 3],
 [1, 2, 3], [1, 2, 3], [3, 4, 5], [3, 4, 5], [3, 4, 5]]

which means we only consider [A B C] at frame 0, 1, 2, 3, and [B C D] at frame 4, 5, 6, [D E F] at frame 7, 8, 9.

We can only consider limited number of symbols because frames and symbols are monotonic aligned, theoretically it can only generate particular range of symbols given a particular frame.

Note

For the generated tensor ranges (assuming batch size is 1), ranges[:, 0] is a monotonic increasing tensor from 0 to len(symbols) - s_range and it satisfies ranges[t+1, 0] - ranges[t, 0] < s_range which means we won’t skip any symbols.

Parameters
  • px_grad (Tensor) – The gradient of px, see docs in mutual_information_recursion for more details of px.

  • py_grad (Tensor) – The gradient of py, see docs in mutual_information_recursion for more details of py.

  • boundary (Tensor) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame]

  • s_range (int) – How many symbols to keep for each frame.

Return type

Tensor

Returns

A tensor with the shape of (B, T, s_range) containing the indexes of the kept symbols for each frame.

index_add

k2.index_add(index, value, in_out)[source]

It implements in_out[index[i]] += value[i].

Caution

It has similar semantics with torch.Tensor.index_add_ except that:

  • index.dtype == torch.int32

  • -1 <= index[i] < in_out.shape[0]

  • index[i] == -1 is ignored.

  • index has to be a 1-D contiguous tensor.

Caution

in_out is modified in-place.

Caution

This functions does NOT support autograd.

Parameters
  • index (Tensor) – A 1-D contiguous tensor with dtype torch.int32. Must satisfy -1 <= index[i] < in_out.shape[0]

  • value (Tensor) – A 1-D or 2-D tensor with dtype torch.int32, torch.float32, or torch.float64. Must satisfy index.shape[0] == value.shape[0]

  • in_out (Tensor) – A 1-D or 2-D tensor with the same dtype as value. It satisfies in_out.shape[1] == value.shape[1] if it is a 2-D tensor.

Return type

None

Returns

Return None.

index_fsa

k2.index_fsa(src, indexes)[source]

Select a list of FSAs from src with a 1-D tensor.

Parameters
  • src (Fsa) – An FsaVec.

  • indexes (Tensor) – A 1-D torch.Tensor of dtype torch.int32 containing the ids of FSAs to select.

Return type

Fsa

Returns

Return an FsaVec containing only those FSAs specified by indexes.

index_select

k2.index_select(src, index, default_value=0)[source]

Returns a new tensor which indexes the input tensor along dimension 0 using the entries in index.

If the entry in index is -1, then the corresponding entry in the returned tensor is 0.

Caution

index.dtype == torch.int32 and index.ndim == 1.

Parameters
  • src (Tensor) – The input tensor. Either 1-D or 2-D with dtype torch.int32, torch.int64, torch.float32, or torch.float64.

  • index (Tensor) – 1-D tensor of dtype torch.int32 containing the indexes. If an entry is -1, the corresponding entry in the returned value is 0. The elements of index should be in the range [-1..src.shape[0]-1].

  • default_value (float) – Used only when src is a 1-D tensor. It sets ans[i] to default_value if index[i] is -1.

Return type

Tensor

Returns

A tensor with shape (index.numel(), *src.shape[1:]) and dtype the same as src, e.g. if src.ndim == 1, ans.shape would be (index.shape[0],); if src.ndim == 2, ans.shape would be (index.shape[0], src.shape[1]). Will satisfy ans[i] == src[index[i]] if src.ndim == 1, or ans[i, j] == src[index[i], j] if src.ndim == 2, except for entries where index[i] == -1 which will be zero.

intersect

k2.intersect(a_fsa, b_fsa, treat_epsilons_specially=True, ret_arc_maps=False)[source]

Compute the intersection of two FSAs.

When treat_epsilons_specially is True, this function works only on CPU. When treat_epsilons_specially is False and both a_fsa and b_fsa are on GPU, then this function works on GPU; in this case, the two input FSAs do not need to be arc sorted.

Parameters
  • a_fsa (Fsa) – The first input FSA. It can be either a single FSA or an FsaVec.

  • b_fsa (Fsa) – The second input FSA. it can be either a single FSA or an FsaVec. If both a_fsa and b_fsa are FsaVec, they must contain the same number of FSAs.

  • treat_epsilons_specially (bool) – If True, epsilons will be treated as epsilon, meaning epsilon arcs can match with an implicit epsilon self-loop. If False, epsilons will be treated as real, normal symbols (to have them treated as epsilons in this case you may have to add epsilon self-loops to whichever of the inputs is naturally epsilon-free).

  • ret_arc_maps (bool) –

    If False, return the resulting Fsa. If True, return a tuple containing three entries:

    • the resulting Fsa

    • a_arc_map, a 1-D torch.Tensor with dtype torch.int32. a_arc_map[i] is the arc index in a_fsa that corresponds to the i-th arc in the resulting Fsa. a_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in a_fsa.

    • b_arc_map, a 1-D torch.Tensor with dtype torch.int32. b_arc_map[i] is the arc index in b_fsa that corresponds to the i-th arc in the resulting Fsa. b_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in b_fsa.

Caution

The two input FSAs MUST be arc sorted if treat_epsilons_specially is True.

Caution

The rules for assigning the attributes of the output Fsa are as follows:

  • (1) For attributes where only one source (a_fsa or b_fsa) has that attribute: Copy via arc_map, or use zero if arc_map has -1. This rule works for both floating point and integer attributes.

  • (2) For attributes where both sources (a_fsa and b_fsa) have that attribute: For floating point attributes: sum via arc_maps, or use zero if arc_map has -1. For integer attributes, it’s not supported for now (the attributes will be discarded and will not be kept in the output FSA).

Return type

Union[Fsa, Tuple[Fsa, Tensor, Tensor]]

Returns

If ret_arc_maps is False, return the result of intersecting a_fsa and b_fsa. len(out_fsa.shape) is 2 if and only if the two input FSAs are single FSAs; otherwise, len(out_fsa.shape) is 3. If ret_arc_maps is True, it returns additionally two arc_maps: a_arc_map and b_arc_map.

intersect_dense

k2.intersect_dense(a_fsas, b_fsas, output_beam, max_states=15000000, max_arcs=1073741824, a_to_b_map=None, seqframe_idx_name=None, frame_idx_name=None)[source]

Intersect array of FSAs on CPU/GPU.

Caution

a_fsas MUST be arc sorted.

Parameters
  • a_fsas (Fsa) – Input FsaVec, i.e., decoding graphs, one per sequence. It might just be a linear sequence of phones, or might be something more complicated. Must have a_fsas.shape[0] == b_fsas.dim0() if a_to_b_map is None. Otherwise, must have a_fsas.shape[0] == a_to_b_map.shape[0]

  • b_fsas (DenseFsaVec) – Input FSAs that correspond to neural network output.

  • output_beam (float) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.

  • max_states (int) – The max number of states to prune the output, mainly to avoid out-of-memory and numerical overflow, default 15,000,000.

  • max_arcs (int) – The max number of arcs to prune the output, mainly to avoid out-of-memory and numerical overflow, default 1073741824(2^30).

  • a_to_b_map (Optional[Tensor]) – Maps from FSA-index in a to FSA-index in b to use for it. If None, then we expect the number of FSAs in a_fsas to equal b_fsas.dim0(). If set, then it should be a Tensor with ndim=1 and dtype=torch.int32, with a_to_b_map.shape[0] equal to the number of FSAs in a_fsas (i.e. a_fsas.shape[0] if len(a_fsas.shape) == 3, else 1); and elements 0 <= i < b_fsas.dim0().

  • seqframe_idx_name (Optional[str]) – If set (e.g. to ‘seqframe’), an attribute in the output will be created that encodes the sequence-index and the frame-index within that sequence; this is equivalent to a row-index into b_fsas.values, or, equivalently, an element in b_fsas.shape.

  • frame_idx_name (Optional[str]) – If set (e.g. to ‘frame’, an attribute in the output will be created that contains the frame-index within the corresponding sequence.

Return type

Fsa

Returns

The result of the intersection (pruned to output_beam; this pruning is exact, it uses forward and backward scores.

intersect_dense_pruned

k2.intersect_dense_pruned(a_fsas, b_fsas, search_beam, output_beam, min_active_states, max_active_states, seqframe_idx_name=None, frame_idx_name=None, allow_partial=False)[source]

Intersect array of FSAs on CPU/GPU.

Caution

a_fsas MUST be arc sorted.

Parameters
  • a_fsas (Fsa) – Input FsaVec, i.e., decoding graphs, one per sequence. It might just be a linear sequence of phones, or might be something more complicated. Must have either a_fsas.shape[0] == b_fsas.dim0(), or a_fsas.shape[0] == 1 in which case the graph is shared.

  • b_fsas (DenseFsaVec) – Input FSAs that correspond to neural network output.

  • search_beam (float) – Decoding beam, e.g. 20. Smaller is faster, larger is more exact (less pruning). This is the default value; it may be modified by min_active_states and max_active_states.

  • output_beam (float) – Beam to prune output, similar to lattice-beam in Kaldi. Relative to best path of output.

  • min_active_states (int) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.

  • max_active_states (int) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.

  • active (allow_partial If true and there was no final state) – we will treat all the states on the last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice.

:paramwe will treat all the states on the

last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice.

Parameters
  • seqframe_idx_name (Optional[str]) – If set (e.g. to ‘seqframe’), an attribute in the output will be created that encodes the sequence-index and the frame-index within that sequence; this is equivalent to a row-index into b_fsas.values, or, equivalently, an element in b_fsas.shape.

  • frame_idx_name (Optional[str]) – If set (e.g. to ‘frame’, an attribute in the output will be created that contains the frame-index within the corresponding sequence.

Return type

Fsa

Returns

The result of the intersection.

intersect_device

k2.intersect_device(a_fsas, b_fsas, b_to_a_map, sorted_match_a=False, ret_arc_maps=False)[source]

Compute the intersection of two FsaVecs treating epsilons as real, normal symbols.

This function supports both CPU and GPU. But it is very slow on CPU. That’s why this function name ends with _device. It is intended for GPU. See k2.intersect() which is a more general interface (it will call the same underlying code, IntersectDevice(), if the inputs are on GPU and a_fsas is arc-sorted).

Caution

Epsilons are treated as real, normal symbols.

Hint

The two inputs do not need to be arc-sorted.

Refer to k2.intersect() for how we assign the attributes of the output FsaVec.

Parameters
  • a_fsas (Fsa) – An FsaVec (must have 3 axes, i.e., len(a_fsas.shape) == 3.

  • b_fsas (Fsa) – An FsaVec (must have 3 axes) on the same device as a_fsas.

  • b_to_a_map (Tensor) –

    A 1-D torch.Tensor with dtype torch.int32 on the same device as a_fsas. Map from FSA-id in b_fsas to the corresponding FSA-id in a_fsas that we want to compose it with. E.g. might be an identity map, or all-to-zero, or something the user chooses.

    Requires
    • b_to_a_map.shape[0] == b_fsas.shape[0]

    • 0 <= b_to_a_map[i] < a_fsas.shape[0]

  • sorted_match_a (bool) – If true, the arcs of a_fsas must be sorted by label (checked by calling code via properties), and we’ll use a matching approach that requires this.

  • ret_arc_maps (bool) –

    If False, return the resulting Fsa. If True, return a tuple containing three entries:

    • the resulting Fsa

    • a_arc_map, a 1-D torch.Tensor with dtype torch.int32. a_arc_map[i] is the arc index in a_fsas that corresponds to the i-th arc in the resulting Fsa. a_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in a_fsas.

    • b_arc_map, a 1-D torch.Tensor with dtype torch.int32. b_arc_map[i] is the arc index in b_fsas that corresponds to the i-th arc in the resulting Fsa. b_arc_map[i] is -1 if the i-th arc in the resulting Fsa has no corresponding arc in b_fsas.

Return type

Union[Fsa, Tuple[Fsa, Tensor, Tensor]]

Returns

If ret_arc_maps is False, return intersected FsaVec; will satisfy ans.shape == b_fsas.shape. If ret_arc_maps is True, it returns additionally two arc maps: a_arc_map and b_arc_map.

invert

k2.invert(fsa, ret_arc_map=False)[source]

Invert an FST, swapping the labels in the FSA with the auxiliary labels.

Parameters
  • fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

  • ret_arc_map (bool) – True to return an extra arc map, which is a 1-D tensor with dtype torch.int32. The returned arc_map[i] is the arc index in the input fsa that corresponds to the i-th arc in the returned fsa. arc_map[i] is -1 if the i-th arc in the returned fsa has no counterpart in the input fsa.

Return type

Union[Fsa, Tuple[Fsa, Tensor]]

Returns

If ret_arc_map is False, return the inverted Fsa, it’s top-sorted if fsa is top-sorted. If ret_arc_map is True, return an extra arc map.

is_rand_equivalent

k2.is_rand_equivalent(a, b, log_semiring, beam=inf, treat_epsilons_specially=True, delta=1e-06, npath=100)[source]

Check if the Fsa a appears to be equivalent to b by randomly checking some symbol sequences in them.

Caution

It works only on CPU.

Parameters
  • a (Fsa) – One of the input FSA. It can be either a single FSA or an FsaVec. Must be top-sorted and on CPU.

  • b (Fsa) – The other input FSA. It must have the same NumAxes() as a. Must be top-sorted and on CPU.

  • log_semiring (bool) – The semiring to be used for all weight measurements; if false then we use ‘max’ on alternative paths; if true we use ‘log-add’.

  • beam (float) – beam > 0 that affects pruning; the algorithm will only check paths within beam of the total score of the lattice (for tropical semiring, it’s max weight over all paths from start state to final state; for log semiring, it’s log-sum probs over all paths) in a or b.

  • treat_epsilons_specially (bool) – We’ll do intersection between generated path and a or b when check equivalence. Generally, if it’s true, we will treat epsilons as epsilon when doing intersection; Otherwise, epsilons will just be treated as any other symbol.

  • delta (float) – Tolerance for path weights to check the equivalence. If abs(weights_a, weights_b) <= delta, we say the two paths are equivalent.

  • npath (int) – The number of paths will be generated to check the equivalence of a and b

Return type

bool

Returns

True if the Fsa a appears to be equivalent to b by randomly generating npath paths from one of them and then checking if the symbol sequence exists in the other one and if the total weight for that symbol sequence is the same in both FSAs.

joint_mutual_information_recursion

k2.joint_mutual_information_recursion(px, py, boundary=None)[source]

A recursion that is useful for modifications of RNN-T and similar loss functions, where the recursion probabilities have a number of terms and you want them reported separately. See mutual_information_recursion() for more documentation of the basic aspects of this.

Parameters
  • px (Sequence[Tensor]) – a sequence of Tensors, each of the same shape [B][S][T+1]

  • py (Sequence[Tensor]) – a sequence of Tensor, each of the same shape [B][S+1][T], the sequence must be the same length as px.

  • boundary (Optional[Tensor]) – optionally, a LongTensor of shape [B][4] containing rows [s_begin, t_begin, s_end, t_end], with 0 <= s_begin <= s_end <= S and 0 <= t_begin <= t_end < T, defaulting to [0, 0, S, T]. These are the beginning and one-past-the-last positions in the x and y sequences respectively, and can be used if not all sequences are of the same length.

Return type

Sequence[Tensor]

Returns

a Tensor of shape (len(px), B), whose sum over dim 0 is the total log-prob of the recursion mentioned below, per sequence. The first element of the sequence of length len(px) is “special”, in that it has an offset term reflecting the difference between sum-of-log and log-of-sum; for more interpretable loss values, the “main” part of your loss function should be first.

The recursion below applies if boundary == None, when it defaults to (0, 0, S, T); where px_sum, py_sum are the sums of the elements of px and py:

p = tensor of shape (B, S+1, T+1), containing -infinity
p[b,0,0] = 0.0
# do the following in loop over s and t:
p[b,s,t] = log_add(p[b,s-1,t] + px_sum[b,s-1,t],
                    p[b,s,t-1] + py_sum[b,s,t-1])
            (if s > 0 or t > 0)
return b[:][S][T]

This function lets you implement the above recursion efficiently, except that it gives you a breakdown of the contribution from all the elements of px and py separately. As noted above, the first element of the sequence is “special”.

levenshtein_alignment

k2.levenshtein_alignment(refs, hyps, hyp_to_ref_map, sorted_match_ref=False)[source]

Get the levenshtein alignment of two FsaVecs

This function supports both CPU and GPU. But it is very slow on CPU.

Parameters
  • refs (Fsa) – An FsaVec (must have 3 axes, i.e., len(refs.shape) == 3. It is the output Fsa of the levenshtein_graph().

  • hyps (Fsa) – An FsaVec (must have 3 axes) on the same device as refs. It is the output Fsa of the levenshtein_graph().

  • hyp_to_ref_map (Tensor) –

    A 1-D torch.Tensor with dtype torch.int32 on the same device as refs. Map from FSA-id in hpys to the corresponding FSA-id in refs that we want to get levenshtein alignment with. E.g. might be an identity map, or all-to-zero, or something the user chooses.

    Requires
    • hyp_to_ref_map.shape[0] == hyps.shape[0]

    • 0 <= hyp_to_ref_map[i] < refs.shape[0]

  • sorted_match_ref (bool) – If true, the arcs of refs must be sorted by label (checked by calling code via properties), and we’ll use a matching approach that requires this.

Return type

Fsa

Returns

Returns an FsaVec containing the alignment information and satisfing ans.Dim0() == hyps.Dim0(). Two attributes named ref_labels and hyp_labels will be added to the returned FsaVec. ref_labels contains the aligned sequences of refs and hyp_labels contains the aligned sequences of hyps. You can get the levenshtein distance by calling get_tot_scores on the returned FsaVec.

Examples

>>> hyps = k2.levenshtein_graph([[1, 2, 3], [1, 3, 3, 2]])
>>> refs = k2.levenshtein_graph([[1, 2, 4]])
>>> alignment = k2.levenshtein_alignment(
        refs, hyps,
        hyp_to_ref_map=torch.tensor([0, 0], dtype=torch.int32),
        sorted_match_ref=True)
>>> alignment.labels
tensor([ 1,  2,  0, -1,  1,  0,  0,  0, -1], dtype=torch.int32)
>>> alignment.ref_labels
tensor([ 1,  2,  4, -1,  1,  2,  4,  0, -1], dtype=torch.int32)
>>> alignment.hyp_labels
tensor([ 1,  2,  3, -1,  1,  3,  3,  2, -1], dtype=torch.int32)
>>> -alignment.get_tot_scores(
        use_double_scores=False, log_semiring=False))
tensor([1., 3.])

levenshtein_graph

k2.levenshtein_graph(symbols, ins_del_score=- 0.501, device='cpu')[source]

Construct levenshtein graphs from symbols.

See https://github.com/k2-fsa/k2/pull/828 for more details about levenshtein graph.

Parameters
  • symbols (Union[RaggedTensor, List[List[int]]]) –

    It can be one of the following types:

    • A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]

    • An instance of k2.RaggedTensor. Must have num_axes == 2 and with dtype torch.int32.

  • ins_del_score (float) – The score on the self loops arcs in the graphs, the main idea of this score is to set insertion and deletion penalty, which will affect the shortest path searching produre.

  • device (Union[device, str, None]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. By default, the returned FSA is on CPU. If symbols is an instance of k2.RaggedTensor, the returned FSA will on the same device as k2.RaggedTensor.

Return type

Fsa

Returns

An FsaVec containing the returned levenshtein graphs, with “Dim0()” the same as “len(symbols)”(List[List[int]]) or “dim0”(k2.RaggedTensor).

linear_fsa

k2.linear_fsa(labels, device=None)[source]

Construct an linear FSA from labels.

Note

The scores of arcs in the returned FSA are all 0.

Parameters
  • labels (Union[List[int], List[List[int]], RaggedTensor]) –

    It can be one of the following types:

    • A list of integers, e.g., [1, 2, 3]

    • A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]

    • An instance of k2.RaggedTensor. Must have num_axes == 2.

  • device (Union[device, str, None]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it is None, then the returned FSA is on CPU. It has to be None if labels is an instance of k2.RaggedTensor.

Return type

Fsa

Returns

  • If labels is a list of integers, return an FSA

  • If labels is a list of list-of-integers, return an FsaVec

  • If labels is an instance of k2.RaggedTensor, return an FsaVec

linear_fsa_with_self_loops

k2.linear_fsa_with_self_loops(fsas)[source]

Create a linear FSA with epsilon self-loops by first removing epsilon transitions from the input linear FSA.

Parameters

fsas (Fsa) – An FSA or an FsaVec. It MUST be a linear FSA or a vector of linear FSAs.

Returns

Return an FSA or FsaVec, where each FSA contains epsilon self-loops but contains no epsilon transitions for arcs that are not self-loops.

linear_fst

k2.linear_fst(labels, aux_labels)[source]

Construct a linear FST from labels and its corresponding auxiliary labels.

Note

The scores of arcs in the returned FST are all 0.

Parameters
  • labels (Union[List[int], List[List[int]]]) – A list of integers or a list of list of integers.

  • aux_labels (Union[List[int], List[List[int]]]) – A list of integers or a list of list of integers.

Return type

Fsa

Returns

An FST if the labels is a list of integers. A vector of FSTs (FsaVec) if the input is a list of list of integers.

linear_fst_with_self_loops

k2.linear_fst_with_self_loops(fsts)[source]

Create a linear FST with epsilon self-loops by first removing epsilon transitions from the input linear FST.

Note

The main difference to linear_fsa_with_self_loops() is that aux_labels and scores are also kept here.

Parameters

fsas – An FST or an FstVec. It MUST be a linear FST or a vector of linear FSTs.

Returns

Return an FST or FstVec, where each FST contains epsilon self-loops but contains no epsilon transitions for arcs that are not self-loops.

mutual_information_recursion

k2.mutual_information_recursion(px, py, boundary=None, return_grad=False)[source]

A recursion that is useful in computing mutual information between two sequences of real vectors, but may be useful more generally in sequence-to-sequence tasks where monotonic alignment between pairs of sequences is desired. The definitions of the arguments are definitions that would be used when computing this type of mutual information, but you can also view them as arbitrary quantities and just make use of the formula computed by this function.

Parameters
  • px (Tensor) –

    A torch.Tensor of some floating point type, with shape [B][S][T+1], where B is the batch size, S is the length of the x sequence (including representations of EOS symbols but not BOS symbols), and T is the length of the y sequence (including representations of EOS symbols but not BOS symbols). In the mutual information application, px[b][s][t] would represent the following log odds ratio; ignoring the b index on the right to make the notation more compact:

    px[b][s][t] =  log [ p(x_s | x_{0..s-1}, y_{0..t-1}) / p(x_s) ]
    

    This expression also implicitly includes the log-probability of choosing to generate an x value as opposed to a y value. In practice it might be computed as a + b, where a is the log probability of choosing to extend the sequence of length (s,t) with an x as opposed to a y value; and b might in practice be of the form:

    log(N exp f(x_s, y_{t-1}) / sum_t'  exp f(x_s, y_t'))
    

    where N is the number of terms that the sum over t' included, which might include some or all of the other sequences as well as this one.

    Note

    we don’t require px and py to be contiguous, but the code assumes for optimization purposes that the T axis has stride 1.

  • py (Tensor) –

    A torch.Tensor of the same dtype as px, with shape [B][S+1][T], representing:

    py[b][s][t] =  log [ p(y_t | x_{0..s-1}, y_{0..t-1}) / p(y_t) ]
    

    This function does not treat x and y differently; the only difference is that for optimization purposes we assume the last axis (the t axis) has stride of 1; this is true if px and py are contiguous.

  • boundary (Optional[Tensor]) – If supplied, a torch.LongTensor of shape [B][4], where each row contains [s_begin, t_begin, s_end, t_end], with 0 <= s_begin <= s_end <= S and 0 <= t_begin <= t_end < T (this implies that empty sequences are allowed). If not supplied, the values [0, 0, S, T] will be assumed. These are the beginning and one-past-the-last positions in the x and y sequences respectively, and can be used if not all sequences are of the same length.

  • return_grad (bool) – Whether to return grads of px and py, this grad standing for the occupation probability is the output of the backward with a fake gradient the fake gradient is the same as the gradient you’d get if you did torch.autograd.grad((scores.sum()), [px, py]). This is useful to implement the pruned version of rnnt loss.

Return type

Union[Tuple[Tensor, Tuple[Tensor, Tensor]], Tensor]

Returns

Returns a torch.Tensor of shape [B], containing the log of the mutual information between the b’th pair of sequences. This is defined by the following recursion on p[b,s,t] (where p is of shape [B,S+1,T+1]), representing a mutual information between sub-sequences of lengths s and t:

     p[b,0,0] = 0.0
if !modified:
     p[b,s,t] = log_add(p[b,s-1,t] + px[b,s-1,t],
                        p[b,s,t-1] + py[b,s,t-1])
if modified:
     p[b,s,t] = log_add(p[b,s-1,t-1] + px[b,s-1,t-1],
                        p[b,s,t-1] + py[b,s,t-1])

where we handle edge cases by treating quantities with negative indexes as -infinity. The extension to cases where the boundaries are specified should be obvious; it just works on shorter sequences with offsets into px and py.

mwer_loss

k2.mwer_loss(lattice, ref_texts, nbest_scale=0.5, num_paths=200, temperature=1.0, use_double_scores=True, reduction='sum')[source]

Compute the Minimum loss given a lattice and corresponding ref_texts.

Parameters
  • lattice – An FsaVec with axes [utt][state][arc].

  • ref_texts

    It can be one of the following types:
    • A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]

    • An instance of k2.RaggedTensor. Must have num_axes == 2 and with dtype torch.int32.

  • nbest_scale – Scale lattice.score before passing it to k2.random_paths(). A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.’’

  • num_paths – Number of paths to sample from the lattice using k2.random_paths().

  • temperature – For long utterances, the dynamic range of scores will be too large and the posteriors will be mostly 0 or 1. To prevent this it might be a good idea to have an extra argument that functions like a temperature. We scale the logprobs by before doing the normalization.

  • use_double_scores – True to use double precision floating point. False to use single precision.

  • reduction (Literal[‘none’, ‘mean’, ‘sum’]) –

    Specifies the reduction to apply to the output: ‘none’ | ‘sum’ | ‘mean’. ‘none’: no reduction will be applied.

    The returned ‘loss’ is a k2.RaggedTensor, with loss.tot_size(0) == batch_size. loss.tot_size(1) == total_num_paths_of_current_batch If you want the MWER loss for each utterance, just do: loss_per_utt = loss.sum() Then loss_per_utt.shape[0] should be batch_size. See more example usages in ‘k2/python/tests/mwer_test.py’

    ’sum’: sum loss of each path over the whole batch together. ‘mean’: divide above ‘sum’ by total num paths over the whole batch.

Return type

Union[Tensor, RaggedTensor]

Returns

Minimum Word Error Rate loss.

one_best_decoding

k2.one_best_decoding(lattice, use_double_scores=True)[source]

Get the best path from a lattice.

Parameters
  • lattice (Fsa) – The decoding lattice returned by get_lattice().

  • use_double_scores (bool) – True to use double precision floating point in the computation. False to use single precision.

Return type

Fsa

Returns

An FsaVec containing linear paths.

properties_to_str

k2.properties_to_str(p)[source]

Convert properties to a string for debug purpose.

Parameters

p (int) – An integer returned by get_properties().

Return type

str

Returns

A string representation of the input properties.

prune_on_arc_post

k2.prune_on_arc_post(fsas, threshold_prob, use_double_scores)[source]

Remove arcs whose posteriors are less than the given threshold.

Parameters
  • fsas (Fsa) – An FsaVec. Must have 3 axes.

  • threshold_prob (float) – Arcs whose posteriors are less than this value are removed. .. note:: 0 < threshold_prob < 1

  • use_double_scores (bool) – True to use double precision during computation; False to use single precision.

Return type

Fsa

Returns

Return a pruned FsaVec.

pruned_ranges_to_lattice

k2.pruned_ranges_to_lattice(ranges: torch.Tensor, frames: torch.Tensor, symbols: torch.Tensor, logits: torch.Tensor) Tuple[_k2.RaggedArc, torch.Tensor]

random_fsa

k2.random_fsa(acyclic=True, max_symbol=50, min_num_arcs=0, max_num_arcs=1000)[source]

Generate a random Fsa.

Parameters
  • acyclic (bool) – If true, generated Fsa will be acyclic.

  • max_symbol (int) –

    Maximum symbol on arcs. Generated arc symbols will be in range

    [-1,max_symbol], note -1 is kFinalSymbol; must be at least 0;

    min_num_arcs:

    Minimum number of arcs; must be at least 0.

    max_num_arcs:

    Maximum number of arcs; must be >= min_num_arcs.

Return type

Fsa

random_fsa_vec

k2.random_fsa_vec(min_num_fsas=1, max_num_fsas=1000, acyclic=True, max_symbol=50, min_num_arcs=0, max_num_arcs=1000)[source]

Generate a random FsaVec.

Parameters
  • min_num_fsas (int) – Minimum number of fsas we’ll generated in the returned FsaVec; must be at least 1.

  • max_num_fsas (int) – Maximum number of fsas we’ll generated in the returned FsaVec; must be >= min_num_fsas.

  • acyclic (bool) – If true, generated Fsas will be acyclic.

  • max_symbol (int) – Maximum symbol on arcs. Generated arcs’ symbols will be in range [-1,max_symbol], note -1 is kFinalSymbol; must be at least 0;

  • min_num_arcs (int) – Minimum number of arcs in each Fsa; must be at least 0.

  • max_num_arcs (int) – Maximum number of arcs in each Fsa; must be >= min_num_arcs.

Return type

Fsa

random_paths

k2.random_paths(fsas, use_double_scores, num_paths)[source]

Compute pseudo-random paths through the FSAs in this vector of FSAs (this object must have 3 axes, self.arcs.num_axes() == 3)

Caution

It does not support autograd.

Caution

Do not be confused by the function name. There is no randomness at all, thus no seed. It uses a deterministic algorithm internally, similar to arithmetic coding (see https://en.wikipedia.org/wiki/Arithmetic_coding).

Look into the C++ implementation code for more details.

Parameters
  • fsas (Fsa) – A FsaVec, i.e., len(fsas.shape) == 3

  • use_double_scores (bool) – If true, do computation with double-precision, else float (single-precision)

  • num_paths (int) – Number of paths requested through each FSA. FSAs that have no successful paths will have zero paths returned.

Returns

[fsa][path][arc_pos]; the final sub-lists (indexed with arc_pos) are sequences of arcs starting from the start state and terminating in the final state. The values are arc_idx012, i.e. arc indexes.

Return type

Returns a k2.RaggedTensor (dtype is torch.int32) with 3 axes

remove_epsilon

k2.remove_epsilon(fsa)[source]

Remove epsilons (symbol zero) in the input Fsa.

Caution

Call k2.connect() if you are using a GPU version.

Parameters

fsa (Fsa) –

The input FSA. It can be either a single FSA or an FsaVec. Works either for CPU or GPU, but the algorithm is different. We can only use the CPU algorithm if the input is top-sorted, and the GPU algorithm, while it works for CPU, may not be very fast.

fsa must be free of epsilon loops that have score greater than 0.

Return type

Fsa

Returns

The resulting Fsa is equivalent to the input fsa under the tropical semiring but will be epsilon-free. Any linear tensor attributes, such as ‘aux_labels’, will have been turned into ragged labels after removing fillers (i.e. labels whose value equals fsa.XXX_filler if the attribute name is XXX), counting -1’s on final-arcs as fillers even if the filler value for that attribute is not -1.

remove_epsilon_and_add_self_loops

k2.remove_epsilon_and_add_self_loops(fsa, remove_filler=True)[source]

Remove epsilons (symbol zero) in the input Fsa, and then add epsilon self-loops to all states in the input Fsa (usually as a preparation for intersection with treat_epsilons_specially=0).

Caution

Call k2.connect() if you are using a GPU version.

Parameters
  • fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

  • remove_filler (bool) – If true, we will remove any filler values of attributes when converting linear to ragged attributes.

Return type

Fsa

Returns

The resulting Fsa. See remove_epsilon() for details. The only epsilons will be epsilon self-loops on all states.

remove_epsilon_self_loops

k2.remove_epsilon_self_loops(fsa)[source]

Remove epsilon self-loops of an Fsa or an FsaVec.

Caution

Unlike remove_epsilon(), this funciton removes only epsilon self-loops.

Parameters

fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

Return type

Fsa

Returns

An instance of Fsa that has no epsilon self-loops on every non-final state.

replace_fsa

k2.replace_fsa(src, index, symbol_begin_range=1, ret_arc_map=False)[source]

Replace arcs in index FSA with the corresponding fsas in a vector of FSAs(src). For arcs in index with label symbol_range_begin <= label < symbol_range_begin + src.Dim0() will be replaced with fsa indexed label - symbol_begin_range in src. The destination state of the arc in index is identified with the final-state of the corresponding FSA in src, and the arc in index will become an epsilon arc leading to a new state in the output that is a copy of the start-state of the corresponding FSA in src. Arcs with labels outside this range are just copied. Labels on final-arcs in src (Which will be -1) would be set to 0(epsilon) in the result fsa.

Caution

Attributes of the result inherits from index and src via arc_map_index and arc_map_src, But if there are attributes with same name, only the attributes with dtype torch.float32 are supported, the other kinds of attributes are discarded. See docs in fsa_from_binary_function_tensor for details.

Parameters
  • src (Fsa) – Fsa that we’ll be inserting into the result, MUST have 3 axes.

  • index (Fsa) – The Fsa that is to be replaced, It can be a single FSA or a vector of FSAs.

  • symbol_range_begin – Beginning of the range of symbols that are to be replaced with Fsas.

  • ret_arc_map (bool) – if true, will return a tuple (new_fsas, arc_map_index, arc_map_src) with arc_map_index and arc_map_src tensors of int32 that maps from arcs in the result to arcs in index and src , with -1’s for the arcs not mapped. If false, just returns new_fsas.

Return type

Union[Fsa, Tuple[Fsa, Tensor, Tensor]]

reverse

k2.reverse(fsa)[source]

Reverse the input Fsa. If the input Fsa accepts string ‘x’ with weight ‘x.weight’, then the reversed Fsa accepts the reverse of string ‘x’ with weight ‘x.weight.reverse’. As the Fsas of k2 run on the Log-semiring or Tropical-semiring, the ‘weight.reverse’ will equal to the orignal ‘weight’.

Parameters

fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

Return type

Fsa

Returns

An instance of Fsa which has been reversed.

rnnt_loss

k2.rnnt_loss(logits, symbols, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean')[source]

A normal RNN-T loss, which uses a ‘joiner’ network output as input, i.e. a 4 dimensions tensor.

Parameters
  • logits (Tensor) – The output of joiner network, with shape (B, T, S + 1, C), i.e. batch, time_seq_len, symbol_seq_len+1, num_classes

  • symbols (Tensor) – The symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.

  • termination_symbol (int) – the termination symbol, with 0 <= termination_symbol < C

  • boundary (Optional[Tensor]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

  • delay_penalty (float) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.

  • reduction (Optional[str]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: mean

Return type

Tensor

Returns

If recursion is none, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch, otherwise a scalar with the reduction applied.

rnnt_loss_pruned

k2.rnnt_loss_pruned(logits, symbols, ranges, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', use_hat_loss=False)[source]

A RNN-T loss with pruning, which uses the output of a pruned ‘joiner’ network as input, i.e. a 4 dimensions tensor with shape (B, T, s_range, C), s_range means the number of symbols kept for each frame.

Parameters
  • logits (Tensor) – The pruned output of joiner network, with shape (B, T, s_range, C), i.e. batch, time_seq_len, prune_range, num_classes

  • symbols (Tensor) – A LongTensor of shape [B][S], containing the symbols at each position of the sequence.

  • ranges (Tensor) – A tensor containing the symbol ids for each frame that we want to keep. It is a LongTensor of shape [B][T][s_range], where ranges[b,t,0] contains the begin symbol 0 <= s <= S - s_range +1, such that logits[b,t,:,:] represents the logits with positions s, s + 1, ... s + s_range - 1. See docs in get_rnnt_prune_ranges() for more details of what ranges contains.

  • termination_symbol (int) – The identity of the termination symbol, must be in {0..C-1}

  • boundary (Optional[Tensor]) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

  • delay_penalty (float) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.

  • reduction (Optional[str]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: mean

  • use_hat_loss (bool) – If True, we compute the Hybrid Autoregressive Transducer (HAT) loss from https://arxiv.org/abs/2003.07705. This is a variant of RNN-T that models the blank distribution separately as a Bernoulli distribution, and the non-blanks are modeled as a multinomial. This formulation may be useful for performing internal LM estimation, as described in the paper.

Return type

Tensor

Returns

If reduction is none, returns a tensor of shape (B,), containing the total RNN-T loss values for each sequence of the batch, otherwise a scalar with the reduction applied.

rnnt_loss_simple

k2.rnnt_loss_simple(lm, am, symbols, termination_symbol, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', return_grad=False)[source]

A simple case of the RNN-T loss, where the ‘joiner’ network is just addition.

Parameters
  • lm (Tensor) – language-model part of unnormalized log-probs of symbols, with shape (B, S+1, C), i.e. batch, symbol_seq_len+1, num_classes

  • am (Tensor) – acoustic-model part of unnormalized log-probs of symbols, with shape (B, T, C), i.e. batch, frame, num_classes

  • symbols (Tensor) – the symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.

  • termination_symbol (int) – the termination symbol, with 0 <= termination_symbol < C

  • boundary (Optional[Tensor]) – a optional LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame either emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

  • delay_penalty (float) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.

  • reduction (Optional[str]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: mean

  • return_grad (bool) – Whether to return grads of px and py, this grad standing for the occupation probability is the output of the backward with a fake gradient, the fake gradient is the same as the gradient you’d get if you did torch.autograd.grad((-loss.sum()), [px, py]), note, the loss here is the loss with reduction “none”. This is useful to implement the pruned version of rnnt loss.

Return type

Union[Tensor, Tuple[Tensor, Tuple[Tensor, Tensor]]]

Returns

If return_grad is False, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch if reduction equals to “none”, otherwise a scalar with the reduction applied. If return_grad is True, the grads of px and py, which is the output of backward with a `fake gradient`(see above), will be returned too. And the returned value will be a tuple like (loss, (px_grad, py_grad)).

rnnt_loss_smoothed

k2.rnnt_loss_smoothed(lm, am, symbols, termination_symbol, lm_only_scale=0.1, am_only_scale=0.1, boundary=None, rnnt_type='regular', delay_penalty=0.0, reduction='mean', return_grad=False)[source]

A simple case of the RNN-T loss, where the ‘joiner’ network is just addition.

Parameters
  • lm (Tensor) – language-model part of unnormalized log-probs of symbols, with shape (B, S+1, C), i.e. batch, symbol_seq_len+1, num_classes. These are assumed to be well-normalized, in the sense that we could use them as probabilities separately from the am scores

  • am (Tensor) – acoustic-model part of unnormalized log-probs of symbols, with shape (B, T, C), i.e. batch, frame, num_classes

  • symbols (Tensor) – the symbol sequences, a LongTensor of shape [B][S], and elements in {0..C-1}.

  • termination_symbol (int) – the termination symbol, with 0 <= termination_symbol < C

  • lm_only_scale (float) – the scale on the “LM-only” part of the loss.

  • am_only_scale (float) – the scale on the “AM-only” part of the loss, for which we use an “averaged” LM (averaged over all histories, so effectively unigram).

  • boundary (Optional[Tensor]) – a LongTensor of shape [B, 4] with elements interpreted as [begin_symbol, begin_frame, end_symbol, end_frame] that is treated as [0, 0, S, T] if boundary is not supplied. Most likely you will want begin_symbol and begin_frame to be zero.

  • rnnt_type (str) –

    Specifies the type of rnnt paths: regular, modified or constrained. regular: The regular rnnt that taking you to the next frame only if

    emitting a blank (i.e., emitting a symbol does not take you to the next frame).

    modified: A modified version of rnnt that will take you to the next

    frame whether emitting a blank or a non-blank symbol.

    constrained: A version likes the modified one that will go to the next

    frame when you emit a non-blank symbol, but this is done by “forcing” you to take the blank transition from the next context on the current frame, e.g. if we emit c given “a b” context, we are forced to emit “blank” given “b c” context on the current frame.

  • delay_penalty (float) – A constant value to penalize symbol delay, this may be needed when training with time masking, to avoid the time-masking encouraging the network to delay symbols. See https://github.com/k2-fsa/k2/issues/955 for more details.

  • reduction (Optional[str]) – Specifies the reduction to apply to the output: none, mean or sum. none: no reduction will be applied. mean: apply torch.mean over the batches. sum: the output will be summed. Default: mean

  • return_grad (bool) – Whether to return grads of px and py, this grad standing for the occupation probability is the output of the backward with a fake gradient, the fake gradient is the same as the gradient you’d get if you did torch.autograd.grad((-loss.sum()), [px, py]), note, the loss here is the loss with reduction “none”. This is useful to implement the pruned version of rnnt loss.

Return type

Union[Tuple[Tensor, Tuple[Tensor, Tensor]], Tensor]

Returns

If return_grad is False, returns a tensor of shape (B,), containing the total RNN-T loss values for each element of the batch if reduction equals to “none”, otherwise a scalar with the reduction applied. If return_grad is True, the grads of px and py, which is the output of backward with a `fake gradient`(see above), will be returned too. And the returned value will be a tuple like (loss, (px_grad, py_grad)).

shortest_path

k2.shortest_path(fsa, use_double_scores)[source]

Return the shortest paths as linear FSAs from the start state to the final state in the tropical semiring.

Note

It uses the opposite sign. That is, It uses max instead of min.

Parameters
  • fsa (Fsa) – The input FSA. It can be either a single FSA or an FsaVec.

  • use_double_scores (bool) – False to use float, i.e., single precision floating point, for scores. True to use double.

Return type

Fsa

Returns

FsaVec, it contains the best paths as linear FSAs

simple_ragged_index_select

k2.simple_ragged_index_select(src: torch.Tensor, indexes: k2::RaggedAny) torch.Tensor

swoosh_l

k2.swoosh_l(x: torch.Tensor, dropout_prob: float = 0.0) torch.Tensor

Compute swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035, and optionally apply dropout. If x.requires_grad is True, it returns dropout(swoosh_l(l)). In order to reduce momory, the function derivative swoosh_l'(x) is encoded into 8-bits. If x.requires_grad is False, it returns swoosh_l(x).

Parameters
  • x – A Tensor.

  • dropout_prob – A float number. The default value is 0.

swoosh_l_forward

k2.swoosh_l_forward(x: torch.Tensor) torch.Tensor

Compute swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035.

Parameters

x – A Tensor.

swoosh_l_forward_and_deriv

k2.swoosh_l_forward_and_deriv(x: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Compute swoosh_l(x) = log(1 + exp(x-4)) - 0.08x - 0.035, and also the derivative swoosh_l'(x) = 0.92 - 1 / (1 + exp(x-4)).

Note

\[\begin{split}\text{swoosh_l}'(x) &= -0.08 + \exp(x-4) / (1 + \exp(x-4)) \\ &= -0.08 + (1 - 1 / (1 + \exp(x-4))) \\ &= 0.92 - 1 / (1 + \exp(x-4))\end{split}\]

1 + exp(x-4) might be infinity, but 1 / (1 + exp(x-4)) will be 0 in that case. This is partly why we rearranged the expression above, to avoid infinity / infinity = nan.

Parameters

x – A Tensor.

swoosh_r

k2.swoosh_r(x: torch.Tensor, dropout_prob: float = 0.0) torch.Tensor

Compute swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687, and optionally apply dropout. If x.requires_grad is True, it returns dropout(swoosh_r(l)). In order to reduce momory, the function derivative swoosh_r'(x) is encoded into 8-bits. If x.requires_grad is False, it returns swoosh_r(x).

Parameters
  • x – A Tensor.

  • dropout_prob – A float number. The default value is 0.

swoosh_r_forward

k2.swoosh_r_forward(x: torch.Tensor) torch.Tensor

Compute swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687.

Parameters

x – A Tensor.

swoosh_r_forward_and_deriv

k2.swoosh_r_forward_and_deriv(x: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Compute swoosh_r(x) = log(1 + exp(x-1)) - 0.08x - 0.313261687, and also the derivative swoosh_r'(x) = 0.92 - 1 / (1 + exp(x-1)).

Note

\[\begin{split}\text{swoosh_r}'(x) &= -0.08 + \exp(x-1) / (1 + \exp(x-1)) \\ &= -0.08 + (1 - 1 / (1 + \exp(x-1))) \\ &= 0.92 - 1 / (1 + \exp(x-1))\end{split}\]

1 + exp(x-1) might be infinity, but 1 / (1 + exp(x-1)) will be 0 in that case. This is partly why we rearranged the expression above, to avoid infinity / infinity = nan.

Parameters

x – A Tensor.

to_dot

k2.to_dot(fsa, title=None)[source]

Visualize an Fsa via graphviz.

Note

Graphviz is needed only when this function is called.

Parameters
  • fsa (Fsa) – The input FSA to be visualized.

  • title (Optional[str]) – Optional. The title of the resulting visualization.

Return type

Digraph

Returns

a Diagraph from grahpviz.

to_str

k2.to_str(fsa, openfst=False)[source]

Convert an Fsa to a string. This version prints out all integer labels and integer ragged labels on the same line as each arc, the same format accepted by Fsa.from_str().

Note

The returned string can be used to construct an Fsa with Fsa.from_str(), but you would need to know the names of the auxiliary labels and ragged labels.

Parameters

openfst (bool) – Optional. If true, we negate the scores during the conversion.

Return type

str

Returns

A string representation of the Fsa.

to_str_simple

k2.to_str_simple(fsa, openfst=False)[source]

Convert an Fsa to a string. This is less complete than Fsa.to_str(), fsa.__str__(), or to_str_full(), meaning it prints only fsa.aux_labels and no ragged labels, not printing any other attributes. This is used in testing.

Note

The returned string can be used to construct an Fsa. See also to_str().

Parameters

openfst (bool) – Optional. If true, we negate the scores during the conversion.

Return type

str

Returns

A string representation of the Fsa.

to_tensor

k2.to_tensor(fsa)[source]

Convert an Fsa to a Tensor.

You can save the tensor to disk and read it later to construct an Fsa.

Note

The returned Tensor contains only the transition rules, e.g., arcs. You may want to save its aux_labels separately if any.

Parameters

fsa (Fsa) – The input Fsa.

Return type

Tensor

Returns

A torch.Tensor of dtype torch.int32. It is a 2-D tensor if the input is a single FSA. It is a 1-D tensor if the input is a vector of FSAs.

top_sort

k2.top_sort(fsa)[source]

Sort an FSA topologically.

Note

It returns a new FSA. The input FSA is NOT changed.

Parameters

fsa (Fsa) – The input FSA to be sorted. It can be either a single FSA or a vector of FSAs.

Return type

Fsa

Returns

It returns a single FSA if the input is a single FSA; it returns a vector of FSAs if the input is a vector of FSAs.

trivial_graph

k2.trivial_graph(max_token, device=None)[source]

Create a trivial graph which has only two states. On state 0, there are max_token self loops(i.e. a loop for each symbol from 1 to max_token), and state 1 is the final state.

Parameters
  • max_token (int) – The maximum token ID (inclusive). We assume that token IDs are contiguous (from 1 to max_token).

  • device (Union[device, str, None]) – Optional. It can be either a string (e.g., ‘cpu’, ‘cuda:0’) or a torch.device. If it is None, then the returned FSA is on CPU.

Return type

Fsa

Returns

Returns the expected trivial graph on the given device. Note: The returned graph does not contain arcs with label being 0.

union

k2.union(fsas)[source]

Compute the union of a FsaVec.

Caution

We require that every fsa in fsas is non-empty, i.e., contains at least two states

Parameters

fsas (Fsa) – A FsaVec. That is, len(fsas.shape) == 3.

Return type

Fsa

Returns

A single Fsa that is the union of the input fsas.

CtcLoss

forward

CtcLoss.forward(decoding_graph, dense_fsa_vec, delay_penalty=0.0, target_lengths=None)[source]

Compute the CTC loss given a decoding graph and a dense fsa vector.

Parameters
  • decoding_graph (Fsa) – An FsaVec. It can be the composition result of a CTC topology and a transcript.

  • dense_fsa_vec (DenseFsaVec) – It represents the neural network output. Refer to the help information in k2.DenseFsaVec.

  • delay_penalty (float) – A constant to penalize symbol delay, which is used to make symbol emit earlier for streaming models. It is almost the same as the delay_penalty in our rnnt_loss, See https://github.com/k2-fsa/k2/issues/955 and https://arxiv.org/pdf/2211.00490.pdf for more details.

  • target_lengths (Optional[Tensor]) – Used only when reduction is mean. It is a 1-D tensor of batch size representing lengths of the targets, e.g., number of phones or number of word pieces in a sentence.

Return type

Tensor

Returns

If reduction is none, return a 1-D tensor with size equal to batch size. If reduction is mean or sum, return a scalar.

DecodeStateInfo

DenseFsaVec

__init__

DenseFsaVec.__init__(log_probs, supervision_segments, allow_truncate=0)[source]

Construct a DenseFsaVec from neural net log-softmax outputs.

Parameters
  • log_probs (Tensor) – A 3-D tensor of dtype torch.float32 with shape (N, T, C), where N is the number of sequences, T the maximum input length, and C the number of output classes.

  • supervision_segments (Tensor) –

    A 2-D CPU tensor of dtype torch.int32 with 3 columns. Each row contains information for a supervision segment. Column 0 is the sequence_index indicating which sequence this segment comes from; column 1 specifies the start_frame of this segment within the sequence; column 2 contains the duration of this segment.

    Note

    • 0 < start_frame + duration <= T + allow_truncate

    • 0 <= start_frame < T

    • duration > 0

    Caution

    If the resulting dense fsa vec is used as an input to k2.intersect_dense, then the last column, i.e., the duration column, has to be sorted in decreasing order. That is, the first supervision_segment (the first row) has the largest duration. Otherwise, you don’t need to sort the last column.

    k2.intersect_dense is often used in the training stage, so you should usually sort dense fsa vecs by its duration in training. k2.intersect_dense_pruned is usually used in the decoding stage, so you don’t need to sort dense fsa vecs in decoding.

  • allow_truncate (int) – If not zero, it truncates at most this number of frames from duration in case start_frame + duration > T.

_from_dense_fsa_vec

classmethod DenseFsaVec._from_dense_fsa_vec(dense_fsa_vec, scores)[source]

Construct a DenseFsaVec from _k2.DenseFsaVec and scores.

Note

It is intended for internal use. Users will normally not use it.

Parameters
  • dense_fsa_vec (DenseFsaVec) – An instance of _k2.DenseFsaVec.

  • scores (Tensor) – The scores of _k2.DenseFsaVec for back propagation.

Return type

DenseFsaVec

Returns

An instance of DenseFsaVec.

to

DenseFsaVec.to(device)[source]

Move the DenseFsaVec onto a given device.

Parameters

device (Union[device, str]) – An instance of torch.device or a string that can be used to construct a torch.device, e.g., ‘cpu’, ‘cuda:0’. It supports only cpu and cuda devices.

Return type

DenseFsaVec

Returns

Returns a new DenseFsaVec which is this object copied to the given device (or this object itself, if the device was the same).

device

DenseFsaVec.device
Return type

device

duration

DenseFsaVec.duration

Return the duration (on CPU) of each seq.

Return type

Tensor

DeterminizeWeightPushingType

name

DeterminizeWeightPushingType.name

value

DeterminizeWeightPushingType.value

Fsa

__getattr__

Fsa.__getattr__(name)[source]

Note: for attributes that exist as properties, e.g. self.labels, self.properties, self.requires_grad, we won’t reach this code because Python checks the class dict before calling getattr. The same is true for instance attributes such as self.{_tensor_attr,_non_tensor_attr,_cache,_properties}

The ‘virtual’ members of this class are those in self._tensor_attr and self._non_tensor_attr.

Return type

Any

__getitem__

Fsa.__getitem__(i)[source]

Get the i-th FSA.

Caution

self has to be an FsaVec, i.e. len(self.shape) == 3

Parameters

i (int) – The i-th FSA to select. 0 <= i < self.arcs.dim0().

Return type

Fsa

Returns

The i-th FSA. Note it is a single FSA.

__init__

Fsa.__init__(arcs, aux_labels=None, properties=None)[source]

Build an Fsa from a tensor with optional aux_labels.

It is useful when loading an Fsa from file.

Parameters
  • arcs (Union[Tensor, RaggedArc]) –

    When the arcs is an instance of torch.Tensor, it is a torch tensor of dtype torch.int32 with 4 columns. Each row represents an arc. Column 0 is the src_state, column 1 the dest_state, column 2 the label, and column 3 the score. When the arcs is an instance of _k2.RaggedArc, it is a Ragged containing _k2.Arc returned by internal functions (i.e. C++/CUDA functions) or got from other Fsa object by fsa.arcs.

    Caution

    Scores are floats and their binary pattern is reinterpreted as integers and saved in a tensor of dtype torch.int32.

  • aux_labels (Union[Tensor, RaggedTensor, None]) – Optional. If not None, it associates an aux_label with every arc, so it has as many rows as tensor. It is a 1-D tensor of dtype torch.int32 or k2.RaggedTensor whose dim0 equals to the number of arcs.

  • properties – Tensor properties if known (should only be provided by internal code, as they are not checked; intended for use by clone())

Returns

An instance of Fsa.

__setattr__

Fsa.__setattr__(name, value)[source]

Caution

We save a reference to value. If you need to change value afterwards, please consider passing a copy of it.

Parameters
  • name (str) – Name of the attribute.

  • value (Any) – Value of the attribute.

Return type

None

__str__

Fsa.__str__()[source]

Return a string representation of this object

For visualization and debug only.

Return type

str

_get_arc_post

Fsa._get_arc_post(use_double_scores, log_semiring)[source]

Compute scores on arcs, representing log probabilities; with log_semiring=True you could call these log posteriors, but if log_semiring=False they can only be interpreted as the difference between the best-path score and the score of the best path that includes this arc.

This version is not differentiable; see also get_arc_post().

Parameters
  • use_double_scores (bool) – if True, use double precision.

  • log_semiring (bool) – if True, use log semiring, else tropical.

Return type

Tensor

Returns

A torch.Tensor with shape equal to (num_arcs,) and non-positive elements.

_get_backward_scores

Fsa._get_backward_scores(use_double_scores, log_semiring)[source]

Compute backward-scores, i.e. total weight (or best-path weight) from each state to the final state.

For internal k2 use. Not differentiable.

See also get_backward_scores() which is differentiable.

Parameters
  • use_double_scores (bool) – True to use double precision floating point. False to use single precision.

  • log_semiring (bool) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).

Return type

Tensor

Returns

A torch.Tensor with shape equal to (num_states,)

_get_entering_arcs

Fsa._get_entering_arcs(use_double_scores)[source]

Compute, for each state, the index of the best arc entering it.

For internal k2 use.

Parameters

use_double_scores (bool) – True to use double precision floating point. False to use single precision.

Return type

Tensor

_get_forward_scores

Fsa._get_forward_scores(use_double_scores, log_semiring)[source]

Get (and compute if necessary) cached property self.forward_scores_xxx_yyy (where xxx indicates float-type and yyy indicates semiring).

For use by internal k2 code; returns the total score from start-state to each state. Not differentiable; see get_forward_scores() which is the differentiable version.

Parameters
  • use_double_scores (bool) – True to use double precision floating point. False to use single precision.

  • log_semiring (bool) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).

Return type

Tensor

_get_tot_scores

Fsa._get_tot_scores(use_double_scores, log_semiring)[source]

Compute total-scores (one per FSA) as the best-path score.

This version is not differentiable; see also get_tot_scores() which is differentiable.

Parameters
  • use_double_scores (bool) – If True, use double precision floating point; false; else single precision.

  • log_semiring (bool) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).

Return type

Tensor

_invalidate_cache_

Fsa._invalidate_cache_(scores_only=True)[source]

Intended for internal use only so its name begins with an underline.

Also, it changes self in-place.

Currently, it is used only when the scores field are re-assigned.

Parameters

scores_only (bool) – It True, it invalidates only cached entries related to scores. If False, the whole cache is invalidated.

Return type

None

as_dict

Fsa.as_dict()[source]

Convert this Fsa to a dict (probably for purposes of serialization , e.g., torch.save).

Caution

self.requires_grad attribute is not saved.

Return type

Dict[str, Any]

Returns

A dict that can be used to reconstruct this FSA by using Fsa.from_dict.

convert_attr_to_ragged

Fsa.convert_attr_to_ragged_(name, remove_eps=True)[source]

Convert the attribute given by name from a 1-D torch.tensor to a k2.RaggedTensor.

Caution

This function ends with an underscore, meaning it changes the FSA in-place.

Parameters
  • name (str) – The attribute name. This attribute is expected to be a 1-D tensor with dtype torch.int32.

  • remove_eps (bool) – True to remove 0s in the resulting ragged tensor.

Return type

Fsa

Returns

Return self.

draw

Fsa.draw(filename, title=None)[source]

Render FSA as an image via graphviz, and return the Digraph object; and optionally save to file filename. filename must have a suffix that graphviz understands, such as pdf, svg or png.

Note

You need to install graphviz to use this function:

pip install graphviz
Parameters
  • filename (Optional[str]) – Filename to (optionally) save to, e.g. ‘foo.png’, ‘foo.svg’, ‘foo.png’ (must have a suffix that graphviz understands).

  • title (Optional[str]) – Title to be displayed in image, e.g. ‘A simple FSA example’

Return type

Digraph

from_openfst

classmethod Fsa.from_openfst(s, acceptor=None, num_aux_labels=None, aux_label_names=None, ragged_label_names=[])[source]

Create an Fsa from a string in OpenFST format (or a slightly more general format, if num_aux_labels > 1). See also from_str().

The given string s consists of lines with the following format:

src_state dest_state label [aux_label1 aux_label2...] [cost]

(the cost defaults to 0.0 if not present).

The line for the final state consists of two fields:

final_state [cost]

Note

Fields are separated by space(s), tab(s) or both. The cost field is a float, while other fields are integers.

There might be multiple final states. Also, OpenFst may omit the cost if it is 0.0.

Caution

We use cost here to indicate that its value will be negated so that we can get scores. That is, score = -1 * cost.

Note

At most one of acceptor, num_aux_labels, and aux_label_names must be supplied; if none are supplied, acceptor format is assumed.

Parameters
  • s (str) – The input string. Refer to the above comment for its format.

  • acceptor (Optional[bool]) – Set to true to denote acceptor format which is num_aux_labels == 0, or false to denote transducer format (i.e. num_aux_labels == 1 with name ‘aux_labels’).

  • num_aux_labels (Optional[int]) – The number of auxiliary labels to expect on each line (in addition to the ‘acceptor’ label; is 1 for traditional transducers but can be any non-negative number.

  • aux_label_names (Optional[List[str]]) – If provided, the length of this list dictates the number of aux_labels. By default the names are ‘aux_labels’, ‘aux_labels2’, ‘aux_labels3’ and so on.

  • ragged_label_names (List[str]) – If provided, expect this number of ragged labels, in the order of this list. It is advisable that this list be in alphabetical order, so that the format when we write back to a string will be unchanged.

Return type

Fsa

from_str

classmethod Fsa.from_str(s, acceptor=None, num_aux_labels=None, aux_label_names=None, ragged_label_names=[], openfst=False)[source]

Create an Fsa from a string in the k2 or OpenFst format. (See also from_openfst()).

The given string s consists of lines with the following format:

src_state dest_state label [aux_label1 aux_label2...] [score]

The line for the final state consists of only one field:

final_state

Note

Fields are separated by space(s), tab(s) or both. The score field is a float, while other fields are integers.

Caution

The first column has to be non-decreasing.

Caution

The final state has the largest state number. There is ONLY ONE final state. All arcs that are connected to the final state have label -1. If there are aux_labels, they are also -1 for arcs entering the final state.

Note

At most one of acceptor, num_aux_labels, and aux_label_names must be supplied; if none are supplied, acceptor format is assumed.

Parameters
  • s (str) – The input string. Refer to the above comment for its format.

  • acceptor (Optional[bool]) – Set to true to denote acceptor format which is num_aux_labels == 0, or false to denote transducer format (i.e. num_aux_labels == 1 with name ‘aux_labels’).

  • num_aux_labels (Optional[int]) – The number of auxiliary labels to expect on each line (in addition to the ‘acceptor’ label; is 1 for traditional transducers but can be any non-negative number. The names of the aux_labels default to ‘aux_labels’ then ‘aux_labels2’, ‘aux_labels3’ and so on.

  • aux_label_names (Optional[List[str]]) – If provided, the length of this list dictates the number of aux_labels and this list dictates their names.

  • ragged_label_names (List[str]) – If provided, expect this number of ragged labels, in the order of this list. It is advisable that this list be in alphabetical order, so that the format when we write back to a string will be unchanged.

  • openfst (bool) – If true, will expect the OpenFST format (costs not scores, i.e. negated; final-probs rather than final-state specified).

Return type

Fsa

get_arc_post

Fsa.get_arc_post(use_double_scores, log_semiring)[source]

Compute scores on arcs, representing log probabilities; with log_semiring=True you could call these log posteriors, but if log_semiring=False they can only be interpreted as the difference between the best-path score and the score of the best path that includes this arc. This version is differentiable; see also _get_arc_post().

Caution

Because of how the autograd mechanics works and the need to avoid circular references, this is not cached; it’s best to store it if you’ll need it multiple times.

Parameters
  • use_double_scores (bool) – if True, use double precision.

  • log_semiring (bool) – if True, use log semiring, else tropical.

Return type

Tensor

Returns

A torch.Tensor with shape equal to (num_arcs,) and non-positive elements.

get_backward_scores

Fsa.get_backward_scores(use_double_scores, log_semiring)[source]

Compute backward-scores, i.e. total weight (or best-path weight) from each state to the final state.

Supports autograd.

Parameters
  • use_double_scores (bool) – if True, use double precision.

  • log_semiring (bool) – if True, use log semiring, else tropical.

Return type

Tensor

Returns

A torch.Tensor with shape equal to (num_states,)

get_filler

Fsa.get_filler(attribute_name)[source]

Return the filler value associated with attribute names.

This is 0 unless otherwise specified, but you can override this by for example, doing:

fsa.foo_filler = -1

which will mean the “filler” for attribute fsa.foo is -1; and this will get propagated when you do FSA operations, like any other non-tensor attribute. The filler is the value that means “nothing is here” (like epsilon).

Caution::

you should use a value that is castable to float and back to integer without loss of precision, because currently the default_value parameter of index_select in ./ops.py is a float.

Return type

Union[int, float]

get_forward_scores

Fsa.get_forward_scores(use_double_scores, log_semiring)[source]

Compute forward-scores, i.e. total weight (or best-path weight) from start state to each state.

Supports autograd.

Parameters
  • use_double_scores (bool) – if True, use double precision.

  • log_semiring (bool) – if True, use log semiring, else tropical.

Return type

Tensor

Returns

A torch.Tensor with shape equal to (num_states,)

get_tot_scores

Fsa.get_tot_scores(use_double_scores, log_semiring)[source]

Compute total-scores (one per FSA) as the best-path score.

This version is differentiable.

Parameters
  • use_double_scores (bool) – True to use double precision floating point; False to use single precision.

  • log_semiring (bool) – True to use log semiring (log-sum), false to use tropical (i.e. max on scores).

Return type

Tensor

invert

Fsa.invert()[source]

Swap the labels and aux_labels.

If there are symbol tables associated with labels and aux_labels, they are also swapped.

It is an error if the FSA contains no aux_labels.

Return type

Fsa

Returns

Return a new Fsa.

invert_

Fsa.invert_()[source]

Swap the labels and aux_labels.

If there are symbol tables associated with labels and aux_labels, they are also swapped.

It is an error if the FSA contains no aux_labels.

Caution

The function name ends with an underscore which means this is an in-place operation.

Return type

Fsa

Returns

Return self.

rename_tensor_attribute

Fsa.rename_tensor_attribute_(src_name, dest_name)[source]

Rename a tensor attribute (or, as a special case ‘labels’), and also rename non-tensor attributes that are associated with it, i.e. that have it as a prefix.

Parameters
  • src_name (str) – The original name, exist as a tensor attribute, e.g. ‘aux_labels’, or, as a special case, equal ‘labels’; special attributes ‘labels’ and ‘scores’ are allowed but won’t be deleted.

  • dest_name (str) – The new name, that we are renaming it to. If it already existed as a tensor attribute, it will be rewritten; and any previously existing non-tensor attributes that have this as a prefix will be deleted. As a special case, may equal ‘labels’.

Return type

Fsa

Returns

Return self.

Note::

It is OK if src_name and/or dest_name equals ‘labels’ or ‘scores’, but these special attributes won’t be deleted.

requires_grad_

Fsa.requires_grad_(requires_grad)[source]

Change if autograd should record operations on this FSA:

Sets the scores’s requires_grad attribute in-place.

Returns this FSA.

You can test whether this object has the requires_grad property true or false by accessing requires_grad (handled in __getattr__()).

Caution

This is an in-place operation as you can see that the function name ends with _.

Parameters

requires_grad (bool) – If autograd should record operations on this FSA or not.

Return type

Fsa

Returns

This FSA itself.

set_scores_stochastic

Fsa.set_scores_stochastic_(scores)[source]

Normalize the given scores and assign it to self.scores.

Parameters

scores – Tensor of scores of dtype torch.float32, and shape equal to self.scores.shape (one axis). Will be normalized so the sum, after exponentiating, of the scores leaving each state that has at least one arc leaving it is 1.

Caution

The function name ends with an underline indicating this function will modify self in-place.

Return type

None

to

Fsa.to(device)[source]

Move the FSA onto a given device.

Parameters

device (Union[str, device]) – An instance of torch.device or a string that can be used to construct a torch.device, e.g., ‘cpu’, ‘cuda:0’. It supports only cpu and cuda devices.

Return type

Fsa

Returns

Returns a new Fsa which is this object copied to the given device (or this object itself, if the device was the same)

device

Fsa.device
Return type

device

grad

Fsa.grad
Return type

Tensor

num_arcs

Fsa.num_arcs

Return the number of arcs in this Fsa.

Return type

int

properties

Fsa.properties
Return type

int

properties_str

Fsa.properties_str
Return type

str

requires_grad

Fsa.requires_grad
Return type

bool

shape

Fsa.shape

Returns: (num_states, None) if this is an Fsa; (num_fsas, None, None) if this is an FsaVec.

Return type

Tuple[int, …]

MWERLoss

forward

MWERLoss.forward(lattice, ref_texts, nbest_scale, num_paths)[source]

Compute the Minimum Word Error loss given a lattice and corresponding ref_texts.

Parameters
  • lattice (Fsa) – An FsaVec with axes [utt][state][arc].

  • ref_texts (Union[RaggedTensor, List[List[int]]]) –

    It can be one of the following types:
    • A list of list-of-integers, e..g, [ [1, 2], [1, 2, 3] ]

    • An instance of k2.RaggedTensor. Must have num_axes == 2 and with dtype torch.int32.

  • nbest_scale (float) – Scale lattice.score before passing it to k2.random_paths(). A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.

  • num_paths (int) – Number of paths to sample from the lattice using k2.random_paths().

Return type

Union[Tensor, RaggedTensor]

Returns

Minimum Word Error Rate loss.

Nbest

from_lattice

static Nbest.from_lattice(lattice, num_paths, use_double_scores=True, nbest_scale=0.5)[source]

Construct an Nbest object by sampling num_paths from a lattice.

Each sampled path is a linear FSA.

We assume lattice.labels contains token IDs and lattice.aux_labels contains word IDs.

Parameters
  • lattice (Fsa) – An FsaVec with axes [utt][state][arc].

  • num_paths (int) – Number of paths to sample from the lattice using k2.random_paths().

  • use_double_scores (bool) – True to use double precision in k2.random_paths(). False to use single precision.

  • nbest_scale (float) – Scale lattice.score before passing it to k2.random_paths(). A smaller value leads to more unique paths at the risk of being not to sample the path with the best score.

Return type

Nbest

Returns

Return an Nbest instance.

intersect

Nbest.intersect(lats)[source]

Intersect this Nbest object with a lattice and get 1-best path from the resulting FsaVec.

Caution

We assume FSAs in self.fsa don’t have epsilon self-loops. We also assume self.fsa.labels and lats.labels are token IDs.

Parameters

lats (Fsa) – An FsaVec. It can be the return value of whole_lattice_rescoring().

Return type

Nbest

Returns

Return a new Nbest. This new Nbest shares the same shape with self, while its fsa is the 1-best path from intersecting self.fsa and lats.

top_k

Nbest.top_k(k)[source]

Get a subset of paths in the Nbest. The resulting Nbest is regular in that each sequence (i.e., utterance) has the same number of paths (k).

We select the top-k paths according to the total_scores of each path. If a utterance has less than k paths, then its last path, after sorting by tot_scores in descending order, is repeated so that each utterance has exactly k paths.

Parameters

k (int) – Number of paths in each utterance.

Return type

Nbest

Returns

Return a new Nbest with a regular shape.

total_scores

Nbest.total_scores()[source]

Get total scores of the FSAs in this Nbest.

Note

Since FSAs in Nbest are just linear FSAs, log-semirng and tropical semiring produce the same total scores.

Return type

RaggedTensor

Returns

Return a ragged tensor with two axes [utt][path_scores].

OnlineDenseIntersecter

__init__

OnlineDenseIntersecter.__init__(decoding_graph, num_streams, search_beam, output_beam, min_active_states, max_active_states, allow_partial=True)[source]

Create a new online intersecter object. :type decoding_graph: Fsa :param decoding_graph: The decoding graph used in this intersecter. :type num_streams: int :param num_streams: How many streams can this intersecter handle parallelly. :type search_beam: float :param search_beam: Decoding beam, e.g. 20. Smaller is faster, larger is more exact

(less pruning). This is the default value; it may be modified by min_active_states and max_active_states.

Parameters
  • output_beam (float) – Pruning beam for the output of intersection (vs. best path); equivalent to kaldi’s lattice-beam. E.g. 8.

  • min_active_states (int) – Minimum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to have fewer than this number active. Set it to zero if there is no constraint.

  • max_active_states (int) – Maximum number of FSA states that are allowed to be active on any given frame for any given intersection/composition task. This is advisory, in that it will try not to exceed that but may not always succeed. You can use a very large number if no constraint is needed.

Examples

decode

OnlineDenseIntersecter.decode(dense_fsas, decode_states)[source]

Does intersection/composition for current chunk of nnet_output(given by a DenseFsaVec), sequences in every chunk may come from different sources. :type dense_fsas: DenseFsaVec :param dense_fsas: The neural-net output, with each frame containing the log-likes of

each modeling unit.

Parameters

decode_states (List[DecodeStateInfo]) – A list of history decoding states for current batch of sequences, its length equals to dense_fsas.dim0() (i.e. batch size). Each element in decode_states belongs to the sequence at the corresponding position in current batch. For a new sequence(i.e. has no history states), just put None at the corresponding position.

Return type

Tuple[Fsa, List[DecodeStateInfo]]

Returns

Return a tuple containing an Fsa and a List of new decoding states. The Fsa which has 3 axes(i.e. (batch, state, arc)) contains the output lattices. See the example in the constructor to get more info about how to use the list of new decoding states.

num_streams

OnlineDenseIntersecter.num_streams
Return type

int

RaggedShape

__eq__

RaggedShape.__eq__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool

Return True if two shapes are equal. Otherwise, return False.

Caution

The two shapes have to be on the same device. Otherwise, it throws an exception.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape3 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape1 == shape2
False
>>> shape3 == shape2
True
Parameters

other – The shape that we want to compare with self.

Returns

Return True if the two shapes are the same. Return False otherwise.

__getitem__

RaggedShape.__getitem__(self: _k2.ragged.RaggedShape, i: int) _k2.ragged.RaggedShape

Select the i-th sublist along axis 0.

Note

It requires that this shape has at least 3 axes.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x] [x x]] [[x x x] [] [x x]] ]')
>>> shape[0]
[ [ x ] [ x x ] ]
>>> shape[1]
[ [ x x x ] [ ] [ x x ] ]
Parameters

i – The i-th sublist along axis 0.

Returns

Return a new ragged shape with one fewer axis.

__init__

RaggedShape.__init__(self: _k2.ragged.RaggedShape, s: str) None

Construct a ragged shape from a string.

An example string for a ragged shape with 2 axes is:

[ [x x] [ ] [x] ]

An example string for a ragged shape with 3 axes is:

[ [[x] []] [[x] [x x]] ]
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x] ]')
>>> shape
[ [ x ] [ ] [ x x ] ]
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[]] ]')
>>> shape2
[ [ [ x ] [ ] [ x x ] ] [ [ ] ] ]

__ne__

RaggedShape.__ne__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool

Return True if two shapes are not equal. Otherwise, return False.

Caution

The two shapes have to be on the same device. Otherwise, it throws an exception.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape3 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape1 != shape2
True
>>> shape2 != shape3
False
Parameters

other – The shape that we want to compare with self.

Returns

Return True if the two shapes are not equal. Return False otherwise.

__repr__

RaggedShape.__repr__(self: _k2.ragged.RaggedShape) str

Return a string representation of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]')
>>> print(shape)
[ [ x ] [ ] [ x x ] ]
>>> shape
[ [ x ] [ ] [ x x ] ]

__str__

RaggedShape.__str__(self: _k2.ragged.RaggedShape) str

Return a string representation of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]')
>>> print(shape)
[ [ x ] [ ] [ x x ] ]
>>> shape
[ [ x ] [ ] [ x x ] ]

compose

RaggedShape.compose(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) _k2.ragged.RaggedShape

Compose self with a given shape.

Caution

other and self MUST be on the same device.

Hint

In order to compose self with other, it has to satisfy self.tot_size(self.num_axes - 1) == other.dim0

Example 1:

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [x x] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x x x] [x x] [] ]')
>>> shape1.compose(shape2)
[ [ [ x x x ] [ x x ] ] [ [ ] ] ]

Example 2:

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [[x x] [x x x] []] [[x] [x x x x]] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x x x] [] [] [x x] [x] [] [x x x x] [] [x x] ]')
>>> shape1.compose(shape2)
[ [ [ [ x ] [ x x x ] ] [ [ ] [ ] [ x x ] ] [ ] ] [ [ [ x ] ] [ [ ] [ x x x x ] [ ] [ x x ] ] ] ]
>>> shape1.tot_size(shape1.num_axes - 1)
10
>>> shape2.dim0
10
Parameters

other – The other shape that is to be composed with self.

Returns

Return a composed ragged shape.

get_layer

RaggedShape.get_layer(self: _k2.ragged.RaggedShape, arg0: int) _k2.ragged.RaggedShape

Returns a sub-shape of self.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x x] [x] []] [[] [x x x] [x]] [[]] ]')
>>> shape.get_layer(0)
[ [ x x x ] [ x x x ] [ x ] ]
>>> shape.get_layer(1)
[ [ x x ] [ x ] [ ] [ ] [ x x x ] [ x ] [ ] ]
Parameters

layer – Layer that is desired, from 0 .. src.num_axes - 2 (inclusive).

Returns

This returned shape will have num_axes == 2, the minimal case of a RaggedShape.

index

RaggedShape.index(self: _k2.ragged.RaggedShape, axis: int, indexes: torch.Tensor, need_value_indexes: bool = True) Tuple[_k2.ragged.RaggedShape, Optional[torch.Tensor]]

Indexing operation on a ragged shape, returns self[indexes], where elements of indexes are interpreted as indexes into axis axis of self``.

Caution

indexes is a 1-D tensor and indexes.dtype == torch.int32.

Example 1:

>>> shape = k2r.RaggedShape('[ [x x] [x] [x x x] ]')
>>> value = torch.arange(6, dtype=torch.float32) * 10
>>> ragged = k2r.RaggedTensor(shape, value)
>>> ragged
[ [ 0 10 ] [ 20 ] [ 30 40 50 ] ]
>>> i = torch.tensor([0, 2, 1], dtype=torch.int32)
>>> sub_shape, value_indexes = shape.index(axis=0, indexes=i, need_value_indexes=True)
>>> sub_shape
[ [ x x ] [ x x x ] [ x ] ]
>>> value_indexes
tensor([0, 1, 3, 4, 5, 2], dtype=torch.int32)
>>> ragged.data[value_indexes.long()]
tensor([ 0., 10., 30., 40., 50., 20.])
>>> k = torch.tensor([0, -1, 1, 0, 2, -1], dtype=torch.int32)
>>> sub_shape2, value_indexes2 = shape.index(axis=0, indexes=k, need_value_indexes=True)
>>> sub_shape2
[ [ x x ] [ ] [ x ] [ x x ] [ x x x ] [ ] ]
>>> value_indexes2
tensor([0, 1, 2, 0, 1, 3, 4, 5], dtype=torch.int32)

Example 2:

>>> import torch
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x x] [x]] [[] [x x x] [x]] [[x] [] [] [x x]] ]')
>>> i = torch.tensor([0, 1, 3, 5, 7, 8], dtype=torch.int32)
>>> shape.index(axis=1, indexes=i)
([ [ [ x x ] [ x ] ] [ [ x x x ] ] [ [ x ] [ ] [ x x ] ] ], tensor([0, 1, 2, 3, 4, 5, 7, 8, 9], dtype=torch.int32))
Parameters
  • axis – The axis to be indexed. Must satisfy 0 <= axis < self.num_axes.

  • indexes – Array of indexes, which will be interpreted as indexes into axis axis of self, i.e. with 0 <= indexes[i] < self.tot_size(axis). Note that if axis is 0, then -1 is also a valid entry in index, in which case, an empty list is returned.

  • need_value_indexes

    If True, it will return a torch.Tensor containing the indexes into ragged_tensor.data that ans.data has, as in ans.data = ragged_tensor.data[value_indexes], where ragged_tensor uses self as its shape.

    Caution

    It is currently not allowed to change the order on axes less than axis, i.e. if axis > 0, we require: IsMonotonic(self.row_ids(axis)[indexes]).

Returns

Return an indexed ragged shape.

max_size

RaggedShape.max_size(self: _k2.ragged.RaggedShape, axis: int) int

Return the maximum number of elements of any sublist at the given axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [] [x] [x x] [x x x] [x x x x] ]')
>>> shape.max_size(1)
4
>>> shape = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]')
>>> shape.max_size(1)
5
>>> shape.max_size(2)
4
Parameters

axis

Compute the max size of this axis.

Caution

axis has to be greater than 0.

Returns

Return the maximum number of elements of sublists at the given axis.

numel

RaggedShape.numel(self: _k2.ragged.RaggedShape) int

Return the number of elements in this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]')
>>> shape.numel()
6
>>> shape2 = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]')
>>> shape2.numel()
8
>>> shape3 = k2r.RaggedShape('[ [x x x] [x] ]')
>>> shape3.numel()
4
Returns

Return the number of elements in this shape.

Hint

It’s the number of x’s.

regular_ragged_shape

static RaggedShape.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape

Create a ragged shape with 2 axes that has a regular structure.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3)
>>> shape1
[ [ x x x ] [ x x x ] ]
>>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2)
>>> shape2
[ [ x x ] [ x x ] [ x x ] ]
Parameters
  • dim0 – Number of entries at axis 0.

  • dim1 – Number of entries in each sublist at axis 1.

Returns

Return a ragged shape on CPU.

remove_axis

RaggedShape.remove_axis(self: _k2.ragged.RaggedShape, axis: int) _k2.ragged.RaggedShape

Remove a certain axis.

Caution

self.num_axes MUST be greater than 2.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x x x x]] [[] [] []]]')
>>> shape.remove_axis(0)
[ [ x ] [ ] [ x x ] [ x x x ] [ x x x x ] [ ] [ ] [ ] ]
>>> shape.remove_axis(1)
[ [ x x x ] [ x x x x x x x ] [ ] ]
Parameters

axis – The axis to be removed.

Returns

Return a ragged shape with one fewer axis.

row_ids

RaggedShape.row_ids(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor

Return the row ids of a certain axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> shape.row_ids(1)
tensor([0, 0, 2, 2, 2], dtype=torch.int32)
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]')
>>> shape2.row_ids(1)
tensor([0, 0, 0, 1, 1, 1, 1, 1], dtype=torch.int32)
>>> shape2.row_ids(2)
tensor([0, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5], dtype=torch.int32)
Parameters
  • axis – The axis whose row ids is to be returned.

  • Hintaxis >= 1.

Returns

Return the row ids of the given axis.

row_splits

RaggedShape.row_splits(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor

Return the row splits of a certain axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> shape.row_splits(1)
tensor([0, 2, 2, 5], dtype=torch.int32)
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]')
>>> shape2.row_splits(1)
tensor([0, 3, 8], dtype=torch.int32)
>>> shape2.row_splits(2)
tensor([ 0,  1,  1,  3,  6,  7, 11, 11, 11], dtype=torch.int32)
Parameters
  • axis – The axis whose row splits is to be returned.

  • Hintaxis >= 1.

Returns

Return the row splits of the given axis.

to

RaggedShape.to(self: _k2.ragged.RaggedShape, device: object) _k2.ragged.RaggedShape

Move this shape to the specified device.

Hint

If the shape is already on the specified device, the returned shape shares the underlying memory with self.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[x]]')
>>> shape.device
device(type='cpu')
>>> import torch
>>> shape2 = shape.to(torch.device('cuda', 0))
>>> shape2.device
device(type='cuda', index=0)
>>> shape
[ [ x ] ]
>>> shape2
[ [ x ] ]
Parameters

device – An instance of torch.device. It can be either a CPU device or a CUDA device.

Returns

Return a shape on the given device.

tot_size

RaggedShape.tot_size(self: _k2.ragged.RaggedShape, axis: int) int

Return the number of elements at a certain``axis``.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [x x] [x x x] []]')
>>> shape.tot_size(1)
6
>>> shape.numel()
6
>>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[]] ]')
>>> shape2.tot_size(1)
7
>>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[] []] ]')
>>> shape2.tot_size(1)
8
>>> shape2.tot_size(2)
6
>>> shape2.numel()
6
Parameters

axis – Return the number of elements for this axis.

Returns

Return the number of elements at axis.

tot_sizes

RaggedShape.tot_sizes(self: _k2.ragged.RaggedShape) tuple

Return total sizes of every axis in a tuple.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [ ] [x x x x]]')
>>> shape.dim0
3
>>> shape.tot_size(1)
5
>>> shape.tot_sizes()
(3, 5)
>>> shape2 = k2r.RaggedShape('[ [[x] []] [[x x x x]]]')
>>> shape2.dim0
2
>>> shape2.tot_size(1)
3
>>> shape2.tot_size(2)
5
>>> shape2.tot_sizes()
(2, 3, 5)
Returns

Return a tuple containing the total sizes of each axis. ans[i] is the total size of axis i (for i > 0). For i=0, it is the dim0 of this shape.

device

RaggedShape.device

Return the device of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[]]')
>>> shape.device
device(type='cpu')
>>> import torch
>>> shape2 = shape.to(torch.device('cuda', 0))
>>> shape2.device
device(type='cuda', index=0)

dim0

RaggedShape.dim0

Return number of sublists at axis 0.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]')
>>> shape.dim0
3
>>> shape2 = k2r.RaggedShape('[ [[x] []] [[]] [[x] [x x] [x x x]] [[]]]')
>>> shape2.dim0
4

num_axes

RaggedShape.num_axes

Return the number of axes of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[] []]')
>>> shape.num_axes
2
>>> shape2 = k2r.RaggedShape('[ [[]] [[]]]')
>>> shape2.num_axes
3

RaggedTensor

__eq__

RaggedTensor.__eq__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool

Compare two ragged tensors.

Caution

The two tensors MUST have the same dtype. Otherwise, it throws an exception.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]])
>>> b = a.clone()
>>> a ==  b
True
>>> c = a.to(torch.float32)
>>> try:
...   c == b
... except RuntimeError:
...   print("raised exception")
Parameters

other – The tensor to be compared.

Returns

Return True if the two tensors are equal. Return False otherwise.

__getitem__

RaggedTensor.__getitem__(*args, **kwargs)

Overloaded function.

  1. __getitem__(self: _k2.ragged.RaggedTensor, i: int) -> object

Select the i-th sublist along axis 0.

Caution

Support for autograd is to be implemented.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]]  [[8]] ]')
>>> a
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]]], dtype=torch.int32)
>>> a[0]
RaggedTensor([[1, 3],
              [],
              [9]], dtype=torch.int32)
>>> a[1]
RaggedTensor([[8]], dtype=torch.int32)

Example 2:

>>> a = k2r.RaggedTensor('[ [1 3] [9] [8] ]')
>>> a
RaggedTensor([[1, 3],
              [9],
              [8]], dtype=torch.int32)
>>> a[0]
tensor([1, 3], dtype=torch.int32)
>>> a[1]
tensor([9], dtype=torch.int32)
Parameters

i – The i-th sublist along axis 0.

Returns

Return a new ragged tensor with one fewer axis. If num_axes == 2, the return value will be a 1D tensor.

  1. __getitem__(self: _k2.ragged.RaggedTensor, key: slice) -> _k2.ragged.RaggedTensor

Slices sublists along axis 0 with the given range. Only support slicing step equals to 1.

Caution

Support for autograd is to be implemented.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]]  [[8]] [[10 11]] ]')
>>> a
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]],
              [[10, 11]]], dtype=torch.int32)
>>> a[0:2]
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]]], dtype=torch.int32)
>>> a[1:2]
RaggedTensor([[[8]]], dtype=torch.int32)
Parameters

key – Slice containing integer constants.

Returns

Return a new ragged tensor with the same axes as original ragged tensor, but only contains the sublists within the range.

  1. __getitem__(self: _k2.ragged.RaggedTensor, key: torch.Tensor) -> _k2.ragged.RaggedTensor

Slice a ragged tensor along axis 0 using a 1-D torch.int32 tensor.

Example 1:

>>> import k2
>>> a = k2.RaggedTensor([[1, 2, 0], [0, 1], [2, 3]])
>>> b = k2.RaggedTensor([[10, 20], [300], [-10, 0, -1], [-2, 4, 5]])
>>> a[0]
tensor([1, 2, 0], dtype=torch.int32)
>>> b[a[0]]
RaggedTensor([[300],
              [-10, 0, -1],
              [10, 20]], dtype=torch.int32)
>>> a[1]
tensor([0, 1], dtype=torch.int32)
>>> b[a[1]]
RaggedTensor([[10, 20],
              [300]], dtype=torch.int32)
>>> a[2]
tensor([2, 3], dtype=torch.int32)
>>> b[a[2]]
RaggedTensor([[-10, 0, -1],
              [-2, 4, 5]], dtype=torch.int32)

Example 2:

>>> import torch
>>> import k2
>>> a = k2.RaggedTensor([ [[1], [2, 3], [0]], [[], [2]], [[10, 20]] ])
>>> i = torch.tensor([0, 2, 1, 0], dtype=torch.int32)
>>> a[i]
RaggedTensor([[[1],
               [2, 3],
               [0]],
              [[10, 20]],
              [[],
               [2]],
              [[1],
               [2, 3],
               [0]]], dtype=torch.int32)
Parameters

key – A 1-D torch.int32 tensor containing the indexes to select along axis 0.

Returns

Return a new ragged tensor with the same number of axes as self but only contains the specified sublists.

__getstate__

RaggedTensor.__getstate__(self: k2.RaggedTensor) tuple

Requires a tensor with 2 axes or 3 axes. Other number of axes are not implemented yet.

This method is to support pickle, e.g., used by torch.save(). You are not expected to call it by yourself.

Returns

If this tensor has 2 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.values). If this tensor has 3 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.row_splits(1), “row_ids2”, self.values)

Note

“row_ids1” and “row_ids2” in the returned value is for backward compatibility.

__init__

RaggedTensor.__init__(*args, **kwargs)

Overloaded function.

  1. __init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: object = ‘cpu’) -> None

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64)
>>> c
RaggedTensor([[1]], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.RaggedTensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)

Example 2:

>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ])
RaggedTensor([[[1, 2]],
              [],
              [[]]], dtype=torch.int32)
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0')
RaggedTensor([[[1, 2]],
              [],
              [[]]], device='cuda:0', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: str = ‘cpu’) -> None

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64)
>>> c
RaggedTensor([[1]], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.RaggedTensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)

Example 2:

>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ])
RaggedTensor([[[1, 2]],
              [],
              [[]]], dtype=torch.int32)
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0')
RaggedTensor([[[1, 2]],
              [],
              [[]]], device='cuda:0', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: object = ‘cpu’) -> None

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
RaggedTensor([[[],
               [3]],
              [[10]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.RaggedTensor('[[1.]]')
>>> c.dtype
torch.float32
>>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0')
>>> d
RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: str = ‘cpu’) -> None

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
RaggedTensor([[[],
               [3]],
              [[10]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.RaggedTensor('[[1.]]')
>>> c.dtype
torch.float32
>>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0')
>>> d
RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, shape: _k2.ragged.RaggedShape, value: torch.Tensor) -> None

Create a ragged tensor from a shape and a value.

>>> import torch
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> value = torch.tensor([10, 0, 20, 30, 40], dtype=torch.float32)
>>> ragged = k2r.RaggedTensor(shape, value)
>>> ragged
RaggedTensor([[10, 0],
              [],
              [20, 30, 40]], dtype=torch.float32)
Parameters
  • shape – The shape of the tensor.

  • value – The value of the tensor.

  1. __init__(self: _k2.ragged.RaggedTensor, tensor: torch.Tensor) -> None

Create a ragged tensor from a torch tensor.

Note

It turns a regular tensor into a ragged tensor.

Caution

The input tensor has to have more than 1 dimension. That is tensor.ndim > 1.

Also, if the input tensor is contiguous, self will share the underlying memory with it. Otherwise, memory of the input tensor is copied to create self.

Supported dtypes of the input tensor are: torch.int32, torch.float32, and torch.float64.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3)
>>> b = k2r.RaggedTensor(a)
>>> a
tensor([[0, 1, 2],
        [3, 4, 5]], dtype=torch.int32)
>>> b
RaggedTensor([[0, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> a.is_contiguous()
True
>>> a[0, 0] = 10
>>> b
RaggedTensor([[10, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> b.values[1] = -2
>>> a
tensor([[10, -2,  2],
        [ 3,  4,  5]], dtype=torch.int32)

Example 2:

>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4]
>>> a
tensor([[ 0,  4,  8],
        [12, 16, 20]], dtype=torch.int32)
>>> a.is_contiguous()
False
>>> b = k2r.RaggedTensor(a)
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> a[0, 0] = 10
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> a
tensor([[10,  4,  8],
        [12, 16, 20]], dtype=torch.int32)

Example 3:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4)
>>> a
tensor([[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],
        [[12., 13., 14., 15.],
         [16., 17., 18., 19.],
         [20., 21., 22., 23.]]])
>>> b = k2r.RaggedTensor(a)
>>> b
RaggedTensor([[[0, 1, 2, 3],
               [4, 5, 6, 7],
               [8, 9, 10, 11]],
              [[12, 13, 14, 15],
               [16, 17, 18, 19],
               [20, 21, 22, 23]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = torch.tensor([[1, 2]], device='cuda:0', dtype=torch.float32)
>>> k2r.RaggedTensor(c)
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.float32)
Parameters

tensor – An N-D (N > 1) tensor.

__ne__

RaggedTensor.__ne__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool

Compare two ragged tensors.

Caution

The two tensors MUST have the same dtype. Otherwise, it throws an exception.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [3]])
>>> b = a.clone()
>>> b != a
False
>>> c = k2r.RaggedTensor([[1], [2], [3]])
>>> c != a
True
Parameters

other – The tensor to be compared.

Returns

Return True if the two tensors are NOT equal. Return False otherwise.

__repr__

RaggedTensor.__repr__(self: _k2.ragged.RaggedTensor) str

Return a string representation of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3], []])
>>> a
RaggedTensor([[1],
              [2, 3],
              []], dtype=torch.int32)
>>> str(a)
'RaggedTensor([[1],\n              [2, 3],\n              []], dtype=torch.int32)'
>>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0')
>>> b
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)

__setstate__

RaggedTensor.__setstate__(self: k2.RaggedTensor, arg0: tuple) None

Set the content of this class from arg0.

This method is to support pickle, e.g., used by torch.load(). You are not expected to call it by yourself.

Parameters

arg0 – It is the return value from the method __getstate__.

__str__

RaggedTensor.__str__(self: _k2.ragged.RaggedTensor) str

Return a string representation of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3], []])
>>> a
RaggedTensor([[1],
              [2, 3],
              []], dtype=torch.int32)
>>> str(a)
'RaggedTensor([[1],\n              [2, 3],\n              []], dtype=torch.int32)'
>>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0')
>>> b
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)

add

RaggedTensor.add(self: _k2.ragged.RaggedTensor, value: torch.Tensor, alpha: object) _k2.ragged.RaggedTensor

Add value scaled by alpha to source ragged tensor over the last axis.

It implements:

dest[…][i][j] = src[…][i][j] + alpha * value[i]

>>> import k2.ragged as k2r
>>> import torch
>>> src = k2r.RaggedTensor([[1, 3], [1], [2, 8]], dtype=torch.int32)
>>> value = torch.tensor([1, 2, 3], dtype=torch.int32)
>>> src.add(value, 1)
RaggedTensor([[2, 4],
              [3],
              [5, 11]], dtype=torch.int32)
>>> src.add(value, -1)
RaggedTensor([[0, 2],
              [-1],
              [-1, 5]], dtype=torch.int32)
Parameters
  • value – The value to be added to the self, whose dimension MUST equal the number of sublists along the last dimension of self.

  • alpha – The number used to scaled value before adding to self.

Returns

Returns a new RaggedTensor, sharing the same dtype and device with self.

arange

RaggedTensor.arange(self: _k2.ragged.RaggedTensor, axis: int, begin: int, end: int) _k2.ragged.RaggedTensor

Return a sub-range of self containing indexes begin through end - 1 along axis axis of self.

The axis argument may be confusing; its behavior is equivalent to:

for i in range(axis):
  self = self.remove_axis(0)

return self.arange(0, begin, end)

Caution

The returned tensor shares the underlying memory with self.

Example 1

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]] ])
>>> a
RaggedTensor([[[1],
               [],
               [2]],
              [[],
               [4, 5],
               []],
              [[],
               [1]],
              [[]]], dtype=torch.int32)
>>> a.num_axes
3
>>> b = a.arange(axis=0, begin=1, end=3)
>>> b
RaggedTensor([[[],
               [4, 5],
               []],
              [[],
               [1]]], dtype=torch.int32)
>>> b.num_axes
3
>>> c = a.arange(axis=0, begin=1, end=2)
>>> c
RaggedTensor([[[],
               [4, 5],
               []]], dtype=torch.int32)
>>> c.num_axes
3
>>> d = a.arange(axis=1, begin=0, end=4)
>>> d
RaggedTensor([[1],
              [],
              [2],
              []], dtype=torch.int32)
>>> d.num_axes
2
>>> e = a.arange(axis=1, begin=2, end=5)
>>> e
RaggedTensor([[2],
              [],
              [4, 5]], dtype=torch.int32)
>>> e.num_axes
2

Example 2

>>> a = k2r.RaggedTensor([ [[[], [1], [2, 3]],[[5, 8], [], [9]]], [[[10], [0], []]], [[[], [], [1]]] ])
>>> a.num_axes
4
>>> b = a.arange(axis=0, begin=0, end=2)
>>> b
RaggedTensor([[[[],
                [1],
                [2, 3]],
               [[5, 8],
                [],
                [9]]],
              [[[10],
                [0],
                []]]], dtype=torch.int32)
>>> b.num_axes
4
>>> c = a.arange(axis=1, begin=1, end=3)
>>> c
RaggedTensor([[[5, 8],
               [],
               [9]],
              [[10],
               [0],
               []]], dtype=torch.int32)
>>> c.num_axes
3
>>> d = a.arange(axis=2, begin=0, end=5)
>>> d
RaggedTensor([[],
              [1],
              [2, 3],
              [5, 8],
              []], dtype=torch.int32)
>>> d.num_axes
2

Example 3

>>> a = k2r.RaggedTensor([[0], [1], [2], [], [3]])
>>> a
RaggedTensor([[0],
              [1],
              [2],
              [],
              [3]], dtype=torch.int32)
>>> a.num_axes
2
>>> b = a.arange(axis=0, begin=1, end=4)
>>> b
RaggedTensor([[1],
              [2],
              []], dtype=torch.int32)
>>> b.values[0] = -1
>>> a
RaggedTensor([[0],
              [-1],
              [2],
              [],
              [3]], dtype=torch.int32)
Parameters
  • axis – The axis from which begin and end correspond to.

  • begin – The beginning of the range (inclusive).

  • end – The end of the range (exclusive).

argmax

RaggedTensor.argmax(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor

Return a tensor containing maximum value indexes within each sub-list along the last axis of self, i.e. the max taken over the last axis, The index is -1 if the sub-list was empty or all values in the sub-list are less than initial_value.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [3, -1], [], [], [] ])
>>> a.argmax()
tensor([ 0, -1, -1, -1], dtype=torch.int32)
>>> b = a.argmax(initial_value=0)
>>> b
tensor([ 0, -1, -1, -1], dtype=torch.int32)
>>> c = k2r.RaggedTensor([ [3, 0, 2, 5, 1], [], [1, 3, 8, 2, 0] ])
>>> c.argmax()
tensor([ 3, -1,  7], dtype=torch.int32)
>>> d = c.argmax(initial_value=0)
>>> d
tensor([ 3, -1,  7], dtype=torch.int32)
>>> c.values[3], c.values[7]
(tensor(5, dtype=torch.int32), tensor(8, dtype=torch.int32))
>>> c.argmax(initial_value=6)
tensor([-1, -1,  7], dtype=torch.int32)
>>> c.to('cuda:0').argmax(0)
tensor([ 3, -1,  7], device='cuda:0', dtype=torch.int32)
>>> import torch
>>> c.to(torch.float32).argmax(0)
tensor([ 3, -1,  7], dtype=torch.int32)
Parameters

initial_value – A base value to compare. If values in a sublist are all less than this value, then the argmax of this sublist is -1. If a sublist is empty, the argmax of it is also -1. If it is None, the lowest value of self.dtype is used.

Returns

Return a 1-D torch.int32 tensor. It is on the same device as self.

cat

static RaggedTensor.cat(srcs: List[_k2.ragged.RaggedTensor], axis: int) _k2.ragged.RaggedTensor

Concatenate a list of ragged tensor over a specified axis.

Example 1

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [], [2, 3]])
>>> k2r.cat([a, a], axis=0)
RaggedTensor([[1],
              [],
              [2, 3],
              [1],
              [],
              [2, 3]], dtype=torch.int32)
>>> k2r.cat((a, a), axis=1)
RaggedTensor([[1, 1],
              [],
              [2, 3, 2, 3]], dtype=torch.int32)

Example 2

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 3], [], [5, 8], [], [9]])
>>> b = k2r.RaggedTensor([[0], [1, 8], [], [-1], [10]])
>>> c = k2r.cat([a, b], axis=0)
>>> c
RaggedTensor([[1, 3],
              [],
              [5, 8],
              [],
              [9],
              [0],
              [1, 8],
              [],
              [-1],
              [10]], dtype=torch.int32)
>>> c.num_axes
2
>>> d = k2r.cat([a, b], axis=1)
>>> d
RaggedTensor([[1, 3, 0],
              [1, 8],
              [5, 8],
              [-1],
              [9, 10]], dtype=torch.int32)
>>> d.num_axes
2
>>> k2r.RaggedTensor.cat([a, b], axis=1)
RaggedTensor([[1, 3, 0],
              [1, 8],
              [5, 8],
              [-1],
              [9, 10]], dtype=torch.int32)
>>> k2r.cat((b, a), axis=0)
RaggedTensor([[0],
              [1, 8],
              [],
              [-1],
              [10],
              [1, 3],
              [],
              [5, 8],
              [],
              [9]], dtype=torch.int32)
Parameters
  • srcs – A list (or a tuple) of ragged tensors to concatenate. They MUST all have the same dtype and on the same device.

  • axis – Only 0 and 1 are supported right now. If it is 1, then srcs[i].dim0 must all have the same value.

Returns

Return a concatenated tensor.

clone

RaggedTensor.clone(self: _k2.ragged.RaggedTensor) _k2.ragged.RaggedTensor

Return a copy of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [3]])
>>> b = a
>>> c = a.clone()
>>> a
RaggedTensor([[1, 2],
              [3]], dtype=torch.int32)
>>> b.values[0] = 10
>>> a
RaggedTensor([[10, 2],
              [3]], dtype=torch.int32)
>>> c
RaggedTensor([[1, 2],
              [3]], dtype=torch.int32)
>>> c.values[0] = -1
>>> c
RaggedTensor([[-1, 2],
              [3]], dtype=torch.int32)
>>> a
RaggedTensor([[10, 2],
              [3]], dtype=torch.int32)
>>> b
RaggedTensor([[10, 2],
              [3]], dtype=torch.int32)

index

RaggedTensor.index(*args, **kwargs)

Overloaded function.

  1. index(self: _k2.ragged.RaggedTensor, indexes: _k2.ragged.RaggedTensor) -> _k2.ragged.RaggedTensor

Index a ragged tensor with a ragged tensor.

Example 1:

>>> import k2.ragged as k2r
>>> src = k2r.RaggedTensor([[10, 11], [12, 13.5]])
>>> indexes = k2r.RaggedTensor([[0, 1]])
>>> src.index(indexes)
RaggedTensor([[[10, 11],
               [12, 13.5]]], dtype=torch.float32)
>>> i = k2r.RaggedTensor([[0], [1], [0, 0]])
>>> src.index(i)
RaggedTensor([[[10, 11]],
              [[12, 13.5]],
              [[10, 11],
               [10, 11]]], dtype=torch.float32)

Example 2:

>>> import k2.ragged as k2r
>>> src = k2r.RaggedTensor([ [[1, 0], [], [2]], [[], [3], [0, 0, 1]], [[1, 2], [-1]]])
>>> i = k2r.RaggedTensor([[[0, 2], [1]], [[0]]])
>>> src.index(i)
RaggedTensor([[[[[1, 0],
                 [],
                 [2]],
                [[1, 2],
                 [-1]]],
               [[[],
                 [3],
                 [0, 0, 1]]]],
              [[[[1, 0],
                 [],
                 [2]]]]], dtype=torch.int32)
Parameters

indexes

Its values must satisfy 0 <= values[i] < self.dim0.

Caution

Its dtype has to be torch.int32.

Returns

Return indexed tensor.

  1. index(self: _k2.ragged.RaggedTensor, indexes: torch.Tensor, axis: int, need_value_indexes: bool = False) -> Tuple[_k2.ragged.RaggedTensor, Optional[torch.Tensor]]

Indexing operation on ragged tensor, returns self[indexes], where the elements of indexes are interpreted as indexes into axis axis of self.

Caution

indexes is a 1-D tensor and indexes.dtype == torch.int32.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[0, 2, 3], [], [0, 1, 2], [], [], [3, -1.25]])
>>> i = torch.tensor([2, 0, 3, 5], dtype=torch.int32)
>>> b, value_indexes = a.index(i, axis=0, need_value_indexes=True)
>>> b
RaggedTensor([[0, 1, 2],
              [0, 2, 3],
              [],
              [3, -1.25]], dtype=torch.float32)
>>> value_indexes
tensor([3, 4, 5, 0, 1, 2, 6, 7], dtype=torch.int32)
>>> a.values[value_indexes.long()]
tensor([ 0.0000,  1.0000,  2.0000,  0.0000,  2.0000,  3.0000,  3.0000, -1.2500])
>>> k = torch.tensor([2, -1, 0], dtype=torch.int32)
>>> a.index(k, axis=0, need_value_indexes=True)
(RaggedTensor([[0, 1, 2],
              [],
              [0, 2, 3]], dtype=torch.float32), tensor([3, 4, 5, 0, 1, 2], dtype=torch.int32))

Example 2:

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1, 3], [], [2]], [[5, 8], [], [-1], [2]] ])
>>> i = torch.tensor([0, 2, 1, 6, 3, 5, 4], dtype=torch.int32)
>>> a.shape.row_ids(1)[i.long()]
tensor([0, 0, 0, 1, 1, 1, 1], dtype=torch.int32)
>>> b, value_indexes = a.index(i, axis=1, need_value_indexes=True)
>>> b
RaggedTensor([[[1, 3],
               [2],
               []],
              [[2],
               [5, 8],
               [-1],
               []]], dtype=torch.int32)
>>> value_indexes
tensor([0, 1, 2, 6, 3, 4, 5], dtype=torch.int32)
>>> a.values[value_indexes.long()]
tensor([ 1,  3,  2,  2,  5,  8, -1], dtype=torch.int32)
Parameters
  • indexes

    Array of indexes, which will be interpreted as indexes into axis axis of self, i.e. with 0 <= indexes[i] < self.tot_size(axis). Note that if axis is 0, then -1 is also a valid entry in index, -1 as an index, which will result in an empty list (as if it were the index into a position in self that had an empty list at that point).

    Caution

    It is currently not allowed to change the order on axes less than axis, i.e. if axis > 0, we require: IsMonotonic(self.shape.row_ids(axis)[indexes]).

  • axis – The axis to be indexed. Must satisfy 0 <= axis < self.num_axes.

  • need_value_indexes – If True, it will return a torch.Tensor containing the indexes into self.values that ans.values has, as in ans.values = self.values[value_indexes].

Returns

  • A ragged tensor, sharing the same dtype and device with self

  • None if need_value_indexes is False; a 1-D torch.tensor of dtype torch.int32 containing the indexes into self.values that ans.values has.

Return type

Return a tuple containing

logsumexp

RaggedTensor.logsumexp(self: _k2.ragged.RaggedTensor, initial_value: float = - inf) torch.Tensor

Compute the logsumexp of sublists over the last axis of this tensor.

Note

It is similar to torch.logsumexp except it accepts a ragged tensor. See https://pytorch.org/docs/stable/generated/torch.logsumexp.html for definition of logsumexp.

Note

If a sublist is empty, the logsumexp for it is the provided initial_value.

Note

This operation only supports float type input, i.e., with dtype being torch.float32 or torch.float64.

>>> import torch
>>> import k2
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[-0.25, -0.25, -0.25, -0.25], [], [-0.5, -0.5]], dtype=torch.float32)
>>> a.requires_grad_(True)
RaggedTensor([[-0.25, -0.25, -0.25, -0.25],
              [],
              [-0.5, -0.5]], dtype=torch.float32)
>>> b = a.logsumexp()
>>> b
tensor([1.1363,   -inf, 0.1931], grad_fn=<LogSumExpFunction>>)
>>> c = b.sum()
>>> c
tensor(-inf, grad_fn=<SumBackward0>)
>>> c.backward()
>>> a.grad
tensor([0.2500, 0.2500, 0.2500, 0.2500, 0.5000, 0.5000])
>>>
>>> # if a is a 3-d ragged tensor
>>> a = k2r.RaggedTensor([[[-0.25, -0.25, -0.25, -0.25]], [[], [-0.5, -0.5]]], dtype=torch.float32)
>>> a.requires_grad_(True)
RaggedTensor([[[-0.25, -0.25, -0.25, -0.25]],
              [[],
               [-0.5, -0.5]]], dtype=torch.float32)
>>> b = a.logsumexp()
>>> b
tensor([1.1363,   -inf, 0.1931], grad_fn=<LogSumExpFunction>>)
>>> c = b.sum()
>>> c
tensor(-inf, grad_fn=<SumBackward0>)
>>> c.backward()
>>> a.grad
tensor([0.2500, 0.2500, 0.2500, 0.2500, 0.5000, 0.5000])
Parameters

initial_value – If a sublist is empty, its logsumexp is this value.

Returns

Return a 1-D tensor with the same dtype of this tensor containing the computed logsumexp.

max

RaggedTensor.max(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor

Return a tensor containing the maximum of each sub-list along the last axis of self. The max is taken over the last axis or initial_value, whichever was larger.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1, 3, 0], [2, 5, -1, 1, 3], [], []], [[1, 8, 9, 2], [], [2, 4, 6, 8]] ])
>>> a.max()
tensor([          3,           5, -2147483648, -2147483648,           9,
        -2147483648,           8], dtype=torch.int32)
>>> a.max(initial_value=-10)
tensor([  3,   5, -10, -10,   9, -10,   8], dtype=torch.int32)
>>> a.max(initial_value=7)
tensor([7, 7, 7, 7, 9, 7, 8], dtype=torch.int32)
>>> import torch
>>> a.to(torch.float32).max(-3)
tensor([ 3.,  5., -3., -3.,  9., -3.,  8.])
>>> a.to('cuda:0').max(-2)
tensor([ 3,  5, -2, -2,  9, -2,  8], device='cuda:0', dtype=torch.int32)
Parameters

initial_value – The base value to compare. If values in a sublist are all less than this value, then the max of this sublist is initial_value. If a sublist is empty, its max is also initial_value.

Returns

Return 1-D tensor containing the max value of each sublist. It shares the same dtype and device with self.

min

RaggedTensor.min(self: _k2.ragged.RaggedTensor, initial_value: object = None) torch.Tensor

Return a tensor containing the minimum of each sub-list along the last axis of self. The min is taken over the last axis or initial_value, whichever was smaller.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1, 3, 0], [2, 5, -1, 1, 3], [], []], [[1, 8, 9, 2], [], [2, 4, 6, 8]] ], dtype=torch.float32)
>>> a.min()
tensor([ 0.0000e+00, -1.0000e+00,  3.4028e+38,  3.4028e+38,  1.0000e+00,
         3.4028e+38,  2.0000e+00])
>>> a.min(initial_value=float('inf'))
tensor([ 0., -1., inf, inf,  1., inf,  2.])
>>> a.min(100)
tensor([  0.,  -1., 100., 100.,   1., 100.,   2.])
>>> a.to(torch.int32).min(20)
tensor([ 0, -1, 20, 20,  1, 20,  2], dtype=torch.int32)
>>> a.to('cuda:0').min(15)
tensor([ 0., -1., 15., 15.,  1., 15.,  2.], device='cuda:0')
Parameters

initial_value – The base value to compare. If values in a sublist are all larger than this value, then the minimum of this sublist is initial_value. If a sublist is empty, its minimum is also initial_value.

Returns

Return 1-D tensor containing the minimum of each sublist. It shares the same dtype and device with self.

normalize

RaggedTensor.normalize(self: _k2.ragged.RaggedTensor, use_log: bool) _k2.ragged.RaggedTensor

Normalize a ragged tensor over the last axis.

If use_log is True, the normalization per sublist is done as follows:

  1. Compute the log sum per sublist

2. Subtract the log sum computed above from the sublist and return it

If use_log is False, the normalization per sublist is done as follows:

  1. Compute the sum per sublist

  2. Divide the sublist by the above sum and return the resulting sublist

Note

If a sublist contains 3 elements [a, b, c], then the log sum is defined as:

s = log(exp(a) + exp(b) + exp(c))

The resulting sublist looks like below if use_log is True:

[a - s, b - s, c - s]

If use_log is False, the resulting sublist looks like:

[a/(a+b+c), b/(a+b+c), c/(a+b+c)]
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[0.1, 0.3], [], [1], [0.2, 0.8]])
>>> a.normalize(use_log=False)
RaggedTensor([[0.25, 0.75],
              [],
              [1],
              [0.2, 0.8]], dtype=torch.float32)
>>> a.normalize(use_log=True)
RaggedTensor([[-0.798139, -0.598139],
              [],
              [0],
              [-1.03749, -0.437488]], dtype=torch.float32)
>>> b = k2r.RaggedTensor([ [[0.1, 0.3], []], [[1], [0.2, 0.8]] ])
>>> b.normalize(use_log=False)
RaggedTensor([[[0.25, 0.75],
               []],
              [[1],
               [0.2, 0.8]]], dtype=torch.float32)
>>> b.normalize(use_log=True)
RaggedTensor([[[-0.798139, -0.598139],
               []],
              [[0],
               [-1.03749, -0.437488]]], dtype=torch.float32)
>>> a.num_axes
2
>>> b.num_axes
3
>>> import torch
>>> (torch.tensor([0.1, 0.3]).exp() / torch.tensor([0.1, 0.3]).exp().sum()).log()
tensor([-0.7981, -0.5981])
Parameters

use_log – It indicates which kind of normalization to be applied.

Returns

Returns a 1-D tensor, sharing the same dtype and device with self.

numel

RaggedTensor.numel(self: _k2.ragged.RaggedTensor) int
Returns

Return number of elements in this tensor. It equals to self.values.numel().

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [], [3, 4, 5, 6]])
>>> a.numel()
5
>>> b = k2r.RaggedTensor('[ [[1] [] []]  [[2 3]]]')
>>> b.numel()
3
>>> c = k2r.RaggedTensor('[[1] [] [3 4 5 6]]')
>>> c.numel()
5

pad

RaggedTensor.pad(self: _k2.ragged.RaggedTensor, mode: str, padding_value: object) torch.Tensor

Pad a ragged tensor with 2-axes to a 2-D torch tensor.

For example, if self has the following values:

[ [1 2 3] [4] [5 6 7 8] ]

Then it returns a 2-D tensor as follows if padding_value is 0 and mode is constant:

tensor([[1, 2, 3, 0],
        [4, 0, 0, 0],
        [5, 6, 7, 8]])

Caution

It requires that self.num_axes == 2.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [], [2, 3], [5, 8, 9, 8, 2]])
>>> a.pad(mode='constant', padding_value=-1)
tensor([[ 1, -1, -1, -1, -1],
        [-1, -1, -1, -1, -1],
        [ 2,  3, -1, -1, -1],
        [ 5,  8,  9,  8,  2]], dtype=torch.int32)
>>> a.pad(mode='replicate', padding_value=-1)
tensor([[ 1,  1,  1,  1,  1],
        [-1, -1, -1, -1, -1],
        [ 2,  3,  3,  3,  3],
        [ 5,  8,  9,  8,  2]], dtype=torch.int32)
Parameters
  • mode – Valid values are: constant, replicate. If it is constant, the given padding_value is used for filling. If it is replicate, the last entry in a list is used for filling. If a list is empty, then the given padding_value is also used for filling.

  • padding_value – The filling value.

Returns

A 2-D torch tensor, sharing the same dtype and device with self.

remove_axis

RaggedTensor.remove_axis(self: _k2.ragged.RaggedTensor, axis: int) _k2.ragged.RaggedTensor

Remove an axis; if it is not the first or last axis, this is done by appending lists (effectively the axis is combined with the following axis). If it is the last axis it is just removed and the number of elements may be changed.

Caution

The tensor has to have more than two axes.

Example 1:

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1], [], [0, -1]], [[], [2, 3], []], [[0]], [[]] ])
>>> a
RaggedTensor([[[1],
               [],
               [0, -1]],
              [[],
               [2, 3],
               []],
              [[0]],
              [[]]], dtype=torch.int32)
>>> a.num_axes
3
>>> b = a.remove_axis(0)
>>> b
RaggedTensor([[1],
              [],
              [0, -1],
              [],
              [2, 3],
              [],
              [0],
              []], dtype=torch.int32)
>>> c = a.remove_axis(1)
>>> c
RaggedTensor([[1, 0, -1],
              [2, 3],
              [0],
              []], dtype=torch.int32)

Example 2:

>>> a = k2r.RaggedTensor([ [[[1], [], [2]]], [[[3, 4], [], [5, 6], []]], [[[], [0]]] ])
>>> a.num_axes
4
>>> a
RaggedTensor([[[[1],
                [],
                [2]]],
              [[[3, 4],
                [],
                [5, 6],
                []]],
              [[[],
                [0]]]], dtype=torch.int32)
>>> b = a.remove_axis(0)
>>> b
RaggedTensor([[[1],
               [],
               [2]],
              [[3, 4],
               [],
               [5, 6],
               []],
              [[],
               [0]]], dtype=torch.int32)
>>> c = a.remove_axis(1)
>>> c
RaggedTensor([[[1],
               [],
               [2]],
              [[3, 4],
               [],
               [5, 6],
               []],
              [[],
               [0]]], dtype=torch.int32)
>>> d = a.remove_axis(2)
>>> d
RaggedTensor([[[1, 2]],
              [[3, 4, 5, 6]],
              [[0]]], dtype=torch.int32)
Parameters

axis – The axis to move.

Returns

Return a ragged tensor with one fewer axes.

remove_values_eq

RaggedTensor.remove_values_eq(self: _k2.ragged.RaggedTensor, target: object) _k2.ragged.RaggedTensor

Returns a ragged tensor after removing all ‘values’ that equal a provided target. Leaves all layers of the shape except for the last one unaffected.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]])
>>> a
RaggedTensor([[1, 2, 3, 0, 3, 2],
              [],
              [3, 2, 3],
              [3]], dtype=torch.int32)
>>> b = a.remove_values_eq(3)
>>> b
RaggedTensor([[1, 2, 0, 2],
              [],
              [2],
              []], dtype=torch.int32)
>>> c = a.remove_values_eq(2)
>>> c
RaggedTensor([[1, 3, 0, 3],
              [],
              [3, 3],
              [3]], dtype=torch.int32)
Parameters

target – The target value to delete.

Returns

Return a ragged tensor whose values don’t contain the target.

remove_values_leq

RaggedTensor.remove_values_leq(self: _k2.ragged.RaggedTensor, cutoff: object) _k2.ragged.RaggedTensor

Returns a ragged tensor after removing all ‘values’ that are equal to or less than a provided cutoff. Leaves all layers of the shape except for the last one unaffected.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2, 3, 0, 3, 2], [], [3, 2, 3], [3]])
>>> a
RaggedTensor([[1, 2, 3, 0, 3, 2],
              [],
              [3, 2, 3],
              [3]], dtype=torch.int32)
>>> b = a.remove_values_leq(3)
>>> b
RaggedTensor([[],
              [],
              [],
              []], dtype=torch.int32)
>>> c = a.remove_values_leq(2)
>>> c
RaggedTensor([[3, 3],
              [],
              [3, 3],
              [3]], dtype=torch.int32)
>>> d = a.remove_values_leq(1)
>>> d
RaggedTensor([[2, 3, 3, 2],
              [],
              [3, 2, 3],
              [3]], dtype=torch.int32)
Parameters

cutoff – Values less than or equal to this cutoff are deleted.

Returns

Return a ragged tensor whose values are all above cutoff.

requires_grad_

RaggedTensor.requires_grad_(self: _k2.ragged.RaggedTensor, requires_grad: bool = True) _k2.ragged.RaggedTensor

Change if autograd should record operations on this tensor: Set this tensor’s requires_grad attribute in-place.

Note

If this tensor is not a float tensor, PyTorch will throw a RuntimeError exception.

Caution

This method ends with an underscore, meaning it changes this tensor in-place.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]], dtype=torch.float64)
>>> a.requires_grad
False
>>> a.requires_grad_(True)
RaggedTensor([[1]], dtype=torch.float64)
>>> a.requires_grad
True
Parameters

requires_grad – If autograd should record operations on this tensor.

Returns

Return this tensor.

sort_

RaggedTensor.sort_(self: _k2.ragged.RaggedTensor, descending: bool = False, need_new2old_indexes: bool = False) Optional[torch.Tensor]

Sort a ragged tensor over the last axis in-place.

Caution

sort_ ends with an underscore, meaning this operation changes self in-place.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 3, 0], [2, 5, 3], [], [1, 3, 0.] ])
>>> a_clone = a.clone()
>>> b = a.sort_(descending=True, need_new2old_indexes=True)
>>> b
tensor([1, 0, 2, 4, 5, 3, 7, 6, 8], dtype=torch.int32)
>>> a
RaggedTensor([[3, 1, 0],
              [5, 3, 2],
              [],
              [3, 1, 0]], dtype=torch.float32)
>>> a_clone.values[b.long()]
tensor([3., 1., 0., 5., 3., 2., 3., 1., 0.])
>>> a_clone = a.clone()
>>> c = a.sort_(descending=False, need_new2old_indexes=True)
>>> c
tensor([2, 1, 0, 5, 4, 3, 8, 7, 6], dtype=torch.int32)
>>> a
RaggedTensor([[0, 1, 3],
              [2, 3, 5],
              [],
              [0, 1, 3]], dtype=torch.float32)
>>> a_clone.values[c.long()]
tensor([0., 1., 3., 2., 3., 5., 0., 1., 3.])
Parameters
  • descendingTrue to sort in descending order. False to sort in ascending order.

  • need_new2old_indexes – If True, also returns a 1-D tensor, containing the indexes mapping from the sorted elements to the unsorted elements. We can use self.clone().values[returned_tensor] to get a sorted tensor.

Returns

If need_new2old_indexes is False, returns None. Otherwise, returns a 1-D tensor of dtype torch.int32.

sum

RaggedTensor.sum(self: _k2.ragged.RaggedTensor, initial_value: float = 0) torch.Tensor

Compute the sum of sublists over the last axis of this tensor.

Note

If a sublist is empty, the sum for it is the provided initial_value.

Note

This operation supports autograd if this tensor is a float tensor, i.e., with dtype being torch.float32 or torch.float64.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 2] [] [5]]  [[10]] ]', dtype=torch.float32)
>>> a.requires_grad_(True)
RaggedTensor([[[1, 2],
               [],
               [5]],
              [[10]]], dtype=torch.float32)
>>> b = a.sum()
>>> c = (b * torch.arange(4)).sum()
>>> c.backward()
>>> a.grad
tensor([0., 0., 2., 3.])
>>> b
tensor([ 3.,  0.,  5., 10.], grad_fn=<SumFunction>>)
>>> c
tensor(40., grad_fn=<SumBackward0>)
Parameters

initial_value – This value is added to the sum of each sublist. So when a sublist is empty, its sum is this value.

Returns

Return a 1-D tensor with the same dtype of this tensor containing the computed sum.

to

RaggedTensor.to(*args, **kwargs)

Overloaded function.

  1. to(self: _k2.ragged.RaggedTensor, device: object) -> _k2.ragged.RaggedTensor

Transfer this tensor to a given device.

Note

If self is already on the specified device, return a ragged tensor sharing the underlying memory with self. Otherwise, a new tensor is returned.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3]])
>>> a.device
device(type='cpu')
>>> b = a.to(torch.device('cuda', 0))
>>> b.device
device(type='cuda', index=0)
Parameters

device – The target device to move this tensor.

Returns

Return a tensor on the given device.

  1. to(self: _k2.ragged.RaggedTensor, device: str) -> _k2.ragged.RaggedTensor

Transfer this tensor to a given device.

Note

If self is already on the specified device, return a ragged tensor sharing the underlying memory with self. Otherwise, a new tensor is returned.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]])
>>> a.device
device(type='cpu')
>>> b = a.to('cuda:0')
>>> b.device
device(type='cuda', index=0)
>>> c = b.to('cpu')
>>> c.device
device(type='cpu')
>>> d = c.to('cuda:1')
>>> d.device
device(type='cuda', index=1)
Parameters

device – The target device to move this tensor. Note: The device is represented as a string. Valid strings are: “cpu”, “cuda:0”, “cuda:1”, etc.

Returns

Return a tensor on the given device.

  1. to(self: _k2.ragged.RaggedTensor, dtype: torch::dtype) -> _k2.ragged.RaggedTensor

Convert this tensor to a specific dtype.

Note

If self is already of the specified dtype, return a ragged tensor sharing the underlying memory with self. Otherwise, a new tensor is returned.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3, 5]])
>>> a.dtype
torch.int32
>>> b = a.to(torch.float64)
>>> b.dtype
torch.float64

Caution

Currently, only support dtypes torch.int32, torch.float32, and torch.float64. We can support other types if needed.

Parameters

dtype – The dtype this tensor should be converted to.

Returns

Return a tensor of the given dtype.

to_str_simple

RaggedTensor.to_str_simple(self: _k2.ragged.RaggedTensor) str

Convert a ragged tensor to a string representation, which is more compact than self.__str__.

An example output is given below:

RaggedTensor([[[1, 2, 3], [], [0]], [[2], [3, 10.5]]], dtype=torch.float32)
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1, 2, 3], [], [0]], [[2], [3, 10.5]] ])
>>> a
RaggedTensor([[[1, 2, 3],
               [],
               [0]],
              [[2],
               [3, 10.5]]], dtype=torch.float32)
>>> str(a)
'RaggedTensor([[[1, 2, 3],\n               [],\n               [0]],\n              [[2],\n               [3, 10.5]]], dtype=torch.float32)'
>>> a.to_str_simple()
'RaggedTensor([[[1, 2, 3], [], [0]], [[2], [3, 10.5]]], dtype=torch.float32)'

tolist

RaggedTensor.tolist(self: _k2.ragged.RaggedTensor) list

Turn a ragged tensor into a list of lists [of lists..].

Hint

You can pass the returned list to the constructor of RaggedTensor.

>>> a = k2r.RaggedTensor([ [[], [1, 2], [3], []], [[5, 6, 7]], [[], [0, 2, 3], [], []]])
>>> a.tolist()
[[[], [1, 2], [3], []], [[5, 6, 7]], [[], [0, 2, 3], [], []]]
>>> b = k2r.RaggedTensor(a.tolist())
>>> a == b
True
>>> c = k2r.RaggedTensor([[1.], [2.], [], [3.25, 2.5]])
>>> c.tolist()
[[1.0], [2.0], [], [3.25, 2.5]]
Returns

A list of list of lists [of lists …] containing the same elements and structure as self.

tot_size

RaggedTensor.tot_size(self: _k2.ragged.RaggedTensor, axis: int) int

Return the number of elements of an given axis. If axis is 0, it’s equivalent to the property dim0.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [1 2 3] [] [5 8 ] ]')
>>> a.tot_size(0)
3
>>> a.tot_size(1)
5
>>> import k2.ragged as k2r
>>> b = k2r.RaggedTensor('[ [[1 2 3] [] [5 8]] [[] [1 5 9 10 -1] [] [] []] ]')
>>> b.tot_size(0)
2
>>> b.tot_size(1)
8
>>> b.tot_size(2)
10

unique

RaggedTensor.unique(self: _k2.ragged.RaggedTensor, need_num_repeats: bool = False, need_new2old_indexes: bool = False) Tuple[_k2.ragged.RaggedTensor, Optional[_k2.ragged.RaggedTensor], Optional[torch.Tensor]]

If self has two axes, this will return the unique sub-lists (in a possibly different order, but without repeats). If self has 3 axes, it will do the above but separately for each index on axis 0; if more than 3 axes, the earliest axes will be ignored.

Caution

It does not completely guarantee that all unique sequences will be present in the output, as it relies on hashing and ignores collisions. If several sequences have the same hash, only one of them is kept, even if the actual content in the sequence is different.

Caution

Even if there are no repeated sequences, the output may be different from self. That is, new2old_indexes may NOT be an identity map even if nothing was removed.

Example 1

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[3, 1], [3], [1], [1], [3, 1], [2]])
>>> a.unique()
(RaggedTensor([[1],
              [2],
              [3],
              [3, 1]], dtype=torch.int32), None, None)
>>> a.unique(need_num_repeats=True, need_new2old_indexes=True)
(RaggedTensor([[1],
              [2],
              [3],
              [3, 1]], dtype=torch.int32), RaggedTensor([[2, 1, 1, 2]], dtype=torch.int32), tensor([2, 5, 1, 0], dtype=torch.int32))
>>> a.unique(need_num_repeats=True)
(RaggedTensor([[1],
              [2],
              [3],
              [3, 1]], dtype=torch.int32), RaggedTensor([[2, 1, 1, 2]], dtype=torch.int32), None)
>>> a.unique(need_new2old_indexes=True)
(RaggedTensor([[1],
              [2],
              [3],
              [3, 1]], dtype=torch.int32), None, tensor([2, 5, 1, 0], dtype=torch.int32))

Example 2

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[[1, 2], [2, 1], [1, 2], [1, 2]], [[3], [2], [0, 1], [2]], [[], [2, 3], [], [3]] ])
>>> a.unique()
(RaggedTensor([[[1, 2],
               [2, 1]],
              [[2],
               [3],
               [0, 1]],
              [[],
               [3],
               [2, 3]]], dtype=torch.int32), None, None)
>>> a.unique(need_num_repeats=True, need_new2old_indexes=True)
(RaggedTensor([[[1, 2],
               [2, 1]],
              [[2],
               [3],
               [0, 1]],
              [[],
               [3],
               [2, 3]]], dtype=torch.int32), RaggedTensor([[3, 1],
              [2, 1, 1],
              [2, 1, 1]], dtype=torch.int32), tensor([ 0,  1,  5,  4,  6,  8, 11,  9], dtype=torch.int32))
>>> a.unique(need_num_repeats=True)
(RaggedTensor([[[1, 2],
               [2, 1]],
              [[2],
               [3],
               [0, 1]],
              [[],
               [3],
               [2, 3]]], dtype=torch.int32), RaggedTensor([[3, 1],
              [2, 1, 1],
              [2, 1, 1]], dtype=torch.int32), None)
>>> a.unique(need_new2old_indexes=True)
(RaggedTensor([[[1, 2],
               [2, 1]],
              [[2],
               [3],
               [0, 1]],
              [[],
               [3],
               [2, 3]]], dtype=torch.int32), None, tensor([ 0,  1,  5,  4,  6,  8, 11,  9], dtype=torch.int32))

Example 3

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [3], [2]])
>>> a.unique(True, True)
(RaggedTensor([[1],
              [2],
              [3]], dtype=torch.int32), RaggedTensor([[1, 1, 1]], dtype=torch.int32), tensor([0, 2, 1], dtype=torch.int32))
Parameters
  • need_num_repeats – If True, it also returns the number of repeats of each sequence.

  • need_new2old_indexes

    If true, it returns an extra 1-D tensor new2old_indexes. If src has 2 axes, this tensor contains src_idx0; if src has 3 axes, this tensor contains src_idx01.

    Caution

    For repeated sublists, only one of them is kept. The choice of which one to keep is deterministic and is an implementation detail.

Returns

  • ans: A ragged tensor with the same number of axes as self and possibly fewer elements due to removing repeated sequences on the last axis (and with the last-but-one indexes possibly in a different order).

  • num_repeats: A tensor containing number of repeats of each returned sequence if need_num_repeats is True; it is None otherwise. If it is not None, num_repeats.num_axes is always 2. If ans.num_axes is 2, then num_repeats.dim0 == 1 and num_repeats.numel() == ans.dim0. If ans.num_axes is 3, then num_repeats.dim0 == ans.dim0 and num_repeats.numel() == ans.tot_size(1).

  • new2old_indexes: A 1-D tensor whose i-th element specifies the input sublist that the i-th output sublist corresponds to.

Return type

Returns a tuple containing

device

RaggedTensor.device

Return the device of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]])
>>> a.device
device(type='cpu')
>>> b = a.to(torch.device('cuda', 0))
>>> b.device
device(type='cuda', index=0)
>>> b.device == torch.device('cuda:0')

dim0

RaggedTensor.dim0

Return number of sublists at axis 0.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [3], [], [], [] ])
>>> a.dim0
5
>>> b = k2r.RaggedTensor('[ [[]] [[] []]]')
>>> b.dim0
2

dtype

RaggedTensor.dtype

Return the dtype of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], []])
>>> a.dtype
torch.int32
>>> a = a.to(torch.float32)
>>> a.dtype
torch.float32
>>> b = k2r.RaggedTensor([[3]], dtype=torch.float64)
>>> b.dtype
torch.float64

grad

RaggedTensor.grad

This attribute is None by default. PyTorch will set it during backward().

The attribute will contain the gradients computed and future calls to backward() will accumulate (add) gradients into it.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [3], [5, 6], []], dtype=torch.float32)
>>> a.requires_grad_(True)
RaggedTensor([[1, 2],
              [3],
              [5, 6],
              []], dtype=torch.float32)
>>> b = a.sum()
>>> b
tensor([ 3.,  3., 11.,  0.], grad_fn=<SumFunction>>)
>>> c = b * torch.arange(4)
>>> c.sum().backward()
>>> a.grad
tensor([0., 0., 1., 2., 2.])

is_cuda

RaggedTensor.is_cuda
Returns

Return True if the tensor is stored on the GPU, False otherwise.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]])
>>> a.is_cuda
False
>>> b = a.to(torch.device('cuda', 0))
>>> b.is_cuda
True

num_axes

RaggedTensor.num_axes

Return the number of axes of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [] [] [] [] ]')
>>> a.num_axes
2
>>> b = k2r.RaggedTensor('[ [[] []] [[]] ]')
>>> b.num_axes
3
>>> c = k24.Tensor('[ [ [[] [1]] [[3 4] []] ]  [ [[1]] [[2] [3 4]] ] ]')
>>> c.num_axes
4
Returns

Return number of axes of this tensor, which is at least 2.

requires_grad

RaggedTensor.requires_grad

Return True if gradients need to be computed for this tensor. Return False otherwise.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]], dtype=torch.float32)
>>> a.requires_grad
False
>>> a.requires_grad = True
>>> a.requires_grad
True

shape

RaggedTensor.shape

Return the shape of this tensor.

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [], [3] ])
>>> a.shape
[ [ x x ] [ ] [ x ] ]
>>> type(a.shape)
<class '_k2.ragged.RaggedShape'>

values

RaggedTensor.values

Return the underlying memory as a 1-D tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [], [5], [], [8, 9, 10]])
>>> a.values
tensor([ 1,  2,  5,  8,  9, 10], dtype=torch.int32)
>>> isinstance(a.values, torch.Tensor)
True
>>> a.values[-2] = -1
>>> a
RaggedTensor([[1, 2],
              [],
              [5],
              [],
              [8, -1, 10]], dtype=torch.int32)
>>> a.values[3] = -3
>>> a
RaggedTensor([[1, 2],
              [],
              [5],
              [],
              [-3, -1, 10]], dtype=torch.int32)
>>> a.values[2] = -2
>>> a
RaggedTensor([[1, 2],
              [],
              [-2],
              [],
              [-3, -1, 10]], dtype=torch.int32)

RnntDecodingConfig

__init__

RnntDecodingConfig.__init__(self: _k2.RnntDecodingConfig, vocab_size: int, decoder_history_len: int, beam: float, max_states: int, max_contexts: int) None

Construct a RnntDecodingConfig object, it contains the parameters needed by rnnt decoding.

Parameters
  • vocab_size – It indicates how many symbols we are using, equals the largest-symbol plus one.

  • decoder_history_len – The number of symbols of history the decoder takes; will normally be one or two (“stateless decoder”), our RNN-T decoding setup does not support unlimited decoder context such as with LSTMs.

  • beambeam imposes a limit on the score of a state, relative to the best-scoring state on the same frame. E.g. 10.

  • max_statesmax_states is a limit on the number of distinct states that we allow per frame, per stream; the number of states will not be allowed to exceed this limit.

  • max_contextsmax_contexts is a limit on the number of distinct contexts that we allow per frame, per stream; the number of contexts will not be allowed to exceed this limit.

beam

RnntDecodingConfig.beam

decoder_history_len

RnntDecodingConfig.decoder_history_len

max_contexts

RnntDecodingConfig.max_contexts

max_states

RnntDecodingConfig.max_states

vocab_size

RnntDecodingConfig.vocab_size

RnntDecodingStream

__init__

RnntDecodingStream.__init__(fsa)[source]

Create a new rnnt decoding stream.

Every sequence(wave data) needs a decoding stream, this function is expected to be called when a new sequence comes. We support different decoding graphs for different streams.

Parameters

fsa (Fsa) – The decoding graph used in this stream.

Returns

A rnnt decoding stream object, which will be combined into RnntDecodingStreams to do decoding together with other sequences in parallel.

__str__

RnntDecodingStream.__str__()[source]

Return a string representation of this object

For visualization and debug only.

Return type

str

RnntDecodingStreams

__init__

RnntDecodingStreams.__init__(src_streams, config)[source]

Combines multiple RnntDecodingStream objects to create a RnntDecodingStreams object, then all these RnntDecodingStreams can do decoding in parallel.

Parameters
  • src_streams (List[RnntDecodingStream]) – A list of RnntDecodingStream object to be combined.

  • config (RnntDecodingConfig) – A configuration object which contains decoding parameters like vocab-size, decoder_history_len, beam, max_states, max_contexts etc.

Returns

Return a RnntDecodingStreams object.

__str__

RnntDecodingStreams.__str__()[source]

Return a string representation of this object

For visualization and debug only.

Return type

str

advance

RnntDecodingStreams.advance(logprobs)[source]

Advance decoding streams by one frame.

Parameters

logprobs (Tensor) – A tensor of shape [tot_contexts][num_symbols], containing log-probs of symbols given the contexts output by get_contexts(). It satisfies logprobs.Dim0() == shape.TotSize(1), shape is returned by get_contexts().

Return type

None

format_output

RnntDecodingStreams.format_output(num_frames, allow_partial=False, log_probs=None, t2s2c_shape=None)[source]

Generate the lattice Fsa currently got.

Note

The attributes of the generated lattice is a union of the attributes of all the decoding graphs. For example, if self contains three individual stream, each stream has its own decoding graphs, graph[0] has attributes attr1, attr2; graph[1] has attributes attr1, attr3; graph[2] has attributes attr3, attr4; then the generated lattice has attributes attr1, attr2, attr3, attr4.

Parameters
  • num_frames (List[int]) – A List containing the number of frames we want to gather for each stream (note: the frames we have ever received for the corresponding stream). It MUST satisfy len(num_frames) == self.num_streams.

  • allow_partial (bool) – If true and there is no final state active, we will treat all the states on the last frame to be final state. If false, we only care about the real final state in the decoding graph on the last frame when generating lattice. Default False.

  • log_probs (Optional[Tensor]) – A tensor of shape [t2s2c_shape.tot_size(2)][num_symbols]. It’s a stacked tensor of logprobs passed to function advance during decoding.

  • t2s2c_shape (Optional[RaggedShape]) – It is short for time2stream2context_shape, which describes shape of log_probs used to generate lattice. Used to generate arc_map_token and make the whole decoding process differentiable.

Return type

Fsa

Returns

Return the lattice Fsa with all the attributes propagated. The returned Fsa has 3 axes with fsa.dim0==self.num_streams.

get_contexts

RnntDecodingStreams.get_contexts()[source]

This function must be called prior to evaluating the joiner network for a particular frame. It tells the calling code for which contexts it must evaluate the joiner network.

Return type

Tuple[RaggedShape, Tensor]

Returns

Return a two-element tuple containing a RaggedShape and a tensor.

shape:

A RaggedShape with 2 axes, representing [stream][context].

contexts:

A tensor of shape [tot_contexts][decoder_history_len], where tot_contexts == shape->TotSize(1) and decoder_history_len comes from the config, it represents the number of symbols in the context of the decoder network (assumed to be finite). It contains the token ids into the vocabulary(i.e. 0 <= value < vocab_size). Its dtype is torch.int32.

terminate_and_flush_to_streams

RnntDecodingStreams.terminate_and_flush_to_streams()[source]

Terminate the decoding process of current RnntDecodingStreams object. It will update the decoding states and store the decoding results currently got to each of the individual streams.

Note

We can not decode with this object anymore after calling terminate_and_flush_to_streams().

Return type

None

SymbolTable

add

SymbolTable.add(symbol, index=None)[source]

Add a new symbol to the SymbolTable.

Parameters
  • symbol (~Symbol) – The symbol to be added.

  • index (Optional[int]) – Optional int id to which the symbol should be assigned. If it is not available, a ValueError will be raised.

Return type

int

Returns

The int id to which the symbol has been assigned.

from_file

static SymbolTable.from_file(filename)[source]

Build a symbol table from file.

Every line in the symbol table file has two fields separated by space(s), tab(s) or both. The following is an example file:

<eps> 0
a 1
b 2
c 3
Parameters

filename (str) – Name of the symbol table file. Its format is documented above.

Return type

SymbolTable

Returns

An instance of SymbolTable.

from_str

static SymbolTable.from_str(s)[source]

Build a symbol table from a string.

The string consists of lines. Every line has two fields separated by space(s), tab(s) or both. The first field is the symbol and the second the integer id of the symbol.

Parameters

s (str) – The input string with the format described above.

Return type

SymbolTable

Returns

An instance of SymbolTable.

get

SymbolTable.get(k)[source]

Get a symbol for an id or get an id for a symbol

Parameters

k (Union[int, ~Symbol]) – If it is an id, it tries to find the symbol corresponding to the id; if it is a symbol, it tries to find the id corresponding to the symbol.

Return type

Union[~Symbol, int]

Returns

An id or a symbol depending on the given k.

merge

SymbolTable.merge(other)[source]

Create a union of two SymbolTables. Raises an AssertionError if the same IDs are occupied by different symbols.

Parameters

other (SymbolTable) – A symbol table to merge with self.

Return type

SymbolTable

Returns

A new symbol table.

to_file

SymbolTable.to_file(filename)[source]

Serialize the SymbolTable to a file.

Every line in the symbol table file has two fields separated by space(s), tab(s) or both. The following is an example file:

<eps> 0
a 1
b 2
c 3
Parameters

filename (str) – Name of the symbol table file. Its format is documented above.

ids

SymbolTable.ids

Returns a list of integer IDs corresponding to the symbols.

Return type

List[int]

symbols

SymbolTable.symbols

Returns a list of symbols (e.g., strings) corresponding to the integer IDs.

Return type

List[~Symbol]

k2.ragged

cat

k2.ragged.cat(srcs: List[_k2.ragged.RaggedTensor], axis: int) _k2.ragged.RaggedTensor

Concatenate a list of ragged tensor over a specified axis.

Example 1

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [], [2, 3]])
>>> k2r.cat([a, a], axis=0)
RaggedTensor([[1],
              [],
              [2, 3],
              [1],
              [],
              [2, 3]], dtype=torch.int32)
>>> k2r.cat((a, a), axis=1)
RaggedTensor([[1, 1],
              [],
              [2, 3, 2, 3]], dtype=torch.int32)

Example 2

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 3], [], [5, 8], [], [9]])
>>> b = k2r.RaggedTensor([[0], [1, 8], [], [-1], [10]])
>>> c = k2r.cat([a, b], axis=0)
>>> c
RaggedTensor([[1, 3],
              [],
              [5, 8],
              [],
              [9],
              [0],
              [1, 8],
              [],
              [-1],
              [10]], dtype=torch.int32)
>>> c.num_axes
2
>>> d = k2r.cat([a, b], axis=1)
>>> d
RaggedTensor([[1, 3, 0],
              [1, 8],
              [5, 8],
              [-1],
              [9, 10]], dtype=torch.int32)
>>> d.num_axes
2
>>> k2r.RaggedTensor.cat([a, b], axis=1)
RaggedTensor([[1, 3, 0],
              [1, 8],
              [5, 8],
              [-1],
              [9, 10]], dtype=torch.int32)
>>> k2r.cat((b, a), axis=0)
RaggedTensor([[0],
              [1, 8],
              [],
              [-1],
              [10],
              [1, 3],
              [],
              [5, 8],
              [],
              [9]], dtype=torch.int32)
Parameters
  • srcs – A list (or a tuple) of ragged tensors to concatenate. They MUST all have the same dtype and on the same device.

  • axis – Only 0 and 1 are supported right now. If it is 1, then srcs[i].dim0 must all have the same value.

Returns

Return a concatenated tensor.

create_ragged_shape2

k2.ragged.create_ragged_shape2(row_splits: Optional[torch.Tensor] = None, row_ids: Optional[torch.Tensor] = None, cached_tot_size: int = - 1) _k2.ragged.RaggedShape

Construct a RaggedShape from row_ids and/or row_splits vectors. For the overall concepts, please see comments in k2/csrc/utils.h.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[x x] [x]]')
>>> k2r.create_ragged_shape2(shape.row_splits(1), shape.row_ids(1))
[ [ x x ] [ x ] ]
Parameters
  • row_splits – Optional. A 1-D torch.Tensor with dtype torch.int32. If None, you have to specify row_ids.

  • row_ids – Optional. A 1-D torch.Tensor with dtype torch.int32. If None, you have to specify row_splits.

  • cached_tot_size – The number of elements (length of row_ids, even if row_ids is not provided); would be identical to the last element of row_splits, but can avoid a GPU to CPU transfer if known.

Returns

An instance of RaggedShape, with ans.num_axes == 2.

create_ragged_tensor

k2.ragged.create_ragged_tensor(*args, **kwargs)

Overloaded function.

  1. create_ragged_tensor(data: list, dtype: object = None, device: object = ‘cpu’) -> _k2.ragged.RaggedTensor

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Hint

The returned tensor is on CPU.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.create_ragged_tensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.create_ragged_tensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.create_ragged_tensor([ [1] ], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.create_ragged_tensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.create_ragged_tensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)
>>> f = k2r.create_ragged_tensor([ [1, 2], [], [3] ], device=torch.device('cuda', 0))
>>> f
RaggedTensor([[1, 2],
              [],
              [3]], device='cuda:0', dtype=torch.int32)
>>> e = k2r.create_ragged_tensor([[1], []], device='cuda:1')
>>> e
RaggedTensor([[1],
              []], device='cuda:1', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

Returns

Return a ragged tensor.

  1. create_ragged_tensor(data: list, dtype: object = None, device: str = ‘cpu’) -> _k2.ragged.RaggedTensor

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Hint

The returned tensor is on CPU.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.create_ragged_tensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.create_ragged_tensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.create_ragged_tensor([ [1] ], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.create_ragged_tensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.create_ragged_tensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)
>>> f = k2r.create_ragged_tensor([ [1, 2], [], [3] ], device=torch.device('cuda', 0))
>>> f
RaggedTensor([[1, 2],
              [],
              [3]], device='cuda:0', dtype=torch.int32)
>>> e = k2r.create_ragged_tensor([[1], []], device='cuda:1')
>>> e
RaggedTensor([[1],
              []], device='cuda:1', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

Returns

Return a ragged tensor.

  1. create_ragged_tensor(s: str, dtype: object = None, device: object = ‘cpu’) -> _k2.ragged.RaggedTensor

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.create_ragged_tensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.create_ragged_tensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
[ [ [ ] [ 3 ] ] [ [ 10 ] ] ]
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.create_ragged_tensor('[[1.]]')
>>> c.dtype
torch.float32

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

Returns

Return a ragged tensor.

  1. create_ragged_tensor(s: str, dtype: object = None, device: str = ‘cpu’) -> _k2.ragged.RaggedTensor

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.create_ragged_tensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.create_ragged_tensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
[ [ [ ] [ 3 ] ] [ [ 10 ] ] ]
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.create_ragged_tensor('[[1.]]')
>>> c.dtype
torch.float32

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

Returns

Return a ragged tensor.

  1. create_ragged_tensor(tensor: torch.Tensor) -> _k2.ragged.RaggedTensor

Create a ragged tensor from a torch tensor.

Note

It turns a regular tensor into a ragged tensor.

Caution

The input tensor has to have more than 1 dimension. That is tensor.ndim > 1.

Also, if the input tensor is contiguous, self will share the underlying memory with it. Otherwise, memory of the input tensor is copied to create self.

Supported dtypes of the input tensor are: torch.int32, torch.float32, and torch.float64.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3)
>>> b = k2r.create_ragged_tensor(a)
>>> a
tensor([[0, 1, 2],
        [3, 4, 5]], dtype=torch.int32)
>>> b
RaggedTensor([[0, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> b.dtype
torch.int32
>>> a.is_contiguous()
True
>>> a[0, 0] = 10
>>> b
RaggedTensor([[10, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> b.values[1] = -2
>>> a
tensor([[10, -2,  2],
        [ 3,  4,  5]], dtype=torch.int32)

Example 2:

>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4]
>>> a
tensor([[ 0,  4,  8],
        [12, 16, 20]], dtype=torch.int32)
>>> a.is_contiguous()
False
>>> b = k2r.create_ragged_tensor(a)
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> b.dtype
torch.int32
>>> a[0, 0] = 10
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> a
tensor([[10,  4,  8],
        [12, 16, 20]], dtype=torch.int32)

Example 3:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4)
>>> a
tensor([[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],
        [[12., 13., 14., 15.],
         [16., 17., 18., 19.],
         [20., 21., 22., 23.]]])
>>> b = k2r.create_ragged_tensor(a)
>>> b
RaggedTensor([[[0, 1, 2, 3],
               [4, 5, 6, 7],
               [8, 9, 10, 11]],
              [[12, 13, 14, 15],
               [16, 17, 18, 19],
               [20, 21, 22, 23]]], dtype=torch.float32)
Parameters

tensor – An N-D (N > 1) tensor.

Returns

Return a ragged tensor.

index

k2.ragged.index(src: torch.Tensor, indexes: _k2.ragged.RaggedTensor, default_value: object = None) _k2.ragged.RaggedTensor

Use a ragged tensor to index a 1-d torch tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> i = k2r.RaggedTensor([ [1, 5, 3], [0, 2] ])
>>> src = torch.arange(6, dtype=torch.int32) * 10
>>> src
tensor([ 0, 10, 20, 30, 40, 50], dtype=torch.int32)
>>> k2r.index(src, i)
RaggedTensor([[10, 50, 30],
              [0, 20]], dtype=torch.int32)
>>> k = k2r.RaggedTensor([ [[1, 5, 3], [0]], [[0, 2], [1, 3]] ])
>>> k2r.index(src, k)
RaggedTensor([[[10, 50, 30],
               [0]],
              [[0, 20],
               [10, 30]]], dtype=torch.int32)
>>> n = k2r.RaggedTensor([ [1, -1], [-1, 0], [-1] ])
>>> k2r.index(src, n)
RaggedTensor([[10, 0],
              [0, 0],
              [0]], dtype=torch.int32)
>>> k2r.index(src, n, default_value=-2)
RaggedTensor([[10, -2],
              [-2, 0],
              [-2]], dtype=torch.int32)
Parameters
  • src – A 1-D torch tensor.

  • indexes – A ragged tensor with dtype torch.int32.

  • default_value – Used only when an entry in indexes is -1, in which case it returns default_value as -1 is not a valid index. If it is None and an entry in indexes is -1, 0 is returned.

Returns

Return a ragged tensor with the same dtype and device as src.

index_and_sum

k2.ragged.index_and_sum(src: torch.Tensor, indexes: _k2.ragged.RaggedTensor) torch.Tensor

Index a 1-D tensor with a ragged tensor of indexes, perform a sum-per-sublist operation, and return the resulting 1-D tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> i = k2r.RaggedTensor([[1, 3, 5], [0, 2, 3]])
>>> src = torch.arange(6, dtype=torch.float32) * 10
>>> src
tensor([ 0., 10., 20., 30., 40., 50.])
>>> k2r.index_and_sum(src, i)
tensor([90., 50.])
>>> k = k2r.RaggedTensor([[1, -1, 2], [-1], [2, 5, -1]])
>>> k2r.index_and_sum(src, k)
tensor([30.,  0., 70.])
Parameters
  • src – A 1-D tensor.

  • indexes – A ragged tensor with two axes. Its dtype MUST be torch.int32. For instance, it can be the arc map returned from the function remove_epsilon. If an index is -1, the resulting sublist is 0.

Returns

Return a 1-D tensor with the same dtype and device as src.

random_ragged_shape

k2.ragged.random_ragged_shape(set_row_ids: bool = False, min_num_axes: int = 2, max_num_axes: int = 4, min_num_elements: int = 0, max_num_elements: int = 2000) _k2.ragged.RaggedShape

RandomRaggedShape

regular_ragged_shape

k2.ragged.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape

Create a ragged shape with 2 axes that has a regular structure.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3)
>>> shape1
[ [ x x x ] [ x x x ] ]
>>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2)
>>> shape2
[ [ x x ] [ x x ] [ x x ] ]
Parameters
  • dim0 – Number of entries at axis 0.

  • dim1 – Number of entries in each sublist at axis 1.

Returns

Return a ragged shape on CPU.

RaggedShape

__eq__

RaggedShape.__eq__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool

Return True if two shapes are equal. Otherwise, return False.

Caution

The two shapes have to be on the same device. Otherwise, it throws an exception.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape3 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape1 == shape2
False
>>> shape3 == shape2
True
Parameters

other – The shape that we want to compare with self.

Returns

Return True if the two shapes are the same. Return False otherwise.

__getitem__

RaggedShape.__getitem__(self: _k2.ragged.RaggedShape, i: int) _k2.ragged.RaggedShape

Select the i-th sublist along axis 0.

Note

It requires that this shape has at least 3 axes.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x] [x x]] [[x x x] [] [x x]] ]')
>>> shape[0]
[ [ x ] [ x x ] ]
>>> shape[1]
[ [ x x x ] [ ] [ x x ] ]
Parameters

i – The i-th sublist along axis 0.

Returns

Return a new ragged shape with one fewer axis.

__init__

RaggedShape.__init__(self: _k2.ragged.RaggedShape, s: str) None

Construct a ragged shape from a string.

An example string for a ragged shape with 2 axes is:

[ [x x] [ ] [x] ]

An example string for a ragged shape with 3 axes is:

[ [[x] []] [[x] [x x]] ]
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x] ]')
>>> shape
[ [ x ] [ ] [ x x ] ]
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[]] ]')
>>> shape2
[ [ [ x ] [ ] [ x x ] ] [ [ ] ] ]

__ne__

RaggedShape.__ne__(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) bool

Return True if two shapes are not equal. Otherwise, return False.

Caution

The two shapes have to be on the same device. Otherwise, it throws an exception.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape3 = k2r.RaggedShape('[ [x] [x] ]')
>>> shape1 != shape2
True
>>> shape2 != shape3
False
Parameters

other – The shape that we want to compare with self.

Returns

Return True if the two shapes are not equal. Return False otherwise.

__repr__

RaggedShape.__repr__(self: _k2.ragged.RaggedShape) str

Return a string representation of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]')
>>> print(shape)
[ [ x ] [ ] [ x x ] ]
>>> shape
[ [ x ] [ ] [ x x ] ]

__str__

RaggedShape.__str__(self: _k2.ragged.RaggedShape) str

Return a string representation of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x ] ]')
>>> print(shape)
[ [ x ] [ ] [ x x ] ]
>>> shape
[ [ x ] [ ] [ x x ] ]

compose

RaggedShape.compose(self: _k2.ragged.RaggedShape, other: _k2.ragged.RaggedShape) _k2.ragged.RaggedShape

Compose self with a given shape.

Caution

other and self MUST be on the same device.

Hint

In order to compose self with other, it has to satisfy self.tot_size(self.num_axes - 1) == other.dim0

Example 1:

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [x x] [x] ]')
>>> shape2 = k2r.RaggedShape('[ [x x x] [x x] [] ]')
>>> shape1.compose(shape2)
[ [ [ x x x ] [ x x ] ] [ [ ] ] ]

Example 2:

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape('[ [[x x] [x x x] []] [[x] [x x x x]] ]')
>>> shape2 = k2r.RaggedShape('[ [x] [x x x] [] [] [x x] [x] [] [x x x x] [] [x x] ]')
>>> shape1.compose(shape2)
[ [ [ [ x ] [ x x x ] ] [ [ ] [ ] [ x x ] ] [ ] ] [ [ [ x ] ] [ [ ] [ x x x x ] [ ] [ x x ] ] ] ]
>>> shape1.tot_size(shape1.num_axes - 1)
10
>>> shape2.dim0
10
Parameters

other – The other shape that is to be composed with self.

Returns

Return a composed ragged shape.

get_layer

RaggedShape.get_layer(self: _k2.ragged.RaggedShape, arg0: int) _k2.ragged.RaggedShape

Returns a sub-shape of self.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x x] [x] []] [[] [x x x] [x]] [[]] ]')
>>> shape.get_layer(0)
[ [ x x x ] [ x x x ] [ x ] ]
>>> shape.get_layer(1)
[ [ x x ] [ x ] [ ] [ ] [ x x x ] [ x ] [ ] ]
Parameters

layer – Layer that is desired, from 0 .. src.num_axes - 2 (inclusive).

Returns

This returned shape will have num_axes == 2, the minimal case of a RaggedShape.

index

RaggedShape.index(self: _k2.ragged.RaggedShape, axis: int, indexes: torch.Tensor, need_value_indexes: bool = True) Tuple[_k2.ragged.RaggedShape, Optional[torch.Tensor]]

Indexing operation on a ragged shape, returns self[indexes], where elements of indexes are interpreted as indexes into axis axis of self``.

Caution

indexes is a 1-D tensor and indexes.dtype == torch.int32.

Example 1:

>>> shape = k2r.RaggedShape('[ [x x] [x] [x x x] ]')
>>> value = torch.arange(6, dtype=torch.float32) * 10
>>> ragged = k2r.RaggedTensor(shape, value)
>>> ragged
[ [ 0 10 ] [ 20 ] [ 30 40 50 ] ]
>>> i = torch.tensor([0, 2, 1], dtype=torch.int32)
>>> sub_shape, value_indexes = shape.index(axis=0, indexes=i, need_value_indexes=True)
>>> sub_shape
[ [ x x ] [ x x x ] [ x ] ]
>>> value_indexes
tensor([0, 1, 3, 4, 5, 2], dtype=torch.int32)
>>> ragged.data[value_indexes.long()]
tensor([ 0., 10., 30., 40., 50., 20.])
>>> k = torch.tensor([0, -1, 1, 0, 2, -1], dtype=torch.int32)
>>> sub_shape2, value_indexes2 = shape.index(axis=0, indexes=k, need_value_indexes=True)
>>> sub_shape2
[ [ x x ] [ ] [ x ] [ x x ] [ x x x ] [ ] ]
>>> value_indexes2
tensor([0, 1, 2, 0, 1, 3, 4, 5], dtype=torch.int32)

Example 2:

>>> import torch
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x x] [x]] [[] [x x x] [x]] [[x] [] [] [x x]] ]')
>>> i = torch.tensor([0, 1, 3, 5, 7, 8], dtype=torch.int32)
>>> shape.index(axis=1, indexes=i)
([ [ [ x x ] [ x ] ] [ [ x x x ] ] [ [ x ] [ ] [ x x ] ] ], tensor([0, 1, 2, 3, 4, 5, 7, 8, 9], dtype=torch.int32))
Parameters
  • axis – The axis to be indexed. Must satisfy 0 <= axis < self.num_axes.

  • indexes – Array of indexes, which will be interpreted as indexes into axis axis of self, i.e. with 0 <= indexes[i] < self.tot_size(axis). Note that if axis is 0, then -1 is also a valid entry in index, in which case, an empty list is returned.

  • need_value_indexes

    If True, it will return a torch.Tensor containing the indexes into ragged_tensor.data that ans.data has, as in ans.data = ragged_tensor.data[value_indexes], where ragged_tensor uses self as its shape.

    Caution

    It is currently not allowed to change the order on axes less than axis, i.e. if axis > 0, we require: IsMonotonic(self.row_ids(axis)[indexes]).

Returns

Return an indexed ragged shape.

max_size

RaggedShape.max_size(self: _k2.ragged.RaggedShape, axis: int) int

Return the maximum number of elements of any sublist at the given axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [] [x] [x x] [x x x] [x x x x] ]')
>>> shape.max_size(1)
4
>>> shape = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]')
>>> shape.max_size(1)
5
>>> shape.max_size(2)
4
Parameters

axis

Compute the max size of this axis.

Caution

axis has to be greater than 0.

Returns

Return the maximum number of elements of sublists at the given axis.

numel

RaggedShape.numel(self: _k2.ragged.RaggedShape) int

Return the number of elements in this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]')
>>> shape.numel()
6
>>> shape2 = k2r.RaggedShape('[ [[x x] [x] [] [] []] [[x]] [[x x x x]] ]')
>>> shape2.numel()
8
>>> shape3 = k2r.RaggedShape('[ [x x x] [x] ]')
>>> shape3.numel()
4
Returns

Return the number of elements in this shape.

Hint

It’s the number of x’s.

regular_ragged_shape

static RaggedShape.regular_ragged_shape(dim0: int, dim1: int) _k2.ragged.RaggedShape

Create a ragged shape with 2 axes that has a regular structure.

>>> import k2.ragged as k2r
>>> shape1 = k2r.RaggedShape.regular_ragged_shape(dim0=2, dim1=3)
>>> shape1
[ [ x x x ] [ x x x ] ]
>>> shape2 = k2r.regular_ragged_shape(dim0=3, dim1=2)
>>> shape2
[ [ x x ] [ x x ] [ x x ] ]
Parameters
  • dim0 – Number of entries at axis 0.

  • dim1 – Number of entries in each sublist at axis 1.

Returns

Return a ragged shape on CPU.

remove_axis

RaggedShape.remove_axis(self: _k2.ragged.RaggedShape, axis: int) _k2.ragged.RaggedShape

Remove a certain axis.

Caution

self.num_axes MUST be greater than 2.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x x x x]] [[] [] []]]')
>>> shape.remove_axis(0)
[ [ x ] [ ] [ x x ] [ x x x ] [ x x x x ] [ ] [ ] [ ] ]
>>> shape.remove_axis(1)
[ [ x x x ] [ x x x x x x x ] [ ] ]
Parameters

axis – The axis to be removed.

Returns

Return a ragged shape with one fewer axis.

row_ids

RaggedShape.row_ids(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor

Return the row ids of a certain axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> shape.row_ids(1)
tensor([0, 0, 2, 2, 2], dtype=torch.int32)
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]')
>>> shape2.row_ids(1)
tensor([0, 0, 0, 1, 1, 1, 1, 1], dtype=torch.int32)
>>> shape2.row_ids(2)
tensor([0, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5], dtype=torch.int32)
Parameters
  • axis – The axis whose row ids is to be returned.

  • Hintaxis >= 1.

Returns

Return the row ids of the given axis.

row_splits

RaggedShape.row_splits(self: _k2.ragged.RaggedShape, axis: int) torch.Tensor

Return the row splits of a certain axis.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> shape.row_splits(1)
tensor([0, 2, 2, 5], dtype=torch.int32)
>>> shape2 = k2r.RaggedShape('[ [[x] [] [x x]] [[x x x] [x] [x x x x] [] []] ]')
>>> shape2.row_splits(1)
tensor([0, 3, 8], dtype=torch.int32)
>>> shape2.row_splits(2)
tensor([ 0,  1,  1,  3,  6,  7, 11, 11, 11], dtype=torch.int32)
Parameters
  • axis – The axis whose row splits is to be returned.

  • Hintaxis >= 1.

Returns

Return the row splits of the given axis.

to

RaggedShape.to(self: _k2.ragged.RaggedShape, device: object) _k2.ragged.RaggedShape

Move this shape to the specified device.

Hint

If the shape is already on the specified device, the returned shape shares the underlying memory with self.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[x]]')
>>> shape.device
device(type='cpu')
>>> import torch
>>> shape2 = shape.to(torch.device('cuda', 0))
>>> shape2.device
device(type='cuda', index=0)
>>> shape
[ [ x ] ]
>>> shape2
[ [ x ] ]
Parameters

device – An instance of torch.device. It can be either a CPU device or a CUDA device.

Returns

Return a shape on the given device.

tot_size

RaggedShape.tot_size(self: _k2.ragged.RaggedShape, axis: int) int

Return the number of elements at a certain``axis``.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [x x] [x x x] []]')
>>> shape.tot_size(1)
6
>>> shape.numel()
6
>>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[]] ]')
>>> shape2.tot_size(1)
7
>>> shape2 = k2r.RaggedShape('[ [[x]] [[x x]] [[x x x]] [[]] [[]] [[]] [[] []] ]')
>>> shape2.tot_size(1)
8
>>> shape2.tot_size(2)
6
>>> shape2.numel()
6
Parameters

axis – Return the number of elements for this axis.

Returns

Return the number of elements at axis.

tot_sizes

RaggedShape.tot_sizes(self: _k2.ragged.RaggedShape) tuple

Return total sizes of every axis in a tuple.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [ ] [x x x x]]')
>>> shape.dim0
3
>>> shape.tot_size(1)
5
>>> shape.tot_sizes()
(3, 5)
>>> shape2 = k2r.RaggedShape('[ [[x] []] [[x x x x]]]')
>>> shape2.dim0
2
>>> shape2.tot_size(1)
3
>>> shape2.tot_size(2)
5
>>> shape2.tot_sizes()
(2, 3, 5)
Returns

Return a tuple containing the total sizes of each axis. ans[i] is the total size of axis i (for i > 0). For i=0, it is the dim0 of this shape.

device

RaggedShape.device

Return the device of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[]]')
>>> shape.device
device(type='cpu')
>>> import torch
>>> shape2 = shape.to(torch.device('cuda', 0))
>>> shape2.device
device(type='cuda', index=0)

dim0

RaggedShape.dim0

Return number of sublists at axis 0.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x] [] [x x x x x]]')
>>> shape.dim0
3
>>> shape2 = k2r.RaggedShape('[ [[x] []] [[]] [[x] [x x] [x x x]] [[]]]')
>>> shape2.dim0
4

num_axes

RaggedShape.num_axes

Return the number of axes of this shape.

>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[[] []]')
>>> shape.num_axes
2
>>> shape2 = k2r.RaggedShape('[ [[]] [[]]]')
>>> shape2.num_axes
3

RaggedTensor

__eq__

RaggedTensor.__eq__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool

Compare two ragged tensors.

Caution

The two tensors MUST have the same dtype. Otherwise, it throws an exception.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1]])
>>> b = a.clone()
>>> a ==  b
True
>>> c = a.to(torch.float32)
>>> try:
...   c == b
... except RuntimeError:
...   print("raised exception")
Parameters

other – The tensor to be compared.

Returns

Return True if the two tensors are equal. Return False otherwise.

__getitem__

RaggedTensor.__getitem__(*args, **kwargs)

Overloaded function.

  1. __getitem__(self: _k2.ragged.RaggedTensor, i: int) -> object

Select the i-th sublist along axis 0.

Caution

Support for autograd is to be implemented.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]]  [[8]] ]')
>>> a
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]]], dtype=torch.int32)
>>> a[0]
RaggedTensor([[1, 3],
              [],
              [9]], dtype=torch.int32)
>>> a[1]
RaggedTensor([[8]], dtype=torch.int32)

Example 2:

>>> a = k2r.RaggedTensor('[ [1 3] [9] [8] ]')
>>> a
RaggedTensor([[1, 3],
              [9],
              [8]], dtype=torch.int32)
>>> a[0]
tensor([1, 3], dtype=torch.int32)
>>> a[1]
tensor([9], dtype=torch.int32)
Parameters

i – The i-th sublist along axis 0.

Returns

Return a new ragged tensor with one fewer axis. If num_axes == 2, the return value will be a 1D tensor.

  1. __getitem__(self: _k2.ragged.RaggedTensor, key: slice) -> _k2.ragged.RaggedTensor

Slices sublists along axis 0 with the given range. Only support slicing step equals to 1.

Caution

Support for autograd is to be implemented.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]]  [[8]] [[10 11]] ]')
>>> a
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]],
              [[10, 11]]], dtype=torch.int32)
>>> a[0:2]
RaggedTensor([[[1, 3],
               [],
               [9]],
              [[8]]], dtype=torch.int32)
>>> a[1:2]
RaggedTensor([[[8]]], dtype=torch.int32)
Parameters

key – Slice containing integer constants.

Returns

Return a new ragged tensor with the same axes as original ragged tensor, but only contains the sublists within the range.

  1. __getitem__(self: _k2.ragged.RaggedTensor, key: torch.Tensor) -> _k2.ragged.RaggedTensor

Slice a ragged tensor along axis 0 using a 1-D torch.int32 tensor.

Example 1:

>>> import k2
>>> a = k2.RaggedTensor([[1, 2, 0], [0, 1], [2, 3]])
>>> b = k2.RaggedTensor([[10, 20], [300], [-10, 0, -1], [-2, 4, 5]])
>>> a[0]
tensor([1, 2, 0], dtype=torch.int32)
>>> b[a[0]]
RaggedTensor([[300],
              [-10, 0, -1],
              [10, 20]], dtype=torch.int32)
>>> a[1]
tensor([0, 1], dtype=torch.int32)
>>> b[a[1]]
RaggedTensor([[10, 20],
              [300]], dtype=torch.int32)
>>> a[2]
tensor([2, 3], dtype=torch.int32)
>>> b[a[2]]
RaggedTensor([[-10, 0, -1],
              [-2, 4, 5]], dtype=torch.int32)

Example 2:

>>> import torch
>>> import k2
>>> a = k2.RaggedTensor([ [[1], [2, 3], [0]], [[], [2]], [[10, 20]] ])
>>> i = torch.tensor([0, 2, 1, 0], dtype=torch.int32)
>>> a[i]
RaggedTensor([[[1],
               [2, 3],
               [0]],
              [[10, 20]],
              [[],
               [2]],
              [[1],
               [2, 3],
               [0]]], dtype=torch.int32)
Parameters

key – A 1-D torch.int32 tensor containing the indexes to select along axis 0.

Returns

Return a new ragged tensor with the same number of axes as self but only contains the specified sublists.

__getstate__

RaggedTensor.__getstate__(self: k2.RaggedTensor) tuple

Requires a tensor with 2 axes or 3 axes. Other number of axes are not implemented yet.

This method is to support pickle, e.g., used by torch.save(). You are not expected to call it by yourself.

Returns

If this tensor has 2 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.values). If this tensor has 3 axes, return a tuple containing (self.row_splits(1), “row_ids1”, self.row_splits(1), “row_ids2”, self.values)

Note

“row_ids1” and “row_ids2” in the returned value is for backward compatibility.

__init__

RaggedTensor.__init__(*args, **kwargs)

Overloaded function.

  1. __init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: object = ‘cpu’) -> None

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64)
>>> c
RaggedTensor([[1]], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.RaggedTensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)

Example 2:

>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ])
RaggedTensor([[[1, 2]],
              [],
              [[]]], dtype=torch.int32)
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0')
RaggedTensor([[[1, 2]],
              [],
              [[]]], device='cuda:0', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, data: list, dtype: object = None, device: str = ‘cpu’) -> None

Create a ragged tensor with arbitrary number of axes.

Note

A ragged tensor has at least two axes.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [1, 2], [5], [], [9] ])
>>> a
RaggedTensor([[1, 2],
              [5],
              [],
              [9]], dtype=torch.int32)
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor([ [1, 3.0], [] ])
>>> b
RaggedTensor([[1, 3],
              []], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = k2r.RaggedTensor([ [1] ], dtype=torch.float64)
>>> c
RaggedTensor([[1]], dtype=torch.float64)
>>> c.dtype
torch.float64
>>> d = k2r.RaggedTensor([ [[1], [2, 3]], [[4], []] ])
>>> d
RaggedTensor([[[1],
               [2, 3]],
              [[4],
               []]], dtype=torch.int32)
>>> d.num_axes
3
>>> e = k2r.RaggedTensor([])
>>> e
RaggedTensor([], dtype=torch.int32)
>>> e.num_axes
2
>>> e.shape.row_splits(1)
tensor([0], dtype=torch.int32)
>>> e.shape.row_ids(1)
tensor([], dtype=torch.int32)

Example 2:

>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ])
RaggedTensor([[[1, 2]],
              [],
              [[]]], dtype=torch.int32)
>>> k2r.RaggedTensor([ [[1, 2]], [], [[]] ], device='cuda:0')
RaggedTensor([[[1, 2]],
              [],
              [[]]], device='cuda:0', dtype=torch.int32)
Parameters
  • data – A list-of sublist(s) of integers or real numbers. It can have arbitrary number of axes (at least two).

  • dtype – Optional. If None, it infers the dtype from data automatically, which is either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: object = ‘cpu’) -> None

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
RaggedTensor([[[],
               [3]],
              [[10]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.RaggedTensor('[[1.]]')
>>> c.dtype
torch.float32
>>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0')
>>> d
RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, s: str, dtype: object = None, device: str = ‘cpu’) -> None

Create a ragged tensor from its string representation.

Fields are separated by space(s) or comma(s).

An example string for a 2-axis ragged tensor is given below:

[ [1] [2] [3, 4], [5 6 7, 8] ]

An example string for a 3-axis ragged tensor is given below:

[ [[1]] [[]] ]
>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [1] [] [3 4] ]')
>>> a
RaggedTensor([[1],
              [],
              [3, 4]], dtype=torch.int32)
>>> a.num_axes
2
>>> a.dtype
torch.int32
>>> b = k2r.RaggedTensor('[ [[] [3]]  [[10]] ]', dtype=torch.float32)
>>> b
RaggedTensor([[[],
               [3]],
              [[10]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> b.num_axes
3
>>> c = k2r.RaggedTensor('[[1.]]')
>>> c.dtype
torch.float32
>>> d = k2r.RaggedTensor('[[1.]]', device='cuda:0')
>>> d
RaggedTensor([[1]], device='cuda:0', dtype=torch.float32)

Note

Number of spaces or commas in s does not affect the result. Of course, numbers have to be separated by at least one space or comma.

Parameters
  • s – A string representation of a ragged tensor.

  • dtype – The desired dtype of the tensor. If it is None, it tries to infer the correct dtype from s, which is assumed to be either torch.int32 or torch.float32. Supported dtypes are: torch.int32, torch.float32, and torch.float64.

  • device – It can be either an instance of torch.device or a string representing a torch device. Example values are: "cpu", "cuda:0", torch.device("cpu"), torch.device("cuda", 0).

  1. __init__(self: _k2.ragged.RaggedTensor, shape: _k2.ragged.RaggedShape, value: torch.Tensor) -> None

Create a ragged tensor from a shape and a value.

>>> import torch
>>> import k2.ragged as k2r
>>> shape = k2r.RaggedShape('[ [x x] [] [x x x] ]')
>>> value = torch.tensor([10, 0, 20, 30, 40], dtype=torch.float32)
>>> ragged = k2r.RaggedTensor(shape, value)
>>> ragged
RaggedTensor([[10, 0],
              [],
              [20, 30, 40]], dtype=torch.float32)
Parameters
  • shape – The shape of the tensor.

  • value – The value of the tensor.

  1. __init__(self: _k2.ragged.RaggedTensor, tensor: torch.Tensor) -> None

Create a ragged tensor from a torch tensor.

Note

It turns a regular tensor into a ragged tensor.

Caution

The input tensor has to have more than 1 dimension. That is tensor.ndim > 1.

Also, if the input tensor is contiguous, self will share the underlying memory with it. Otherwise, memory of the input tensor is copied to create self.

Supported dtypes of the input tensor are: torch.int32, torch.float32, and torch.float64.

Example 1:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(6, dtype=torch.int32).reshape(2, 3)
>>> b = k2r.RaggedTensor(a)
>>> a
tensor([[0, 1, 2],
        [3, 4, 5]], dtype=torch.int32)
>>> b
RaggedTensor([[0, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> a.is_contiguous()
True
>>> a[0, 0] = 10
>>> b
RaggedTensor([[10, 1, 2],
              [3, 4, 5]], dtype=torch.int32)
>>> b.values[1] = -2
>>> a
tensor([[10, -2,  2],
        [ 3,  4,  5]], dtype=torch.int32)

Example 2:

>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.int32).reshape(2, 12)[:, ::4]
>>> a
tensor([[ 0,  4,  8],
        [12, 16, 20]], dtype=torch.int32)
>>> a.is_contiguous()
False
>>> b = k2r.RaggedTensor(a)
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> a[0, 0] = 10
>>> b
RaggedTensor([[0, 4, 8],
              [12, 16, 20]], dtype=torch.int32)
>>> a
tensor([[10,  4,  8],
        [12, 16, 20]], dtype=torch.int32)

Example 3:

>>> import torch
>>> import k2.ragged as k2r
>>> a = torch.arange(24, dtype=torch.float32).reshape(2, 3, 4)
>>> a
tensor([[[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]],
        [[12., 13., 14., 15.],
         [16., 17., 18., 19.],
         [20., 21., 22., 23.]]])
>>> b = k2r.RaggedTensor(a)
>>> b
RaggedTensor([[[0, 1, 2, 3],
               [4, 5, 6, 7],
               [8, 9, 10, 11]],
              [[12, 13, 14, 15],
               [16, 17, 18, 19],
               [20, 21, 22, 23]]], dtype=torch.float32)
>>> b.dtype
torch.float32
>>> c = torch.tensor([[1, 2]], device='cuda:0', dtype=torch.float32)
>>> k2r.RaggedTensor(c)
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.float32)
Parameters

tensor – An N-D (N > 1) tensor.

__ne__

RaggedTensor.__ne__(self: _k2.ragged.RaggedTensor, other: _k2.ragged.RaggedTensor) bool

Compare two ragged tensors.

Caution

The two tensors MUST have the same dtype. Otherwise, it throws an exception.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [3]])
>>> b = a.clone()
>>> b != a
False
>>> c = k2r.RaggedTensor([[1], [2], [3]])
>>> c != a
True
Parameters

other – The tensor to be compared.

Returns

Return True if the two tensors are NOT equal. Return False otherwise.

__repr__

RaggedTensor.__repr__(self: _k2.ragged.RaggedTensor) str

Return a string representation of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3], []])
>>> a
RaggedTensor([[1],
              [2, 3],
              []], dtype=torch.int32)
>>> str(a)
'RaggedTensor([[1],\n              [2, 3],\n              []], dtype=torch.int32)'
>>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0')
>>> b
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)

__setstate__

RaggedTensor.__setstate__(self: k2.RaggedTensor, arg0: tuple) None

Set the content of this class from arg0.

This method is to support pickle, e.g., used by torch.load(). You are not expected to call it by yourself.

Parameters

arg0 – It is the return value from the method __getstate__.

__str__

RaggedTensor.__str__(self: _k2.ragged.RaggedTensor) str

Return a string representation of this tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1], [2, 3], []])
>>> a
RaggedTensor([[1],
              [2, 3],
              []], dtype=torch.int32)
>>> str(a)
'RaggedTensor([[1],\n              [2, 3],\n              []], dtype=torch.int32)'
>>> b = k2r.RaggedTensor([[1, 2]], device='cuda:0')
>>> b
RaggedTensor([[1, 2]], device='cuda:0', dtype=torch.int32)

add

RaggedTensor.add(self: _k2.ragged.RaggedTensor, value: torch.Tensor, alpha: object) _k2.ragged.RaggedTensor

Add value scaled by alpha to source ragged tensor over the last axis.

It implements:

dest[…][i][j] = src[…][i][j] + alpha * value[i]

>>> import k2.ragged as k2r
>>> import torch
>>> src = k2r.RaggedTensor([[1, 3], [1], [2, 8]], dtype=torch.int32)
>>> value = torch.tensor([1, 2, 3], dtype=torch.int32)
>>> src.add(value, 1)
RaggedTensor([[2, 4],
              [3],
              [5, 11]], dtype=torch.int32)
>>> src.add(value, -1)
RaggedTensor([[0, 2],
              [-1],
              [-1, 5]], dtype=torch.int32)
Parameters
  • value – The value to be added to the self, whose dimension MUST equal the number of sublists along the last dimension of self.

  • alpha – The number used to scaled value before adding to self.

Returns

Returns a new RaggedTensor, sharing the same dtype and device with self.

arange

RaggedTensor.arange(self: _k2.ragged.RaggedTensor, axis: int, begin: int, end: int) _k2.ragged.RaggedTensor

Return a sub-range of self containing indexes begin through end - 1 along axis axis of self.

The axis argument may be confusing; its behavior is equivalent to:

for i in range(axis):
  self = self.remove_axis(0)

return self.arange(0, begin, end)

Caution

The returned tensor shares the underlying memory with self.

Example 1

>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([ [[1], [], [2]], [[], [4, 5], []], [[], [1]], [[]] ])
>>> a
RaggedTensor([[[1],
               [],
               [2]],
              [[],
               [4, 5],
               []],
              [[],
               [1]],
              [[]]], dtype=torch.int32)
>>> a.num_axes
3
>>> b = a.arange(axis=0, begin=1, end=3)
>>> b
RaggedTensor([[[],
               [4, 5],
               []],
              [[],
               [1]]], dtype=torch.int32)
>>> b.num_axes
3
>>> c = a.arange(axis=0, begin=1, end=2)
>>> c
RaggedTensor([[[],
               [4, 5],
               []]], dtype=torch.int32)
>>> c.num_axes
3
>>> d = a.arange(axis=1, begin=0, end=4)
>>> d
RaggedTensor([[1],
              [],
              [2],
              []], dtype=torch.int32)
>>> d.num_axes
2
>>> e = a.arange(axis=1, begin=2, end=5)
>>> e
RaggedTensor([[2],
              [],
              [4, 5]], dtype=torch.int32)
>>> e.num_axes
2

Example 2

>>> a = k2r.RaggedTensor([ [[[], [1], [2, 3]],[[5, 8], [], [9]]], [[[10], [0], []]], [[[], [], [1]]] ])
>>> a.num_axes
4
>>> b = a.arange(axis=0, begin=0, end=2)
>>> b
RaggedTensor([[[[],
                [1],
                [2, 3]],
               [[5, 8],
                [],
                [9]]],
              [[[10],
                [0],
                []]]], dtype=torch.int32)
>>> b.num_axes
4
>>> c = a.arange(axis=1, begin=1, end=3)
>>> c
RaggedTensor([[[5, 8],
               [],
               [9]],
              [[10],
               [0],
               []]], dtype=torch.int32)
>>> c.num_axes
3
>>> d = a.arange(axis=2, begin=0, end=5)
>>> d
RaggedTensor([[],
              [1],
              [2, 3],
              [5, 8],
              []], dtype=torch.int32)
>>> d.num_axes
2

Example 3

>>> a = k2r.RaggedTensor([[0], [1], [2], [], [3]])
>>> a
RaggedTensor([[0],
              [1],
              [2],
              [],
              [3]], dtype=torch.int32)
>>> a.num_axes
2
>>> b = a.arange(axis=0, begin=1, end=4)
>>> b
RaggedTensor([[1],
              [2],
              []], dtype=torch.int32)
>>> b.values[0] = -1
>>> a
RaggedTensor([[0],
              [-1],
              [2],
              [],
              [3]], dtype=torch.int32)
Parameters
  • axis – The axis from which begin and end correspond to.

  • begin – The beginning of the range (inclusive).