/TensorFlow 2.3


View source on GitHub

One-hot encodes a text into a list of word indexes of size n.

This function receives as input a string of text and returns a list of encoded integers each corresponding to a word (or token) in the given input string.

input_text Input text (string).
n int. Size of vocabulary.
filters list (or concatenation) of characters to filter out, such as punctuation. Default: !"#$%&()*+,-./:;<=>[email protected][\]^_`{|}~\t\n, includes basic punctuation, tabs, and newlines.
lower boolean. Whether to set the text to lowercase.
split str. Separator for word splitting.
List of integers in [1, n]. Each integer encodes a word (unicity non-guaranteed).

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.