|View source on GitHub|
Loads the IMDB dataset.
Compat aliases for migration
See Migration guide for more details.
tf.keras.datasets.imdb.load_data( path='imdb.npz', num_words=None, skip_top=0, maxlen=None, seed=113, start_char=1, oov_char=2, index_from=3, **kwargs )
| || where to cache the data (relative to |
| ||max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept|
| ||skip the top N most frequently occurring words (which may not be informative).|
| ||sequences longer than this will be filtered out.|
| ||random seed for sample shuffling.|
| ||The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.|
| || words that were cut out because of the |
| ||index actual words with this index and higher.|
| ||Used for backwards compatibility.|
| Tuple of Numpy arrays: |
| || in case |
Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the
num_words cut here. Words that were not seen in the training set but are in the test set have simply been skipped.
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.