|View source on GitHub|
Reads CSV files into a dataset.
Compat aliases for migration
See Migration guide for more details.
tf.data.experimental.make_csv_dataset( file_pattern, batch_size, column_names=None, column_defaults=None, label_name=None, select_columns=None, field_delim=',', use_quote_delim=True, na_value='', header=True, num_epochs=None, shuffle=True, shuffle_buffer_size=10000, shuffle_seed=None, prefetch_buffer_size=None, num_parallel_reads=None, sloppy=False, num_rows_for_inference=100, compression_type=None, ignore_errors=False )
Reads CSV files into a dataset, where each element is a (features, labels) tuple that corresponds to a batch of CSV rows. The features dictionary maps feature column names to
Tensors containing the corresponding feature data, and labels is a
Tensor containing the batch's label data.
| || List of files or patterns of file paths containing CSV records. See |
| ||An int representing the number of records to combine in a single batch.|
| ||An optional list of strings that corresponds to the CSV columns, in order. One per column of the input record. If this is not provided, infers the column names from the first row of the records. These names will be the keys of the features dict of each dataset element.|
| || A optional list of default values for the CSV fields. One item per selected column of the input record. Each item in the list is either a valid CSV dtype (float32, float64, int32, int64, or string), or a |
| || A optional string corresponding to the label column. If provided, the data for this column is returned as a separate |
| || An optional list of integer indices or string column names, that specifies a subset of columns of CSV data to select. If column names are provided, these must correspond to names provided in |
| || An optional |
| || An optional bool. Defaults to |
| ||Additional string to recognize as NA/NaN.|
| ||A bool that indicates whether the first rows of provided CSV files correspond to header lines with column names, and should not be included in the data.|
| ||An int specifying the number of times this dataset is repeated. If None, cycles through the dataset forever.|
| ||A bool that indicates whether the input should be shuffled.|
| ||Buffer size to use for shuffling. A large buffer size ensures better shuffling, but increases memory usage and startup time.|
| ||Randomization seed to use for shuffling.|
| ||An int specifying the number of feature batches to prefetch for performance improvement. Recommended value is the number of batches consumed per training step. Defaults to auto-tune.|
| || Number of threads used to read CSV records from files. If >1, the results will be interleaved. Defaults to |
| || If |
| ||Number of rows of a file to use for type inference if record_defaults is not provided. If None, reads all the rows of all the files. Defaults to 100.|
| || (Optional.) A |
| || (Optional.) If |
| A dataset, where each element is a (features, labels) tuple that corresponds to a batch of |
| ||If any of the arguments is malformed.|
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.