Loader for species distribution dataset from Phillips et. al. (2006).
Read more in the User Guide.
Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in ‘~/scikit_learn_data’ subfolders.
If False, raise an OSError if the data is not locally available instead of trying to download the data from the source site.
Number of retries when HTTP errors are encountered.
Added in version 1.5.
Number of seconds between retries.
Added in version 1.5.
Bunch
Dictionary-like object, with the following attributes.
These represent the 14 features measured at each point of the map grid. The latitude/longitude values for the grid are discussed below. Missing data is represented by the value -9999.
The training points for the data. Each point has three fields:
The test points for the data. Same format as the training data.
The number of longitudes (x) and latitudes (y) in the grid
The (x,y) position of the lower-left corner, in degrees
The spacing between points of the grid, in degrees
This dataset represents the geographic distribution of species. The dataset is provided by Phillips et. al. (2006).
The two species are:
>>> from sklearn.datasets import fetch_species_distributions
>>> species = fetch_species_distributions()
>>> species.train[:5]
array([(b'microryzomys_minutus', -64.7 , -17.85 ),
(b'microryzomys_minutus', -67.8333, -16.3333),
(b'microryzomys_minutus', -67.8833, -16.3 ),
(b'microryzomys_minutus', -67.8 , -16.2667),
(b'microryzomys_minutus', -67.9833, -15.9 )],
dtype=[('species', 'S22'), ('dd long', '<f4'), ('dd lat', '<f4')])
For a more extended example, see Species distribution modeling
© 2007–2025 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/1.6/modules/generated/sklearn.datasets.fetch_species_distributions.html