W3cubDocs

statsmodels.nonparametric.kernel_density.EstimatorSettings

class statsmodels.nonparametric.kernel_density.EstimatorSettings(efficient=False, randomize=False, n_res=25, n_sub=50, return_median=True, return_only_bw=False, n_jobs=-1) [source]

Object to specify settings for density estimation or regression.

EstimatorSettings has several proporties related to how bandwidth estimation for the KDEMultivariate, KDEMultivariateConditional, KernelReg and CensoredKernelReg classes behaves.

Parameters:

Parameters:	efficient (bool, optional) – If True, the bandwidth estimation is to be performed efficiently – by taking smaller sub-samples and estimating the scaling factor of each subsample. This is useful for large samples (nobs >> 300) and/or multiple variables (k_vars > 3). If False (default), all data is used at the same time. randomize (bool, optional) – If True, the bandwidth estimation is to be performed by taking `n_res` random resamples (with replacement) of size `n_sub` from the full sample. If set to False (default), the estimation is performed by slicing the full sample in sub-samples of size `n_sub` so that all samples are used once. n_sub (int, optional) – Size of the sub-samples. Default is 50. n_res (int, optional) – The number of random re-samples used to estimate the bandwidth. Only has an effect if `randomize == True`. Default value is 25. return_median (bool, optional) – If True (default), the estimator uses the median of all scaling factors for each sub-sample to estimate the bandwidth of the full sample. If False, the estimator uses the mean. return_only_bw (bool, optional) – If True, the estimator is to use the bandwidth and not the scaling factor. This is not theoretically justified. Should be used only for experimenting. n_jobs (int, optional) – The number of jobs to use for parallel estimation with `joblib.Parallel`. Default is -1, meaning `n_cores - 1`, with `n_cores` the number of available CPU cores. See the joblib documentation for more details.

efficient (bool, optional) – If True, the bandwidth estimation is to be performed efficiently – by taking smaller sub-samples and estimating the scaling factor of each subsample. This is useful for large samples (nobs >> 300) and/or multiple variables (k_vars > 3). If False (default), all data is used at the same time.
randomize (bool, optional) – If True, the bandwidth estimation is to be performed by taking n_res random resamples (with replacement) of size n_sub from the full sample. If set to False (default), the estimation is performed by slicing the full sample in sub-samples of size n_sub so that all samples are used once.
n_sub (int, optional) – Size of the sub-samples. Default is 50.
n_res (int, optional) – The number of random re-samples used to estimate the bandwidth. Only has an effect if randomize == True. Default value is 25.
return_median (bool, optional) – If True (default), the estimator uses the median of all scaling factors for each sub-sample to estimate the bandwidth of the full sample. If False, the estimator uses the mean.
return_only_bw (bool, optional) – If True, the estimator is to use the bandwidth and not the scaling factor. This is not theoretically justified. Should be used only for experimenting.
n_jobs (int, optional) – The number of jobs to use for parallel estimation with joblib.Parallel. Default is -1, meaning n_cores - 1, with n_cores the number of available CPU cores. See the joblib documentation for more details.

Examples

>>> settings = EstimatorSettings(randomize=True, n_jobs=3)
>>> k_dens = KDEMultivariate(data, var_type, defaults=settings)

Methods

© 2009–2012 Statsmodels Developers
© 2006–2008 Scipy Developers
© 2006 Jonathan E. Taylor
Licensed under the 3-clause BSD License.
http://www.statsmodels.org/stable/generated/statsmodels.nonparametric.kernel_density.EstimatorSettings.html