class sklearn.linear_model.LogisticRegression(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=’warn’, max_iter=100, multi_class=’warn’, verbose=0, warm_start=False, n_jobs=None)
[source]
Logistic Regression (aka logit, MaxEnt) classifier.
In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the ‘multi_class’ option is set to ‘ovr’, and uses the cross- entropy loss if the ‘multi_class’ option is set to ‘multinomial’. (Currently the ‘multinomial’ option is supported only by the ‘lbfgs’, ‘sag’ and ‘newton-cg’ solvers.)
This class implements regularized logistic regression using the ‘liblinear’ library, ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).
The ‘newton-cg’, ‘sag’, and ‘lbfgs’ solvers support only L2 regularization with primal formulation. The ‘liblinear’ solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty.
Read more in the User Guide.
Parameters: |
|
---|---|
Attributes: |
|
See also
SGDClassifier
loss="log"
).LogisticRegressionCV
The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.
Predict output may not match that of standalone liblinear in certain cases. See differences from liblinear in the narrative documentation.
>>> from sklearn.datasets import load_iris >>> from sklearn.linear_model import LogisticRegression >>> X, y = load_iris(return_X_y=True) >>> clf = LogisticRegression(random_state=0, solver='lbfgs', ... multi_class='multinomial').fit(X, y) >>> clf.predict(X[:2, :]) array([0, 0]) >>> clf.predict_proba(X[:2, :]) array([[9.8...e-01, 1.8...e-02, 1.4...e-08], [9.7...e-01, 2.8...e-02, ...e-08]]) >>> clf.score(X, y) 0.97...
decision_function (X) | Predict confidence scores for samples. |
densify () | Convert coefficient matrix to dense array format. |
fit (X, y[, sample_weight]) | Fit the model according to the given training data. |
get_params ([deep]) | Get parameters for this estimator. |
predict (X) | Predict class labels for samples in X. |
predict_log_proba (X) | Log of probability estimates. |
predict_proba (X) | Probability estimates. |
score (X, y[, sample_weight]) | Returns the mean accuracy on the given test data and labels. |
set_params (**params) | Set the parameters of this estimator. |
sparsify () | Convert coefficient matrix to sparse format. |
__init__(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=’warn’, max_iter=100, multi_class=’warn’, verbose=0, warm_start=False, n_jobs=None)
[source]
decision_function(X)
[source]
Predict confidence scores for samples.
The confidence score for a sample is the signed distance of that sample to the hyperplane.
Parameters: |
|
---|---|
Returns: |
|
densify()
[source]
Convert coefficient matrix to dense array format.
Converts the coef_
member (back) to a numpy.ndarray. This is the default format of coef_
and is required for fitting, so calling this method is only required on models that have previously been sparsified; otherwise, it is a no-op.
Returns: |
|
---|
fit(X, y, sample_weight=None)
[source]
Fit the model according to the given training data.
Parameters: |
|
---|---|
Returns: |
|
get_params(deep=True)
[source]
Get parameters for this estimator.
Parameters: |
|
---|---|
Returns: |
|
predict(X)
[source]
Predict class labels for samples in X.
Parameters: |
|
---|---|
Returns: |
|
predict_log_proba(X)
[source]
Log of probability estimates.
The returned estimates for all classes are ordered by the label of classes.
Parameters: |
|
---|---|
Returns: |
|
predict_proba(X)
[source]
Probability estimates.
The returned estimates for all classes are ordered by the label of classes.
For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.
Parameters: |
|
---|---|
Returns: |
|
score(X, y, sample_weight=None)
[source]
Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: |
|
---|---|
Returns: |
|
set_params(**params)
[source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter>
so that it’s possible to update each component of a nested object.
Returns: |
|
---|
sparsify()
[source]
Convert coefficient matrix to sparse format.
Converts the coef_
member to a scipy.sparse matrix, which for L1-regularized models can be much more memory- and storage-efficient than the usual numpy.ndarray representation.
The intercept_
member is not converted.
Returns: |
|
---|
For non-sparse models, i.e. when there are not many zeros in coef_
, this may actually increase memory usage, so use this method with care. A rule of thumb is that the number of zero elements, which can be computed with (coef_ == 0).sum()
, must be more than 50% for this to provide significant benefits.
After calling this method, further fitting with the partial_fit method (if any) will not work until you call densify.
sklearn.linear_model.LogisticRegression
© 2007–2018 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html