DEP loss_ attribute in gradient boosting #23079

lorentzenchr · 2022-04-08T11:38:54Z

Reference Issues/PRs

None.

What does this implement/fix? Explain your changes.

This PR deprecates the attribute loss_ of GradientBoostingClassifier and GradientBoostingRegressor.

Any other comments?

This will greatly simplify using the common losses under sklearn._loss in (old) gradient boosting.

thomasjpfan

Deprecating loss_ was mentioned once here: #15139 (comment) I am happy with deprecating it as well.

LGTM

doc/whats_new/v1.1.rst

lorentzenchr · 2022-04-08T19:36:51Z

This will greatly simplify using the common losses under sklearn._loss in (old) gradient boosting.

Meaning, this improvement will have to wait until 1.3 (PR will be large enough without keeping backwards compat loss_)😬

ogrisel

LGTM.

sam-s · 2022-05-31T21:09:26Z

What do we use instead of loss_?

lorentzenchr · 2022-06-01T06:32:06Z

For most of them, your can use the equivalent function under sklearn.metrics. It also depends on your use case. What do you want to achieve?

sam-s · 2022-06-13T18:24:15Z

For most of them, your can use the equivalent function under sklearn.metrics.

It would be nice if it were clearly documented - e.g., figuring out the need for sigmoid is not obvious.
Which metrics correspond to deviance and exponential?

It also depends on your use case. What do you want to achieve?

Please see the link in the original comment for code sample.

lorentzenchr · 2022-06-13T21:42:31Z

I meant: What is your use case? In words, not in code.
And sure, improving the docs would be great. Any volunteers?

Note that HistGradientBoostingRegressor and HistGradientBoostingClassifier might be better suited than the old GBDTs for large sample sizes. There, the (inverse) link functions are documented.

sam-s · 2022-06-14T03:32:24Z

the code allows to see the loss as a function of stage, which lets me figure out the optimal number of estimators.

sam-s · 2022-06-14T03:41:15Z

Also, I see nothing like _loss in HistGradientBoostingClassifier - except maybe that since the only loss that is supported is log_loss, my code would work there too.

lorentzenchr · 2022-06-14T17:06:39Z

the code allows to see the loss as a function of stage, which lets me figure out the optimal number of estimators.

Note that this would only give the training error. A better strategy is looking at the test/validation error, preferably with cross validation, e.g. with GridSearchCV.

sam-s · 2022-06-14T19:34:46Z

the code allows to see the loss as a function of stage, which lets me figure out the optimal number of estimators.

Note that this would only give the training error. A better strategy is looking at the test/validation error, preferably with cross validation, e.g. with GridSearchCV.

I am passing test_X to staged_decision_function, not train_X.

sam-s · 2022-06-16T14:31:00Z

preferably with cross validation, e.g. with GridSearchCV.

GridSearchCV is way too expensive compared to my method.

Has been deprecated in sklearn, see scikit-learn/scikit-learn#23079

DEP loss_ attribute in gradient boosting

b2dce7f

lorentzenchr added this to the 1.1 milestone Apr 8, 2022

github-actions bot added the module:ensemble label Apr 8, 2022

lorentzenchr added 2 commits April 8, 2022 18:43

DOC add whats new

b81c3da

DOC add pr number

bf9fd18

thomasjpfan approved these changes Apr 8, 2022

View reviewed changes

doc/whats_new/v1.1.rst Outdated Show resolved Hide resolved

CLN largest change ever

9a1df08

ogrisel approved these changes Apr 12, 2022

View reviewed changes

ogrisel merged commit 0d669dc into scikit-learn:main Apr 12, 2022

lorentzenchr deleted the dep_loss_attribute_gb branch April 12, 2022 20:44

jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Apr 29, 2022

DEP loss_ attribute in gradient boosting (scikit-learn#23079)

635875c

sam-s mentioned this pull request May 31, 2022

Shape Error on GradientBoostingClassfier.loss_ function #15139

Closed

sebp added a commit to sebp/scikit-survival that referenced this pull request Aug 2, 2022

Do not set loss_ attribute in gradient boosting

d89b92e

Has been deprecated in sklearn, see scikit-learn/scikit-learn#23079

sebp added a commit to sebp/scikit-survival that referenced this pull request Aug 7, 2022

Do not set loss_ attribute in gradient boosting

58e0fc3

Has been deprecated in sklearn, see scikit-learn/scikit-learn#23079

sebp added a commit to sebp/scikit-survival that referenced this pull request Aug 13, 2022

Do not set loss_ attribute in gradient boosting

e446939

Has been deprecated in sklearn, see scikit-learn/scikit-learn#23079

sebp added a commit to sebp/scikit-survival that referenced this pull request Aug 13, 2022

Do not set loss_ attribute in gradient boosting

029fc64

Has been deprecated in sklearn, see scikit-learn/scikit-learn#23079

lorentzenchr mentioned this pull request Mar 24, 2023

Use common loss module in gradient boosting #25964

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEP loss_ attribute in gradient boosting #23079

DEP loss_ attribute in gradient boosting #23079

DEP loss_ attribute in gradient boosting #23079

DEP loss_ attribute in gradient boosting #23079

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Choose a reason for hiding this comment

Choose a reason for hiding this comment