[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT deprecate attributes in Partial Least Squares module #18768

Merged

Conversation

marenwestermann
Copy link
Member

Reference Issues/PRs

Towards #14312
Closes #18764

What does this implement/fix? Explain your changes.

Deprecation of the attributes x_mean_, y_mean_, x_std_, y_std_ in module _pls.py

Any other comments?

ping @adrinjalali

@adrinjalali
Copy link
Member

we're already deprecating a few attributes in 0.24 in these classes, and this PR deprecates more (which I'm in favor of), but there are a bunch of other attributes which can be made private as well, like, all of these maybe:

    x_weights_ : ndarray of shape (n_features, n_components)
        The left singular vectors of the cross-covariance matrices of each
        iteration.

    y_weights_ : ndarray of shape (n_targets, n_components)
        The right singular vectors of the cross-covariance matrices of each
        iteration.

    x_loadings_ : ndarray of shape (n_features, n_components)
        The loadings of `X`.

    y_loadings_ : ndarray of shape (n_targets, n_components)
        The loadings of `Y`.

    x_scores_ : ndarray of shape (n_samples, n_components)
        The transformed training samples.

    y_scores_ : ndarray of shape (n_samples, n_components)
        The transformed training targets.

    x_rotations_ : ndarray of shape (n_features, n_components)
        The projection matrix used to transform `X`.

    y_rotations_ : ndarray of shape (n_features, n_components)
        The projection matrix used to transform `Y`.

should we deprecate them and include them in the release? Kinda looks cleaner if they're all deprecated in the same release, WDYT @ogrisel @NicolasHug

@NicolasHug
Copy link
Member
NicolasHug commented Nov 5, 2020

apart from the scores (already deprecated), the rest of the attributes are relevant and shouldn't be deprecated

@adrinjalali
Copy link
Member

what do you mean by relevant? I mean, what's the difference between x_mean_ and and x_rotations_ for instance in your opinion?

@NicolasHug
Copy link
Member

Copy link
Member
@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @marenwestermann , looks good.

we'll also need a quick test making sure these are deprecated. You can parametrize with all estimators and with all attributes.

Copy link
Member
@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @marenwestermann , LGTM if green!

sklearn/cross_decomposition/tests/test_pls.py Outdated Show resolved Hide resolved
@NicolasHug
Copy link
Member
NicolasHug commented Nov 6, 2020

@adrinjalali to expand on my (probably too short) reply above re the attributes: I think they're more relevant than the means and stds because they can be used for some specific purposes and they can also be used for inspection. The rotations_ matrices are used to project the input into the transformed space, so they're basically the equivalent of the components_ attributes for PCA. The weights are also related to projections, and the loadings are related to the regression (these estimators can do both transform and predict.)

@adrinjalali
Copy link
Member

@adrinjalali to expand on my (probably too short) reply above re the attributes: I think they're more relevant than the means and stds because they can be used for some specific purposes and they can also be used for inspection. The rotations_ matrices are used to project the input into the transformed space, so they're basically the equivalent of the components_ attributes for PCA. The weights are also related to projections, and the loadings are related to the regression (these estimators can do both transform and predict.)

fair point, that's convincing :)

@marenwestermann
Copy link
Member Author

@NicolasHug I added the TODO note and removed the classes from the IGNORED set in sklearn/tests/test_docstring_parameters.py as suggested by @glemaitre in #18764 (I hope it's correct). Do I also need to leave a note in doc/whats_new/v0.24.rst saying that the attributes here are deprecated?

@@ -488,6 +488,18 @@ def test_norm_y_weights_deprecation(Est):
est.norm_y_weights


# TODO: Remove test in 0.26
@pytest.mark.parametrize('Est', (PLSRegression, PLSCanonical, CCA, PLSSVD))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize('Est', (PLSRegression, PLSCanonical, CCA, PLSSVD))
@pytest.mark.parametrize('Estimator', (PLSRegression, PLSCanonical, CCA, PLSSVD))

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need an entry in the what's new to announce the deprecation of the parameters since they could be used.

Otherwise, LGTM

@@ -488,6 +488,18 @@ def test_norm_y_weights_deprecation(Est):
est.norm_y_weights


# TODO: Remove test in 0.26
@pytest.mark.parametrize('Est', (PLSRegression, PLSCanonical, CCA, PLSSVD))
@pytest.mark.parametrize('attr', ("x_mean_", "y_mean_", "x_std_", "y_std_"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize('attr', ("x_mean_", "y_mean_", "x_std_", "y_std_"))
@pytest.mark.parametrize('attribute', ("x_mean_", "y_mean_", "x_std_", "y_std_"))

Y = rng.randn(10, 3)
est = Est().fit(X, Y)
with pytest.warns(FutureWarning, match=f"{attr} was deprecated"):
getattr(est, attr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
getattr(est, attr)
getattr(estimato, attribute)

X = rng.randn(10, 5)
Y = rng.randn(10, 3)
est = Est().fit(X, Y)
with pytest.warns(FutureWarning, match=f"{attr} was deprecated"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with pytest.warns(FutureWarning, match=f"{attr} was deprecated"):
with pytest.warns(FutureWarning, match=f"{attribute} was deprecated"):

rng = np.random.RandomState(0)
X = rng.randn(10, 5)
Y = rng.randn(10, 3)
est = Est().fit(X, Y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
est = Est().fit(X, Y)
estimator = Estimator().fit(X, Y)

@marenwestermann
Copy link
Member Author

I implemented the suggested changes and made an entry in the what's new section.

Copy link
Member
@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali
Copy link
Member

ping @ogrisel for the release

@adrinjalali adrinjalali merged commit 0e6d415 into scikit-learn:master Nov 11, 2020
@adrinjalali adrinjalali added this to the 0.24 milestone Nov 11, 2020
@marenwestermann marenwestermann deleted the pls-attribute-deprecation branch November 11, 2020 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants