-
-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update _mds.py #18094
Update _mds.py #18094
Conversation
Implemented the normalized stress value from Borchmann's stalled PR: scikit-learn#10168 With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.
Hi @rotheconrad! Thanks for your contribution. I see you've mentioned #10168, but I wanted to add that there are two other open PRs (#12285 and #13042) that would also address this issue. I would recommend checking with the author of #13042 to verify the status of that PR, as there was some in-progress discussion regarding implementation details that could be relevant here. |
I am not really familiar with this technique. I read quickly the original reference: https://link.springer.com/content/pdf/10.1007/BF02289565.pdf. In the previous PR, @jnothman raised the question if it was necessary to perform the normalization at each step of the SMACOF algorithm (I think that's why @amueller tag "benchmark needed"). From what I can read in p. 8-9, the raw stress does not have scale invariance property. Therefore, I think that one would need to play with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need a test.
I think that the test in https://github.com/scikit-learn/scikit-learn/pull/10168/files#diff-4170fd64d149a193329356d7f55e78d1R64 was a good start
@@ -54,6 +55,10 @@ def _smacof_single(dissimilarities, metric=True, n_components=2, init=None, | |||
Pass an int for reproducible results across multiple function calls. | |||
See :term: `Glossary <random_state>`. | |||
|
|||
normalize : boolean, optional, default: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalize : boolean, optional, default: False | |
normalize : bool, default=False |
stress = np.sqrt(stress / | ||
((disparities.ravel() ** 2).sum() / 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stress = np.sqrt(stress / | |
((disparities.ravel() ** 2).sum() / 2)) | |
stress = np.sqrt( | |
stress / ((disparities.ravel() ** 2).sum() / 2) | |
) |
@@ -204,6 +217,10 @@ def smacof(dissimilarities, *, metric=True, n_components=2, init=None, | |||
return_n_iter : bool, default=False | |||
Whether or not to return the number of iterations. | |||
|
|||
normalize : boolean, optional, default: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalize : boolean, optional, default: False | |
normalize : bool, default=False |
@@ -326,6 +347,10 @@ class MDS(BaseEstimator): | |||
Pre-computed dissimilarities are passed directly to ``fit`` and | |||
``fit_transform``. | |||
|
|||
normalize : boolean, optional, default: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normalize : boolean, optional, default: False | |
normalize : bool, default=False |
Closing as superseded by #22562. |
Implemented the normalized stress value from Borchmann's stalled PR: #10168
With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?