Update _mds.py #18094

rotheconrad · 2020-08-04T22:37:27Z

Implemented the normalized stress value from Borchmann's stalled PR: #10168
With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Implemented the normalized stress value from Borchmann's stalled PR: scikit-learn#10168 With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.

joshuacwnewton · 2020-08-07T23:52:32Z

Please note: I'm not a core maintainer. I'm leaving my review to help move things along, but two core maintainers will need to approve this PR for it to be merged. Thanks for your patience!

Hi @rotheconrad! Thanks for your contribution. I see you've mentioned #10168, but I wanted to add that there are two other open PRs (#12285 and #13042) that would also address this issue. I would recommend checking with the author of #13042 to verify the status of that PR, as there was some in-progress discussion regarding implementation details that could be relevant here.

glemaitre · 2020-08-20T13:24:29Z

I am not really familiar with this technique. I read quickly the original reference: https://link.springer.com/content/pdf/10.1007/BF02289565.pdf.

In the previous PR, @jnothman raised the question if it was necessary to perform the normalization at each step of the SMACOF algorithm (I think that's why @amueller tag "benchmark needed").

From what I can read in p. 8-9, the raw stress does not have scale invariance property. Therefore, I think that one would need to play with the eps parameter depending on the input dataset. If stress-1 has this property, and if I understand properly, the stopping criterion will have the same meaning with different data.

glemaitre

We will need a test.
I think that the test in https://github.com/scikit-learn/scikit-learn/pull/10168/files#diff-4170fd64d149a193329356d7f55e78d1R64 was a good start

glemaitre · 2020-08-20T13:24:48Z

sklearn/manifold/_mds.py

@@ -54,6 +55,10 @@ def _smacof_single(dissimilarities, metric=True, n_components=2, init=None,
        Pass an int for reproducible results across multiple function calls.
        See :term: `Glossary <random_state>`.

+    normalize : boolean, optional, default: False


Suggested change

normalize : boolean, optional, default: False

normalize : bool, default=False

glemaitre · 2020-08-20T13:26:21Z

sklearn/manifold/_mds.py

+            stress = np.sqrt(stress /
+                             ((disparities.ravel() ** 2).sum() / 2))


Suggested change

stress = np.sqrt(stress /

((disparities.ravel() ** 2).sum() / 2))

stress = np.sqrt(

stress / ((disparities.ravel() ** 2).sum() / 2)

)

glemaitre · 2020-08-20T13:26:35Z

sklearn/manifold/_mds.py

@@ -204,6 +217,10 @@ def smacof(dissimilarities, *, metric=True, n_components=2, init=None,
    return_n_iter : bool, default=False
        Whether or not to return the number of iterations.

+    normalize : boolean, optional, default: False


Suggested change

normalize : boolean, optional, default: False

normalize : bool, default=False

glemaitre · 2020-08-20T13:26:52Z

sklearn/manifold/_mds.py

@@ -326,6 +347,10 @@ class MDS(BaseEstimator):
            Pre-computed dissimilarities are passed directly to ``fit`` and
            ``fit_transform``.

+    normalize : boolean, optional, default: False


Suggested change

normalize : boolean, optional, default: False

normalize : bool, default=False

cmarmo · 2022-05-02T19:13:00Z

Closing as superseded by #22562.

Update _mds.py

b34bf99

Implemented the normalized stress value from Borchmann's stalled PR: scikit-learn#10168 With these changes, passing normalize=True returns a meaningful stress value between 0-1. The current returned stress value is basically useless. normalize is set to False by default.

github-actions bot added the module:manifold label Aug 4, 2020

glemaitre reviewed Aug 20, 2020

View reviewed changes

glemaitre added this to REVIEWED AND WAITING FOR CHANGES in Guillaume's pet Aug 20, 2020

Base automatically changed from master to main January 22, 2021 10:52

cmarmo added help wanted Stalled labels Mar 25, 2021

Micky774 mentioned this pull request Feb 20, 2022

ENH Calculate normed stress (Stress-1) in manifold.MDS #22562

Merged

cmarmo added Superseded PR has been replace by a newer PR and removed Stalled help wanted labels Apr 22, 2022

thomasjpfan mentioned this pull request Apr 27, 2022

DOC Detail superseded workflow for PRs #23220

Merged

cmarmo closed this May 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update _mds.py #18094

Update _mds.py #18094

	normalize : boolean, optional, default: False
	normalize : bool, default=False

		stress = np.sqrt(stress /
		((disparities.ravel() ** 2).sum() / 2))

Update _mds.py #18094

Update _mds.py #18094

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment