Handle NaN gracefully in sampler.MarkovChain #69

dvandyk · 2020-06-27T12:11:39Z

Lieber Fred,

I sometimes encounter NaNs in the log(target) in corners of the parameter space. It would be very helpful, if this modification or something similar to it could end up in a released version of pypmc.

I am also happy to change the name of the keyword argument if you disagree. I had considered e.g. ignore_NaN. Let my know what you think!

Cheers
Danny

dvandyk · 2020-09-20T09:32:52Z

@fredRos Bump :-)

fredRos

For me , it was always useful to bail out on NaN because our target density shouldn't return that. You can always wrap your target function to catch NaN, so I am not sure this change is really needed. What is your use case that NaN is an OK value for the target density?

fredRos · 2020-09-26T21:07:55Z

pypmc/sampler/markov_chain.py

+                else:
+                    this_run[i_N] = self.current_point
+                    #do not need to update self.current
+                    #self.current = self.current


remove commented code

pypmc/sampler/markov_chain.py

fredRos · 2020-09-26T21:12:46Z

pypmc/sampler/markov_chain.py

-                #self.current = self.current
+            else:
+                # accept if rho = 1
+                if log_rho >=0:


How could I allow this code duplication? All of this should disappear and we can just check for log_rho >= _np.log(self.rng.rand())

I would prefer if we deduplicated code in a separate commit, or not at all. It is very readable in this way. But it's your decision in the end!

Also, it comes to mind that your proposed condition changes the number of evaluations of rng. Would make old computation non-repeatable. This is not the case for my changes, since previous computations would simply fail.

Ok, I vaguely remember we used to do something different in the two branches but we don't anymore. Yep, it's a different issue. So let's leave as is. And good point about rng evaluation. Easily solved with short-circuit semantics

if log_rho >=0 or log_rho >= _np.log(self.rng.rand()):

pypmc/sampler/markov_chain.py

dvandyk · 2020-09-28T08:09:47Z

For me , it was always useful to bail out on NaN because our target density shouldn't return that.

In my case I run into edges of the parameter space in which the target density is not defined. I cannot easily cut these regions out, since the boundaries of the parameter space cannot be determined analytically.

You can always wrap your target function to catch NaN, so I am not sure this change is really needed.

Yes, but then I need to assign a regular value for the target density instead, with a non-zero probability that the Markov chain accepts it nevertheless. In this way, I can reject the invalid point entirely and deterministically.

What is your use case that NaN is an OK value for the target density?

It's a light-cone sum rule calculation in EOS. The likelihood contains observables that are ill defined in a pocket of the (30dim) parameter space.

I'd also like to add that the PMC sample continues on NaN in the target density. Changing this here make both samplers work for the same set of target densities.

jPhy · 2020-09-28T17:35:00Z

On 28.09.20 10:10, Danny van Dyk wrote: For me , it was always useful to bail out on NaN because our target density shouldn't return that. In my case I run into edges of the parameter space in which the target density is not defined. I cannot easily cut these regions out, since the boundaries of the parameter space cannot be determined analytically. You can always wrap your target function to catch NaN, so I am not sure this change is really needed. Yes, but then I need to assign a regular value for the target density instead, with a non-zero probability that the Markov chain accepts it nevertheless. In this way, I can reject the invalid point entirely and deterministically.

Not really: I always had my density return '-numpy.inf' in that case since Limit[x-->0, x>0] log(x) = -infinity. Note that minus infinity is (at least was when I was actively working on pypmc) handled correctly and this is also how the builtin indicator functions are implemented.

…

What is your use case that NaN is an OK value for the target density? It's a light-cone sum rule calculation in EOS. The likelihood contains observables that are ill defined in a pocket of the (30dim) parameter space. I'd also like to add that the PMC sample continues on NaN in the target density. Changing this here make both samplers work for the same set of target densities. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#69 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNC62LA3TFB7BSX7QTTRH3SIBAFVANCNFSM4OKAEEIQ>.

fredRos · 2020-09-28T22:09:33Z

Yes, I agree with @jPhy that -inf is fine to set as a valid log probability such that the proposed point is always rejected.

I'd also like to add that the PMC sample continues on NaN in the target density. Changing this here make both samplers work for the same set of target densities.

Good point. Should create a separate issue for that. I don't know if the update equations are correct with nan values.

It's a light-cone sum rule calculation in EOS. The likelihood contains observables that are ill defined in a pocket of the (30dim) parameter space.

I see. In the end, it boils down to a question of convenience. What should happen and who has control. You might just want to continue, and could just return -inf instead of NaN. Another use might want to count the NaNs, or log a message or raise a different kind of exception. So I think the best solution is to leave this in the hands of the user.

The other question is: is it a good idea to raise an exception by default? I think yes because at some point I didn't have this check in there and then the output of the chain was not from the target density because some checks failed as NaN comparisons come out False. Now I know that subtle problem and can make my code robust.
Of course I could output a warning instead of an exception but this might swamp logs. I will sleep over this one night :)

fredRos · 2020-10-03T11:35:46Z

@dvandyk I thought about it. In terms of customer orientation, I'm happy to make that little change if it makes life easier for you. If you or anyone else wants more control over handling NaNs, that should be done in the client code; i.e. outside pypmc

fredRos

nearly done, only cosmetics left

fredRos · 2020-10-03T11:50:12Z

pypmc/sampler/markov_chain.py

-                #self.current = self.current
+            else:
+                # accept if rho = 1
+                if log_rho >=0:


Ok, I vaguely remember we used to do something different in the two branches but we don't anymore. Yep, it's a different issue. So let's leave as is. And good point about rng evaluation. Easily solved with short-circuit semantics

if log_rho >=0 or log_rho >= _np.log(self.rng.rand()):

pypmc/sampler/markov_chain.py

fredRos · 2020-10-03T11:53:39Z

pypmc/sampler/markov_chain.py

@@ -107,6 +107,11 @@ def run(self, N=1):

            An int which defines the number of steps to run the chain.

+        :param continue_on_NaN:
+
+            A boolean flag defining the behaviour when encountering an NaN.


Suggested change

A boolean flag defining the behaviour when encountering an NaN.

A boolean flag defining the behavior when encountering an NaN in the user-supplied target density for a proposed point.

fredRos · 2020-10-03T11:55:05Z

pypmc/sampler/markov_chain.py

+        :param continue_on_NaN:
+
+            A boolean flag defining the behaviour when encountering an NaN.
+            Default: False (-> raise ValueError)


Suggested change

Default: False (-> raise ValueError)

Default: ``False`` (-> raise ``ValueError``). If ``True``, reject the proposed point and continue.

fredRos · 2020-10-03T11:55:54Z

pypmc/sampler/markov_chain.py

-            # reject if not accepted
-            else:
+            if _np.isnan(log_rho):
+                # raise error if so desired


useless comment. Please remove

fredRos · 2021-02-14T22:15:36Z

@dvandyk THere were only small things left on this PR, let's take it over the finish line

fredRos · 2021-02-20T22:06:38Z

@dvandyk Friendly ping

…ampling with Markov chains Reviewed-by: Frederik Beaujean <beaujean@mpp.mpg.de>

dvandyk · 2021-02-21T11:24:43Z

Sorry for the delay, here it is!

fredRos · 2021-02-21T20:54:16Z

passed tests https://travis-ci.org/github/fredRos/pypmc/builds/759851896, well done!

fredRos requested changes Sep 26, 2020

View reviewed changes

dvandyk force-pushed the master branch from 3457d87 to ae4c3d9 Compare September 28, 2020 08:11

fredRos requested changes Oct 3, 2020

View reviewed changes

fredRos mentioned this pull request Oct 3, 2020

Remove code duplication #70

Closed

[sampler] Add code to gracefully handle NaN in the log(target) when s…

4c0a7aa

…ampling with Markov chains Reviewed-by: Frederik Beaujean <beaujean@mpp.mpg.de>

dvandyk force-pushed the master branch from ae4c3d9 to 4c0a7aa Compare February 21, 2021 11:24

fredRos merged commit 6b4f9ee into pypmc:master Feb 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle NaN gracefully in sampler.MarkovChain #69

Handle NaN gracefully in sampler.MarkovChain #69

	A boolean flag defining the behaviour when encountering an NaN.
	A boolean flag defining the behavior when encountering an NaN in the user-supplied target density for a proposed point.

	Default: False (-> raise ValueError)
	Default: ``False`` (-> raise ``ValueError``). If ``True``, reject the proposed point and continue.

Handle NaN gracefully in sampler.MarkovChain #69

Handle NaN gracefully in sampler.MarkovChain #69

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment