[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate MCMC notebook to gpflow2.0 #1100

Merged
merged 24 commits into from
Nov 13, 2019
Merged

Migrate MCMC notebook to gpflow2.0 #1100

merged 24 commits into from
Nov 13, 2019

Conversation

cdmatters
Copy link
Contributor

Migrate MCMC notebook to gpflow 2 using tensorflow probabilities.
Changes required a few changes across the rest of the codebase, outside of the notebook.

  • small change to leaf printing function (fix bug, which was preventing printing of composite kernels)
  • parameters and trainable_parameters now return tuples, not generators, like tf implementation of variables, trainable_variables
  • tfp.distributions now work with different dtypes (by wrapping of parameters), so now play nicely with gpflow.

awav and others added 3 commits October 7, 2019 09:37
Other changes required including:
*  small change to leaf printing function (fix bug, preventing printing
of composite kernels)
* `parameters` and `trainable_parameters` now return tuples, not
generators, like tf implementation of `variables`, `trainable_variables`
* tfp.distributions now have a choice of dtype (by wrapping of
parameters)
@codecov
Copy link
codecov bot commented Oct 14, 2019

Codecov Report

Merging #1100 into develop will decrease coverage by <.01%.
The diff coverage is 95.12%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1100      +/-   ##
===========================================
- Coverage    95.49%   95.48%   -0.01%     
===========================================
  Files           67       68       +1     
  Lines         3084     3125      +41     
===========================================
+ Hits          2945     2984      +39     
- Misses         139      141       +2
Impacted Files Coverage Δ
gpflow/optimizers/__init__.py 100% <100%> (ø) ⬆️
gpflow/optimizers/mcmc.py 95% <95%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff339f8...0db2ea0. Read the comment docs.

@cdmatters cdmatters requested a review from awav October 14, 2019 14:39
gpflow/utilities/utilities.py Outdated Show resolved Hide resolved
Fpred, Ypred = [], []
num_samples = len(samples[0])
for i in range(burn, num_samples, thin):
hparams = [hp[i] for hp in samples]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does hp stand for?

num_samples = len(samples[0])
for i in range(burn, num_samples, thin):
hparams = [hp[i] for hp in samples]
[var.assign(hp) for var, hp in zip(m.trainable_variables, hparams)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this compare in speed to the tf1 with feed-dict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no clue. Theres clearly iteration overhead to consider as well as the unknown speed of assign. I dont think we have a choice either way, though worth investigating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in general, I think this list comp is quite ugly and there should be a different way to set all the variables of the model, which is cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing them would be wrong as they are different things

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@condnsdmatters , there is a utility function for multiple assign

@@ -22,7 +23,8 @@ class RobustMax(tf.Module):
def __init__(self, num_classes, epsilon=1e-3, **kwargs):
super().__init__(**kwargs)
transform = tfp.bijectors.Sigmoid()
prior = tfp.distributions.Beta(0.2, 5.)
fdtype = lambda x: np.array(x, dtype=default_float())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does the f stand for?
if this is something needed throughout the code base, would it be worthwhile putting it into gpflow.utilities or so ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f here stands for float. i think this is a reasonable thing to go to utilities, perhaps under the name def to_default_float(x) etc

gpflow/optimizers/mcmc.py Outdated Show resolved Hide resolved
sample = param.transform.forward(values)
else:
sample = values
samples.append(sample.numpy())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the casting to numpy needed? Would this slow down the execution?

gpflow/optimizers/mcmc.py Outdated Show resolved Hide resolved
parameters: List of `Variable`'s or gpflow `Parameter`s used as a state of the Markov chain.
"""

target_log_prob_fn: Callable[[ModelParameters], tf.Tensor]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is used as if it doesn't take any arguments, so maybe the type hint is wrong?

gpflow/optimizers/mcmc.py Show resolved Hide resolved
gpflow/optimizers/mcmc.py Show resolved Hide resolved
@awav
Copy link
Member
awav commented Oct 14, 2019

@condnsdmatters, Thanks for updating the notebook.

Could we not mix thing up in a single PR?

small change to leaf printing function (fix bug, which was preventing printing of composite kernels)

That must be separate PR with testing and cetera. I have a suspicion that it has been resolved already.

parameters and trainable_parameters now return tuples, not generators, like tf implementation of variables, trainable_variables.

I don't think that variables return a tuple, it is a list. Why this change is important here?

tfp.distributions now work with different dtypes (by wrapping of parameters), so now play nicely with gpflow.

I don't understand the problem with it. Could you explain what you are trying to do with it?

In general, when I wrote SamplingHelper, I adjusted it to the specific project for solving the needs. It requires much more thought about how the interface should look for a user. Let's discuss it offline.

Thanks!

@cdmatters
Copy link
Contributor Author
cdmatters commented Oct 15, 2019

In reply to @awav, I think I misunderstood the helper you wrote - my understanding was I should use as is, rather than thinking about interfaces, etc. I'm very happy to discuss and change, I agree it seems to be a bit of an odd class atm.

Vis a vis other changes:

  • The printing does indeed seem to be fixed
  • The variables and trainable_variables parameters on tf.Module return tuples technically not lists: see here However, the key problem here is that at the moment the module wrapper we have neither returns lists nor tuples, but rather a generator for self._flatten. If you don't wrap the generator in an iterable, when you pass the generator to the frozenset SampleHelper, the generator exhausts after its first use. eg:
    params = m.trainable_parameters
    y = [p for p in params]  # List[Params]
    z = [p for p in params]  # [] Empty! 
  • The problem with tf.distributions is that there is no way to chose the dtype explicitly for the Normal and Gamma distributions, and there is no way to change the default dtype in tensorflow (as per the issue you opened). tfp implementation of these distributions instead infers the dtype from the dtype of the parameters to the distribution eg here, and then passes these explicitly to the base Distribution class. The end result is that if you want to chose the dtype for the distributions (as we need to when operating in 64bit mode in gpflow), the only way you can do this is by wrapping the parameters before sending them in to the constructor. Most unsatisfactory I know.

I am more than happy to put these into separate commits and PRs, as I do agree that's better from a git commit history point of view. I can restructure these now.

@st-- I'll also address all the comments in this PR before doing that.

Co-Authored-By: st-- <st--@users.noreply.github.com>
@awav awav closed this Nov 3, 2019
@awav
Copy link
Member
awav commented Nov 3, 2019

Hello @condnsdmatters , I apologize for closing your PR. It was closed automatically when I moved develop-2.0 to the develop. Could you re-open the PR? Thanks!

@awav awav reopened this Nov 3, 2019
@awav awav changed the base branch from develop-2.0 to develop November 3, 2019 23:30
gpflow/optimizers/mcmc.py Outdated Show resolved Hide resolved
gpflow/optimizers/mcmc.py Outdated Show resolved Hide resolved
gpflow/optimizers/mcmc.py Show resolved Hide resolved

return _target_log_prob_fn_closure

def convert_samples_to_parameter_values(self, hmc_samples):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def convert_samples_to_parameter_values(self, hmc_samples):
def convert_to_constrained_values(self, hmc_samples):

Eric Hambro and others added 3 commits November 12, 2019 16:09
@cdmatters
Copy link
Contributor Author

@awav I fundamentally cant change the grad function to remove None's due to the requirements that we differentiate w.r.t to the held params on the model. As it says in the tf.custom_gradient documentation:

If f uses Variables (that are not part of the inputs), i.e. through get_variable, then grad_fn should have signature g(*grad_ys, variables=None), where variables is a list of the Variables, and return a 2-tuple (grad_xs, grad_vars), where grad_xs is the same as above, and grad_vars is a list with the derivatives of Tensors in y with respect to the variables.

gpflow/optimizers/mcmc.py Outdated Show resolved Hide resolved
@awav awav merged commit 24b5733 into develop Nov 13, 2019
@awav awav deleted the eric/develop-2.0/hmc-helper branch November 13, 2019 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants