[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different result of cross entropy loss between result from model as function and using predict function #56555

Closed
Xinshuai-Lyu opened this issue Jun 23, 2022 · 4 comments
Assignees
Labels
comp:keras Keras related issues stat:awaiting response Status - Awaiting response from author TF 2.8 type:bug Bug

Comments

@Xinshuai-Lyu
Copy link
Xinshuai-Lyu commented Jun 23, 2022
Click to expand!

Issue Type

Bug

Source

source

Tensorflow Version

2.8.2

Custom Code

Yes

OS Platform and Distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

Firstly, you get a prediction using model as a function, call this predition A; and then you use predict function of the model and get the same seemingly same prediction, call it B.

If you use CategoricalCrossentropy to calculate loss with same label, the loss are different between them.

Standalone code to reproduce the issue

class NLPModel(keras.Model):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)
    self.hidden_layer1 = tf.keras.layers.GRU(400)
    self.outputs = tf.keras.layers.Dense(3, activation="sigmoid")
  def call(self, inputs):
    outputs = self.hidden_layer1(inputs)
    outputs = self.outputs(outputs)
    return outputs
model = NLPModel()
label = tf.constant([[1.,0.,0.]])
prediction = model(tf.constant([[[0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.9],
                                  [0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.9]]]))
cce = tf.keras.losses.CategoricalCrossentropy()
loss = cce(label, prediction)
def custom_cce(label, prediction):
  return tf.reduce_mean(tf.math.log(tf.matmul(label, tf.transpose(prediction)) / tf.reduce_sum(prediction)))
print("before numpy")
print(prediction)
print(loss)
print("before numpy")
print("after custom_cce")
print(custom_cce(label, prediction))
print("after custom_cce")
prediction = prediction.numpy()
loss = cce(label, prediction)
print("after numpy")
print(prediction)
print(loss)
print("after numpy")

Relevant log output

before numpy
tf.Tensor([[0.5021686  0.48505145 0.5026401 ]], shape=(1, 3), dtype=float32)
tf.Tensor(1.0769439, shape=(), dtype=float32)
before numpy
after custom_cce
tf.Tensor(-1.0875016, shape=(), dtype=float32)
after custom_cce
after numpy
[[0.5021686  0.48505145 0.5026401 ]]
tf.Tensor(1.0875016, shape=(), dtype=float32)
after numpy
@google-ml-butler google-ml-butler bot added the type:bug Bug label Jun 23, 2022
@sushreebarsa sushreebarsa added TF 2.8 comp:apis Highlevel API related issues labels Jun 24, 2022
@sushreebarsa
Copy link
Contributor

@chunduriv I was able to replicate the issue on colab, please find the gist here for reference. Thank you!

@chunduriv chunduriv added comp:keras Keras related issues and removed comp:apis Highlevel API related issues labels Jun 28, 2022
@chunduriv
Copy link
Contributor

@Xinshuai-Lyu,

Thank you for opening this issue.

Development of keras moved to separate repository https://github.com/keras-team/keras/issues.

Please post this issue on keras-team/keras repo. For more details please refer here.

@chunduriv chunduriv added the stat:awaiting response Status - Awaiting response from author label Jun 28, 2022
@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@Xinshuai-Lyu
Copy link
Author

keras-team/tf-keras#537

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues stat:awaiting response Status - Awaiting response from author TF 2.8 type:bug Bug
Projects
None yet
Development

No branches or pull requests

3 participants