Different result of cross entropy loss between result from model as function and using predict function #56555

Xinshuai-Lyu · 2022-06-23T20:52:36Z

Click to expand!

Issue Type

Bug

Source

source

Tensorflow Version

2.8.2

Custom Code

Yes

OS Platform and Distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

Firstly, you get a prediction using model as a function, call this predition A; and then you use predict function of the model and get the same seemingly same prediction, call it B.

If you use CategoricalCrossentropy to calculate loss with same label, the loss are different between them.

Standalone code to reproduce the issue

class NLPModel(keras.Model):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)
    self.hidden_layer1 = tf.keras.layers.GRU(400)
    self.outputs = tf.keras.layers.Dense(3, activation="sigmoid")
  def call(self, inputs):
    outputs = self.hidden_layer1(inputs)
    outputs = self.outputs(outputs)
    return outputs
model = NLPModel()
label = tf.constant([[1.,0.,0.]])
prediction = model(tf.constant([[[0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.9],
                                  [0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.4, 0.5, 0.54,0.9]]]))
cce = tf.keras.losses.CategoricalCrossentropy()
loss = cce(label, prediction)
def custom_cce(label, prediction):
  return tf.reduce_mean(tf.math.log(tf.matmul(label, tf.transpose(prediction)) / tf.reduce_sum(prediction)))
print("before numpy")
print(prediction)
print(loss)
print("before numpy")
print("after custom_cce")
print(custom_cce(label, prediction))
print("after custom_cce")
prediction = prediction.numpy()
loss = cce(label, prediction)
print("after numpy")
print(prediction)
print(loss)
print("after numpy")

Relevant log output

before numpy
tf.Tensor([[0.5021686  0.48505145 0.5026401 ]], shape=(1, 3), dtype=float32)
tf.Tensor(1.0769439, shape=(), dtype=float32)
before numpy
after custom_cce
tf.Tensor(-1.0875016, shape=(), dtype=float32)
after custom_cce
after numpy
[[0.5021686  0.48505145 0.5026401 ]]
tf.Tensor(1.0875016, shape=(), dtype=float32)
after numpy

sushreebarsa · 2022-06-28T09:53:56Z

@chunduriv I was able to replicate the issue on colab, please find the gist here for reference. Thank you!

chunduriv · 2022-06-28T11:19:12Z

@Xinshuai-Lyu,

Thank you for opening this issue.

Development of keras moved to separate repository https://github.com/keras-team/keras/issues.

Please post this issue on keras-team/keras repo. For more details please refer here.

google-ml-butler · 2022-06-30T10:00:45Z

Are you satisfied with the resolution of your issue?
Yes
No

Xinshuai-Lyu · 2022-06-30T10:01:21Z

keras-team/tf-keras#537

google-ml-butler bot added the type:bug Bug label Jun 23, 2022

google-ml-butler bot assigned sushreebarsa Jun 23, 2022

sushreebarsa added TF 2.8 comp:apis Highlevel API related issues labels Jun 24, 2022

sushreebarsa assigned chunduriv and unassigned sushreebarsa Jun 28, 2022

chunduriv added comp:keras Keras related issues and removed comp:apis Highlevel API related issues labels Jun 28, 2022

chunduriv added the stat:awaiting response Status - Awaiting response from author label Jun 28, 2022

Xinshuai-Lyu closed this as completed Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different result of cross entropy loss between result from model as function and using predict function #56555

Different result of cross entropy loss between result from model as function and using predict function #56555

Issue Type

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output

Different result of cross entropy loss between result from model as function and using predict function #56555

Different result of cross entropy loss between result from model as function and using predict function #56555

Comments

Issue Type

Source

Tensorflow Version

Custom Code

OS Platform and Distribution

Mobile device

Python version

Bazel version

GCC/Compiler version

CUDA/cuDNN version

GPU model and memory

Current Behaviour?

Standalone code to reproduce the issue

Relevant log output