[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with loss_weights parameter of model.compile() , when model returns multiple output #67405

Open
samrajput1143 opened this issue May 12, 2024 · 4 comments
Assignees
Labels
comp:keras Keras related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.11 Issues related to TF 2.11 type:support Support issues

Comments

@samrajput1143
Copy link
samrajput1143 commented May 12, 2024

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

binary

TensorFlow version

2.11.0

Custom code

Yes

OS platform and distribution

ubuntu 18.04

Mobile device

No response

Python version

3.7.5

Bazel version

No response

GCC/compiler version

7.5.0

CUDA/cuDNN version

cuda_11.8/cuDNN 9.1.0.70

GPU model and memory

Tesla P40/ 24 gb

Current behavior?

Even if i have provided the loss_weights parameter , TensorFlow 2.11.0 is calculating loss for each of the output variable in the outputs list against the true value .

Standalone code to reproduce the issue

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras import layers
from tensorflow.keras.layers import Bidirectional, Input, Add, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, AveragePooling2D, MaxPooling2D, GlobalMaxPooling2D, Dropout, TimeDistributed, Permute
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.initializers import random_uniform, glorot_uniform, constant, identity
from tensorflow.keras.layers import Concatenate, Permute, Dot, Input, LSTM, Multiply
from tensorflow.keras.layers import RepeatVector, Dense, Activation, Lambda
from tensorflow.keras.optimizers import Adam
import tensorflow.keras.backend as K
from keras import regularizers
from tensorflow.random import uniform
import tensorflow as tf
from tensorflow.keras.layers import Reshape
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score

def create_encoder_decoder_model(n_a, n_s, Tx, Ty, xFeatures, yFeatures):
   # Encoder
   encoder_inputs = Input(shape=(Tx,xFeatures))
   s1 = Input(shape=(n_a,), name='s1')#hidden state
   c1 = Input(shape=(n_a,), name='c1')#cell state
   s2 = Input(shape=(n_a,), name='s2')
   c2 = Input(shape=(n_a,), name='c2')
   s3 = Input(shape=(n_a,), name='s3')
   c3 = Input(shape=(n_a,), name='c3')
   s4 = Input(shape=(n_a,), name='s4')
   c4 = Input(shape=(n_a,), name='c4')

   encoder_lstm1,hiddenState1,cellState1 = LSTM(n_a, return_sequences=True,return_state=True)(encoder_inputs,initial_state=[s1,c1])
   encoder_lstm2,hiddenState2,cellState2 = LSTM(n_a, return_state=True)(encoder_lstm1,initial_state=[s2,c2])
   # Repeat vector to feed into decoder
   repeat_vector = RepeatVector(Ty)(encoder_lstm2)
   # Decoder
   decoder_lstm1,hiddenState3,cellState3  = LSTM(n_s, return_sequences=True,return_state=True)(repeat_vector,initial_state=[s3,c3])
   decoder_lstm2,hiddenState4,cellState4  = LSTM(n_s, return_sequences=True,return_state=True)(decoder_lstm1,initial_state=[s4,c4])
   decoder_outputs = TimeDistributed(Dense(yFeatures,activation='relu', kernel_initializer=glorot_uniform(
       seed=0), kernel_regularizer=regularizers.l2(0.01)))(decoder_lstm2)

   model = Model(inputs=[encoder_inputs, s1, c1, s2, c2, s3, c3, s4, c4], outputs=[decoder_outputs, hiddenState1, cellState1, hiddenState2, cellState2, hiddenState3, cellState3, hiddenState4, cellState4])
   return model
epochs=10
n_a=64
n_s=64
n_past=40
n_future=20
xFeatures=11
yFeatures=6
batch_size=32
X_lstm_train=np.random.random((1000,40,11))
X_lstm_val=np.random.random((100,40,11))
X_lstm_test=np.random.random((100,40,11))
Y_train = np.random.random((1000,20,6))
Y_test = np.random.random((100,20,6))
Y_val = np.random.random((1000,20,6))

s0 = np.zeros((X_lstm_train.shape[0], n_s))
c0 = np.zeros((X_lstm_train.shape[0], n_s))
s_val = np.zeros((X_lstm_val.shape[0], n_s))
c_val = np.zeros((X_lstm_val.shape[0], n_s))
s_test = np.zeros((X_lstm_test.shape[0], n_s))
c_test = np.zeros((X_lstm_test.shape[0], n_s))

model = create_encoder_decoder_model(n_a, n_s, n_past, n_future, xFeatures,yFeatures)
print(model.summary())

reduce_lr = tf.keras.callbacks.LearningRateScheduler(
    lambda x: 1e-3 * 0.90 ** x)
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, verbose=1)
#model_checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath='model_checkpoint.h5', monitor='val_accuracy', save_best_only=True)

# model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.Huber())
model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.Huber(),loss_weights=[1.0]+[0.0]*8)
history = model.fit([X_lstm_train, s0, c0, s0, c0, s0, c0, s0, c0],
                    Y_train,                       # Target data: Decoder outputs for training
                    epochs=epochs,
                    batch_size=batch_size,
                    # Validation data
                    validation_data=(
                        [X_lstm_val, s_val, c_val, s_val, c_val, s_val, c_val, s_val, c_val], Y_val),
                    shuffle=False  ,                                  # No shuffling
                    callbacks=[early_stopping,reduce_lr]
                )

prediction = np.array(model.predict([X_lstm_test, s_test, c_test, s_test, c_test, s_test, c_test, s_test, c_test])[0])
print(prediction.shape)

Relevant log output

Epoch 1/10
Traceback (most recent call last):
  File "test2.py", line 86, in <module>
    callbacks=[early_stopping,reduce_lr]
  File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filek1437efx.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1249, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1233, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1222, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1024, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1083, in compute_loss
        y, y_pred, sample_weight, regularization_losses=self.losses
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 265, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 152, in __call__
        losses = call_fn(y_true, y_pred)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 284, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 1893, in huber
        error = tf.subtract(y_pred, y_true)

    ValueError: Dimensions must be equal, but are 64 and 6 for '{{node huber_loss_1/Sub}} = Sub[T=DT_FLOAT](model/lstm/PartitionedCall:2, IteratorGetNext:9)' with input shapes: [?,64], [?,20,6].
@google-ml-butler google-ml-butler bot added the type:bug Bug label May 12, 2024
@SuryanarayanaY SuryanarayanaY added TF 2.11 Issues related to TF 2.11 comp:keras Keras related issues labels May 14, 2024
@SuryanarayanaY
Copy link
Collaborator

Hi @samrajput1143 ,

Could you please test with tf-nightly and let us know whether this is replicable with nightly version as well?

@SuryanarayanaY SuryanarayanaY added the stat:awaiting response Status - Awaiting response from author label May 14, 2024
@samrajput1143
Copy link
Author

I have python version 3.7.5 , can i install tf-nightly?

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 14, 2024
@tilakrayal
Copy link
Contributor

@samrajput1143,
Looks like this is not the issue related to TensorFlow. Also Tensorflow v2.11 is a pretty older version, Could you please try to execute the code with the latest TensorFlow v2.16 which supports the python v3.9-3.12. Kindly open a tensorflow discussion forum[https://discuss.tensorflow.org/] issue for this as it is not a bug or feature request

@tilakrayal tilakrayal added stat:awaiting response Status - Awaiting response from author type:support Support issues and removed type:bug Bug labels Jun 14, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.11 Issues related to TF 2.11 type:support Support issues
Projects
None yet
Development

No branches or pull requests

3 participants