LSTM model has a concerning loss graph and returns a constant value on predictions

My dataset is 597515 rows, 31 columns measuring one variable. So input shape (31,1).

import pandas as pd
import tensorflow as tf

gpu_devices = tf.config.experimental.list_physical_devices("GPU")
for device in gpu_devices:
    tf.config.experimental.set_memory_growth(device, True)

raw_dataset = pd.read_csv('data.csv').drop(columns=['Unnamed: 0'])
train_set = raw_dataset.sample(frac=0.8, random_state=42)
test_set = raw_dataset.drop(train_set.index)

import numpy as np

train_features = train_set.copy()
test_features = test_set.copy()

train_labels = train_features.pop('T+1')
test_labels = test_features.pop('T+1')
train_features = train_features.to_numpy().reshape(train_features.shape[0], 31, 1)
test_features = test_features.to_numpy().reshape(test_features.shape[0], 31, 1)

Here’s my model. Parameters were determined via Optuna, though I found that different configurations result in the same phenomena, just with different values.

model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(11,return_sequences=True))
model.add(tf.keras.layers.LSTM(4, return_sequences=True))
model.add(tf.keras.layers.LSTM(13, return_sequences=False))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0940840143887362),
                loss=tf.keras.losses.MeanAbsoluteError(),
                metrics=['mae', 'mse']
                )
history = model.fit(train_features,
                    train_labels,
                    validation_split=0.2,
                    batch_size=65536,
                    epochs=100,
                    verbose=True
                    )

Finally resulting in the following loss graph

enter image description here

Looking at the predictions I get the following same values

array([[-9.407474],
       [-9.407474],
       [-9.407474],
       ...,
       [-9.407474],
       [-9.407474],
       [-9.407474]], dtype=float32)

And this is what it looks like plotted against the first 50 test label values.
enter image description here

I’m pretty much at a loss, and any help would be greatly appreciated.

  • Where is the time dimension in your problem? I feel you are using LSTM wrongly.

    – 

  • @user2586955 I’m not sure I understand your question? Each value corresponds to a variable at a specific time. I’m trying to identify dependencies between time steps. There are 31 columns thus 31 time steps. It seems LSTM is okay to use here. What am I missing?

    – 

  • try to scale your series, I think this could be the first step to increase the performances

    – 

Leave a Comment