具有keras和可变输入大小的深度学习(lstm)
我正在尝试用keras实现一个lstm模型.问题是我有不同形状的数据.我的数据如下:
I am trying to implement a lstm model with keras. The problem is that I have data of different shapes. My data looks like this:
col1 col2 col3 col4 col5
col1 col2 col3 col4 col5
[1,2,3] [2,3,4] [3,4,5] [5,6,7] [4,5,9]
[1,2,3] [2,3,4] [3,4,5] [5,6,7] [4,5,9]
[0,2] [1,5] [1,24] [11,7] [-1,4]
[0,2] [1,5] [1,24] [11,7] [-1,4]
[0,2,4,5] [1,5,7,8] [1,24,-7,6] [11,7,4,5] [-1,4,1,2]
[0,2,4,5] [1,5,7,8] [1,24,-7,6] [11,7,4,5] [-1,4,1,2]
我的代码是
import numpy as np
import pandas as pd
import h5py
from sklearn.model_selection import train_test_split
from keras.layers import Dense
from keras.layers import Input, LSTM
from keras.models import Model
X_train, X_test, y_train, y_test = train_test_split(X, y_target, test_size=0.2, random_state=1)
batch_size = 32
timesteps = 300
output_size = 1
epochs=120
inputs = Input(batch_shape=(batch_size, timesteps, output_size))
lay1 = LSTM(10, stateful=True, return_sequences=True)(inputs)
lay2 = LSTM(10, stateful=True, return_sequences=True)(lay1)
output = Dense(units = output_size)(lay2)
regressor = Model(inputs=inputs, outputs = output)
regressor.compile(optimizer='adam', loss = 'mae')
regressor.summary()
for i in range(epochs):
print("Epoch: " + str(i))
regressor.fit(X_train, y_train, shuffle=False, epochs = 1, batch_size = batch_size)
regressor.reset_states()
运行代码时出现的错误是:
The error I have when I run the code is :
ValueError: Error when checking input: expected input_5 to have 3 dimensions, but got array with shape (11200, 5) #11200 lines, 5 columns
谢谢
多维numpy数组需要具有清晰的形状,因此将不同长度的数组放在同一numpy数组中将导致对象的numpy数组,而不是所需的多维数数组.
A multidimensional numpy array need have a clear shape so putting array of different length inside the same numpy array will result in a numpy array of objects instead odf the desired multidimension array.
因此,基本上不可能一口气将数据提供给keras.
So basically it's not possible to feed your data to keras in one go.
但是,有几种可能的解决方案.他们中的大多数要求您的keras输入形状在您的时间步维度中必须为无":
However there are several possible solutions. Most of them require that your keras input shape has to be None in your timestep dimension:
- 使用填充,使数据始终具有相同的形状
- batch_size = 1的火车
- 以一种在每个批次内每个样本都具有相同形状的方式分批对数据进行排序.
最后两个选项需要使用fit_generator选项,因为您必须分步喂食数据.
The last two options require the usage of the fit_generator option, because you have to feed the data step wise.