keras capa LSTM tarda demasiado en entrenar

Cada vez que pruebo los modelos LSTM en Keras, parece que el modelo es imposible de entrenar debido al largo tiempo de entrenamiento.

Por ejemplo, un modelo como este toma 80 segundos por paso para entrenar .:

def create_model(self): inputs = {} inputs['input'] = [] lstm = [] placeholder = {} for tf, v in self.env.timeframes.items(): inputs[tf] = Input(shape = v['shape'], name = tf) lstm.append(LSTM(8)(inputs[tf])) inputs['input'].append(inputs[tf]) account = Input(shape = (3,), name = 'account') account_ = Dense(8, activation = 'relu')(account) dt = Input(shape = (7,), name = 'dt') dt_ = Dense(16, activation = 'relu')(dt) inputs['input'].extend([account, dt]) data = Concatenate(axis = 1)(lstm) data = Dense(128, activation = 'relu')(data) y = Concatenate(axis = 1)([data, account, dt]) y = Dense(256, activation = 'relu')(y) y = Dense(64, activation = 'relu')(y) y = Dense(16, activation = 'relu')(y) output = Dense(3, activation = 'linear')(y) model = Model(inputs = inputs['input'], outputs = output) model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae']) return model 

Mientras que el modelo que ha subordinado LSTM con Flatten + Dense como este:

 def create_model(self): inputs = {} inputs['input'] = [] lstm = [] placeholder = {} for tf, v in self.env.timeframes.items(): inputs[tf] = Input(shape = v['shape'], name = tf) #lstm.append(LSTM(8)(inputs[tf])) placeholder[tf] = Flatten()(inputs[tf]) lstm.append(Dense(32, activation = 'relu')(placeholder[tf])) inputs['input'].append(inputs[tf]) account = Input(shape = (3,), name = 'account') account_ = Dense(8, activation = 'relu')(account) dt = Input(shape = (7,), name = 'dt') dt_ = Dense(16, activation = 'relu')(dt) inputs['input'].extend([account, dt]) data = Concatenate(axis = 1)(lstm) data = Dense(128, activation = 'relu')(data) y = Concatenate(axis = 1)([data, account, dt]) y = Dense(256, activation = 'relu')(y) y = Dense(64, activation = 'relu')(y) y = Dense(16, activation = 'relu')(y) output = Dense(3, activation = 'linear')(y) model = Model(inputs = inputs['input'], outputs = output) model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae']) return model 

Tarda 45-50 ms por paso para entrenar.

¿Hay algo mal en el modelo que está causando esto? ¿O es esto tan rápido como funcionará este modelo?

– self.env.timeframes tiene este aspecto: diccionario con 9 elementos

 timeframes = { 's1': { 'lookback': 86400, 'word': '1 s', 'unit': 1, 'offset': 12 }, 's5': { 'lookback': 200, 'word': '5 s', 'unit': 5, 'offset': 2 }, 'm1': { 'lookback': 100, 'word': '1 min', 'unit': 60, 'offset': 0 }, 'm5': { 'lookback': 100, 'word': '5 min', 'unit': 300, 'offset': 0 }, 'm30': { 'lookback': 100, 'word': '30 min', 'unit': 1800, 'offset': 0 }, 'h1': { 'lookback': 200, 'word': '1 h', 'unit': 3600, 'offset': 0 }, 'h4': { 'lookback': 200, 'word': '4 h', 'unit': 14400, 'offset': 0 }, 'h12': { 'lookback': 100, 'word': '12 h', 'unit': 43200, 'offset': 0 }, 'd1': { 'lookback': 200, 'word': '1 d', 'unit': 86400, 'offset': 0 } } 

Información de GPU desde el indicador

 2018-06-30 07:35:16.204320: IT:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2018-06-30 07:35:16.495832: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.86 pciBusID: 0000:01:00.0 totalMemory: 8.00GiB freeMemory: 6.59GiB 2018-06-30 07:35:16.495981: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0 2018-06-30 07:35:16.956743: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-06-30 07:35:16.956827: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0 2018-06-30 07:35:16.957540: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 2018-06-30 07:35:16.957865: IT:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1) 

Si está utilizando GPU, reemplace todas las capas LSTM con capas CuDNNLSTM. Puedes importarlo desde keras.layers :

 from keras.layers import CuDNNLSTM def create_model(self): inputs = {} inputs['input'] = [] lstm = [] placeholder = {} for tf, v in self.env.timeframes.items(): inputs[tf] = Input(shape = v['shape'], name = tf) lstm.append(CuDNNLSTM(8)(inputs[tf])) inputs['input'].append(inputs[tf]) account = Input(shape = (3,), name = 'account') account_ = Dense(8, activation = 'relu')(account) dt = Input(shape = (7,), name = 'dt') dt_ = Dense(16, activation = 'relu')(dt) inputs['input'].extend([account, dt]) data = Concatenate(axis = 1)(lstm) data = Dense(128, activation = 'relu')(data) y = Concatenate(axis = 1)([data, account, dt]) y = Dense(256, activation = 'relu')(y) y = Dense(64, activation = 'relu')(y) y = Dense(16, activation = 'relu')(y) output = Dense(3, activation = 'linear')(y) model = Model(inputs = inputs['input'], outputs = output) model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae']) return model 

Aquí hay más información: https://keras.io/layers/recurrent/#cudnnlstm

Esto acelerará significativamente el modelo =)