Restaurando el modelo TensorFlow

Estoy tratando de restaurar el modelo TensorFlow. Seguí este ejemplo: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/

Al final del código en el ejemplo agregué estas líneas:

saver = tf.train.Saver() save_path = saver.save(sess, "model.ckpt") print("Model saved in file: %s" % save_path) 

Se crearon dos archivos: checkpoint y model.ckpt.

En un nuevo archivo de python (tomas_bees_predict.py), tengo este código:

 import tensorflow as tf saver = tf.train.Saver() with tf.Session() as sess: # Restore variables from disk. saver.restre(sess, "model.ckpt") print("Model restred.") 

Sin embargo, cuando ejecuto el código, me sale este error:

 Traceback (most recent call last): File "tomas_bees_predict.py", line 3, in  saver = tf.train.Saver() File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 705, in __init__ raise ValueError("No variables to save") 

ValueError: No hay variables para guardar

¿Hay una manera de leer el archivo mode.ckpt y ver qué variables se guardan? ¿O tal vez alguien puede ayudar a guardar el modelo y restaurarlo según el ejemplo descrito anteriormente?

EDITAR 1:

Creo que intenté ejecutar el mismo código para recrear la estructura del modelo y estaba recibiendo el error. Creo que podría estar relacionado con el hecho de que el código descrito aquí no utiliza variables con nombre: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/

 def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) 

Así que hice este experimento. Escribí dos versiones del código (con y sin variables nombradas) para guardar el modelo y el código para restaurar el modelo.

tensor_save_named_vars.py :

 import tensorflow as tf # Create some variables. v1 = tf.Variable(1, name="v1") v2 = tf.Variable(2, name="v2") # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) print "v1 = ", v1.eval() print "v2 = ", v2.eval() # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path 

tensor_save_not_named_vars.py:

 import tensorflow as tf # Create some variables. v1 = tf.Variable(1) v2 = tf.Variable(2) # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) print "v1 = ", v1.eval() print "v2 = ", v2.eval() # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path 

tensor_restre.py

 import tensorflow as tf # Create some variables. v1 = tf.Variable(0, name="v1") v2 = tf.Variable(0, name="v2") # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, use the saver to restre variables from disk, and # do some work with the model. with tf.Session() as sess: # Restore variables from disk. saver.restre(sess, "/tmp/model.ckpt") print "Model restred." print "v1 = ", v1.eval() print "v2 = ", v2.eval() 

Esto es lo que obtengo cuando ejecuto este código:

 $ python tensor_save_named_vars.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 v1 = 1 v2 = 2 Model saved in file: /tmp/model.ckpt $ python tensor_restre.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 Model restred. v1 = 1 v2 = 2 $ python tensor_save_not_named_vars.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 v1 = 1 v2 = 2 Model saved in file: /tmp/model.ckpt $ python tensor_restre.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice_1/tensor_name, save/restre_slice_1/shape_and_slice)]] W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v1" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice/tensor_name, save/restre_slice/shape_and_slice)]] Traceback (most recent call last): File "tensor_restre.py", line 14, in  saver.restre(sess, "/tmp/model.ckpt") File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 891, in restre sess.run([self._restre_op_name], {self._filename_tensor_name: save_path}) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 368, in run results = self._do_run(target_list, unique_fetch_targets, feed_dict_string) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 444, in _do_run e.code) tensorflow.python.framework.errors.NotFoundError: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice_1/tensor_name, save/restre_slice_1/shape_and_slice)]] Caused by op u'save/restre_slice_1', defined at: File "tensor_restre.py", line 8, in  saver = tf.train.Saver() File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 713, in __init__ restre_sequentially=restre_sequentially) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 432, in build filename_tensor, vars_to_save, restre_sequentially, reshape) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 191, in _AddRestoreOps values = self.restre_op(filename_tensor, vs, preferred_shard) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 106, in restre_op preferred_shard=preferred_shard) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 189, in _restre_slice preferred_shard, name=name) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 271, in _restre_slice preferred_shard=preferred_shard, name=name) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 664, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1834, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1043, in __init__ self._traceback = _extract_stack() 

Entonces, tal vez el código original (vea el enlace externo arriba) podría ser modificado a algo como esto:

 def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) weight_var = tf.Variable(initial, name="weight_var") return weight_var def bias_variable(shape): initial = tf.constant(0.1, shape=shape) bias_var = tf.Variable(initial, name="bias_var") return bias_var 

Pero entonces la pregunta que tengo es: ¿la restauración de las variables weight_var y bias_var es suficiente para implementar la predicción? Realicé el entrenamiento en la poderosa máquina con GPU y me gustaría copiar el modelo a la computadora menos poderosa sin GPU para ejecutar predicciones.

Hay una pregunta similar aquí: Tensorflow: ¿cómo guardar / restaurar un modelo? TLDR; necesita volver a crear la estructura del modelo utilizando la misma secuencia de comandos de la API de TensorFlow antes de usar el objeto Saver para restaurar los pesos

Esto no es óptimo, siga el problema de Github # 696 para avanzar en hacer esto más fácil

Este problema debe ser causado por las variantes del scope del nombre cuando se crea una doble red.

poner el comando:

tf.reset_default_graph ()

antes de crear la red

Si se produce un problema como este, intente reiniciar su kernel ya que la variable actual sobrescribe el conflicto anterior que causa el conflicto, por lo que muestra que NotFoundError y otros problemas aparecen.

Encontré el mismo tipo de problema y reiniciar el kernel funcionó para mí. (Precaución: intente evitar ejecutar su kernel varias veces, ya que puede arruinar su archivo modelo al recrear las variables que sobrescriben la existente, por lo que terminará cambiando los valores originales).

Creo que intenté ejecutar el mismo código para recrear la estructura del modelo y estaba recibiendo el error. Creo que podría estar relacionado con el hecho de que el código descrito aquí no utiliza variables con nombre: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/

 def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) 

Así que hice este experimento. Escribí dos versiones del código (con y sin variables nombradas) para guardar el modelo y el código para restaurar el modelo.

tensor_save_named_vars.py :

 import tensorflow as tf # Create some variables. v1 = tf.Variable(1, name="v1") v2 = tf.Variable(2, name="v2") # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) print "v1 = ", v1.eval() print "v2 = ", v2.eval() # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path 

tensor_save_not_named_vars.py:

 import tensorflow as tf # Create some variables. v1 = tf.Variable(1) v2 = tf.Variable(2) # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) print "v1 = ", v1.eval() print "v2 = ", v2.eval() # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path 

tensor_restre.py

 import tensorflow as tf # Create some variables. v1 = tf.Variable(0, name="v1") v2 = tf.Variable(0, name="v2") # Add ops to save and restre all the variables. saver = tf.train.Saver() # Later, launch the model, use the saver to restre variables from disk, and # do some work with the model. with tf.Session() as sess: # Restore variables from disk. saver.restre(sess, "/tmp/model.ckpt") print "Model restred." print "v1 = ", v1.eval() print "v2 = ", v2.eval() 

Esto es lo que obtengo cuando ejecuto este código:

 $ python tensor_save_named_vars.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 v1 = 1 v2 = 2 Model saved in file: /tmp/model.ckpt $ python tensor_restre.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 Model restred. v1 = 1 v2 = 2 $ python tensor_save_not_named_vars.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 v1 = 1 v2 = 2 Model saved in file: /tmp/model.ckpt $ python tensor_restre.py I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4 I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4 W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice_1/tensor_name, save/restre_slice_1/shape_and_slice)]] W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v1" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice/tensor_name, save/restre_slice/shape_and_slice)]] Traceback (most recent call last): File "tensor_restre.py", line 14, in  saver.restre(sess, "/tmp/model.ckpt") File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 891, in restre sess.run([self._restre_op_name], {self._filename_tensor_name: save_path}) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 368, in run results = self._do_run(target_list, unique_fetch_targets, feed_dict_string) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 444, in _do_run e.code) tensorflow.python.framework.errors.NotFoundError: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt [[Node: save/restre_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restre_slice_1/tensor_name, save/restre_slice_1/shape_and_slice)]] Caused by op u'save/restre_slice_1', defined at: File "tensor_restre.py", line 8, in  saver = tf.train.Saver() File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 713, in __init__ restre_sequentially=restre_sequentially) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 432, in build filename_tensor, vars_to_save, restre_sequentially, reshape) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 191, in _AddRestoreOps values = self.restre_op(filename_tensor, vs, preferred_shard) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 106, in restre_op preferred_shard=preferred_shard) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 189, in _restre_slice preferred_shard, name=name) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 271, in _restre_slice preferred_shard=preferred_shard, name=name) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 664, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1834, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1043, in __init__ self._traceback = _extract_stack() 

Entonces, tal vez el código original (vea el enlace externo arriba) podría ser modificado a algo como esto:

 def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) weight_var = tf.Variable(initial, name="weight_var") return weight_var def bias_variable(shape): initial = tf.constant(0.1, shape=shape) bias_var = tf.Variable(initial, name="bias_var") return bias_var 

Pero entonces la pregunta que tengo es: ¿la restauración de las variables weight_var y bias_var es suficiente para implementar la predicción? Realicé el entrenamiento en la poderosa máquina con GPU y me gustaría copiar el modelo a la computadora menos poderosa sin GPU para ejecutar predicciones.

asegúrese de que la statement de tf.train.Saver () esté en tf.Session () como sess