¿Cómo almaceno y reconstruyo un diccionario de pesos en tensorflow?

Cuando entreno, almaceno mis pesas en un diccionario de tensorflow-variables. Paso el diccionario de pesos a una función ‘modelo’ junto con algunos datos para obtener el resultado deseado.
Después del entrenamiento, me gustaría almacenar ese diccionario en un archivo de tal manera que pueda recrearlo. De esa manera puedo aplicar los pesos aprendidos simplemente pasando el diccionario de pesos junto con los nuevos datos a la misma función de modelo.
De acuerdo con la documentación , simplemente pasar el diccionario de pesos a un ahorrador debería guardar esos pesos con los nombres correctos. Entonces debería poder crear el mismo diccionario en la función de la aplicación y luego restaurar los valores guardados. Sin embargo, si hago esto, obtendré un ‘error no inicializado’. ¿Alguien puede ayudarme a encontrar lo que estoy haciendo mal?

Ejemplo de código autocontenido mínimo y error correspondiente:

import tensorflow as tf import numpy as np # first train a linear model on random vectors of length 5 and store the trained parameters. # Then load those parameters and try to apply them to a new vector. def run(): train_model() apply_model() def train_model(): # create random training data: 100 vectors of length 5 for both input and output. train_data = np.random.random((100,5)) train_labels = np.random.random((100,5)) train_data_node = tf.placeholder(tf.float32, shape=(5), name="train_data_node") train_labels_node = tf.placeholder(tf.float32, shape=(5), name="train_labels_node") weights = defineWeights() prediction = model(train_data_node, weights) loss = tf.norm(prediction - train_labels_node) train_op = tf.train.AdagradOptimizer(learning_rate=1).minimize(loss) saver = tf.train.Saver(weights) sess = tf.Session() sess.run(tf.global_variables_initializer()) # train for 50 epochs on all 100 training examples, with a batchsize of 1. for _ in range(50): for i in range(100): batch_data = train_data[i,:] batch_labels = train_labels[i,:] feed_dict = {train_data_node: batch_data, train_labels_node: batch_labels} sess.run([train_op, loss, weights], feed_dict=feed_dict) saver.save(sess, '/results/weights') def apply_model(): sess = tf.Session() weights = defineWeights() new_saver = tf.train.import_meta_graph('/results/weights.meta') new_saver.restre(sess, tf.train.latest_checkpoint('/results')) print(model(np.random.random(5).astype(np.float32), weights).eval(session=sess)) def model(data, weights): # multiply the matrix weights['a'] with the vector data l1 = tf.matmul(tf.expand_dims(data,0), weights['a']) l1 = l1 + weights['b'] return l1 def defineWeights(): weights = { 'a': tf.Variable(tf.random_normal([5, 5], stddev=0.01, dtype = tf.float32), name = 'a'), 'b': tf.Variable(tf.random_normal([5]), name = 'b'), } return weights 

Llamar a la función ‘run ()’ en el código anterior da el siguiente error:

 Traceback (most recent call last): File "", line 1, in  File "/usr/local/lib/python2.7/dist-packages/myFolder/example.py", line 8, in run apply_model() File "/usr/local/lib/python2.7/dist-packages/myFolder/example.py", line 50, in apply_model print(model(np.random.random(5).astype(np.float32), weights).eval(session=sess)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 569, in eval return _eval_using_default_session(self, feed_dict, self.graph, session) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3741, in _eval_using_default_session return session.run(tensors, feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 786, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 994, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1044, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1064, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value a_1 [[Node: a_1/read = Identity[T=DT_FLOAT, _class=["loc:@a_1"], _device="/job:localhost/replica:0/task:0/gpu:0"](a_1)]] [[Node: add_2/_5 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_7_add_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]] Caused by op u'a_1/read', defined at: File "", line 1, in  File "/usr/local/lib/python2.7/dist-packages/myFolder/example.py", line 8, in run apply_model() File "/usr/local/lib/python2.7/dist-packages/myFolder/example.py", line 45, in apply_model weights = defineWeights() File "/usr/local/lib/python2.7/dist-packages/myFolder/example.py", line 63, in defineWeights name = 'a'), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 197, in __init__ expected_shape=expected_shape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 316, in _init_from_args self._snapshot = array_ops.identity(self._variable, name="read") File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1338, in identity result = _op_def_lib.apply_op("Identity", input=input, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1228, in __init__ self._traceback = _extract_stack() FailedPreconditionError (see above for traceback): Attempting to use uninitialized value a_1 [[Node: a_1/read = Identity[T=DT_FLOAT, _class=["loc:@a_1"], _device="/job:localhost/replica:0/task:0/gpu:0"](a_1)]] [[Node: add_2/_5 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_7_add_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

Edité su código para que funcione – ¡Una forma posible! Revisalo.

 import tensorflow as tf import numpy as np # first train a linear model on random vectors of length 5 and store the trained parameters. # Then load those parameters and try to apply them to a new vector. def run(): train_model() apply_model() def train_model(): # create random training data: 100 vectors of length 5 for both input and output. train_data = np.random.random((100,5)) train_labels = np.random.random((100,5)) train_data_node = tf.placeholder(tf.float32, shape=(5), name="train_data_node") train_labels_node = tf.placeholder(tf.float32, shape=(5), name="train_labels_node") weights = defineWeights() prediction = model(train_data_node, weights) prediction = tf.identity(prediction, name="prediction") loss = tf.norm(prediction - train_labels_node) train_op = tf.train.AdagradOptimizer(learning_rate=1).minimize(loss) saver = tf.train.Saver() sess = tf.Session() sess.run(tf.global_variables_initializer()) # train for 50 epochs on all 100 training examples, with a batchsize of 1. for _ in range(50): for i in range(100): batch_data = train_data[i,:] batch_labels = train_labels[i,:] feed_dict = {train_data_node: batch_data, train_labels_node: batch_labels} sess.run([train_op, loss, weights], feed_dict=feed_dict) saver.save(sess, 'results/model') print("Trained Weights") print(sess.run(weights)) def apply_model(): sess = tf.Session() new_saver = tf.train.import_meta_graph('results/model.meta') new_saver.restre(sess, tf.train.latest_checkpoint('results')) print("Loaded Weights") print(sess.run(['a:0','b:0'])) prediction = tf.get_default_graph().get_tensor_by_name("prediction:0") train_data_node = tf.get_default_graph().get_tensor_by_name("train_data_node:0") test_data = np.random.random(5).astype(np.float32) pred = sess.run([prediction],feed_dict={train_data_node:test_data}) print("Prediction") print(pred) def model(data, weights): # multiply the matrix weights['a'] with the vector data l1 = tf.matmul(tf.expand_dims(data,0), weights['a']) l1 = l1 + weights['b'] return l1 def defineWeights(): weights = { 'a': tf.Variable(tf.random_normal([5, 5], stddev=0.01, dtype = tf.float32), name = 'a'), 'b': tf.Variable(tf.random_normal([5]), name = 'b'), } return weights def main(_): run() if __name__ == '__main__': tf.app.run(main=main) 

Salida:

 Trained Weights {'a': array([[ 0.01243415, -0.42879951, 0.0174435 , -0.24622701, 0.35309449], [ 0.03154161, -0.08194152, 0.09223857, -0.15719411, -0.06323836], [-0.03263358, 0.05096304, 0.1769278 , -0.17564282, 0.04325204], [-0.17412457, -0.00338688, 0.08468977, -0.06877152, -0.02180972], [ 0.25160244, -0.19224152, 0.14535131, -0.20594895, -0.03813718]], dtype=float32), 'b': array([ 0.33825615, 0.79861975, 0.30609566, 0.91897982, 0.20577262], dtype=float32)} I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0) Loaded Weights [array([[ 0.01243415, -0.42879951, 0.0174435 , -0.24622701, 0.35309449], [ 0.03154161, -0.08194152, 0.09223857, -0.15719411, -0.06323836], [-0.03263358, 0.05096304, 0.1769278 , -0.17564282, 0.04325204], [-0.17412457, -0.00338688, 0.08468977, -0.06877152, -0.02180972], [ 0.25160244, -0.19224152, 0.14535131, -0.20594895, -0.03813718]], dtype=float32), array([ 0.33825615, 0.79861975, 0.30609566, 0.91897982, 0.20577262], dtype=float32)] Prediction [array([[ 0.3465074 , 0.42139536, 0.71310139, 0.30854774, 0.32671657]], dtype=float32)] 

Explicación:

  1. Nombra los tensores a los que quieres acceder después de restaurar.
  2. Restaure el gráfico y restaure las variables que nombró, que se muestran en apply_model ()
  3. Alimente el nuevo test_data en el marcador de posición usando feed_dict

Cuestiones:

  1. Intenté usar sess.run (tf.global_variables_initializer ()) pero estoy reinicializando variables a nuevos valores aleatorios. (Utilizando TF 1.0)

Espero que esto ayude !