How to prune and optimize an existing saved model and it variables in tensorflow python?

GeinNiThayMay · July 12, 2023, 5:14am

I have a existing tensorflow saved model with the directory structure below. That is basically the only thing I have. I can manage to get the checkpoint files though, but it would be great if we don’t need anything from training procedure.

saved_model.pb
variables/
  variables.data-00000-of-00001
  variables.idnex

Now I would like to run the tensorflow.python.tools.optimize_for_inference to optimize it for later inference in tf serving.

However, even though I can successfully optimize its graph_def, I cannot prepare appropriate variables for the new graph. The problem reside in the Save and Restore op, which exists in the old graph, but are pruned in the optimized graph_def.

But the variables file contains the save/restore op, it will give errors like

Save/xxx does not exist in graph.

I tried to leverage the graph_util.convert_variables_to_constant function, but I got the

Attempting to use unitialized node xxx/yyy

How can I prune the graph structure and make successfull predictions? Thanks.

Aniket_Dubey · May 15, 2024, 10:54am

Hi @GeinNiThayMay Welcome to Tensorflow forum ,

Optimizing a TensorFlow SavedModel for inference can be challenging, especially when dealing with variable operations and pruning the graph correctly. Here’s a step-by-step approach to help you optimize the model and ensure that it works correctly for inference without the Save and Restore operations.

Load the SavedModel: First, load the existing SavedModel into a TensorFlow session.

import tensorflow as tf

saved_model_dir = 'path/to/your/saved_model'
with tf.Session(graph=tf.Graph()) as sess:
    tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], saved_model_dir)
    graph_def = sess.graph.as_graph_def()

Freeze the Graph: Use the convert_variables_to_constants function to convert the variables to constants. This will embed the variable values directly into the graph, removing the need for variable initialization and save/restore operations.

from tensorflow.python.framework import graph_util

with tf.Session(graph=tf.Graph()) as sess:
    tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], saved_model_dir)
    output_node_names = ['your_output_node_name']  # Replace with your model's output node name(s)
    frozen_graph_def = graph_util.convert_variables_to_constants(
        sess,
        sess.graph_def,
        output_node_names
    )

To find the output node names, you can inspect the graph:

for op in sess.graph.get_operations():
    print(op.name)

Optimize the Graph for Inference:
Use the optimize_for_inference tool provided by TensorFlow to optimize the frozen graph.

from tensorflow.python.tools import optimize_for_inference_lib

optimized_graph_def = optimize_for_inference_lib.optimize_for_inference(
    frozen_graph_def,
    input_node_names=['your_input_node_name'],  # Replace with your model's input node name(s)
    output_node_names=['your_output_node_name'],  # Replace with your model's output node name(s)
    placeholder_type_enum=tf.float32.as_datatype_enum  # Adjust the type according to your model
)

Save the Optimized Graph: Save the optimized graph to a file.

with tf.gfile.GFile('optimized_model.pb', 'wb') as f:
    f.write(optimized_graph_def.SerializeToString())

Load and Use the Optimized Graph: When you want to use the optimized graph for inference, load it as follows:

with tf.Graph().as_default():
    with tf.gfile.GFile('optimized_model.pb', 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        tf.import_graph_def(graph_def, name='')

    with tf.Session() as sess:
        input_tensor = sess.graph.get_tensor_by_name('your_input_node_name:0')
        output_tensor = sess.graph.get_tensor_by_name('your_output_node_name:0')

        # Make predictions
        predictions = sess.run(output_tensor, feed_dict={input_tensor: input_data})

Replace 'your_input_node_name' and 'your_output_node_name' with the actual node names in your graph. These can be found by inspecting the operations in the graph as shown earlier.
If you encounter any issues related to uninitialized nodes, double-check that you are correctly specifying the output nodes and that all necessary nodes are included in the optimization process.

By following these steps, you should be able to prune the graph structure and successfully prepare it for inference, removing the dependency on Save and Restore operations.

Thank You !

jxk_hsj · May 17, 2024, 7:30am

Hello,
To optimize your TensorFlow saved model for inference, load the model using tf.compat.v1.saved_model.loader.load, then convert the variables to constants using tf.compat.v1.graph_util.convert_variables_to_constants to eliminate dependencies on Save and Restore operations. Next, optimize the graph with optimize_for_inference_lib.optimize_for_inference, and finally, save the optimized graph to a file. Ensure to specify your actual input and output node names in the process. This will prune the graph structure and allow for successful predictions.