Sample code for TensorFlow sliding average

The sliding average maintains a shadow variable for the target variable, which does not affect the maintenance of updates to the original variable, but is used in place of the original variable during testing or actual prediction (when not training).

1. Sliding average solution object initialization

ema = (decay,num_updates)

Parameter decay

`shadow_variable = decay * shadow_variable + (1 - decay) * variable`

Parameter num_updates

`min(decay, (1 + num_updates) / (10 + num_updates))`

2. Add/update variables

Add target variables for which shadow variables are maintained

Note that maintenance is not automatic and requires this sentence to be run in every round of training, so it's common to use tf.control_dependencies to make it bind to train_op to the point that every time train_op updates the shadow variables

([var0, var1])

3、Get the shadow variable value

This step is not needed to define the graph in which the target values are extracted from the set of shadow variables

(([var0, var1]))

4. Save & Load Shadow Variables

We know that in TensorFlow, the sliding averages of variables are maintained by shadow variables, so if you want to get the sliding average of a variable you need to get the shadow variable and not the variable itself.

Save shadow variables

After the object is created, Saver saves it normally and stores the shadow variable, the naming convention is "v/ExponentialMovingAverage" corresponding to the variable "v".

import tensorflow as tf 
if __name__ == "__main__": 

  v = (0.,name="v") 

  # Setting the coefficients of the sliding average model

  ema = (0.99) 

  # set variable v to use sliding average model, tf.all_variables() sets all variables

  op = ([v]) 

  # Get the name of the variable v

  print() 

  #v:0 

  # Create an object that holds the model

  save = () 

  sess = () 

  # Initialize all variables

  init = tf.initialize_all_variables() 

  (init) 

  # Reassign the value to the variable v

  ((v,10)) 

  #Apply average slide settings

  (op) 

  #Save the model file

  (sess,"./") 

  # Output the value of the variable v before and after using the sliding average model

  print(([v,(v)])) 

  #[10.0, 0.099999905]

Load shadow variables and map to variables

Using Saver's variable name mapping for loading models, you can do this for virtually all variables. Summary of TensorFlow model loading methods

v = (1.,name="v") 

#Define model objects

saver = ({"v/ExponentialMovingAverage":v}) 

sess = () 

(sess,"./") 

print((v)) 

#0.0999999

One particular thing to note here is that in the use function, the model parameter passed is {"v/ExponentialMovingAverage":v} rather than {"v":v}, and if you use the latter parameter then you will get a result of 10 rather than 0.09, and that's because the latter gets the variable itself rather than the shadow variable.

When using this approach to read the model file, you also need to enter a large list of variable names.

Use of the variables_to_restore function

v = (1.,name="v") 

The magnitude of the parameters of the #sliding model does not affect the value of v

ema = (0.99) 

print(ema.variables_to_restore()) 

#{'v/ExponentialMovingAverage': < 'v:0' shape=() dtype=float32_ref>} 

sess = () 

saver = (ema.variables_to_restore()) 

(sess,"./") 

print((v)) 

#0.0999999

variables_to_restore recognizes variables in the network and automatically generates shadow variable names.

By using the variables_to_restore function, it is possible to map the shadow variables directly to the variables themselves when loading the model, so that when we get the sliding average of a variable we only need to get the value of the variable itself and not the shadow variables.

5、Official Documentation Examples

The official document will automatically train one side of the model every time apply update, in fact, you can reverse the relationship, "tf real google" P128 has examples of

 | Example usage when creating a training model:
 | 
 | ```python
 | # Create variables.
 | var0 = (...)
 | var1 = (...)
 | # ... use the variables to build a training model...
 | ...
 | # Create an op that applies the optimizer. This is what we usually
 | # would use as a training op.
 | opt_op = (my_loss, [var0, var1])
 | 
 | # Create an ExponentialMovingAverage object
 | ema = (decay=0.9999)
 | 
 | with tf.control_dependencies([opt_op]):
 |   # Create the shadow variables, and add ops to maintain moving averages
 |   # of var0 and var1. This also creates an op that will update the moving
 |   # averages after each training step. This is what we will use in place
 |   # of the usual training op.
 |   training_op = ([var0, var1])
 | 
 | ...train the model by running training_op...
 | ```

6. batch_normal example

Not quite the same as above, it's not quite as easy to bind to train_op in batch_normal (which sits outside the body of the function), then forcing the outputs of the two variables to be procedurally node-bound to the parameter update step

def batch_norm(x,beta,gamma,phase_train,scope='bn',decay=0.9,eps=1e-5):

  with tf.variable_scope(scope):

    # beta = tf.get_variable(name='beta', shape=[n_out], initializer=tf.constant_initializer(0.0), trainable=True)

    # gamma = tf.get_variable(name='gamma', shape=[n_out],

    #             initializer=tf.random_normal_initializer(1.0, stddev), trainable=True)

    batch_mean,batch_var = (x,[0,1,2],name='moments')

    ema = (decay=decay)

 

    def mean_var_with_update():

      ema_apply_op = ([batch_mean,batch_var])

      with tf.control_dependencies([ema_apply_op]):

        return (batch_mean),(batch_var)

        # after identity will convert Variable to Tensor and incorporate it into the graph.

        # Otherwise, since Variable is session-independent, it won't be limited by graph control_dependencies

 

    mean,var = (phase_train,

              mean_var_with_update,

              lambda: ((batch_mean),(batch_var)))

   normed = .batch_normalization(x, mean, var, beta, gamma, eps)

  return normed

This is the whole content of this article.