template() in TensorFlow
I am trying to use make_template() to avoid passing reuse flag throughout my model. But it seems that make_template() doesn't work correctly when it is used inside of a python class. I pasted ]my model code and the error I am getting below. It is a simple MLP to train on the MNIST dataset.
Since the code is kinda long, the main part here is the _weights() function. I try to wrap it using make_template() and then use get_variables() inside it to create and reuse weights throughout my model. _weights() is used by _create_dense_layer() and that in turn is used by _create_model() to create the graph. The train() function accepts tensors that I get from a data reader.
Model
class MLP(object): def __init__(self, hidden=[], biases=False, activation=tf.nn.relu): self.graph = tf.get_default_graph() self.hidden = hidden self.activation = activation self.biases = biases self.n_features = 784 self.n_classes = 10 self.bsize = 100 self.l2 = 0.1 def _real_weights(self, shape): initializer=tf.truncated_normal_initializer(stddev=0.1) weights = tf.get_variable('weights', shape, initializer=initializer) return weights # use make_template to make variable reuse transparent _weights = tf.make_template('_weights', _real_weights) def _real_biases(self, shape): initializer=tf.constant_initializer(0.0) return tf.get_variable('biases', shape, initializer=initializer) # use make_template to make variable reuse transparent _biases = tf.make_template('_biases', _real_biases) def _create_dense_layer(self, name, inputs, n_in, n_out, activation=True): with tf.variable_scope(name): weights = self._weights([n_in, n_out]) layer = tf.matmul(inputs, weights) if self.biases: biases = self._biases([n_out]) layer = layer + biases if activation: layer = self.activation(layer) return layer def _create_model(self, inputs): n_in = self.n_features for i in range(len(self.hidden)): n_out = self.hidden[i] name = 'hidden%d' % (i) inputs = self._create_dense_layer(name, inputs, n_in, n_out) n_in = n_out output = self._create_dense_layer('output', inputs, n_in, self.n_classes, activation=False) return output def _create_loss_op(self, logits, labels): cent = tf.nn.softmax_cross_entropy_with_logits(logits, labels) weights = self.graph.get_collection('weights') l2 = (self.l2 / self.bsize) * tf.reduce_sum([tf.reduce_sum(tf.square(w)) for w in weights]) return tf.reduce_mean(cent, name='loss') + l2 def _create_train_op(self, loss): optimizer = tf.train.AdamOptimizer() return optimizer.minimize(loss) def _create_accuracy_op(self, logits, labels): predictions = tf.nn.softmax(logits) errors = tf.equal(tf.argmax(predictions, 1), tf.argmax(labels, 1)) return tf.reduce_mean(tf.cast(errors, tf.float32)) def train(self, images, labels): logits = model._create_model(images) loss = model._create_loss_op(logits, labels) return model._create_train_op(loss) def accuracy(self, images, labels): logits = model._create_model(images) return model._create_accuracy_op(logits, labels) def predict(self, images): return model._create_model(images)
The error:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 25 model = MLP(hidden=[128]) 26 # define ops ---> 27 train = model.train(images, labels) 28 accuracy = model.accuracy(eval_images, eval_labels) 29 # load test data and create a prediction op in train(self, images, labels) 60 61 def train(self, images, labels): ---> 62 logits = model._create_model(images) 63 loss = model._create_loss_op(logits, labels) 64 return model._create_train_op(loss) in _create_model(self, inputs) 39 n_out = self.hidden[i] 40 name = 'hidden%d' % (i) ---> 41 inputs = self._create_dense_layer(name, inputs, n_in, n_out) 42 n_in = n_out 43 output = self._create_dense_layer('output', inputs, n_in, self.n_classes, activation=False) in _create_dense_layer(self, name, inputs, n_in, n_out, activation) 25 def _create_dense_layer(self, name, inputs, n_in, n_out, activation=True): 26 with tf.variable_scope(name): ---> 27 weights = self._weights([n_in, n_out]) 28 layer = tf.matmul(inputs, weights) 29 if self.biases: /usr/local/lib/python3.5/site-packages/tensorflow/python/ops/template.py in __call__(self, *args, **kwargs) 265 self._unique_name, self._name) as vs: 266 self._var_scope = vs --> 267 return self._call_func(args, kwargs, check_for_new_variables=False) 268 269 @property /usr/local/lib/python3.5/site-packages/tensorflow/python/ops/template.py in _call_func(self, args, kwargs, check_for_new_variables) 206 ops.get_collection(ops.GraphKeys.TRAINABLE_VARIABLES)) 207 --> 208 result = self._func(*args, **kwargs) 209 if check_for_new_variables: 210 trainable_variables = ops.get_collection( TypeError: _real_weights() missing 1 required positional argument: 'shape' originally defined at: File "", line 1, in class MLP(object): File "", line 17, in MLP _weights = tf.make_template('_weights', _real_weights)
There are multiple problems with this code as it is here, eg the model
references in the train
, accuracy
and predict
methods. I assume this is due to cutting the code from its natural habitat.
The reason for the TypeError
you mention,
TypeError: _real_weights() missing 1 required positional argument: 'shape'
most likely comes from the fact that _real_weights
itself is an instance method of the MLP
class, not a regular function or static method. As such, the first parameter to the function is always the self
reference pointing to the instance of the class at the time of the call (an explicit version of the this
pointer in C-like languages), as can be seen in the function declaration:
def _real_weights(self, shape):
initializer=tf.truncated_normal_initializer(stddev=0.1)
weights = tf.get_variable('weights', shape, initializer=initializer)
return weights
Note that even though you don't use the argument, it's still required in this case. Thus when creating a template of the function using
tf.make_template('_weights', self._real_weights)
you basically state that the _weights
template you create should take two positional arguments: self
and weights
(as does the _real_weights
method). Consequently, when you call the function created from the template as
weights = self._weights([n_in, n_out])
you pass the array to the self
argument, leaving the (required) shape
argument unspecified.
From what it looks like you'd have two options here: You could either make _real_weights
a regular function outside of the MLP
class, so that
def _real_weights(shape):
initializer=tf.truncated_normal_initializer(stddev=0.1)
weights = tf.get_variable('weights', shape, initializer=initializer)
return weights
class MLP():
# etc.
which is probably not what you want, given that you already created a class for the model - or you could explicitly make it a static method of the MLP
class, so that
class MLP():
@staticmethod
def _real_weights(shape):
initializer=tf.truncated_normal_initializer(stddev=0.1)
weights = tf.get_variable('weights', shape, initializer=initializer)
return weights
Since static methods by definition do not operate on a class instance, you can (and have to) omit the self
reference.
You would then create the templates as
tf.make_template('_weights', _real_weights)
in the first case and
tf.make_template('_weights', MLP._real_weights)
in the second case, explicitly specifying the class MLP
as the name scope of the static method. Either way, the _real_weights
function/method and the _weights
template both now have only one argument, the shape of the variable to create.
上一篇: Char的初始状态
下一篇: TensorFlow中的模板()