Tensorflow: The priority of value assigning operations

I try to understand how the Tensorflow computation graph operates, more deeply. Assume that we have the following code:

A = tf.truncated_normal(shape=(1, ), stddev=0.1)
B = tf.Variable([0.3], dtype=tf.float32)
C = A * B
grads = tf.gradients(C, [A, B])
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(1000):
    results = sess.run([C, grads], {A: [2], B:[5]})

I get as the result 10 and gradients 5 for A and 2 for B, as expected. What I want to be sure is that, when we feed values to the tensors like we did for A and B, the default value generation mechanisms for them as defined in the computation graph become overwritten, is that right?

For example, here, no normal random value is generated for A and it is overwritten by 2 and 0.3 is replaced with 5 for B, whenever we run the sess.run line in the for loop. How does the computation graph behaves exactly in such cases?

For general cases, is my following understanding correct: Every time we call sess.run , the required nodes for calculating the values in the fetch list are determined with topological ordering and all tensors are overwritten with the values provided in the feed_dict parameter, breaking their dependence to the rest of the computation graph. (For example if tensor A waits B's value to be evaluated and if we inject a value to A in the feed_dict, A's dependence to B is broken and I believe that this is reflected in the computation graph as well, somehow). Then, according to the final form of the computation graph, the forward and backward calculations are executed.


I believe there are just two small corrections needed:

  • Instead of doing two pass - first determine the smallest graph to execute and then "break it" it can be done in a single pass - one looks for a smallest graph needed to execute sess.run ops given what is in feed dict. In other words every time you discover new node (when going backwards through dependences of your op) you check if it is provided in the feed_dict, and if it is true - you assume this is a given, a leaf node.

  • There is no such thing as "backward calculations" in TF, everything is a forward calculation. tf.gradients (or minimize) calls simply construct a forward graph which is funcionally equivalent to what would happen in many other libraries during backward pass. There is no strict forward/backward separation in TF though - and you are free to hack, mix and do whatever you want with the graph - in the end these are just nodes depending on each other, with one direction data flow.

  • 链接地址: http://www.djcxy.com/p/32108.html

    上一篇: 使用Feed的Tensorflow

    下一篇: Tensorflow:值赋值操作的优先级