How to monitor tensor values in Theano/Keras?

2018-06-11 20:28:05

I know this question has been asked in various forms, but I can't really find any answer I can understand and use. So forgive me if this is a basic question, 'cause I'm a newbie to these tools(theano/keras)

Problem to Solve

Monitor variables in Neural Networks (eg input/forget/output gate values in LSTM)

What I'm currently getting

no matter in which stage I'm getting those values, I'm getting something like :

Elemwise{mul,no_inplace}.0
Elemwise{mul,no_inplace}.0
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
Subtensor{int64}.0
Subtensor{int64}.0

Is there any way I can't monitor(eg print to stdout, write to a file, etc) them?

Possible Solution

Seems like callbacks in Keras can do the job, but it doesn't work either for me. I'm getting same thing as above

My Guess

Seems like I'm making very simple mistakes.

Thank you very much in advance, everyone.

ADDED

Specifically, I'm trying to monitor input/forget/output gating values in LSTM. I found that LSTM.step() is for computing those values:

def step(self, x, states):
    h_tm1 = states[0]   # hidden state of the previous time step
    c_tm1 = states[1]   # cell state from the previous time step
    B_U = states[2]     # dropout matrices for recurrent units?
    B_W = states[3]     # dropout matrices for input units?

    if self.consume_less == 'cpu':                              # just cut x into 4 pieces in columns
        x_i = x[:, :self.output_dim]
        x_f = x[:, self.output_dim: 2 * self.output_dim]
        x_c = x[:, 2 * self.output_dim: 3 * self.output_dim]
        x_o = x[:, 3 * self.output_dim:]
    else:
        x_i = K.dot(x * B_W[0], self.W_i) + self.b_i
        x_f = K.dot(x * B_W[1], self.W_f) + self.b_f
        x_c = K.dot(x * B_W[2], self.W_c) + self.b_c
        x_o = K.dot(x * B_W[3], self.W_o) + self.b_o

    i = self.inner_activation(x_i + K.dot(h_tm1 * B_U[0], self.U_i))
    f = self.inner_activation(x_f + K.dot(h_tm1 * B_U[1], self.U_f))
    c = f * c_tm1 + i * self.activation(x_c + K.dot(h_tm1 * B_U[2], self.U_c))
    o = self.inner_activation(x_o + K.dot(h_tm1 * B_U[3], self.U_o))

    with open("test_visualization.txt", "a") as myfile:
        myfile.write(str(i)+"n")

    h = o * self.activation(c)
    return h, [h, c]

And as it's in the code above, I tried to write the value of i into a file, but it only gave me values like :

Elemwise{mul,no_inplace}.0
[for{cpu,scan_fn}.2, Subtensor{int64::}.0, Subtensor{int64::}.0]
Subtensor{int64}.0

So I tried i.eval() or i.get_value(), but both failed to give me values.

.eval() gave me this:

theano.gof.fg.MissingInputError: An input of the graph, used to compute Subtensor{::, :int64:}(<TensorType(float32, matrix)>, Constant{10}), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.

and .get_value() gave me this:

AttributeError: 'TensorVariable' object has no attribute 'get_value'

So I backtracked those chains(which line calls which functions..) and tried to get values at every steps I found but in vain.

Feels like I'm in some basic pitfalls.

I use the solution described in the Keras FAQ:

http://keras.io/getting-started/faq/#how-can-i-visualize-the-output-of-an-intermediate-layer

In detail:

from keras import backend as K

intermediate_tensor_function = K.function([model.layers[0].input],[model.layers[layer_of_interest].output])
intermediate_tensor = intermediate_tensor_function([thisInput])[0]

yields:

array([[ 3.,  17.]], dtype=float32)

However I'd like to use the functional API but I can't seem to get the actual tensor, only the symbolic representation. For example:

model.layers[1].output

yields:

<tf.Tensor 'add:0' shape=(?, 2) dtype=float32>

I'm missing something about the interaction of Keras and Tensorflow here but I'm not sure what. Any insight much appreciated.

One solution is to create a version of your network that is truncated at the LSTM layer of which you want to monitor the gate values, and then replace the original layer with a custom layer in which the stepfunction is modified to return not only the hidden layer values, but also the gate values.

For instance, say you want to access the access the gate values of a GRU. Create a custom layer GRU2 that inherits everything from the GRU class, but adapt the step function such that it returns a concatenation of the states you want to monitor, and then takes only the part containing the previous hidden layer activations when computing the next activations. Ie:

def step(self, x, states):

    # get prev hidden layer from input that is concatenation of
    # prev hidden layer + reset gate + update gate
    x = x[:self.output_dim, :]


    ###############################################
    # This is the original code from the GRU layer
    #

    h_tm1 = states[0]  # previous memory
    B_U = states[1]  # dropout matrices for recurrent units
    B_W = states[2]

    if self.consume_less == 'gpu':

        matrix_x = K.dot(x * B_W[0], self.W) + self.b
        matrix_inner = K.dot(h_tm1 * B_U[0], self.U[:, :2 * self.output_dim])

        x_z = matrix_x[:, :self.output_dim]
        x_r = matrix_x[:, self.output_dim: 2 * self.output_dim]
        inner_z = matrix_inner[:, :self.output_dim]
        inner_r = matrix_inner[:, self.output_dim: 2 * self.output_dim]

        z = self.inner_activation(x_z + inner_z)
        r = self.inner_activation(x_r + inner_r)

        x_h = matrix_x[:, 2 * self.output_dim:]
        inner_h = K.dot(r * h_tm1 * B_U[0], self.U[:, 2 * self.output_dim:])
        hh = self.activation(x_h + inner_h)
    else:
        if self.consume_less == 'cpu':
            x_z = x[:, :self.output_dim]
            x_r = x[:, self.output_dim: 2 * self.output_dim]
            x_h = x[:, 2 * self.output_dim:]
        elif self.consume_less == 'mem':
            x_z = K.dot(x * B_W[0], self.W_z) + self.b_z
            x_r = K.dot(x * B_W[1], self.W_r) + self.b_r
            x_h = K.dot(x * B_W[2], self.W_h) + self.b_h
        else:
            raise Exception('Unknown `consume_less` mode.')
        z = self.inner_activation(x_z + K.dot(h_tm1 * B_U[0], self.U_z))
        r = self.inner_activation(x_r + K.dot(h_tm1 * B_U[1], self.U_r))

        hh = self.activation(x_h + K.dot(r * h_tm1 * B_U[2], self.U_h))
    h = z * h_tm1 + (1 - z) * hh

    #
    # End of original code
    ###########################################################


    # concatenate states you want to monitor, in this case the
    # hidden layer activations and gates z and r
    all = K.concatenate([h, z, r])

    # return everything
    return all, [h]

(Note that the only lines I added are at the beginning and end of the function).

If you then run your network with GRU2 as last layer instead of GRU (with return_sequences = True for the GRU2 layer), you can just call predict on your network, this will give you all hidden layer and gate values.

The same thing should work for LSTM, although you might have to puzzle a bit to figure out how to store all the outputs you want in one vector and retrieve them again afterwards.

Hope that helps!

You can use theano's printing module for printing during execution (and not during definition, which is what you're doing and the reason why you're not getting values, but their abstract definition).

Print

Just use the Print function. Don't forget to use the output of Print to continue your graph , otherwise the output will be disconnected and Print will most likely be removed during optimisation. And you will see nothing.

from keras import backend as K
from theano.printing import Print

def someLossFunction(x, ref):
  loss = K.square(x - ref)
  loss = Print('Loss tensor (before sum)')(loss)
  loss = K.sum(loss)
  loss = Print('Loss scalar (after sum)')(loss)
  return loss

Plot

A little bonus you might enjoy.

The Print class has a global_fn parameter, to override the default callback to print. You can provide your own function and directly access to the data, to build a plot for instance.

from keras import backend as K
from theano.printing import Print
import matplotlib.pyplot as plt

curve = []

# the callback function
def myPlottingFn(printObj, data):
    global curve
    # Store scalar data
    curve.append(data)

    # Plot it
    fig, ax = plt.subplots()
    ax.plot(curve, label=printObj.message)
    ax.legend(loc='best')
    plt.show()

def someLossFunction(x, ref):
  loss = K.sum(K.square(x - ref))
  # Callback is defined line below
  loss = Print('Loss scalar (after sum)', global_fn=myplottingFn)(loss) 
  return loss

BTW the string you passed to Print('...') is stored in the print object under property name message (see function myPlottingFn ). This is useful for building multi-curves plot automatically

链接地址: http://www.djcxy.com/p/33954.html

上一篇: gcc sanitizer：unmap

下一篇: 如何监控Theano / Keras的张量值？