Char的初始状态
我正在尝试创建Char-RNN以通过Andrej Karpathy博客生成文本。 但是,我对我的Tensorflow实现有一些疑问。 (我复制我的代码的重要部分与解释)
模型:基本上,它是一个RNN,您可以使用多个图层。 输出由下一层用作不同单元的输入。 sequence_length ... self.input_x = tf.placeholder(tf.int32,[batch_size,sequence_length],name =“input_x”)self.input_y = tf.placeholder(tf.int32,[batch_size,sequence_length],name =“input_y “)self.sequence_length = tf.placeholder(tf.int32,[batch_size],name =”sequence_length“)
output = self.input_x
for layer in range(num_layers):
with tf.name_scope("recurrent-%s" % (layer+1)):
cell = tf.contrib.rnn.BasicLSTMCell(num_hidden, state_is_tuple=True)
self.initial_state = cell.zero_state(batch_size, tf.float32)
output, self.state = tf.nn.dynamic_rnn(cell, output,
initial_state=self.initial_state,
sequence_length=self.sequence_length,
dtype = tf.float32, scope="rnn-%s" % (layer+1))
...
训练:在这里,我使用zero_state来提供训练的初始状态。
...
feed_dict = {model.input_x: x, model.input_y: y,
model.sequence_length: seq,
model.initial_state: sess.run(model.initial_state)}
step, loss = sess.run([global_step, model.loss], feed_dict)
...
测试:我按顺序提供前一个时间戳的状态以获取序列的最后状态,从而从预定义的前一个文本中生成文本。
...
start_text = "the meaning of life is"
state = sess.run(model.initial_state)
text = start_text
for word in start_text:
x = np.array(list(vocab_processor.transform([word])))[0][0]
feed_dict = {model.input_x: x,
model.sequence_length: 1,
model.initial_state: state}
state = sess.run(model.state, feed_dict)
for i in range(max_document_length):
feed_dict = {model.input_x: x,
model.sequence_length: 1,
model.initial_state: state}
state, x = sess.run([model.state, model.predictions], feed_dict)
...
我的主要问题是关于self.state和self.initial的用法: