Pretrained embedding type error

I am creating a computational graph in Tensorflow and I want to use the pretrained vectors. I have a method the preloads the vectors of all my words in the dataset into a matrix.

    def preload_vectors(word2vec_path, word2id, vocab_size, emb_dim):
    if word2vec_path:
        print('Load word2vec_norm file {}'.format(word2vec_path))
        with open(word2vec_path,'r') as f:
            print(vocab_size, emb_dim)
            scale = np.sqrt(3.0 / emb_dim)
            init_W = np.random.uniform(-scale, scale, [vocab_size, emb_dim])

            while True:
                if not line:break
                if word in word2id:
                    init_W[word2id[word]] = np.array(line.split()[1:], dtype = np.float32)
    return init_W

    init_W = preload_vectors("data/GoogleNews-vectors-negative300.txt", word2id, word_vocab_size, FLAGS.word_embedding_dim)


    Load word2vec_norm file data/GoogleNews-vectors-negative300.txt
    2556 300

In the computational graph, I have this:

    W = tf.Variable(tf.constant(0.0, shape = [word_vocab_size,FLAGS.word_embedding_dim]),trainable = False, name='word_embeddings')
    embedding_placeholder = tf.placeholder(tf.float32, shape = [word_vocab_size, FLAGS.word_embedding_dim])
    embedding_init = W.assign(embedding_placeholder)

And finally, in the session I feed the init_W to embedding_placeholder:

_,train_cost,[train_op,cost,prediction], feed_dict={
          //other model inputs here
          embedding_placeholder: init_W

But I get this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-18-732a79dc5ebd>     in <module>()
 68                             labels:  next_batch_input.relatedness_scores,
     69                             dropout_f: config.keep_prob,
---> 70                             embedding_placeholder: init_W
 71                         })
 72                     avg_cost+=train_cost

/Users/kurt/anaconda2/envs/tensorflow/lib/python2.7/site- packages/tensorflow/python/client/session.pyc in run(self, fetches,   feed_dict, options, run_metadata)
764     try:
765       result = self._run(None, fetches, feed_dict, options_ptr,
--> 766                          run_metadata_ptr)
767       if run_metadata:
768         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/Users/kurt/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
935                 ' to a larger type (e.g. int64).')
--> 937           np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
939           if not   subfeed_t.get_shape().is_compatible_with(np_val.shape):

/Users/kurt/anaconda2/envs/tensorflow/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
481     """
--> 482     return array(a, dtype, copy=False, order=order)
484 def asanyarray(a, dtype=None, order=None):

TypeError: float() argument must be a string or a number

I checked the values of the init_W array and they are float:


I used to be able to do this recently with no problems. I must have missed out something here? Please, I need your help. Thanks!

Apparently, the problem was not with the embedding_placeholder but with some other input to feed_dict. It was not clear though with the error message.


上一篇: 简单的Tensorflow多层神经网络不学习

下一篇: 预训练嵌入类型错误