join function of a numpy array composed of string

I'm trying to use the join function on a numpy array composed of only strings (representing binary floats) to get the joined string in order to use the numpy.fromstring function, but the join function doesn't seem to work properly.

Any idea why? Which alternative function can I use to do that?

Here is a standalone example to show my problem:

import numpy as np

nb_el = 10

table = np.arange(nb_el, dtype='float64')
print table

binary = table.tostring()

binary_list = map(''.join, zip(*[iter(binary)] * table.dtype.itemsize))
print 'len binary list :', len(binary_list)
# len binary list : 10

join_binary_list = ''.join(binary_list)
print np.fromstring(join_binary_list, dtype='float64')
# [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9.]

binary_split_array = np.array(binary_list)
print 'nb el :', binary_split_array.shape
# nb el : (10,)
print 'nb_el * size :', binary_split_array.shape[0] * binary_split_array.dtype.itemsize
# nb_el * size : 80

join_binary_split_array = ''.join(binary_split_array)
print 'len binary array :', len(join_binary_split_array)
# len binary array : 72

table_fromstring = np.fromstring(join_binary_split_array, dtype='float64')
print table_fromstring
# [ 1.  2.  3.  4.  5.  6.  7.  8.  9.]

As you can see, using the join function on the list ( binary_list ) works properly, but on the equivalent numpy array ( binary_split_array ) it doesn't: we can see the string returned is only 72 characters long instead of 80.


The first element of your join_binary_split_array is an empty string:

print(repr(binary_split_array[0]))    
''

The first element in your list is:

'x00x00x00x00x00x00x00x00'

An empty string has a length of 0:

print([len("".join(a)) for a in binary_split_array])
print([len("".join(a)) for a in binary_list])
[0, 8, 8, 8, 8, 8, 8, 8, 8, 8]
[8, 8, 8, 8, 8, 8, 8, 8, 8, 8]

The length of the str of bytes 8:

print(len('x00x00x00x00x00x00x00x00'))
8

Calling tobytes will give the same output length as the list:

print(len(binary_split_array.tobytes()))
80

table_fromstring = np.fromstring(binary_split_array.tobytes(), dtype='float64')

print table_fromstring
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9.]

The numpy array handles null bytes differently to python, null bytes are truncated.

链接地址: http://www.djcxy.com/p/26324.html

上一篇: 如何使用反射来改变备份服务?

下一篇: 连接由字符串组成的numpy数组的函数