How to read csv into record array in numpy?

I wonder if there is a direct way to import the contents of a csv file into a record array, much in the way that R's read.table() , read.delim() , and read.csv() family imports data to R's data frame?

Or is the best way to use csv.reader() and then apply something like numpy.core.records.fromrecords() ?


You can use Numpy's genfromtxt() method to do so, by setting the delimiter kwarg to a comma.

from numpy import genfromtxt
my_data = genfromtxt('my_file.csv', delimiter=',')

More information on the function can be found at its respective documentation.


I would recommend the read_csv function from the pandas library:

import pandas as pd
df=pd.read_csv('myfile.csv', sep=',',header=None)
df.values
array([[ 1. ,  2. ,  3. ],
       [ 4. ,  5.5,  6. ]])

This gives a pandas DataFrame - allowing many useful data manipulation functions which are not directly available with numpy record arrays.

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table...


I would also recommend genfromtxt . However, since the question asks for a record array, as opposed to a normal array, the dtype=None parameter needs to be added to the genfromtxt call:

Given an input file, myfile.csv :

1.0, 2, 3
4, 5.5, 6

import numpy as np
np.genfromtxt('myfile.csv',delimiter=',')

gives an array:

array([[ 1. ,  2. ,  3. ],
       [ 4. ,  5.5,  6. ]])

and

np.genfromtxt('myfile.csv',delimiter=',',dtype=None)

gives a record array:

array([(1.0, 2.0, 3), (4.0, 5.5, 6)], 
      dtype=[('f0', '<f8'), ('f1', '<f8'), ('f2', '<i4')])

This has the advantage that file with multiple data types (including strings) can be easily imported.


您也可以尝试recfromcsv() ,它可以猜测数据类型并返回格式正确的记录数组。

链接地址: http://www.djcxy.com/p/50982.html

上一篇: 如何访问NumPy多维数组的第i列?

下一篇: 如何将csv读入numpy的记录数组?