用行列标题读取CSV的Pythonic方法

2018-06-23 13:24:41

我们来看一个带有行和列标题的CSV表格，例如：

, "Car", "Bike", "Boat", "Plane", "Shuttle"
"Red", 1, 7, 3, 0, 0
"Green", 5, 0, 0, 0, 0
"Blue", 1, 1, 4, 0, 1

我想获得行和列标题，即：

col_headers = ["Car", "Bike", "Boat", "Plane", "Shuttle"]
row_headers = ["Red", "Green", "Blue"]
data = [[1, 7, 3, 0, 0],
        [5, 0, 0, 0, 0],
        [1, 1, 4, 0, 1]]

当然，我可以做类似的事情

import csv
with open("path/to/file.csv", "r") as f:
    csvraw = list(csv.reader(f))
col_headers = csvraw[1][1:]
row_headers = [row[0] for row in csvraw[1:]]
data = [row[1:] for row in csvraw[1:]]

......但它看起来不够Pythonic。

这种自然操作有没有更好的方法？

看看csv.DictReader 。

如果省略fieldnames参数，则csvfile的第一行中的值将用作字段名称。

然后你可以做reader.fieldnames 。这当然只会给你列标题。您仍然需要手动解析行标题。

不过，我认为你的原始解决方案非常好。

现在我看到，我想要的是用Pandas完成的最简单（也是最强大的）。

import pandas as pd
df = pd.read_csv('foo.csv', index_col=0)

如果我想要，很容易提取：

col_headers = list(df.columns)
row_headers = list(df.index)

否则，在“原始”Python中，似乎我在问题中写的方法“足够好”。

我知道这个解决方案为您提供了另一种比请求的输出格式，但它非常方便。这将csv行读入字典中：

reader = csv.reader(open(parameters_file), dialect)

keys = [key.lower() for key in reader.next()]
for line in reader:
    parameter = dict(zip(keys, cells))

链接地址: http://www.djcxy.com/p/66045.html

上一篇: A Pythonic way to read CSV with row and column headers

下一篇: List iterator causes heap allocations?