为什么熊猫'=='不同于'.eq（）'

2018-06-13 05:49:04

考虑系列s

s = pd.Series([(1, 2), (3, 4), (5, 6)])

这是预期的

s == (3, 4)

0    False
1     True
2    False
dtype: bool

这不是

s.eq((3, 4))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

ValueError: Lengths must be equal

我假设他们是一样的。他们有什么区别？

文件说什么？

相当于系列==其他，但支持用其中一个输入中的缺失数据替换fill_value。

这似乎意味着他们应该同样工作，因此造成混乱。

你遇到的实际上是一个特殊情况，它使得比较pandas.Series或numpy.ndarray与普通的python结构更容易。源代码如下：

def flex_wrapper(self, other, level=None, fill_value=None, axis=0):
    # validate axis
    if axis is not None:
        self._get_axis_number(axis)
    if isinstance(other, ABCSeries):
        return self._binop(other, op, level=level, fill_value=fill_value)
    elif isinstance(other, (np.ndarray, list, tuple)):
        if len(other) != len(self):
            # ---------------------------------------
            # you never reach the `==` path because you get into this.
            # ---------------------------------------
            raise ValueError('Lengths must be equal')  
        return self._binop(self._constructor(other, self.index), op,
                           level=level, fill_value=fill_value)
    else:
        if fill_value is not None:
            self = self.fillna(fill_value)

        return self._constructor(op(self, other),
                                 self.index).__finalize__(self)

由于熊猫假定.eq的值是转换为numpy.ndarray或pandas.Series （如果您给它一个数组，列表或元组），而不是实际将其与tuple进行比较，所以您正在击中ValueError 。例如，如果你有：

s = pd.Series([1,2,3])
s.eq([1,2,3])

你不希望它将每个元素与[1,2,3]进行比较。

问题在于object数组（与dtype=uint ）经常会穿过裂缝或被故意忽略。一个简单的if self.dtype != 'object'分支可以解决这个问题。但也许开发商有充分的理由让这个案例变得不同。我建议通过张贴在他们的bug跟踪器上来要求澄清。

你没有问过你如何使它正常工作，但为了完整性，我将包括一种可能性（根据源代码，你可能需要将它包装为pandas.Series自己）：

>>> s.eq(pd.Series([(1, 2)]))
0     True
1    False
2    False
dtype: bool

==是一个元素明智的比较，它产生一个真值的向量，而.eq是一个“那两个迭代是相等的”，对于它来说，长度是相同的。 Ayhan指出了一个例外：当你使用.eq(scalar value)比较一个pandas矢量类型时，标量值只是广播到一个相同大小的矢量进行比较。

链接地址: http://www.djcxy.com/p/37781.html

上一篇: Why is pandas '==' different than '.eq()'

下一篇: How to make a custom list deserializer in Gson?