python sqlalchemy不同的列值

2018-07-02 13:46:20

我在我的SQLite数据库中有6个表，每个表有6列（ Date, user, NormalA, specialA, contact, remarks ）和1000多行。

我如何使用sqlalchemy对日期列进行排序来查找重复的日期并删除该行？

假设这是你的模型：

class MyTable(Base):
    __tablename__ = 'my_table'
    id = Column(Integer, primary_key=True)
    date = Column(DateTime)
    user = Column(String)
    # do not really care of columns other than `id` and `date`
    # important here is the fact that `id` is a PK

以下是删除数据的两种方法：

找到重复项，将其标记为删除并提交交易

创建一个直接在数据库上执行删除的单个SQL查询。

对于他们两个人都会使用一个辅助子查询：

# helper subquery: find first row (by primary key) for each unique date
subq = (
    session.query(MyTable.date, func.min(MyTable.id).label("min_id"))
    .group_by(MyTable.date)
) .subquery('date_min_id')

选项-1：查找重复项，将其标记为删除并提交交易

# query to find all duplicates
q_duplicates = (
    session
    .query(MyTable)
    .join(subq, and_(
        MyTable.date == subq.c.date,
        MyTable.id != subq.c.min_id)
    )
)

for x in q_duplicates:
    print("Will delete %s" % x)
    session.delete(x)
session.commit()

选项-2：创建一个直接在数据库上执行删除的单个SQL查询

sq = (
    session
    .query(MyTable.id)
    .join(subq, and_(
        MyTable.date == subq.c.date,
        MyTable.id != subq.c.min_id)
    )
).subquery("subq")

dq = (
    session
    .query(MyTable)
    .filter(MyTable.id.in_(sq))
).delete(synchronize_session=False)

受SQL表中查找重复值的启发，这可以帮助您选择重复日期：

query = session.query(
    MyTable
).
    having(func.count(MyTable.date) > 1).
    group_by(MyTable.date).all()

如果你只想显示独特的日期; distinct on是你可能需要的

虽然我喜欢用SQLAlchemy完成面向对象的方法，但有时我发现直接使用某些SQL更容易。由于记录没有密钥，我们需要行号（ _ROWID_ ）来删除目标记录，我不认为API提供它。

所以首先我们连接到数据库：

from sqlalchemy import create_engine
db = create_engine(r'sqlite:///C:tempexample.db')
eng = db.engine

然后列出所有记录：

for row in eng.execute("SELECT * FROM TableA;") :
  print row

并显示日期相同的所有重复记录：

for row in eng.execute("""
  SELECT * FROM {table}
  WHERE {field} IN (SELECT {field} FROM {table} GROUP BY {field} HAVING COUNT(*) > 1)
  ORDER BY {field};
  """.format(table="TableA", field="Date")) :
  print row

现在我们确定了所有重复项，如果其他字段不同，它们可能需要修复：

eng.execute("UPDATE TableA SET NormalA=18, specialA=20 WHERE Date = '2016-18-12' ;");
eng.execute("UPDATE TableA SET NormalA=4,  specialA=8  WHERE Date = '2015-18-12' ;");

最后保留第一个插入的记录并删除最近的重复记录：

print eng.execute("""
  DELETE FROM {table} 
  WHERE _ROWID_ NOT IN (SELECT MIN(_ROWID_) FROM {table} GROUP BY {field});
  """.format(table="TableA", field="Date")).rowcount

或者保留最后插入的记录并删除其他重复的记录：

print eng.execute("""
  DELETE FROM {table} 
  WHERE _ROWID_ NOT IN (SELECT MAX(_ROWID_) FROM {table} GROUP BY {field});
  """.format(table="TableA", field="Date")).rowcount

链接地址: http://www.djcxy.com/p/90793.html

上一篇: python sqlalchemy distinct column values

下一篇: How do I Call highlightRow of ListView.renderRow()?