如何用sqlalchemy编写自己的聚合函数?

时间:2022-11-22 22:30:20

How can I write my own aggregate functions with SQLAlchemy? As an easy example I would like to use numpy to calculate the variance. With sqlite it would look like this:

如何用SQLAlchemy编写自己的聚合函数?作为一个简单的例子,我想使用numpy来计算方差。使用sqlite它看起来像这样:

import sqlite3 as sqlite
import numpy as np

class self_written_SQLvar(object):
  def __init__(self):
    import numpy as np
    self.values = []
  def step(self, value):
    self.values.append(value)
  def finalize(self):
    return np.array(self.values).var()

cxn = sqlite.connect(':memory:')
cur = cxn.cursor()
cxn.create_aggregate("self_written_SQLvar", 1, self_written_SQLvar)
# Now - how to use it:
cur.execute("CREATE TABLE 'mytable' ('numbers' INTEGER)")
cur.execute("INSERT INTO 'mytable' VALUES (1)") 
cur.execute("INSERT INTO 'mytable' VALUES (2)") 
cur.execute("INSERT INTO 'mytable' VALUES (3)") 
cur.execute("INSERT INTO 'mytable' VALUES (4)")
a = cur.execute("SELECT avg(numbers), self_written_SQLvar(numbers) FROM mytable")
print a.fetchall()
>>> [(2.5, 1.25)]

2 个解决方案

#1


The creation of new aggregate functions is backend-dependant, and must be done directly with the API of the underlining connection. SQLAlchemy offers no facility for creating those.

新聚合函数的创建依赖于后端,必须直接使用下划线连接的API完成。 SQLAlchemy没有提供创建它们的工具。

However after created you can just use them in SQLAlchemy normally.

但是在创建之后,您通常可以在SQLAlchemy中使用它们。

Example:

import sqlalchemy
from sqlalchemy import Column, Table, create_engine, MetaData, Integer
from sqlalchemy import func, select
from sqlalchemy.pool import StaticPool
from random import randrange
import numpy
import sqlite3

class NumpyVarAggregate(object):
  def __init__(self):
    self.values = []
  def step(self, value):
    self.values.append(value)
  def finalize(self):
    return numpy.array(self.values).var()

def sqlite_memory_engine_creator():
    con = sqlite3.connect(':memory:')
    con.create_aggregate("np_var", 1, NumpyVarAggregate)
    return con

e = create_engine('sqlite://', echo=True, poolclass=StaticPool,
                  creator=sqlite_memory_engine_creator)
m = MetaData(bind=e)
t = Table('mytable', m, 
            Column('id', Integer, primary_key=True),
            Column('number', Integer)
          )
m.create_all()

Now for the testing:

现在进行测试:

# insert 30 random-valued rows
t.insert().execute([{'number': randrange(100)} for x in xrange(30)])

for row in select([func.avg(t.c.number), func.np_var(t.c.number)]).execute():
    print 'RESULT ROW: ', row

That prints (with SQLAlchemy statement echo turned on):

打印(打开SQLAlchemy语句echo):

2009-06-15 14:55:34,171 INFO sqlalchemy.engine.base.Engine.0x...d20c PRAGMA 
table_info("mytable")
2009-06-15 14:55:34,174 INFO sqlalchemy.engine.base.Engine.0x...d20c ()
2009-06-15 14:55:34,175 INFO sqlalchemy.engine.base.Engine.0x...d20c 
CREATE TABLE mytable (
    id INTEGER NOT NULL, 
    number INTEGER, 
    PRIMARY KEY (id)
)
2009-06-15 14:55:34,175 INFO sqlalchemy.engine.base.Engine.0x...d20c ()
2009-06-15 14:55:34,176 INFO sqlalchemy.engine.base.Engine.0x...d20c COMMIT
2009-06-15 14:55:34,177 INFO sqlalchemy.engine.base.Engine.0x...d20c INSERT
INTO mytable (number) VALUES (?)
2009-06-15 14:55:34,177 INFO sqlalchemy.engine.base.Engine.0x...d20c [[98], 
[94], [7], [1], [79], [77], [51], [28], [85], [26], [34], [68], [15], [43], 
[52], [97], [64], [82], [11], [71], [27], [75], [60], [85], [42], [40], 
[76], [12], [81], [69]]
2009-06-15 14:55:34,178 INFO sqlalchemy.engine.base.Engine.0x...d20c COMMIT
2009-06-15 14:55:34,180 INFO sqlalchemy.engine.base.Engine.0x...d20c SELECT
avg(mytable.number) AS avg_1, np_var(mytable.number) AS np_var_1 FROM mytable
2009-06-15 14:55:34,180 INFO sqlalchemy.engine.base.Engine.0x...d20c []
RESULT ROW: (55.0, 831.0)

Note that I didn't use SQLAlchemy's ORM (just the sql expression part of SQLAlchemy was used) but you could use ORM just as well.

请注意,我没有使用SQLAlchemy的ORM(只使用了SQLAlchemy的sql表达式部分),但您也可以使用ORM。

#2


at first you have to import func from sqlalchemy

首先你必须从sqlalchemy导入func

you can write

你可以写

func.avg('fieldname')

or func.avg('fieldname').label('user_deined')

or you can go thru for mre information

或者你可以通过mre信息

http://www.sqlalchemy.org/docs/05/ormtutorial.html#using-subqueries

#1


The creation of new aggregate functions is backend-dependant, and must be done directly with the API of the underlining connection. SQLAlchemy offers no facility for creating those.

新聚合函数的创建依赖于后端,必须直接使用下划线连接的API完成。 SQLAlchemy没有提供创建它们的工具。

However after created you can just use them in SQLAlchemy normally.

但是在创建之后,您通常可以在SQLAlchemy中使用它们。

Example:

import sqlalchemy
from sqlalchemy import Column, Table, create_engine, MetaData, Integer
from sqlalchemy import func, select
from sqlalchemy.pool import StaticPool
from random import randrange
import numpy
import sqlite3

class NumpyVarAggregate(object):
  def __init__(self):
    self.values = []
  def step(self, value):
    self.values.append(value)
  def finalize(self):
    return numpy.array(self.values).var()

def sqlite_memory_engine_creator():
    con = sqlite3.connect(':memory:')
    con.create_aggregate("np_var", 1, NumpyVarAggregate)
    return con

e = create_engine('sqlite://', echo=True, poolclass=StaticPool,
                  creator=sqlite_memory_engine_creator)
m = MetaData(bind=e)
t = Table('mytable', m, 
            Column('id', Integer, primary_key=True),
            Column('number', Integer)
          )
m.create_all()

Now for the testing:

现在进行测试:

# insert 30 random-valued rows
t.insert().execute([{'number': randrange(100)} for x in xrange(30)])

for row in select([func.avg(t.c.number), func.np_var(t.c.number)]).execute():
    print 'RESULT ROW: ', row

That prints (with SQLAlchemy statement echo turned on):

打印(打开SQLAlchemy语句echo):

2009-06-15 14:55:34,171 INFO sqlalchemy.engine.base.Engine.0x...d20c PRAGMA 
table_info("mytable")
2009-06-15 14:55:34,174 INFO sqlalchemy.engine.base.Engine.0x...d20c ()
2009-06-15 14:55:34,175 INFO sqlalchemy.engine.base.Engine.0x...d20c 
CREATE TABLE mytable (
    id INTEGER NOT NULL, 
    number INTEGER, 
    PRIMARY KEY (id)
)
2009-06-15 14:55:34,175 INFO sqlalchemy.engine.base.Engine.0x...d20c ()
2009-06-15 14:55:34,176 INFO sqlalchemy.engine.base.Engine.0x...d20c COMMIT
2009-06-15 14:55:34,177 INFO sqlalchemy.engine.base.Engine.0x...d20c INSERT
INTO mytable (number) VALUES (?)
2009-06-15 14:55:34,177 INFO sqlalchemy.engine.base.Engine.0x...d20c [[98], 
[94], [7], [1], [79], [77], [51], [28], [85], [26], [34], [68], [15], [43], 
[52], [97], [64], [82], [11], [71], [27], [75], [60], [85], [42], [40], 
[76], [12], [81], [69]]
2009-06-15 14:55:34,178 INFO sqlalchemy.engine.base.Engine.0x...d20c COMMIT
2009-06-15 14:55:34,180 INFO sqlalchemy.engine.base.Engine.0x...d20c SELECT
avg(mytable.number) AS avg_1, np_var(mytable.number) AS np_var_1 FROM mytable
2009-06-15 14:55:34,180 INFO sqlalchemy.engine.base.Engine.0x...d20c []
RESULT ROW: (55.0, 831.0)

Note that I didn't use SQLAlchemy's ORM (just the sql expression part of SQLAlchemy was used) but you could use ORM just as well.

请注意,我没有使用SQLAlchemy的ORM(只使用了SQLAlchemy的sql表达式部分),但您也可以使用ORM。

#2


at first you have to import func from sqlalchemy

首先你必须从sqlalchemy导入func

you can write

你可以写

func.avg('fieldname')

or func.avg('fieldname').label('user_deined')

or you can go thru for mre information

或者你可以通过mre信息

http://www.sqlalchemy.org/docs/05/ormtutorial.html#using-subqueries