= SQLAlchemy Aggregator = [[PageOutline(2-3,,inline)]] == News == * as of 20.06.2008 - supporting project by me (Paul Colomiets) is suspended, there is more recent version at [https://dbcook.svn.sourceforge.net/svnroot/dbcook/trunk/dbcook/misc/aggregator] * 18.11.2007 - We now have this trac installation, and svn (see below) * 27.08.2007 - Version 0.1.4. Code somewhat refactored, small bugfix changes by Svilen. Experimental Average aggregator support ''(0.1.3 was aimed to fix bugs in inherited mappers but actually fixed nothing)'' * 09.08.2007 - Version 0.1.2. Bugs fixed (pointed out in [http://groups.google.com/group/sqlalchemy/browse_thread/thread/e3bde9ad57f63533/0b3d3f51e9e55e89?#0b3d3f51e9e55e89 SQLAlchemy mailing list]) * 08.08.2007 - Version 0.1.1, `UPDATE` support, `Sum` aggregator fixed (was missing unittests, and not worked) * 07.08.2007 - Initial version == Installation == Installation is as easy as typing: {{{ easy_install SQLAlchemyAggregator }}} == Basic usage == Assuming we have some abstract `blocks` table with children `lines` table, and we want to store count of lines in `block.lines` and id of last line in `block.lastline` Your tables should look like: {{{ #!python blocks = Table('blocks', metadata, Column('id', Integer, primary_key=True, autoincrement=True), .... Column('lines', Integer, default=0), Column('lastline', Integer)) lines = Table('lines', metadata, Column('id', Integer, primary_key=True, autoincrement=True), Column('block', Integer, ForeingKey(blocks.c.id)), ....) }}} Few things you should know: * You must have a foreign key to parent table * If there are several first is chosen * If you'd not specified default for `lines` you have `block.lines = None` until you add first row (well, all thease restrictions should go in future releases) Then we need to create mappers: {{{ #!python >>> mapper(Block, blocks) >>> mapper(Line, lines, ... extension=aggregator.Quick( ... aggregator.Max(blocks.c.lastline, lines.c.id), aggregator.Count(blocks.c.lines), ... )) }}} Example is mostly self-explanatory, some details will be later. Then you can just create, and delete lines, and blocks' counter will be updated appropriately: {{{ #!python >>> block = Block('something') >>> session.save(block) >>> session.flush() # this is needed to get block.id >>> line = Line() >>> line.block = block.id >>> session.save(line) >>> session.flush() >>> session.refresh(block) >>> block.lines 1 }}} More things you should know: * You need to save and flush children and reload parent object before value will be updated * You shouldn't change foreign key field `line.block` in this case, as well as primary key of parent object (`block.id`) to get accurate results * Aggregator uses atomic updates so if other two restrictions are kept value will be accurate even in highly concurrent environment You could also do like this: {{{ #!python >>> mapper(Line, lines, ... extension=aggregator.Quick( ... aggregator.Max(blocks.c.lastline, lines.c.id), aggregator.Count(blocks.c.lines), ... ), properties = { ... '_block': blocks.c.block, ... 'block': relation(Block), ... }) >>> block = Block('something') >>> line = Line() >>> line.block = block >>> session.save(block) >>> session.save(line) >>> session.flush() >>> session.refresh(block) >>> block.lines 1 }}} With help of relation you don't need to flush and get `id` before storing children object. == Reference == === Aggregators === There are two basic aggregator clases: * `Accurate` - this class uses subquery with aggregation function (e.g. `UPDATE blocks SET line=(SELECT COUNT(*) FROM lines WHERE id = blocks.id) WHERE id = ?`) to update field after each change of relevant data * `Quick` - this one uses atomic updates to update values where possible (e.g. `UPDATE blocks SET line=line+1 WHERE id = ?`), but fallbacks to behavior of `Accurate` in corner cases (e.g. determining the max value when row was deleted) You can use both for distinct columns using list syntax: {{{ #!python import aggregator as a mapper(Line, lines, extension=[a.Quick(a.Count(blocks.c.lines)), a.Accurate(a.Max(blocks.c.lastline, lines.c.id))], )) }}} === Aggregations === Curenntly only four functions available: * `Count` - count of child rows. Arguments: * `target` - field in which number will be stored (in parent table) * `Max`, `Min` - maximum and minimum of specified fields respectively * `target` - field in which number will be stored (in parent table) * `source` - field which max or min will be calculated (in children table) * `Sum` - sum of the specified field of all children * `target` - field in which number will be stored (in parent table) * `source` - field which sum will be calculated (in children table) == To Do == * Other aggregate functions * Throw warnings when value can be "out of sync" * Document interface for adding own aggregation types (it's easy, today you can look at the source) * Remove foreign key requirement, allow multiple conditions * Test library on databases other than `MySQL` and `SQLite` For latter, just execute: {{{ python tests/simpletest.py scheme://login@host/dbname?everything=you_need }}} from distribution directory and leave feed back in comments here. By default it uses SQLite's memory database. == Download == * Quite stable version [http://www.mr-pc.kiev.ua/files/projects/SQLAlchemyAggregator/SQLAlchemyAggregator-0.1.4.dev-r849.tar.gz SQLAlchemyAggregator-0.1.4] * For the latest development version use [svn://svn.gafol.net/aggregator/trunk/] * There is also a fork at [https://dbcook.svn.sourceforge.net/svnroot/dbcook/trunk/dbcook/misc/aggregator] maintained by Svilen Dobrev * We are also in [http://cheeseshop.python.org/pypi/SQLAlchemyAggregator PyPI]