Sqlalchemy: What's the Difference Between Flush() and Commit()

SQLAlchemy: What's the difference between flush() and commit()?

A Session object is basically an ongoing transaction of changes to a database (update, insert, delete). These operations aren't persisted to the database until they are committed (if your program aborts for some reason in mid-session transaction, any uncommitted changes within are lost).

The session object registers transaction operations with session.add(), but doesn't yet communicate them to the database until session.flush() is called.

session.flush() communicates a series of operations to the database (insert, update, delete). The database maintains them as pending operations in a transaction. The changes aren't persisted permanently to disk, or visible to other transactions until the database receives a COMMIT for the current transaction (which is what session.commit() does).

session.commit() commits (persists) those changes to the database.

flush() is always called as part of a call to commit() (1).

When you use a Session object to query the database, the query will return results both from the database and from the flushed parts of the uncommitted transaction it holds. By default, Session objects autoflush their operations, but this can be disabled.

Hopefully this example will make this clearer:

#---
s = Session()

s.add(Foo('A')) # The Foo('A') object has been added to the session.
                # It has not been committed to the database yet,
                #   but is returned as part of a query.
print 1, s.query(Foo).all()
s.commit()

#---
s2 = Session()
s2.autoflush = False

s2.add(Foo('B'))
print 2, s2.query(Foo).all() # The Foo('B') object is *not* returned
                             #   as part of this query because it hasn't
                             #   been flushed yet.
s2.flush()                   # Now, Foo('B') is in the same state as
                             #   Foo('A') was above.
print 3, s2.query(Foo).all() 
s2.rollback()                # Foo('B') has not been committed, and rolling
                             #   back the session's transaction removes it
                             #   from the session.
print 4, s2.query(Foo).all()

#---
Output:
1 [<Foo('A')>]
2 [<Foo('A')>]
3 [<Foo('A')>, <Foo('B')>]
4 [<Foo('A')>]

What is the difference between session.commit() and session.flush()?

Here are some relevant quotes from the documentation.

flush:

When the Session is used with its default configuration, the flush
step is nearly always done transparently. Specifically, the flush
occurs before any individual Query is issued, as well as within the
commit() call before the transaction is committed.

commit:

commit() is used to commit the current transaction. It always issues
flush() beforehand to flush any remaining state to the database; this
is independent of the “autoflush” setting. If no transaction is
present, it raises an error. Note that the default behavior of the
Session is that a “transaction” is always present; this behavior can
be disabled by setting autocommit=True. In autocommit mode, a
transaction can be initiated by calling the begin() method.

SQLAlchemy: How to delete and flush instead of commit?

Explicit deletion of an object through the session, such as db.session.delete(account.receipt) does not result in the disassociation of the child object from its parent until commit() is called on the session. This means that until the commit occurs, expressions such as if parent.child: ... will still evaluate truthy after flush and before commit.

Instead of relying on truthyness, we can check the object state in our logic, as once flush() has been called, the state of the deleted object changes from persistent to deleted (Quicky Intro to Object States):

from sqlalchemy import inspect

if not inspect(parent.child).deleted:
    ...

if parent.child not in session.deleted:
    ...

Where it makes no sense for a child object to exist independent of it's parent, it might instead be better to set the cascade on the parent relationship attribute to include the 'delete-orphan' directive. The child object is then automatically deleted from the session once it is disassociated from the parent allowing for immediate truthyness testing of the parent attribute, and the same semantics upon rollback (i.e., child object restored). A child relationship on the parent that includes the 'delete-orphan' directive might look like this:

child = relationship("Child", uselist=False, cascade="all, delete-orphan")

and a delete sequence with truthy test, and no db commit looks like this:

child = Child()
parent.child = child

s.commit()

parent.child = None
s.flush()

if parent.child:  # obviously not executed, we just set to None!
    print("not executed")
print(f"{inspect(child).deleted = }")  # inspect(child).deleted = True

s.rollback()  # child object restored

if parent.child:
    print("executed")

Here's a somewhat longish, but fully self contained (py3.8+) script that demonstrates the different states an object passes through, and the truthyness of the parent attribute, using both the explicit session deletion method and the implicit deletion through nulling the parent relationship and setting the 'delete-orphan' cascade:

import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, sessionmaker

engine = sa.create_engine("sqlite:///", echo=False)

Session = sessionmaker(bind=engine)

Base = declarative_base()

class Parent(Base):
    __tablename__ = "parents"
    id = sa.Column(sa.Integer, primary_key=True, autoincrement=True)
    child = relationship("Child", uselist=False, cascade="all, delete-orphan")

class Child(Base):
    __tablename__ = "children"
    id = sa.Column(sa.Integer, primary_key=True, autoincrement=True)
    parent_id = sa.Column(sa.Integer, sa.ForeignKey("parents.id"), nullable=True)

def truthy_test(parent: Parent) -> None:
    if parent.child:
        print("parent.child tested Truthy")
    else:
        print("parent.child tested Falsy")

Base.metadata.create_all(engine)

parent = Parent()
child = Child()
parent.child = child

insp_child = sa.inspect(child)

print("***Example 1: explicit session delete***")

print("\nInstantiated Child")
print(f"{insp_child.transient = }")  # insp_child.transient = True

s = Session()
s.add(parent)

print("\nChild added to session.")
print(f"{insp_child.transient = }")  # insp_child.transient = False
print(f"{insp_child.pending = }")  # insp_child.pending = True

s.commit()

print("\nAfter commit")
print(f"{insp_child.pending = }")  # insp_child.pending = False
print(f"{insp_child.persistent = }")  # insp_child.persistent = True
truthy_test(parent)

s.delete(parent.child)
s.flush()

print("\nAfter Child deleted and flush")
print(f"{insp_child.persistent = }")  # insp_child.persistent = False
print(f"{insp_child.deleted = }")  # insp_child.deleted = True
truthy_test(parent)

s.rollback()

print("\nAfter Child deleted and rollback")
print(f"{insp_child.persistent = }")  # insp_child.persistent = False
print(f"{insp_child.deleted = }")  # insp_child.deleted = True
truthy_test(parent)

s.delete(parent.child)
s.commit()

print("\nAfter Child deleted and commit")
print(f"{insp_child.deleted = }")  # insp_child.deleted = False
print(f"{insp_child.detached = }")  # insp_child.detached = True
print(f"{insp_child.was_deleted = }")  # insp_child.was_deleted = True
truthy_test(parent)

print("\n***Example 2: implicit session delete through parent disassociation***")

child2 = Child()
parent.child = child2

s.commit()

parent.child = None  # type:ignore
s.flush()
print("\nParent.child set to None, after flush")
print(f"{sa.inspect(child2).deleted = }, if 'delete-orphan' not set, this is False")
truthy_test(parent)

s.rollback()

print("\nParent.child set to None, after flush, and rollback")
print(f"{sa.inspect(child2).deleted = }, if 'delete-orphan' not set, this is False")
truthy_test(parent)

parent.child = None  # type:ignore
s.commit()
print("\nParent.child set to None, after commit")
print(f"{sa.inspect(child2).detached = }, if 'delete-orphan not set, this is False")
truthy_test(parent)

When should I be calling flush() on SQLAlchemy?

The ZopeTransactionExtension on the DBSession in conjunction with the pyramid_tm being active on your project will handle all commits for you. The situations where you need to flush are:

You want to create a new object and get back the primary key.

DBSession.add(obj)
DBSession.flush()
log.info('look, my new object got primary key %d', obj.id)

You want to try to execute some SQL in a savepoint and rollback if it fails without invalidating the entire transaction.

sp = transaction.savepoint()
try:
    foo = Foo()
    foo.id = 5
    DBSession.add(foo)
    DBSession.flush()
except IntegrityError:
    log.error('something already has id 5!!')
    sp.rollback()

In all other cases involving the ORM, the transaction will be aborted for you upon exception, or committed upon success automatically by pyramid_tm. If you execute raw SQL, you will need to execute transaction.commit() yourself or mark the session as dirty via zope.sqlalchemy.mark_changed(DBSession) otherwise there is no way for the ZTE to know the session has changed.

Also you should leave expire_on_commit at the default of True unless you have a really good reason.

SQL Alchemy session.commit and flushing behavior

SQLAlchemy expires all objects in a session when the session is committed. That is to say, all the column-value attributes of a model instance are removed from its __dict__

This can be prevented by passing expire_on_commit=False when creating the session; be aware that the data in expired instances may be stale.

sqlalchemy flush() and get inserted id?

Your sample code should have worked as it is. SQLAlchemy should be providing a value for f.id, assuming its an autogenerating primary-key column. Primary-key attributes are populated immediately within the flush() process as they are generated, and no call to commit() should be required. So the answer here lies in one or more of the following:

The details of your mapping
If there are any odd quirks of the backend in use (such as, SQLite doesn't generate integer values for a composite primary key)
What the emitted SQL says when you turn on echo

SqlAlchemy Insert Rollback without logged error

You are using the context manager wrong, or using the wrong context manager.

Engine.connect() requires a manual commit.

with engine.connect() as con:
    con.execute(...)
    con.commit()

Engine.begin() will commit for you on successful exit.

with engine.begin() as con:
    con.execute(...)

See the tutorial: Working with Transactions and the DBAPI

Sqlalchemy: What's the Difference Between Flush() and Commit()