Why SQLAlchemy should no longer be your ORM of choice for Python projects

Eshwaran Venkat
10 min readJan 20, 2023

--

Photo by Jan Antonin Kolar on Unsplash

SQLAlchemy is a powerful tool to interface with a backend database, irrespective of whether the goal is to build a frontend web application, administrate a database “as-code”, or something similar. SQLAlchemy has both an ORM (Object Relational Mapper) and a core library and is used heavily for python backends to I/O a SQL database. As an ORM, it is also database agnostic meaning that a database can be migrated (for example) from a mySQL backend to a PostgreSQL one without changing the SQLAlchemy models on the app. On the flip side, it provides database-specific dialects to use the full power of a specific SQL database (such as JSONBin Postgres).

ORMs have been around for a while and they’re great for building basic CRUD applications without writing raw SQL. They’re also the number one choice to avoid SQL injection vulnerabilities. Most applications will not require complex SQL at all — as you’ll mostly be fetching some user data, session data, or the like which can be accomplished with some simple select and filter statements which most ORMs almost always come pre-packaged with, and work pretty seamlessly.

Popular Python ORMs include:

  • SQLAlchemy: The oldest and most proven ORM for Python. Note that it is a database toolkit and the ORM is a feature.
  • SQLModel: Built on top of SQLAlchemy & Pydantic. Functionality is largely overlapping with SQLAlchemy and its main purpose is to support the FastAPI project as an ORM.
  • Django ORM: ORM for Django projects, but can be extended for generic Python projects as well.

ORMs are considered both necessary by some developers and as an anti-pattern by others. A quote from this StackOverflow post goes:

In many people’s eyes, SQL is ugly, complex, and confusing. But trying to make an object-oriented interface to perform the same functionality is always uglier, more complex, and has a steeper learning curve.

Of course, there are productivity gains but you don’t see a third dimension of the “learning curve” mentioned in the quote above to the figure below.

A 2D Coordinate System with Control & Productivity from a SQL database as X and Y axes. ORMs are right in the middle of the graph, and SQL is to the bottom right.
Source: Prisma

That is not to say that there aren’t upsides to using ORMs either, which is also something discernible from this figure.

  • Engineers deal with a lot of data models on a day-to-day basis, and the shape of the data is prone to change; sometimes often — especially for projects in development or projects that scale up quickly. Managing these in SQL is not ideal, to say the least, and using a GUI SQL Client (like pgAdmin, mySQLWorkbench, DBeaver, etc) makes the process highly manual.
  • DDL (Data Definition Language) of SQL is great but unproductive. Having the data schema “as-code” is a lot easier to track and administer changes instead of “as-SQL”.

The Case Against SQLAlchemy

No native support for database change management. The syntax is complex compared to modern ORMs. Nitpicks: A very ancient-looking official documentation, and not the best support for introspection.

Database Change Management

A database migration (which may sound like you’re migrating the whole database or moving or backing up your data, but instead) means changing the schema of your data models. For example:

  • Adding or removing a column from a table
  • Renaming a column or changing its type definition
  • Adding or removing a table, and so on.
Source: Bytebase

This is topically called Database Change Management (DCM) and Bytebase has a great article detailing the idea. It is honestly a rookie mistake to think that a database schema for any kind of application will remain unchanged over time, and happens more frequently than you may initially assume. This is because a data model is subject to morphing based on where the perceived value of the contained data truly lies. This means that a data model using an ORM is not as straightforward as declaring the models on file in the app and calling it a day.

SQLAlchemy does not support DCM out of the box. If you declare models and push them to your database to reflect the schema, the database is no longer prone to changes to existing models and only detects new tables. If you want to add a column to the SQLAlchemy models file and expect those changes to take place in the database again, not going to happen. Admins will pretty much have to use raw SQL ALTER statements, manually plugging changes for all edits to those model tables in the future. Take the following snippet for example:

sample models.py of an app using sqlalchemy for a user table. Image by Author.

If the model has been reflected in the database already, and now you want to add an attribute called phone_number to the User model, then adding phone_number to this class in the models.py file will not reflect that column in the database.

This is where you, the reader may say “Hey, there’s always the Alembic tool to handle migrations”, and you would be correct. Alembic is a tool created by the Author of SQLAlchemy that overcomes this limitation and allows SQLAlchemy migrations. Problem solved? Not quite. In my opinion, Alembic is probably not the best way that DCM can be solved at least in terms of developer productivity.

Here’s an example of an alembic script to create a migration to add phone_number to the user table:

Alembic script. Image by Author

To use this script, you will need to have Alembic and SQLAlchemy installed. You will also need to have a .ini configuration file and a env.py script, which define the connection to the database and provide necessary context.

To run the script, you can use the alembic upgrade head command, which will apply the changes defined in the upgrade function. To revert the changes, you can use the alembic downgrade base command, which will run the downgrade function.

Seems a bit much, but gets the job done. One cool thing about Alembic is the time-travel aspect like git where you can rollback as well. Adding these scripts each time a change is made just seems tedious, especially for multiple alterations. However, I do acknowledge that alembic can be exactly what you may be looking for.

Migrations in the Django ORM are similar, with some precise code definitions to get you on your way. The alembic migrations can be made slightly better by utilizing Flask-Migrate , which can deploy migrations easier but requires a flask declaration of the app (even if the app doesn’t need to use flask).

Syntax & Docs

Okay, let’s take a look at this file again. You may be indifferent to the declarations here citing that “it doesn’t get simpler than this”, or you feel like there are parts of this file syntactically that don’t need to exist explicitly.

Image by Author

The SQLModel project does a good job of reducing the explicit syntax by using type annotations with pydantic’s BaseModel as follows: (excluding the __repr__)

Sqlmodel equivalent of user table. Image by Author

And this might honestly be good enough for most folks. A combination of SQLModel and Alembic can get your DataOps and Backend development off the ground. SQLModel has a great user guide (though no API documentation since it overlaps with pydantic and SQLAlchemy), and looks better than SQLAlchemy’s docs.

Introspection

The final feature we’re looking at is introspection. Introspection is like “crawling” a database. The program examines the schemas and generates a file, or set of files to reflect the data modeling as objects. Most GUI SQL clients use introspection features. This is especially useful for ORM use cases where a schema already exists. SQLAlchemy has either some good in-built introspection features or some add-on libraries that can be configured to deliver decent end-to-end introspection.

  • SQLAlchemy has Automap functionality which tries to map relational objects to object-oriented models. Scripts still have to be developed around the building blocks SQLAlchemy provides, which is not too bad.
  • sqlacodegen is an open-source project that can introspect a database and create SQLAlchemy Table models. It also has support for sqlmodel but this feature is not an official stable release and has to be used from the master branch (as of Jan 2023).

Conclusion?

Perhaps this whole article is nit-picky but the ORM selection for python consistently seems to be in a state of “almost but not quite”, and most of the places where these apparent holes exist are really in the developer productivity area. In 2023, an age where tools like ChatGPT exist — it just seems unacceptable to me that dealing with SQL databases needs to involve even close to the iota of effort that any of these tools seem to warrant. Of course, this may seem like very much a “first world” or “entitlement” problem and that might very well be true.

Speaking purely from a database administration point of view, tools like ByteBase and DBT seem to do a good job. Google cloud also has a document detailing DCM for production databases by using CI/CD which reduces downtime. Documentation of these schemas also seems to be made quite easy with tools like dbdocs, which certainly seems crucial for managing tables and relations as they scale for SQL Databases (I’m not sponsored or associated with these companies). But speaking from managing a database for small projects or for big projects handled by small teams when using the open-source options: things get quite tedious indeed.

Prisma & Prisma Python Client

In my hunt for ORMs and Dataops tools, the tools highlighted in the article thus far always appeared to be the recommendations. Even online articles and publications favour the de-facto standard that is apparently being introduced by these libraries in the Python ecosystem.

So what surprised me was when a couple of recommendations came through to try Prisma, from the folks over in the Typescript / Javascript ecosystem. I’m not sponsored or associated with Prisma, but they do seem to get a solution closest to the ideal ORM in my opinion. I do want to highlight that the following features address the “almost but not quite” problem in the previous section. This means that some of these features can feel like marginal improvements, but that’s the idea.

  • The data models are syntactically simple but convey necessary information concisely in a near-JSON-like format.
  • Prisma documentation is fantastic and has both user guides as well as API references. It also has a neat set of DIYs in terms of ORMs.
  • Database migrations and DCM are easy as changing the data models in-file and running a migrate CLI command.
  • Introspection is straightforward and has standard syntax instead of separating Table from Class like SQLAlchemy.
  • An easy-to-use query builder for the ORM.
  • Support for NoSQL document databases like MongoDB, which is more of a nice-to-have from a SQL PoV.
Prisma’s data model for the User table. Image by Author.

One thing that stands out here is that Prisma does not use object models (like classes of the ORM) and instead uses a standard schema. This lends to Prisma being less affected by the anti-patterns introduced in using object mappings on relations that the Python ORMs suffer from.

Adding this model to the database as simple as:

prisma migrate dev --name init

Adding a new phone number field would involve adding it to the prisma file and then running the following command:

prisma migrate dev --name added_phone_number

Introspecting an existing schema on the database would be:

prisma db pull

Which would load the entire database schema into a file as prisma data models.

But what about the fact that this is a TypeScript ORM I hear you say? Well, that’s where the Prisma Python Client comes in. Simply pip install prisma and include the following in the Prisma schema file:

You can now query the tables using the Prisma ORM which is similar to the query generators on SQLAlchemy or SQLModel.

Prisma ORM Python file. Derived from example in Prisma Python Client. Image by Author.

Naturally, you can learn about a lot more you can do with Prisma in their docs.

Complex SQL Queries

A point raised about ORMs is how they’re not the tools for complex SQL queries. This is 100% correct, as a complex SQL query is never far away, even as a developer. The best way to manage this issue while still making use of an ORM would probably be to create a VIEW or MATERIALIZED VIEW which would serve as a table that does the hard SQL work from where an ORM can query the data easily. Think of this view as reshaping the data or normalizing it using specific SQL operations, and then using an ORM to utilize simple select or filter queries to extract data for the application.

Conclusion.

I’ve been using Prisma for a while and it really overcomes the shortcomings of ORMs in the standard python ecosystem. The idea is naturally to use it in projects of different shapes and sizes to see how it pans out there too. The simplicity in developer UX appears to be worth it, for now at least.

--

--