We migrated to SQL. Our biggest learning? Don’t use Prisma
Last week, we completed a migration that switched our underlying database from MongoDB to Postgres. We faced a lot of interesting challenges, and the biggest one of them was a two-time rewrite of the codebase.
This article is for developers considering using Prisma for their production workloads – don’t.
We wrote our backend in Prisma first
One of the biggest mistakes we did was trusting Prisma marketing blindly. Look at this beautiful landing page:
One of the good things about Prisma (which we still use) is the ability to create the schema in containing all relations, defaults, and data types. The syntax is very clear.
But that’s pretty much where the benefits end. We rewrote our first version of the backend converting the full codebase from Mongoose ORM to Prisma. It looked great!
Problems with deployment
At codedamn, we use AWS lambdas for hosting our GraphQL-powered backend. The first problem we saw with Prisma, which nobody noticed earlier is that Prisma is shipping multiple 10-20MB “engines”.
It was a shock finding out that Prisma needs almost a “db” engine layer of its own. Read more about it here: https://www.prisma.io/docs/concepts/components/prisma-engines
Our AWS lambda deployments were failing because of exceeded 50MB size limit. We use ESBuild to build the backend TypeScript, and had to write multiple patches in the build pipeline to remove extra engines, extra prisma files and node_modules and bring the deployment size under 50MB.
Still, in the end, we had to ship a 12-13MB query engine that Prisma absolutely needed.
Built for low performance
Our second mistake was that we turned on the “emulated foreign key relation” mode in Prisma. Initially, we wanted to use PlanetScale for our database, but it turns out that it wasn’t the best choice for us either (but that’s for another blog post).
Prisma, generally, and emulating foreign key relations in Prisma is a performance nightmare. Here are few things we observed when we tried test migration of our data from staging MongoDB to staging PlanetScale via adapters written in Prisma:
- Every new insert via Prisma opened a database-level transaction (?). An extremely weird design choice by Prisma. We often would exhaust our transaction pool on Vitess on PlanetScale, and this is the error we will end up 10-20 seconds in migration: https://vitess.io/docs/16.0/user-guides/configuration-basic/troubleshooting/#transaction-connection-limit-errors
- We were not dumping insanely in the SQL database. We would hit the above error doing maybe 50-100 parallel insert calls in DB. Some of our collections had records well over 20M+. We couldn’t wait for days to migrate since downtime was involved here.
- There is no concept of SQL-level joins in Prisma. This was one of the most shocking revelations to us. In some queries we inspected that supposedly should have used SQL Joins or subqueries, we discovered that at a low level, Prisma was fetching data from both tables and then combining the result in its “Rust” engine. This was a path for an absolute trash performance.
- Once we figured this out, we were extremely doubtful about how Prisma actually “emulates” the foreign key constraints. Does it perform cascading select operations on all associated tables on all CUD operations? Seems like the only way. And since we saw that every insert uses a transaction that has a limited pool size, we were sure it would be a non-negotiable performance penalty for us on production workloads.
- On every insert, Prisma opens a transaction where it inserts and then returns the inserted record too, even if we do not want it.
SQL result “patching” inside its own engine was a deal breaker for us. Maybe it somehow optimized it further too, but there is no way you are outperforming the database in its own game, even if you ship a 15MB Rust engine to do so.
Further inspecting, we discovered that your code never makes the actual DB call. Your Prisma code performs a GraphQL network request (?) to the Prisma Rust query engine, which then translates your request into actual DB calls.
Our PlanetScale database was able to pass our stress test of inserting/reading multiple records. It, however failed when we used Prisma because of the limited transaction pool inside Vitess which was getting exhausted very quickly and after that all further calls to PlanetScale DB will fail for a few minutes.
However, passing on PlanetScale was because of two reasons:
- No foreign key support: This is something we overlooked initially, but as we worked more and more on the migration and coming from “relational” data in MongoDB, we figured out that it is a huge safety net of not making data corrupt. We had a lot of data corruption in MongoDB just because we did not respect relations among collections.
- Weird pricing: PlanetScale prices you on row-reads and row-writes per month. What is weird is that row-reads is something that nobody controls. It is the SQL query planner which determines the query plan, which results in how many rows you’re reading internally. Remember that row-reads are not row returns. You can write a wrong query that returns 0 rows but can still do a full table scan of 1M rows and PlanetScale will charge you for it.
- No VPC Peering: MongoDB Atlas supported VPC peering on dedicated clusters, which avoided data flow over the internet. PlanetScale supported it only on Enterprise plans.
Prisma + PlanetScale: Built for each other, but not for production workloads
PlanetScale is fine when used independently. But Prisma might lure you in with its DX, additional features like Accelerate, and tooling.
If you love performance even one bit on heavy production systems, avoid using Prisma.
Prisma’s website has a page called “Why Prisma“. It says 3 things:
- Raw SQL: Full control, low productivity
- SQL query builders: High control, medium productivity
- ORMs: Less control, better productivity
Here’s the actual and truth version of it:
- Raw SQL: Full control, low productivity, insane performance
- SQL query builders: High control, medium productivity, insane performance
- (Prisma) ORMs: Less control, slightly better productivity, and syntax, bonkers performance
Our current stack
After evaluating multiple options, here is what we ended up with:
- Schema is written in Prisma
- Using Prisma migration tooling for applying migrations on DB
- AWS Aurora Serverless v2 Postgres
- AWS lambda linked to AWS Aurora under private VPC (no internet traffic flow)
- Kysely for SQL query builder
Kysely is not feature-complete yet. But it has escape hatches that still allow us to keep code typesafe even if we want to write raw SQL at times.
Sharing is caring
Did you like what Mehul Mohan wrote? Thank them for their work by sharing it on social media.
No comments so far
Leave a question/feedback and someone will get back to you