⛁Databases

Schemas that hold. Under millions of rows.

PostgreSQL, MongoDB, Redis, ClickHouse, Pinecone. Rigorous modeling, smart indexing, query optimisation, replication, sharding when needed. From initial schema to operational maintenance.

Start a data project → See our method

PostgreSQL · MongoDB · Redis · ClickHouse
Pinecone · Qdrant · pgvector (RAG)
Indexing · query tuning · sharding
p99 < 50ms · automated backups

The context

A badly designed DB is paid in years of debt.

In 2026, SQL is more alive than ever. PostgreSQL 17 supports native vector (pgvector), automatic partitioning, mature logical replication. But 80% of the startups we audit have the same problems: non-normalised schemas, missing indexes on columns filtered 10M times/day, broken foreign keys, migrations without rollback, N+1 queries eating 60% of CPU.

The trap: starting with « we'll see later ». At 100 rows, everything works. At 1M rows, a query takes 8 seconds. At 100M, the service crashes under load. Cost of re-modeling at this stage: 3-6 months of product freeze, €50-200k of effort, and production migration risks.

Our conviction: a well-designed schema from the start is worth two years of avoided refactoring. We model based on business invariants (not screens), index based on actual read patterns, choose between normalised and denormalised knowing the trade-off. And we prepare evolution: sharding, partitioning, read replicas — known from day 1, activated when relevant.

< 50ms

p99 latency

On critical queries after tuning and indexing

100×

Read speed

Typical gain after adding appropriate indexes

Broken migration

All our migrations pass rollback test before deployment

1M+

Sustained QPS

Tested architecture — read replicas + cache + partitioning

What we build

Six data project families.

From classic relational DB to vector search for RAG, we pick the right engine for the right problem — never the opposite.

PostgreSQL 17

PostgreSQL modeling

Core of 80% of our projects.

Rigorous relational schema, 3NF normalization adjusted to needs, FK + check constraints, B-tree + partial + GIN indexing based on patterns. Versioned migrations (Prisma, Drizzle, Knex, Alembic). Partitioning when relevant.

MongoDB 8

MongoDB document store

When the schema is flexible.

Apps with rich and hierarchical objects (commerce products, content management). MongoDB 8 with time series collections, change streams, lookup. Compound + multikey indexing strategy, sharding planned from the start.

Redis 7+

Redis cache & queues

Performance and async.

Redis for application cache (session objects, expensive queries), real-time pub/sub, BullMQ for queued jobs. Eviction strategies (LRU, LFU), fine TTLs, replica with sentinel for HA.

ClickHouse

ClickHouse analytics

When 10M rows aren't enough.

For analytical dashboards, event tracking, structured logs. Columnar ClickHouse + materialized views + ReplacingMergeTree. p99 queries < 100ms on billions of rows. 10× cheaper storage than PG on this case.

Pinecone · Qdrant · pgvector

Vector DB for RAG

AI needs to retrieve.

Vector indexing for conversational assistants and semantic search. Pinecone (managed), Qdrant (self-hosted), pgvector (PG-integrated). Embeddings, hybrid search (semantic + BM25), reranking for final precision.

All engines

Audit & query optimisation

Diagnosis of a suffering DB.

Full audit: EXPLAIN ANALYZE on slow queries, missing index identification, N+1 detection, dead tuple accumulation. Actionable report: top 20 wins by gain/effort ratio, implementation if you want.

Our approach

Four steps, from business model to monitoring.

We start by understanding your domain, not drawing tables. Structure follows need — not the opposite.

01 PK Domain modeling (1 wk)

Product workshop to understand business invariants. Identification of entities (users, resources, events), relationships (1-1, 1-N, N-N), consistency rules. Relational vs document choice based on data nature.

⟹ ERD + business glossary + Architecture Decision Records

02 PK Schema design (1-2 wks)

Final DB schema with constraints, strict types, FK, checks. Argued indexing strategy based on anticipated read patterns. Versioned migrations with tested rollback. Evolution strategy (sharding, partitioning, read replicas) documented.

⟹ SQL/JSON Schema + migrations + evolution strategy

03 PK Implementation & seeding (2-4 wks)

Schema implementation on PostgreSQL/MongoDB. Realistic data generation (Faker) for tests. Repository pattern with ORM (Prisma, Drizzle, SQLAlchemy) or native SQL if critical. Integration tests on data layer.

⟹ Populated DB + repository code + tests + ops doc

04 PK Backup & monitoring (1 wk)

Backup strategy (full + WAL streaming for PG, daily snapshot). Restoration tests on copy. Monitoring: pg_stat_statements, slow query log, Grafana dashboards (latency, throughput, locks). Slack alerting on critical thresholds.

⟹ Tested backups + dashboards + alerting + ops runbook

Tech stack

The engines we actually use.

No fashion. Each engine has its sweet spot. Here's how we choose based on real problem.

Relational

PostgreSQL 17 · MySQL 8 · CockroachDB · TiDB

PostgreSQL by default (mature, JSON, vector, extensions). MySQL if already in place. CockroachDB for multi-region serializable. TiDB for HTAP.

Document & NoSQL

MongoDB · Firestore · DynamoDB

MongoDB when schema is flexible and evolving. Firestore for real-time mobile/web. DynamoDB when fully AWS and accepting constraints.

Cache & queues

Redis · Memcached · BullMQ · KeyDB

Redis for cache + queue + pub/sub. Memcached for pure distributed cache. BullMQ for Node job orchestration. KeyDB for Redis drop-in with multi-thread.

Columnar analytics

ClickHouse · DuckDB · BigQuery · Snowflake

ClickHouse for self-hosted real-time analytics. DuckDB embedded in app. BigQuery when on GCP. Snowflake for massive enterprise needs.

Vector DB (RAG)

Pinecone · Qdrant · Weaviate · pgvector · Turbopuffer

pgvector when already PG (native integration). Pinecone (managed) to start fast. Qdrant self-hosted for sovereignty. Turbopuffer for massive cost.

Migration & ORM

Prisma · Drizzle · Knex · TypeORM · Alembic · SQLAlchemy

Prisma or Drizzle for TS (excellent DX). Alembic + SQLAlchemy for Python. Knex in pure SQL builder. TypeORM only on legacy projects.

Measurable guarantees

Four contractual commitments.

p99 < 50ms

Query latency

On critical queries. Measured after tuning and indexing, validated under real load.

Broken migration

Each migration tested with rollback before deployment. No Sunday-night surprises.

RPO < 1h

Automated backup

Recovery Point Objective: you never lose more than 1h of data. Backups tested monthly.

100%

Documented indexes

Each index documented with its use case. No « just-in-case » indexes slowing writes.

التسعير

كل مشروع فريد. والعرض كذلك.

بدلاً من باقات مجردة، نقوم بالتأطير وفقاً لسياقك: النطاق، التعقيد، المواعيد النهائية، القيود. اكتب لنا في 3 جمل ما تريد القيام به — نعود إليك بعرض ثابت خلال 48 ساعة عمل.

الرد خلال 48 ساعة عمل طلب عرض سعر →