Reference

Glossary

Plain-English definitions of the system-design terms used throughout these docs and inside Skeema. Skim it once; come back when a term trips you up.

Performance & reliability

Latency: The time to handle one request, end to end. Measured in milliseconds.
Throughput: How many requests are handled per second — RPS (requests/sec) or QPS (queries/sec).
Percentile (P50 / P90 / P95 / P99): A point in the latency distribution. P99 = 1.7s means 99% of requests finish under 1.7s; the slowest 1% (the “tail”) take longer. Teams set targets on P95/P99 because averages hide slow users.
Tail latency: The slow end of the distribution (P95–P99+). At scale, the tail is what users complain about.
SLA / SLO / SLI: An SLI is a measured signal (e.g. P99 latency); an SLO is your internal target for it; an SLA is the externally-promised guarantee, usually with penalties.
Bottleneck: The component that saturates first as load rises and limits the whole system. The “weakest link.”
Critical path: The chain of synchronous calls whose latencies add up to the user-facing response time.
Fan-out: When one request triggers many downstream calls. High fan-out amplifies tail latency.
SPOF (single point of failure): A component with no redundancy whose failure takes down the system. Removed by replication and load balancing.
Idempotency: An operation that can be safely retried with the same result — essential for reliable async and payment systems.

Vertical scaling (scale up): Bigger machine — more CPU/RAM. Simple, capped, single box.
Horizontal scaling (scale out): More machines behind a load balancer. Requires stateless services.
Stateless service: Any instance can serve any request because no per-user state is stored locally. The prerequisite for horizontal scaling.
Load balancer: Distributes requests across healthy instances of a service.
Cache: A fast store (e.g. Redis) holding results of expensive work so repeat reads are near-instant. Cache-aside: app reads cache, falls back to DB on a miss. TTL: entries expire to avoid staleness.
Read replica: A read-only copy of a database that serves reads, offloading the primary. Best when reads ≫ writes.
Partitioning: Splitting one table by a key (e.g. date) so queries scan less data.
Sharding: Splitting data across multiple databases by a shard key (e.g. user_id). Powerful but complex; cross-shard queries are hard.
CDN: A content delivery network caches assets at edge locations near users, cutting latency and origin egress.
Egress: Data transferred out of the cloud to the internet — a frequently underestimated cost driver.

Synchronous call: The caller blocks until it gets a response; its latency adds to the request path.
Asynchronous (async): The caller publishes work and continues without waiting — decoupling services and smoothing spikes.
Queue / message broker: Infrastructure (Kafka, SQS, RabbitMQ) that buffers async messages between producers and consumers.
Event-driven: Services react to events published on a bus rather than calling each other directly.

Entity: A thing you store — becomes a table. Its attributes become columns.
Primary key (PK): The column that uniquely identifies each row.
Foreign key (FK): A column referencing another table’s primary key, forming a relationship.
Cardinality: The kind of relationship: one-to-one, one-to-many, or many-to-many.
Junction table: A join table holding two foreign keys to represent a many-to-many relationship.
Normalization: Organizing columns so each fact lives in one place (1NF→2NF→3NF), preventing update anomalies.
Index: A lookup structure that turns a slow full-table scan into a fast lookup. Index your FKs and filter columns.
Enum: A column constrained to a fixed set of values (e.g. status: PENDING/ACTIVE/CLOSED).

Derivation: A board generated from a node — an ER schema, sequence flow, or code — that stays linked to its source.
Lineage: The recorded link between a source node and its derived boards, used to detect drift.
Project: A root architecture plus all its derived views and flows, documented together.
Dagre: The graph-layout algorithm Skeema uses to position nodes automatically (two-pass for grouped diagrams).