Elasticsearch vs Time-Series DB: Key Differences Explained
Should you store logs in Elasticsearch or InfluxDB? We compare Search Engines vs. Time-Series Dat...
Abstract AlgorithmsTLDR: Elasticsearch is built for search — full-text log queries, fuzzy matching, and relevance ranking via an inverted index. InfluxDB and Prometheus are built for metrics — numeric time series with aggressive compression. Picking the wrong one wastes 10× the storage or makes queries 100× slower.
📖 Logs vs Metrics: Two Different Storage Problems
A log is a sentence: 2024-01-15 ERROR: failed to connect to database host=db1.
A metric is a number at a timestamp: cpu.usage{host=web1} = 87.3 @ 1705312800.
These look similar (both are time-ordered data) but demand fundamentally different storage strategies:
| Property | Log data | Metric data |
| Structure | Semi-structured text | Strictly typed numbers |
| Query pattern | Full-text search, grep, aggregation | Range queries, rate calculations, aggregation |
| Cardinality | Unbounded keys | Bounded label/tag sets |
| Update frequency | Write-once streams | Regular intervals (every 15s) |
| Retention | Days to months (expensive) | Months to years (cheap with downsampling) |
🔢 Elasticsearch: The Inverted Index for Full-Text Search
Elasticsearch is built on Apache Lucene. Its core data structure is the inverted index: a map from every term (word) to the list of documents that contain it.
"failed" → [doc_3, doc_7, doc_12]
"database" → [doc_3, doc_9]
"connection" → [doc_7, doc_12, doc_20]
This lets Elasticsearch answer "find all logs containing 'database' AND 'connection'" in milliseconds, even across billions of log lines.
Strengths:
- Full-text search with stemming, fuzzy matching, synonyms
- Relevance ranking (BM25)
- Aggregation pipelines (histograms, top-N, date histograms)
- Schema flexibility (dynamic mappings)
Weaknesses:
- High storage overhead — inverted index per field duplicates data
- Poor at range math on numeric series (no delta encoding)
- High cardinality is expensive: each unique label value adds index memory
⚙️ Time-Series DBs: Delta Encoding and Columnar Compression
TSDBs (InfluxDB, Prometheus, TimescaleDB, VictoriaMetrics) are optimized for the fact that metric values change slowly.
Delta encoding example:
Raw: 100, 101, 102, 103
Encoded: 100, +1, +1, +1
Storing deltas instead of absolute values reduces the integer size dramatically. A 64-bit value becomes a 1-bit delta. With additional compression (Gorilla encoding, Snappy), modern TSDBs achieve 1–2 bytes per data point versus Elasticsearch's 50–100 bytes per log document.
flowchart LR
Sensor[Sensor 87.3\n87.4\n87.5] --> Delta[Delta Encoding\n87.3 +0.1 +0.1]
Delta --> Compress[Gorilla XOR\nCompression]
Compress --> TSDB[(TSDB Block\n1-2 bytes/point)]
Strengths:
- Efficient storage (10–50× smaller than Elastic for pure metrics)
- Fast range queries and time aggregations (SUM, AVG, RATE)
- Built-in downsampling and retention policies
- Cardinality-efficient label model (Prometheus label sets)
Weaknesses:
- Poor at full-text search (no inverted index)
- Limited schema flexibility (labels must be pre-planned for cardinality control)
🌍 Which One to Use and When
| Situation | Use |
| "Find all error logs containing 'timeout'" | Elasticsearch |
| "What was the p99 latency over the last 6 hours?" | Prometheus / InfluxDB |
| "Show me all logs where user_id=12345 performed a payment" | Elasticsearch |
| "Alert when CPU > 90% for 5 minutes" | Prometheus |
| "Audit trail: who changed what and when" | Elasticsearch |
| "How many requests per second to /api/v1/order over 30 days?" | TimescaleDB / InfluxDB |
In practice: Production observability stacks often use both. The ELK stack (Elasticsearch + Logstash + Kibana) handles logs; Prometheus + Grafana handles metrics.
⚖️ Cardinality: The TSDB Killer
The biggest operational risk in TSDBs is high-cardinality labels.
Prometheus memory usage scales with the number of unique time series — roughly labels × label combinations. A common trap: using user_id or session_id as a Prometheus label. One million users = one million separate time series = OOM crash.
Rule: TSDBs track populations (per-service, per-host, per-endpoint). Elasticsearch searches individuals (this log, this request, this user).
📌 Key Takeaways
- Elasticsearch is for text search; TSDBs are for numeric time series.
- Elasticsearch uses an inverted index — fast for full-text, expensive for pure numbers.
- TSDBs use delta encoding + compression — 10–50× smaller for regular numeric streams.
- Use both in production: ELK for logs, Prometheus/Grafana for metrics.
- Watch out for high-cardinality labels in Prometheus — they cause OOM crashes.
🧩 Test Your Understanding
- Why does delta encoding work so well for CPU metric data?
- A team wants to search server logs for the phrase "failed payment." Elasticsearch or InfluxDB?
- Why is using
user_idas a Prometheus label dangerous? - What is the Gorilla encoding algorithm optimizing for?
🔗 Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
SFT for LLMs: A Practical Guide to Supervised Fine-Tuning
TLDR: Supervised fine-tuning (SFT) is the stage where a pretrained model learns task-specific response behavior from curated input-output examples. It is usually the first alignment step after pretraining and often the foundation for later RLHF. Good...
RLHF in Practice: From Human Preferences to Better LLM Policies
TLDR: Reinforcement Learning from Human Feedback (RLHF) helps align language models with human preferences after pretraining and SFT. The typical pipeline is: collect preference comparisons, train a reward model, then optimize a policy (often with KL...
PEFT, LoRA, and QLoRA: A Practical Guide to Efficient LLM Fine-Tuning
TLDR: Full fine-tuning updates every model weight, which is expensive in memory, compute, and storage. PEFT methods update only a small trainable slice. LoRA learns low-rank adapters on top of frozen base weights. QLoRA pushes efficiency further by q...
LLM Model Naming Conventions: How to Read Names and Why They Matter
TLDR: LLM names encode practical decisions: model family, size, training stage, context window, format, and quantization level. If you can decode naming conventions, you can avoid costly deployment mistakes and choose the right checkpoint faster. �...
