(Practical MySQL Tuning Guide for Real-World Operations)
zabbix database performance optimization :
When running Zabbix at any real scale, the first bottleneck you’ll hit is the database.
Long before CPU or memory become an issue, InnoDB I/O and Zabbix’s internal caching layers start choking the system.
Especially in environments collecting hundreds or thousands of metrics every second, database tuning directly determines overall stability.
This post summarizes the most effective database optimization techniques tested in production on a single Zabbix Server + MySQL setup.
For more information on this topic, especially regarding partitioned tables, please refer to the URL below, which provides a detailed explanation along with the full installation guide.
1. Zabbix Database Structure Overview
Zabbix data is divided into two main table groups:
| Type | Tables | Characteristics |
|---|---|---|
| History | history, history_uint, history_str, history_log | Stores raw metrics per second, heavy random I/O load |
| Trends | trends, trends_uint | Aggregated hourly averages/max/min values, lighter I/O |
Almost all database load originates from the history tables.
The key is how efficiently you manage and periodically clean up these tables.
2. Separating History and Trends Storage
Even on a single server, splitting disks logically makes a big difference.
Because Zabbix workloads involve constant random writes, SSDs are strongly recommended for history data.
This moves random write I/O from history tables to SSDs,
while trend read/write operations stay on another disk — reducing contention dramatically.
3. Partitioned Tables
When the housekeeper deletes millions of rows, the database comes under such heavy load that it’s effectively stalled. It’s worth considering a partitioned table design.
4. Housekeeper Tuning
The Housekeeper is responsible for deleting old data.
In large setups, it often causes more harm than good.
It is recommended to operate using partitioned tables, and as of Zabbix 7.4, this configuration must be adjusted via the web UI.
Go to: Administration → Housekeeping
Set as follows:
Trends → ☐ (disable)
Events and alerts → ☐ (disable)
Services → ☐ (disable)
User sessions → ☑ (keep)
History → ☐ (disable)
5. Cache and Process Tuning
Zabbix Server doesn’t write directly to the database — it writes to caches first.
If those caches are too small, data immediately flushes to MySQL, causing heavy I/O spikes.
Check these parameters in /etc/zabbix/zabbix_server.conf:
| Option | Description | Recommended |
|---|---|---|
CacheSize | Configuration cache (hosts, templates, items) | 256M–512M |
HistoryCacheSize | Real-time metric cache | 512M–1G |
TrendCacheSize | Cache for trend calculations | ≥256M |
ValueCacheSize | Recent value cache (for trigger evaluation) | ≥512M |
StartDBSyncers | Parallel DB write processes | 8–16 |
StartPollers | Poller processes | about (number of hosts / 50) |
These settings directly reduce DB pressure.
6. Core MySQL InnoDB Tuning Parameters
Zabbix relies heavily on InnoDB.
Default MySQL settings will not survive real-world workloads.
Below are the must-tune parameters for production.
[mysqld]
innodb_buffer_pool_size = 4G
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 2
innodb_doublewrite = 1
innodb_flush_method = O_DIRECT
innodb_io_capacity = 2000
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_buffer_pool_instances = 4
🔹 innodb_buffer_pool_size
Caches both data and indexes — the single most critical variable.
Allocate 60–70% of total system memory.
Too small, and you’ll see a massive rise in disk I/O.
🔹 innodb_flush_log_at_trx_commit
Controls how often InnoDB flushes transaction logs to disk.
| Value | Behavior | Performance | Durability |
|---|---|---|---|
| 0 | Flush every second | Fast | Low |
| 1 | Flush on every commit | Slow | Safest |
| 2 | Flush to OS cache on commit | Balanced | Safe enough |
For Zabbix — with hundreds of transactions per second — 2 is optimal.
At worst, you may lose one second of data during a power failure,
which is operationally negligible.
🔹 innodb_doublewrite
There is an increased risk of page corruption during power failure or kernel panic.
Even in an NVMe + O_DIRECT environment, it is safer to keep the value at 1 for production databases.
Keep it 1 unless you are hitting a performance bottleneck and are willing to accept the associated risk.
🔹 innodb_flush_method
Bypassing filesystem caching with O_DIRECT eliminates double buffering and reduces latency.
🔹 innodb_io_capacity
Defines the maximum IOPS the InnoDB background threads will try to use.
On SSDs, set it between 2000–4000.
The default (200) is far too conservative and delays page flushing.
7. RAID Chunk Size / I/O Optimization
Zabbix databases perform mostly small, random writes.
With RAID, smaller chunk sizes (16–32 KB) match InnoDB’s 16 KB page size best.
| Purpose | Recommended Chunk Size | Reason |
|---|---|---|
| History (random write) | 16–32 KB | Reduces head movement |
| Trends (sequential write) | ≥64 KB | Improves sequential throughput |
In RAID10, stripe width = chunk size × (n – 1).
Align stripe size carefully with InnoDB page boundaries to avoid wasted I/O.
8. Monitoring for Bottlenecks
Always confirm where the bottleneck lies before tuning.
Regularly check these metrics:
| Metric | Target | Meaning |
|---|---|---|
zabbix[queue] | < 100 | Polling queue healthy |
zabbix[wcache,history,pfree] | > 25% | Cache has free space |
MySQL Threads_running | < 50 | DB not overloaded |
iostat %util | < 70% | Disk utilization healthy |
innodb_row_lock_time_avg | < 10 ms | Minimal lock contention |
If all are within range, your tuning is solid.
9. Example: Optimal Single-Server Setup
Zabbix Server
├─ MySQL (history → SSD, trends → HDD)
├─ InnoDB buffer pool: 4 GB
├─ Disable Housekeepers
├─ innodb_flush_log_at_trx_commit=2
├─ CacheSize=512 MB
├─ HistoryCacheSize=1 GB
└─ TrendCacheSize=256 MB
- Store history tables on SSD
- Keep trends on HDD for long-term data
- Disable the housekeeper
- Run daily cron jobs to drop old partitions
- Use InnoDB flush mode
2(flush to OS cache)
With this setup, even environments ingesting 10 K–20 K metrics per second show almost no queue delay.
10. Conclusion
The core principle of Zabbix DB performance is simple:
Reduce InnoDB I/O and rely on caching as much as possible.
- Separate history/trends storage
- Use partition-based cleanup
- Disable Housekeepers
- Use
innodb_flush_log_at_trx_commit=2 - Maximize
innodb_buffer_pool_size - Expand
CacheSizeandValueCacheSize
Get these six right, and your Zabbix server will stay stable even under several times its normal load.
Understand why the database fails first, and you’ll keep your monitoring platform alive much longer
ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. All rights reserved. Unauthorized copying or redistribution of the text and images is prohibited. When sharing, please include the original source link.
🛠 마지막 수정일: 2025.10.28
답글 남기기
댓글을 달기 위해서는 로그인해야합니다.