[Deep Dive] Zero-Downtime Large-Scale Data Deletion Strategies in MySQL: PK Walking vs pt-archiver

Deleting data in a production database—especially tables containing tens of millions of rows—incurs significantly higher system costs than INSERT operations. Executing a massive DELETE in a single transaction carries a very high risk of service degradation due to row-level lock contention, replication lag, and reduced buffer pool efficiency.

While the previous troubleshooting log addressed solutions from an incident-response perspective, this article takes an architectural approach. It provides an in-depth analysis of two deletion strategies that control these issues at the system level while minimizing load: PK Walking Iterator and pt-archiver.

Related article:

[Troubleshooting] DB Outage After Large-Scale DELETE — Row-Level Lock Explosion and Operational Remedies

1. Why Large-Scale Deletes Are Dangerous in RDBMS

In the InnoDB engine, DELETE is not a simple data removal operation. Internally, it involves several high-cost processes.

Undo Log Explosion
To support transaction rollback and MVCC (Multi-Version Concurrency Control), InnoDB records all pre-delete data in the Undo Log. This generates massive disk I/O.

Increased Lock Contention
Exclusive locks are applied to the target rows. While normal consistent reads are unaffected, operations such as SELECT ... FOR UPDATE, foreign key checks, and DDL can trigger cascading lock waits.

Buffer Pool Efficiency Degradation
To locate rows for deletion, data pages are loaded from disk into memory (Buffer Pool) and marked dirty. This process can evict frequently accessed working-set data from memory, directly increasing flush pressure and reducing the buffer pool hit ratio.

For these reasons, large-scale deletions must not be executed “all at once”, but rather “in small batches” with continuous load monitoring and throttling.

2. Anti-Pattern: LIMIT & OFFSET

One of the most common mistakes is using LIMIT with OFFSET.

-- Pattern to absolutely avoid: performance degrades as OFFSET grows
DELETE FROM target_table 
WHERE created_at < '2025-01-01' 
LIMIT 1000 OFFSET 100000;

While this may appear fast initially, performance deteriorates sharply as OFFSET increases. The database must scan and discard the first 100,000 rows to locate the target range. On large tables, this approach is a major source of wasted system resources.

3. Strategy A: PK Walking Iterator (Stored Procedure)

The PK Walking technique eliminates OFFSET inefficiency by traversing the index (PK) and explicitly narrowing the deletion range. The core idea is to identify the next range based on the last processed ID, rather than skipping rows.

[Code] Safe PK Walking Implementation (Recommended)

This approach is most effective when external tool installation is restricted or when deletion logic involves complex business rules.

⚠️ Index Performance Caution
The effectiveness of this logic depends heavily on the presence of composite indexes such as (id, created_at) or (created_at, id), as well as the execution plan (EXPLAIN). At a minimum, an index on created_at is required, and index usage must be verified before executing large-scale deletions.

The template below preserves the algorithm while allowing variable names to be adapted to your environment.

DELIMITER $$

CREATE PROCEDURE safe_purge_data(IN p_batch_size INT)
BEGIN
  -- Initialize to -1 to cover both 0 and positive IDs
  DECLARE v_current_id BIGINT DEFAULT -1;
  DECLARE v_end_id BIGINT;
  DECLARE v_affected_rows INT DEFAULT 1;
  
  -- Loop start
  WHILE v_affected_rows > 0 DO
    
    -- 1. Determine the last ID to delete in this batch (subquery optimized)
    -- Instead of OFFSET, fetch p_batch_size IDs greater than v_current_id and take MAX
    SELECT MAX(id) INTO v_end_id
    FROM (
      SELECT id
      FROM target_table
      WHERE id > v_current_id
        AND created_at < '2025-01-01'
      ORDER BY id ASC
      LIMIT p_batch_size
    ) AS sub;
    
    -- Exit loop if no more rows to delete
    IF v_end_id IS NULL THEN
      SET v_affected_rows = 0;
    ELSE
      START TRANSACTION;

      -- 2. Range-based delete
      DELETE FROM target_table
      WHERE id > v_current_id 
        AND id <= v_end_id
        AND created_at < '2025-01-01';

      SET v_affected_rows = ROW_COUNT();
      
      COMMIT;

      -- 3. Move pointer
      SET v_current_id = v_end_id;
      
      -- 4. Throttling to distribute system load
      -- Sleep only if deletion actually occurred
      IF v_affected_rows > 0 THEN
        DO SLEEP(0.1); 
      END IF;
    END IF;
    
  END WHILE;
END$$

DELIMITER ;

[Tip] Idempotency
Even if this procedure fails mid-execution and is rerun, already deleted data is naturally skipped due to the condition WHERE id > v_current_id, making the process safe and repeatable.
(However, in replication environments, binlog_format must be ROW or MIXED to avoid data inconsistency.)

4. Strategy B: pt-archiver — The Operational Automation Standard

pt-archiver from Percona Toolkit is the most trusted tool among DBAs and SREs. Its core value lies in impact control on live services.

Key Feature: Replication Lag Detection

The biggest differentiator from custom scripts is replication lag monitoring. During large deletes, the master may appear healthy while replicas fall behind, causing data inconsistency or read latency. pt-archiver detects this in real time and automatically pauses execution.

Production-Ready Command

The following command set is optimized for simple data purging (--purge).

pt-archiver \
  --source h=127.0.0.1,u=admin,p=pass,D=mydb,t=target_table \
  --where "created_at < '2025-01-01'" \
  --limit 1000 \
  --txn-size 1000 \
  --check-slave-lag h=192.168.0.50 \
  --max-lag 1 \
  --sleep 1 \
  --purge \
  --primary-key-only

Option Breakdown

--primary-key-only
Scans data using only the PK index for maximum efficiency. (Can be omitted if a WHERE-clause index is significantly more optimal than PK.)
--check-slave-lag h=... --max-lag 1
Immediately pauses execution if replication lag exceeds 1 second on the specified replica. This is the most critical option for operational stability. (Note: Depending on your environment, connection options such as u=user,p=pass may be required.)
--limit / --txn-size
Controls batch size and transaction size. These are typically set to the same value to process each batch in a single transaction.
--sleep 1
Forces a 1-second pause between batches, yielding CPU resources to service traffic.

pt-archiver options may vary by Percona Toolkit version. Always verify availability using pt-archiver --help before execution.

5. Conclusion: Choosing the Right Strategy

The appropriate tool depends on the operational context.

pt-archiver Recommended When:

Shell access is available in large-scale production environments
Services are sensitive to replication lag (safest option)
Deleting standardized data such as time-based expiration records

PK Walking Recommended When:

External tool installation is not permitted (e.g., fully managed PaaS constraints)
Deletion logic includes complex business requirements

Copy & Swap Recommended (When Deleting ≥ 50% of Data):

Creating a new table, inserting only required data (INSERT SELECT), and swapping tables (RENAME) is far faster than DELETE.

[Mandatory Safeguards]

Set transaction isolation level to READ COMMITTED during INSERT SELECT to reduce write-blocking risk on the source table
Before executing RENAME, verify no long-running transactions via SHOW PROCESSLIST to prevent metadata lock waits
Prefer execution during low-traffic maintenance windows

Managing the data lifecycle is just as critical as ingesting data. Unplanned DELETE operations are a direct path to outages. Engineering discipline lies in keeping system load within controllable bounds using the strategies outlined above.

🛠 마지막 수정일: 2025.12.31

ⓒ 2026 엉뚱한 녀석의 블로그 [quirky guy's Blog]. 본문 및 이미지를 무단 복제·배포할 수 없습니다. 공유 시 반드시 원문 링크를 명시해 주세요.
ⓒ 2026 엉뚱한 녀석의 블로그 [quirky guy's Blog]. All rights reserved. Unauthorized copying or redistribution of the text and images is prohibited. When sharing, please include the original source link.

💡 도움이 필요하신가요?
Zabbix, Kubernetes, 그리고 다양한 오픈소스 인프라 환경에 대한 구축, 운영, 최적화, 장애 분석, 광고 및 협업 제안이 필요하다면 언제든 편하게 연락 주세요.

📧 Contact: jikimy75@gmail.com
💼 Service: 구축 대행 | 성능 튜닝 | 장애 분석 컨설팅

📖 E-BooK [PDF] 전자책 (Gumroad): Zabbix 엔터프라이즈 최적화 핸드북
블로그에서 다룬 Zabbix 관련 글들을 기반으로 실무 중심의 지침서로 재구성했습니다. 운영 환경에서 바로 적용할 수 있는 최적화·트러블슈팅 노하우까지 모두 포함되어 있습니다.

💡 Need Professional Support?
If you need deployment, optimization, or troubleshooting support for Zabbix, Kubernetes, or any other open-source infrastructure in your production environment, or if you are interested in sponsorships, ads, or technical collaboration, feel free to contact me anytime.

📧 Email: jikimy75@gmail.com
💼 Services: Deployment Support | Performance Tuning | Incident Analysis Consulting

📖 PDF eBook (Gumroad): Zabbix Enterprise Optimization Handbook
A single, production-ready PDF that compiles my in-depth Zabbix and Kubernetes monitoring guides.