How Ingress Controller Traffic Works on On-Prem K8s with MetalLB — Why externalTrafficPolicy: Cluster Can Blackhole Traffic

1) Default Behavior (Cluster mode) — Kubernetes

  • Service.type: LoadBalancer + externalTrafficPolicy: Cluster → traffic to the VIP can land on any node in the cluster.
  • The receiving node’s kube-proxy forwards the traffic to a node that actually hosts a backend Pod (this hop typically involves SNAT).
  • The original client IP is lost, but availability is maintained regardless of where Pods run.
  • You cannot know in advance which node will be bound for an incoming connection.

Check your current setting

  • Default install (names may vary for custom installs):
kubectl get svc ingress-nginx-controller -n ingress-nginx -o yaml | grep externalTrafficPolicy

2) VIP Binding in MetalLB L2

  • MetalLB (L2 mode) binds a VIP to a single node’s MAC.
    (Upstream ARP table on the switch/gateway: VIP → Node-A MAC)
  • All external traffic therefore enters the cluster through that one node (Node-A).
  • Whether the Pod is on Node-A or not, kube-proxy will re-route it inside the cluster.

3) Failure Scenario

  1. VIP initially bound to Node-A (VIP → Node-A MAC).
  2. The Ingress Controller Pod is rescheduled to Node-B.
    • With externalTrafficPolicy: Cluster, Node-A → Node-B redirection should still work.
  3. MetalLB changes the VIP owner to Node-B and sends gratuitous ARP (GARP) to update neighbors.
  4. Some network devices (switch/router) ignore GARP or keep the old MAC cached.
  5. External traffic still goes to Node-A.
  6. But once VIP ownership moved, Node-A no longer accepts the VIP → traffic is blackholed.

4) Why a Blackhole Even in Cluster Mode?

  • In theory, Cluster mode lets any node accept and forward traffic.
  • In MetalLB L2, only the current owner’s MAC answers ARP for the VIP. When ownership flips, the former owner stops responding.
  • If upstream ARP still points to the old MAC (Node-A), packets arrive at Node-A, which now drops them → packet loss.
  • Root cause is almost always ARP table refresh failure in L2 mode.
    • Even if you switch the ingress traffic policy to Local and ensure the ingress Pod runs on the “intended” node, this does not resolve the underlying ARP-staleness problem.

5) Practical Remediation

  • Make network devices honor GARP and be ready to clear/flush ARP on switches/routers when VIP ownership changes.
  • Consider BGP mode: multiple nodes advertise the VIP, removing the ARP single-owner dependency.
    • This adds network handling requirements and operational complexity—fine for greenfield, but migrating a running cluster can introduce many variables and overhead.
    • (I’ll cover BGP details in a separate post.)

✅ Summary

  • externalTrafficPolicy: Cluster is the default and should be robust.
  • With MetalLB L2, VIP owner changes can fail if ARP tables don’t update.
  • The resulting blackhole is caused by L2/ARP behavior, not by the application stack.

🛠 마지막 수정일: 2025.09.18

ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. 본문 및 이미지를 무단 복제·배포할 수 없습니다. 공유 시 반드시 원문 링크를 명시해 주세요.
ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. All rights reserved. Unauthorized copying or redistribution of the text and images is prohibited. When sharing, please include the original source link.

💡 도움이 필요하신가요?
Zabbix, Kubernetes, 그리고 다양한 오픈소스 인프라 환경에 대한 구축, 운영, 최적화, 장애 분석, 광고 및 협업 제안이 필요하다면 언제든 편하게 연락 주세요.

📧 Contact: jikimy75@gmail.com
💼 Service: 구축 대행 | 성능 튜닝 | 장애 분석 컨설팅

📖 E-BooK [PDF] 전자책 (Gumroad): Zabbix 엔터프라이즈 최적화 핸드북
블로그에서 다룬 Zabbix 관련 글들을 기반으로 실무 중심의 지침서로 재구성했습니다. 운영 환경에서 바로 적용할 수 있는 최적화·트러블슈팅 노하우까지 모두 포함되어 있습니다.


💡 Need Professional Support?
If you need deployment, optimization, or troubleshooting support for Zabbix, Kubernetes, or any other open-source infrastructure in your production environment, or if you are interested in sponsorships, ads, or technical collaboration, feel free to contact me anytime.

📧 Email: jikimy75@gmail.com
💼 Services: Deployment Support | Performance Tuning | Incident Analysis Consulting

📖 PDF eBook (Gumroad): Zabbix Enterprise Optimization Handbook
A single, production-ready PDF that compiles my in-depth Zabbix and Kubernetes monitoring guides.