In Kubernetes environments, it’s common to see intermittent timeouts when using NodePort services, calling external APIs, or communicating between internal services.
Pods appear healthy, nodes have available resources, and logs look clean—yet specific traffic paths intermittently drop packets for no obvious reason.

When this happens repeatedly on Ubuntu-based Kubernetes nodes, the first thing to check is:

Whether the node’s nf_conntrack table has reached its limit.

Ubuntu provides reasonable default values for conntrack, but in Kubernetes—where NAT traffic is concentrated—these defaults are often insufficient.

This guide explains, from a practical standpoint on Ubuntu:

what conntrack does,
why it becomes exhausted,
and how to tune it for stable operation.

1. Why NAT Becomes Excessive in Ubuntu + Kubernetes

Kubernetes ultimately runs on the Linux kernel, and whenever NAT occurs, a session is stored in the conntrack table.

This mechanism is the same regardless of whether the OS is Ubuntu, CentOS, or anything else.

The most common NAT-heavy scenarios on Ubuntu Kubernetes nodes are the following.

1) Pod → External API (SNAT)

If a service performs many outbound calls to external SaaS endpoints, external DBs, or REST APIs,
the NAT session count rises rapidly on the node handling those pods.

2) NodePort

NodePort inherently performs DNAT → SNAT in sequence.
When traffic is heavy, conntrack fills up quickly.

3) ClusterIP (kube-proxy iptables mode)

Kubernetes clusters built with kubeadm on Ubuntu use iptables-based kube-proxy by default.

Depending on the traffic path, iptables-based load balancing may result in one or even two NAT operations.

4) readinessProbe / livenessProbe Overload

Probes themselves are TCP/HTTP connections.
If many probes target a single node, its conntrack count rises significantly.

2. Typical Symptoms of conntrack Exhaustion on Ubuntu Kubernetes Nodes

When the conntrack table becomes full,
the kernel logs a warning and simply drops packets:

nf_conntrack: table full, dropping packet.

On Ubuntu nodes, this results in:

Intermittent timeouts via NodePort
Partial failures in outbound API calls
Ingress 503 errors (often node-specific)
Sporadic Pod → Pod communication drops
Increased HTTP 502/503 responses
One node behaving unstable while others appear fine

The pattern appears random but consistently affects specific nodes.

3. CNI vs conntrack — The Key Point Is Not “Calico or Not”

A common misconception is that conntrack behavior depends on the CNI.
It does not.

CNI handles Pod-to-Pod routing,
while conntrack handles state tracking for NAT.

Therefore, if NAT occurs in Ubuntu + Kubernetes,
conntrack will be used regardless of CNI.

conntrack usage by CNI

CNI	Uses conntrack?	Reason
Calico	Yes	kube-proxy NAT stays unchanged
Flannel	Yes	SNAT/DNAT behavior identical
Weave	Yes	Same kube-proxy NAT behavior
Cilium (kube-proxy ON)	Yes	NAT path remains
Cilium (kube-proxy replacement / eBPF)	Partial reduction	Internal LB via eBPF, but external SNAT still uses Ubuntu kernel

Conclusion:
Any Kubernetes architecture that performs NAT will consume conntrack,
and the CNI has little to do with it.

4. Understanding Ubuntu’s conntrack Default Values

This is an important point.

On Ubuntu 20.04, 22.04, and 24.04, it is incorrect to say that conntrack defaults to a specific fixed value.

Ubuntu determines conntrack settings based on factors such as:

kernel version
system memory
automatic calculation of nf_conntrack_buckets
distro-provided sysctl files in /usr/lib/sysctl.d/
prebaked sysctl values in cloud images (EKS, GKE, AKS, etc.)
kubeadm configuration

Therefore:

Saying “Ubuntu default = 65536” is incorrect.

However, in many real-world Ubuntu nodes, especially those not tuned manually,
values in the 65,536 ~ 131,072 range are frequently observed.
This is why such numbers often appear in troubleshooting discussions.

In practice, the correct approach is:

Check the actual values on the node:

cat /proc/sys/net/netfilter/nf_conntrack_max
cat /proc/sys/net/netfilter/nf_conntrack_count

5. Recommended conntrack Tuning Values for Ubuntu + Kubernetes

In Kubernetes workloads with moderate or high traffic,
Ubuntu’s default conntrack settings are often insufficient.

Recommended practical values:

For mid-sized clusters

nf_conntrack_max = 262144
nf_conntrack_buckets = 65536

For high-throughput workloads

nf_conntrack_max = 524288 ~ 1048576
nf_conntrack_buckets = nf_conntrack_max / 4

On nodes with at least 16GB of RAM,
these settings have negligible memory impact.

6. How to Tune conntrack Properly on Ubuntu

Create the file:

/etc/sysctl.d/99-nf-conntrack.conf

net.netfilter.nf_conntrack_max=262144
net.netfilter.nf_conntrack_buckets=65536

Apply the changes:

sysctl --system

Verify:

cat /proc/sys/net/netfilter/nf_conntrack_max
cat /sys/module/nf_conntrack/parameters/hashsize

If hashsize does not change, Ubuntu may require a reboot
because the module was already loaded.

7. Why conntrack Usage Decreases in Cilium eBPF Mode (Ubuntu-Based)

When Cilium enables kube-proxy replacement,
ClusterIP and NodePort load balancing is performed via eBPF instead of Ubuntu’s iptables.

This reduces NAT operations inside the cluster
and therefore lowers conntrack usage.

However, NAT is still used in these cases:

Pod → external internet (forced SNAT)
ExternalTrafficPolicy=Local
HostNetwork pods
Services with externalTrafficPolicy=Cluster (partial NAT)

So:

NAT doesn’t disappear—it is simply bypassed in some internal paths via eBPF.

8. Real-World Cases Where conntrack Exhaustion Occurs on Ubuntu Kubernetes Nodes

Examples frequently seen in production:

Heavy traffic hitting a NodePort API
All outbound traffic concentrated on a single node
readinessProbe/livenessProbe firing multiple times per second
sidecar patterns multiplying connection count
StatefulSet pods repeatedly calling the same external DB
Ingress traffic skewed heavily toward certain nodes

If an Ubuntu node’s conntrack limit is in the 65k–130k range,
these patterns almost guarantee exhaustion.

9. Conclusion

In Ubuntu-based Kubernetes nodes with heavy NAT traffic,
issues such as intermittent timeouts, NodePort 503 errors, and sporadic Pod-to-Pod drops
tend to appear repeatedly and are often difficult to diagnose.

The core reason behind these symptoms is usually the same:

The nf_conntrack table on the Ubuntu node is configured with a default size that is too small.
The default value is not fixed and varies depending on the environment.
CNI has almost no direct relationship with conntrack usage.
As long as kube-proxy is running in iptables mode,
NAT operations are handled by the Ubuntu kernel.
Pod → external traffic, NodePort, and ClusterIP load balancing
all consume conntrack entries.

Therefore:

Increasing nf_conntrack_max is essential in Ubuntu-based Kubernetes environments.
For Ubuntu nodes running Kubernetes,
nf_conntrack_max = 262144 or higher is effectively the minimum operational baseline,
and systems with heavier traffic should set it even higher.

🛠 마지막 수정일: 2025.11.19

ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. 본문 및 이미지를 무단 복제·배포할 수 없습니다. 공유 시 반드시 원문 링크를 명시해 주세요.
ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. All rights reserved. Unauthorized copying or redistribution of the text and images is prohibited. When sharing, please include the original source link.

💡 도움이 필요하신가요?
Zabbix, Kubernetes, 그리고 다양한 오픈소스 인프라 환경에 대한 구축, 운영, 최적화, 장애 분석이 필요하다면 언제든 편하게 연락 주세요.

📧 Contact: jikimy75@gmail.com
💼 Service: 구축 대행 | 성능 튜닝 | 장애 분석 컨설팅

💡 Need Professional Support?
If you need deployment, optimization, or troubleshooting support for Zabbix, Kubernetes, or any other open-source infrastructure in your production environment, feel free to contact me anytime.

📧 Email: jikimy75@gmail.com
💼 Services: Deployment Support | Performance Tuning | Incident Analysis Consulting

Kubernetes Packet Drops Caused by nf_conntrack Exhaustion — A Practical Guide for Ubuntu-Based Nodes