Service Failure When Internal App Servers Call an L4 VIP — Why You Need a Proxy IP

In many internal service environments, application servers send traffic to a private L4 VIP.
This is common in architectures where an internal load-balancing layer mediates traffic between multiple app servers.
As a result, an app server calling its own VIP is a perfectly normal scenario.

Yet under certain conditions, this setup can suddenly start dropping traffic.
This typically occurs after:

  • L2 switch replacement
  • Uplink port changes
  • VLAN rearrangements
  • Missing port configurations
  • L4 downstream switch re-cabling or replacement

On the surface, the L4 load balancer looks healthy, the app servers look healthy, and even the L4 health checks are fine.
But internal calls to the L4 VIP fail—while everything else works.

The root cause is deceptively simple:

Traffic entering an L4 device must exit through the same L4 device.
If an internal app server bypasses the L4 on its response path, the packet will be dropped.

This article explains

  • why asymmetric routing occurs,
  • why a proxy IP is mandatory,
  • what architectures reproduce this issue, and
  • how real incidents manifest in the field.

1. Failure scenario — Only internal calls to the VIP break

External flows work normally:

External user → L4 VIP → App server

The problem appears only when internal servers call the internal VIP:

Example flow:

App1 → L4 VIP (private VIP) → L4 → App2
App2 → App1 (response goes direct) → packet drop

Because App2 responds directly to App1—bypassing the L4—the return path breaks the load balancer’s NAT/session rules.

Thus, only internal-to-internal VIP flows fail.

This happens due to the combination of:

  • how the L4 device handles NAT-based sessions
  • how an L2 switch forwards packets between servers on the same VLAN

2. Basic architecture — L4 device, L2 switch, and app servers

A very common setup looks like this:

  • A single L2 switch sits underneath the L4 load balancer
  • Multiple app servers exist in the same VLAN
  • The L4 VIP load-balances across these servers

The key point:

Servers on the same L2 network can resolve each other directly via ARP.

That means App2 already knows App1’s MAC address.
So it has no reason to send the reply back through the L4.

This is exactly where things go wrong.


3. How the traffic actually breaks

Let’s walk through a typical flow:

  1. App1 → L4 VIP:80 (request goes to the L4 normally)
  2. L4 load-balances → App2 (also normal)
  3. App2 → App1 (response)
    • App2 knows App1’s MAC (same L2, same VLAN)
    • Sends the response directly, bypassing the L4
  4. But the L4 session/NAT table expects the return traffic
  5. Since no response comes back through the L4, the session breaks
  6. Packet is dropped

In summary:

  • Request path: App1 → L4 → App2
  • Response path: App2 → App1 (direct)

This mismatch (asymmetric routing) causes the NAT session to collapse.

This issue always appears in NAT-mode L4 topologies when internal servers call their own VIP.


4. Why the L4 drops the direct response

In NAT mode, the L4 expects every VIP connection to follow a complete session path:

Request comes into the VIP →
Forwarded to a real server →
Response must return through the VIP

If App2 bypasses the L4 and sends its reply directly:

  • The L4 has no matching session for the response
  • The NAT table is incomplete
  • The packet appears “unexpected”
  • The L4 (or App1) drops the traffic

In other words:

When the response doesn’t return through the L4, the entire NAT session becomes invalid.


5. The solution — Apply a Proxy IP

The only correct and reliable way to fix this is by enabling proxy IP (sometimes called “ARP proxy” or “L2 forwarding enforcement”).

The purpose of the proxy IP is:

Force all intra-VLAN server-to-server responses to pass through the L4, even if the servers are on the same L2.

After enabling proxy IP:

  • When App2 tries to reply to App1,
  • The L2 switch forwards the packet to the proxy IP (L4) instead of sending it directly to App1,
  • The L4 rewrites and forwards the response back to App1,
  • The NAT/session flow becomes symmetric again.

Resulting flow:

App1 → VIP (request)
VIP → App2 (LB)
App2 → Proxy IP (forced response)
L4 → App1 (normal NAT return path)

All traffic now follows a perfect symmetric path.


6. Why issues appear after uplink changes or switch replacement

After uplink modifications, the following often happens:

  • Proxy IP settings do not carry over to the new port
  • VLAN / ARP tables relearn neighbors
  • Direct L2 paths between servers become active again
  • The L4 is bypassed
  • NAT sessions break → traffic drops

This is why failures often occur immediately after:

  • Switch upgrades
  • Port patching
  • VLAN migration
  • Moving L4 to a different downstream switch

If proxy IP becomes inactive—even briefly—NAT-based L4 VIP flows will break instantly.


7. Critical rules every operator must remember

✔ Rule 1

Traffic entering an L4 VIP must leave via the same L4 VIP.
If not, NAT-based LB will fail 100%.

✔ Rule 2

Servers on the same L2 will respond directly unless forced not to.
Proxy IP enforces the return path through the L4.

✔ Rule 3

Any uplink or port migration requires verifying proxy IP settings.
This is the most commonly overlooked operational detail.

✔ Rule 4

If internal servers call an internal VIP, proxy IP is mandatory.
There is no exception in NAT-mode L4 environments.


8. Conclusion

The service failure that occurs only when internal servers call a private L4 VIP seems complicated at first glance.
But the mechanism is very simple:

  • The request path uses the L4
  • The response path bypasses it
  • Asymmetric routing destroys the NAT session
  • Traffic drops

Enabling the proxy IP forces responses back through the L4, restoring symmetry and completely eliminating the issue.

After any switch replacement, uplink change, or VLAN update, always confirm that proxy IP is enabled and active on the correct ports.


9. Traffic Flow Summary (Before vs. After Proxy IP)

✅ Before Proxy IP — Asymmetric Routing (Failure Path)

  1. App1 → L4 VIP (request OK)
  2. L4 → App2 (LB OK)
  3. App2 → App1 direct response (bypasses L4)
  4. NAT session mismatch → session broken
  5. App1 receives no valid response → drop

Issue:
App2 bypasses the L4, causing an asymmetric path that breaks NAT.


✅ After Proxy IP — Symmetric Routing (Healthy Path)

  1. App1 → L4 VIP (request OK)
  2. L4 → App2 (LB OK)
  3. App2 → Proxy IP (forced through L4)
  4. L4 → App1 (NAT session complete)
  5. Flow succeeds

Benefit:
Proxy IP ensures the response always returns through the L4, keeping the NAT session intact.

🛠 마지막 수정일: 2025.11.17

ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. 본문 및 이미지를 무단 복제·배포할 수 없습니다. 공유 시 반드시 원문 링크를 명시해 주세요.
ⓒ 2025 엉뚱한 녀석의 블로그 [quirky guy's Blog]. All rights reserved. Unauthorized copying or redistribution of the text and images is prohibited. When sharing, please include the original source link.

💡 도움이 필요하신가요?
Zabbix, Kubernetes, 그리고 다양한 오픈소스 인프라 환경에 대한 구축, 운영, 최적화, 장애 분석이 필요하다면 언제든 편하게 연락 주세요.

📧 Contact: jikimy75@gmail.com
💼 Service: 구축 대행 | 성능 튜닝 | 장애 분석 컨설팅


💡 Need Professional Support?
If you need deployment, optimization, or troubleshooting support for Zabbix, Kubernetes, or any other open-source infrastructure in your production environment, feel free to contact me anytime.

📧 Email: jikimy75@gmail.com
💼 Services: Deployment Support | Performance Tuning | Incident Analysis Consulting