Boosting the transparency of your load balancer traffic is advantageous. Web applications continually pass information back and forth, yet some of this important data is often hard to get during transit. And while the perceived “black box” nature of networking seems overwhelming, what if you could peek behind the curtain to better understand your traffic?
This starts with preserving crucial data such as the client’s Source IP. Clients often use the PROXY protocol to transmit the original client IP address but also to embed additional information in the request. In fact, many applications embed additional headers like PP2_TYPE_ALPN
, among others. We developed this protocol in-house at HAProxy, and it's now widely supported by modern web infrastructure. However, there’s some minor work involved in extracting and interpreting this header information. That's where popular external tools like TShark can help.
In this guide, we’ll first explain why connection data matters. Next, you’ll learn how the PROXY
protocol works, how to use the TShark analyzer to capture and inspect packets and view the extraction results.
What makes traffic data so valuable?
Theoretically, each request flowing through a load balancer like HAProxy can contain a client IP address, port information, a destination IP address, and even a virtual private cloud (VPC) subnet ID. These bits of data enable connection validation. For example, parsing the header can help us identify good hosts with legitimate identities. Equally, HAProxy can use that information to flag bad hosts and halt traffic accordingly.
At the highest level, these parameters provide important context behind requests and boost the transparency of client-server communication pathways. It’s easier to understand how traffic is flowing and possibly identify configuration errors.
Finally, data packaged via the PROXY
protocol lets us chain multiple layers of NAT or TCP proxies together while keeping the original IP address. The PROXY
protocol header, in one instance, enables traffic to successfully pass through subsequent firewalls and proxies.
What is the Proxy Protocol?
The PROXY
protocol provides a convenient way to transport connection information safely between client and server. Without a load balancer in the middle, the server would normally be able to directly retrieve this data independently. This includes the following in HAProxy:
An address family, such as
AF_INET
for IPv4,AF_INET6
for IPv6, andAF_UNIX
Socket protocols for TCP and UDP
Layer 3 source and destination addresses
Any Layer 4 source and destination ports
We designed the PROXY
protocol to be quickly parseable. While Version 1 focused on human readability, Version 2 adds binary encoding support to accelerate this process even further. The PROXY
protocol is the successor to our XCLIENT
protocol.
How the Proxy Protocol works
Primarily, we know that clients use the PROXY
protocol to embed connection information. But, how does everything work under the hood?
We've designed the protocol to reduce information processing overhead without requiring users to make sweeping changes to their backend components. Plus, the protocol prevents the typical connection parameter losses that occur when relaying TCP connections through a proxy.
This sidesteps what we call the “dumb proxy” problem, where a load balancer processes protocol-agnostic data without knowing which protocol is transported atop the connection. HAProxy can run in pure TCP mode and fall into this category.
This isn’t a negative aspect in and of itself. However, technical challenges can arise when using the keep-alive
directive. Only the first request on a new connection will pass the X-Forwarded-For
header or the Forwarded
extension, which is a problem on long-lived connections that handle multiple requests. Since the client information within those headers often remains unchanged, sending a header with each request is unnecessary.
Packaging information via the PROXY
protocol solves this problem. HAProxy can prepend each connection with a header that reports insightful connection characteristics for the other side. This is easy to implement without protocol-specific knowledge. Plus, we can eliminate any dangers and limitations of caching.
So, we now have this conveniently prepackaged data. How do users unpack it into something usable?
Introducing the TShark packet analyzer
The TShark tool—part of Wireshark’s open source packet-analysis software—lets us quickly and easily capture, read, and print PROXY
protocol packets. It writes the PROXY
Protocol information it gathers in either a Terminal output or destination file.
TShark also offers the following advantages:
CLI-based Wireshark filter integration
Direct packet capture
Packet capture (PCAP) analysis from a tcpdump
In fact, TShark’s default configuration closely emulates tcpdump’s functionality. The tcpdump tool also ships with most Linux-based operating systems, so many users may already be familiar with it. While HAProxy can interpret and understand the contents of these packets for routing purposes, we can't natively perform a tcpdump.
TShark’s manual states that it grabs data “from the first available network interface and displays a summary line on the standard output for each received packet.” Luckily, TShark is a user-friendly tool. Let’s dive into a technical example that uses TShark to unearth data from PROXY
protocol packets.
Set up TShark and test your packet capture
Before combing through your packets, you’ll have to install TShark using the CLI. This is necessary if your Linux distribution doesn’t automatically come with TShark:
For Red Hat Enterprise Linux (RHEL) and RHEL clones, enter the following command:
sudo yum install wireshark |
For Ubuntu and Debian, enter the following:
sudo apt-get install tshark |
TShark should now be available! However, we recommend testing basic packet capture to ensure that everything is working correctly. Use the following command to test capture for five seconds everywhere except at port 22, to avoid grabbing your own SSH traffic:
tshark -f "tcp port not 22" -a "duration:5" |
If you want to analyze your output later, you can also specify a destination file using the -w FILENAME
argument. Simply enter the tshark -r FILENAME
command to read its contents.
Next, the process for solely targeting PROXY
protocol packets is a little different. Here’s how to do it.
Finding and unpacking Proxy Protocol packets
Capturing, identifying, and unpacking PROXY
protocol packets is easy using Wireshark's display filters. For example, you can filter out only relevant packets, and there are multiple proxy traffic filters.
In most cases, a PROXY
protocol packet has an embedded Source IP. You can find it with the following command:
tshark -r FILENAME -Y proxy.src.ipv4 |
This will only list packets that contain PROXY
protocol information. So, what if you want to view the contents of those packets? Simply add the -V
flag, which gives you the following CLI command:
tshark -r FILENAME -Y proxy.src.ipv4 -V |
As a result, your Terminal will display something similar to what you see below:
ubuntu@bk:/etc/hapee-2.6# tshark -Y proxy.src.ipv4 -V | |
... | |
PROXY Protocol | |
Magic: 0d0a0d0a000d0a515549540a | |
0010 .... = Version: 2 | |
.... 0001 = Command: 1 | |
[Version: 2] | |
Address Family Protocol: TCP over IPv4 (0x11) | |
0001 .... = Address Family: IPv4 (0x1) | |
.... 0001 = Protocol: 0x1 | |
Length: 12 | |
Source Address: 192.168.64.1 | |
Destination Address: 192.168.64.120 | |
Source Port: 49344 | |
Destination Port: 8081 |
That’s it! The entire contents of your PROXY
protocol packet are now viewable and ready for inspection. You can repeat this same process for other PROXY
packets.
Inspecting traffic while using TLS
In most cases, your traffic will use TLS. However, these TLS packets confuse tools like Wireshark—preventing them from correctly identifying PROXY
protocol packets. This is especially true for PROXY
protocol v2. Version 2 adds a non-parseable binary signature that's designed to cause immediate failures on SSL/TLS (among other protocols) to enforce its use under certain connections.
Luckily, you can use the following command to easily disable TLS protocol detection (for TShark testing purposes) during packet capture and restore PROXY
packet detection:
tshark -disable-protocol tls -Y proxy.src.ipv4 -V |
This produces a similar output to our original command.
Testing new Proxy Protocol requests
While you can now grab important data, you'll sometimes need to generate PROXY
protocol requests to test your data capture. You can send these requests with the following cURL utility command:
curl --haproxy-protocol http://<your load balancer>/... |
You'll need to add :80 accept-proxy
to the bind
line of your configuration for this to work properly. This enables HAProxy to accept the PROXY
protocol (both Version 1 and Version 2).
Running this command will verify that your connection is working, and that nothing is preventing PROXY
protocol requests from flowing through HAProxy. Here's an example output using the curl --haproxy-protocol -v http://localhost
command:
* Trying ::1:80... | |
* connect to ::1 port 80 failed: Connection refused | |
* Trying 127.0.0.1:80... | |
* Connected to localhost (127.0.0.1) port 80 (#0) | |
> PROXY TCP4 127.0.0.1 127.0.0.1 53606 80 | |
> GET / HTTP/1.1 | |
> Host: localhost | |
> User-Agent: curl/7.74.0 | |
> Accept: */* | |
> | |
* Mark bundle as not supporting multiuse | |
< HTTP/1.1 302 Found | |
< content-length: 0 | |
< location: /hapee-stats | |
< cache-control: no-cache | |
< | |
* Connection #0 to host localhost left intact |
Important header information at your fingertips
The PROXY
protocol is an incredibly useful mechanism for transporting important connection information through HAProxy. This data is valuable to users who want more insight into their traffic, without having to make major functional concessions.
And while some steps are required to unpack that information, tools like TShark make the process pretty painless. The next time you want to inspect the contents of your PROXY
protocol packets, you can do so without jumping through hoops or possessing niche technical knowledge.