Traffic mirroring makes it possible to stream production traffic to a test or staging environment. Use the HAProxy Traffic Shadowing agent to enable mirroring.
The HAProxy Stream Processing Offload Engine (SPOE) lets you stream data to an external agent in real time where it can be processed by a programming language of your choice, including C, .NET Core, Go, Lua and Python. This opens the door to extending HAProxy in many ways. We described the architecture of the SPOE in our blog post Extending HAProxy with the Stream Processing Offload Engine.
As part of the HAProxy 2.0 release, a new agent was introduced that uses the SPOE to mirror traffic to another environment. The Traffic Shadowing agent captures traffic, or a percentage of it, and sends a copy to another URL. You’d use it to send production traffic to a QA testing environment in order to validate a new version of a feature before it’s made public. That way, you reduce the risk of discovering a bug only after the feature is released.
In this blog post, we’ll demonstrate how to use mirroring to send samples of production traffic to your QA environment.
Real-World Traffic Without the Real-World Impact
Imitating real users is hard. Real users stress an application in ways that are difficult to reproduce artificially. For example, two users may perform unrelated tasks in different parts of the application simultaneously, which then triggers an unknown bug caused by the confluence of their actions. Race conditions, deadlocks, and other threading problems often surface under a realistic load.
Traffic mirroring, or traffic shadowing, is a technique in which live, production traffic is copied and sent to two places: the original production servers and a staging or test environment. That test environment may be segregated into a separate network that is not publicly accessible. As long as the requested URLs and parameters match the new version of the feature being tested, then it’s easy to validate that the new version is as close to bug-free as possible.
The value of traffic mirroring lies in that you can do this without impacting your users. Mirroring traffic using the Traffic Shadowing daemon is fire and forget. When requests are copied and sent to the test environment, it has almost no impact on the time needed to process the request. The client does not need to wait for a response from the test server. You can also configure the daemon to only capture a portion of the traffic so that your test environment doesn’t need to maintain the infrastructure that’s necessary to handle production-level amounts of requests.
Setting up Traffic Mirroring
Clone or download the source code repository and follow the instructions for building it. For example, on Ubuntu Bionic, you would build and install it like this:
$ sudo apt update | |
$ sudo apt install -y autoconf automake build-essential git libcurl4-openssl-dev libev-dev libpthread-stubs0-dev pkg-config | |
$ git clone https://github.com/haproxytech/spoa-mirror | |
$ cd spoa-mirror | |
$ ./scripts/bootstrap | |
$ ./configure | |
$ make all | |
$ sudo cp ./src/spoa-mirror /usr/local/bin/ |
Add the --enable-debug
flag when you call configure
if you want to see more verbose output from the agent, like this:
./configure --enable-debug |
After you’ve followed these steps, the spoa-agent program will be available on your PATH. Start it by passing it the --runtime
argument for how long it should run and then exit (e.g. –runtime 1h for one hour or 0 for unlimited time) and the --mirror-url
argument to set the URL where you want to send the mirrored traffic.
$ spoa-mirror --runtime 0 --mirror-url http://test.local --logfile /var/log/haproxy-mirror.log |
The agent listens on all IP addresses at port 12345 by default. You can change these settings with the --address
and --port
arguments. You can also pass it the --daemonize
argument to run the program in the background.
Starting with HAProxy version 2.0, you can have HAProxy manage the lifetime of the agent. Use the HAProxy Process Manager to control starting the daemon when HAProxy starts. Add a program
section that contains a command
directive to your HAProxy configuration, as shown:
program mirror | |
command spoa-mirror --runtime 0 --mirror-url http://test.local |
When you start HAProxy, you’ll see that the spoa-mirror agent is running alongside it:
$ sudo systemctl status haproxy | |
haproxy.service - HAProxy Load Balancer | |
Main PID: 1177 (haproxy) | |
Tasks: 14 (limit: 1152) | |
CGroup: /system.slice/haproxy.service | |
├─1177 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock -sf 1209 | |
├─2081 spoa-mirror --runtime 0 --mirror-url http://localhost:81 --address 127.0.0.1 | |
└─2082 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock -sf 1209 |
Now the daemon is set up to receive requests and forward it to the test server.
If you’ve enabled debugging, you can send the agent’s log messages to a file by adding the --logfile
flag to the spoa-mirror
command. Prefix the filename with either w:
to overwrite the file if it exists or a:
to append to the file, such as –logfile a:/var/log/spoa-mirror.log.
The next step is to configure HAProxy to send traffic to the agent. Add a filter spoe
directive to your frontend
that references a file named mirror.conf, as shown:
# Production frontend | |
frontend fe_main | |
mode http | |
bind :80 | |
option http-buffer-request | |
filter spoe engine mirror config /etc/haproxy/mirror.conf | |
default_backend servers |
We will cover what goes into mirror.conf in the next section.
Next, in addition to the backend
that holds your production servers, add a backend
that contains the address of the spoa-mirror agent. Here’s an example:
# Production servers | |
backend be_servers | |
mode http | |
server s1 prodserver:80 | |
# Mirror agents | |
backend mirroragents | |
mode tcp | |
balance roundrobin | |
timeout connect 5s | |
timeout server 5s | |
server agent1 localhost:12345 |
In this example, production traffic is received at port 80. It is then sent to the servers backend like normal, but it’s also mirrored to the mirroragents backend, which relays it to the agent listening at localhost:12345. For this to work, you have to set up mirror.conf, which you’ll see in the next section.
The mirror.conf File
The filter spoe
directive in your frontend
lists a config
parameter that points to an SPOE configuration file to use. Create that file now at /etc/haproxy/mirror.conf. An engine
parameter sets a label that must match a section in mirror.conf. We’ve arbitrarily named it mirror in this case. It’s only important that they match. Add the following to the file:
[mirror] | |
spoe-agent mirror | |
log global | |
messages mirror | |
use-backend mirroragents | |
timeout hello 500ms | |
timeout idle 5s | |
timeout processing 5s | |
spoe-message mirror | |
args arg_method=method arg_path=url arg_ver=req.ver arg_hdrs=req.hdrs_bin arg_body=req.body | |
event on-frontend-http-request |
With this file, you configure how HAProxy will communicate with the agent(s). The file begins with an engine name, mirror, in square brackets. As mentioned, this must match the engine
value set on the filter
line in the HAProxy configuration.
The log global
line means that events, such as when HAProxy sends data, will be logged to the same output defined by the log
statement in the global
section of the HAProxy configuration. The messages
line is a space-delimited list of labels that match up with spoe-message
sections. The use-backend
line specifies which backend in the HAProxy configuration holds the mirror agents.
You can also set timeouts for various parts of the HAProxy-to-agent communication. The timeout hello
setting limits how long HAProxy will wait for an agent to acknowledge a connection. The timeout idle
setting limits how long HAProxy will wait for an agent to close an idle connection. The timeout processing
setting limits how long an agent is allowed to process an event.
A spoe-message
section defines which HAProxy fetch methods will be used to capture data to send to the agents. The label here, mirror, is expected by this particular agent. For traffic mirroring, we capture the following:
the HTTP method
the URL path
the version of HTTP
all HTTP headers
the request body (note that this requires
option http-buffer-request
in the HAProxy configuration)
Data is sent every time that the on-frontend-http-request event fires, which is before the evaluation of http-request
rules on the frontend side. Once you have this file in place, restart HAProxy for it to take effect. You should see requests to the Traffic Shadowing daemon appear in the log at /var/log/haproxy.log:
SPOE: [mirror] <EVENT:on-frontend-http-request> sid=0 st=0 0/1/0/0/1 1/1 0/0 0/1 |
Tuning the Mirrored Traffic
There are a few ways to tune the traffic that gets mirrored. For one thing, you can add an ACL that limits the requests that get captured. For instance, if you only wanted to mirror traffic for requests to the /search feature on your site, you would ignore all requests except those that have a URL path beginning with /search, as shown:
spoe-message mirror-msg | |
args arg_method=method arg_path=url arg_ver=req.ver arg_hdrs=req.hdrs_bin arg_body=req.body | |
event on-frontend-http-request if { path_beg /search } |
You can also define named ACLs that do the same thing:
spoe-message mirror-msg | |
args arg_method=method arg_path=url arg_ver=req.ver arg_hdrs=req.hdrs_bin arg_body=req.body | |
acl is_search path_beg /search | |
event on-frontend-http-request if is_search |
Or suppose you didn’t want to capture all traffic, but rather only a portion of it. You would simply add an ACL that collects a random sample of requests. In the next example, we generate a random number between 1 and 100 and only mirror the request if that number is less than or equal to 10:
spoe-message mirror-msg | |
args arg_method=method arg_path=url arg_ver=req.ver arg_hdrs=req.hdrs_bin arg_body=req.body | |
acl is_search path_beg /search | |
event on-frontend-http-request if { rand(100) le 10 } |
Your ACL statements can also check values from map files. For example, you can switch mirroring on or off by using a map file that contains a key-value pair like mirroring on. Then, check the map file from your mirror.conf file like this:
spoe-message mirror-msg | |
args arg_method=method arg_path=url arg_ver=req.ver arg_hdrs=req.hdrs_bin arg_body=req.body | |
acl mirroring_on str(mirroring),map(/etc/haproxy/mirroring.map) -m str on | |
event on-frontend-http-request if mirroring_on |
Use the HAProxy Runtime API to change the value in the map file to off.
# Change mirroring to off | |
$ echo "set map /etc/haproxy/mirroring.map mirroring off" | nc 127.0.0.1 9999 | |
# Show current value | |
$ echo "show map /etc/haproxy/mirroring.map mirroring" | nc 127.0.0.1 9999 |
You can also use the Data Plane API to add or remove filter spoe
lines from the HAProxy configuration file dynamically. In the following example, we show the existing filters, then add a new one, and then remove it:
# Show existing filters | |
curl -X GET --user admin:mypassword "http://localhost:5555/v1/services/haproxy/configuration/filters?parent_name=fe_main&parent_type=frontend" | |
# Add a filter line | |
curl -X POST --user admin:mypassword "http://localhost:5555/v1/services/haproxy/configuration/filters?parent_name=fe_main&parent_type=frontend&version=1" -H "Content-Type: application/json" -d '{"id": 0, "spoe_config":"/etc/haproxy/spoa.conf", "spoe_engine":"mirror", "type": "spoe"}' | |
{"id":0,"spoe_config":"/etc/haproxy/spoa.conf","spoe_engine":"mirror","type":"spoe"} | |
# Remove a filter line | |
curl -X DELETE --user admin:mypassword "http://localhost:5555/v1/services/haproxy/configuration/filters/0?parent_name=fe_main&parent_type=frontend&version=2" -H "Content-Type: application/json" |
Use the Data Plane API to fully configure your load balancer using REST API commands.
Tips for Making the Most of Traffic Mirroring
I’ll leave you with a few ways to get the most out of traffic mirroring:
Set up monitoring and compare the errors you get from your production servers with those you get from the new version to which you’re mirroring traffic. Having a monitoring strategy in place will be key to validating a release.
Make use of HAProxy’s built-in metrics, which you can consume via the HAProxy Stats page or the Prometheus module, to see whether the new version of your feature performs better or worse.
Make sure that the feature you’re testing has URL paths and parameters that match the existing feature so that it is forward compatible with mirrored traffic. Forward compatibility may be a valuable test in and of itself.
Conclusion
In this blog post, you got a tour of the new Traffic Shadowing daemon, which uses the Stream Processing Offload Engine to capture live traffic and mirror it to a secondary URL. This is especially useful for vetting new versions of features before they’re released to the public. The best thing about it is your production traffic won’t be impacted by the mirroring. It’s a fire-and-forget process where the downstream client doesn’t need to wait for the mirroring agent to respond.
If you enjoyed this article and want to keep up to date on similar topics, subscribe to this blog! You can also follow us on Twitter and join the conversation on Slack.
HAProxy Enterprise includes a robust and cutting-edge codebase, an enterprise suite of add-ons, expert support, and professional services. Want to learn more? Contact us today and sign up for a free trial.
We’re hiring! Check out our careers page for more info.
Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.