In this presentation, Steven Le Roux of OVH describes how the HAProxy Stream Processing Offload Engine (SPOE) lets you build your own sophisticated solutions, such as custom tracing frameworks similar to OpenTracing. He describes the meaning and value of logs and metrics. He then explains how they can be captured as time series data. OVH has implemented a multi-stage time series database that aggregates data for various levels of retention targeting different use cases.
Transcript
Hello everyone. We’re going to speak about observability. Who is okay with the term observability? Not that much? Okay. So, what’s observability?
If I talk about logs and metrics, maybe now it speaks for you. Who among you is using a time-series database? Quite a bunch! Let’s make a vote. Who is using InfluxDB? Yeah. Prometheus? Yeah. OpenTSDB? One, okay, two. BigGraphite? Graphite? Not that much today. Yeah, okay. Something else, maybe, that I missed? Warp 10? Oh, okay, a few guys. Great. Who is operating HAProxy? Quite a bunch. So, there is actually a convergence between the time-series database and the HAProxy instance.
Observability with Logs and Metrics
The two pillars of observability are metrics and logs, but we will see that actually both are time series with different indexing strategies. The time-series database is really important at this stage. Let’s see the difference. HAProxy is really useful for logging.
This image comes from the blog of HAProxy and shows the different things that you can get from the log. For example, you have counters, you have established connections, statuses. You have the queues, the connections. You have many insightful things in the log and actually it’s marvellous. When you operate a solution, to have this kind of information is really, really, really a good thing.
However, indexing logs can be quite costly. I’m not saying for the previous speakers that Elasticsearch is not a good solution, but given some workloads, it can’t work for every workload, actually.
On the opposite side of logs, you have metrics. Metrics are simple data evolving in time but measured data. Speaking in terms of HAProxy, what are metrics? Client session rates, rates, counters, response times, queues, etc. These kinds of metrics are exposed with the socket API or the Prometheus exporter.
So, there are metrics, but we have seen in the previous slide that there are many metrics in true logs too. The thing is, it’s quite important, but it’s how we do it. We extract metrics from the log to store only metrics. Why? If you see in the slide, if you have to store a log…storing a log is actually not an issue because you have good storage strategies…but if you want to query it and use Kibana, for example, you have to index it.
Indexing logs has a cost and this cost can be mitigated if you extract the value inside the logs, aggregate it in real time, for example, and just flush the corresponding metrics. Here is the difference. Above, you have the full log, and if you want to build an index on it, you have to index different fields, etc, but if you just extract the value, for example, HTTP statuses, and you count them, namely, for example, every 10 seconds, and you flush it so you know that you had five 200 codes, etc. for the other example. But what you have to store at this layer is only the timestamp and the metric. You are sparing a lot of volume, a lot of bandwidth, a lot of things. It’s quite useful.
Let’s Observe (BPE)
Let’s observe how we did it at OVH. It’s important to say that we did it, BPE. It was Before Prometheus Era. There wasn’t a Prometheus exporter and actually we wrote one. It’s not exactly a Prometheus exporter because it exports a format, an HTTP format, that is kind of like Prometheus, but we don’t use Prometheus, so it’s not quite the same. There was, actually, an existing HAProxy exporter, but we didn’t use it because it was not sustaining our workload. We didn’t succeed in collecting metrics quickly enough. So, we had to rework one with performance in mind.
When you collect metrics…Willy said in the keynote that there are many, many insightful stats, but you see that maybe you don’t get the sense behind it. Now, there is documentation for each one of those stats. It should be useful. But here, with this exporter, you can just choose which one of the metrics you want to get. You can just select and fine grain them.
Once we get those metrics we export them, but we have a small daemon on each load balancer that is actually collecting metrics with a DFO strategy. DFO is Disk Failover. We flush the metrics on disk because if we lose the network or the service and we cannot push metrics, we will have them stored and when the network comes back we will flush them back. When we flush it…This component is open source. It’s named Beamium on the GitHub of OVH.
We push on a multi-stage, time-series infrastructure. Why? Because, at first, we have a live instance, which is in memory of computing and we use it for fine-grained operations. For example, scaling, monitoring, etc. We use it, also, for aggregating data because when you have a lot of data separating in different clusters you want to aggregate them. It’s why, actually, we have a multi-stage for this.
At the first stage, we use a really short retention strategy, but mostly for monitoring purposes. We first try to aggregate per frontend, backend, etc. on a customer scope. Then, we will push these aggregated metrics in the second stage, which will provide global insights into the platform. But still, at this stage, we don’t have a long retention. We have a day’s retention because it’s still in memory. But we are global, so we have the total view for customers about how the global load balancing experience is behaving.
Then, we aggregate per customer with different metrics and we push them on the cloud infrastructure where we can have years of retention if we want.
So, we have really different strategies about time series and it’s quite important because there is a huge retention factor in terms of unique time series. On the first layer we have dozens of millions of unique time series. The first layer is really the fine-grained, raw data. There are millions and millions of time series. Even if it’s not an issue to have them on the cloud, because we have multi-hundreds of millions of time series on the cloud, when you operate it’s better to have a better management of your time series. From, let’s say, a hundred million times series; the second stage has only ten million; and we keep only a hundred thousand at the end for the customers. You see that if we had to keep, with a long retention, a hundred million time series, the cost wouldn’t be the same.
Why to Collect Metrics
Why collect metrics? Actually, we do a lot of things with metrics and you can predict the future. Yeah. How do we do it?
Here is an example based on memory, but it could be HTTP requests, for example. You see there is a trend in the graph. It’s raising, raising, raising, raising.
Oh, not good because since it’s memory, there is a limit, a hardware limit. Here we materialize it with the green line. We are going to hit the wall.
What we can do is that we can extract the trend of the series and then we can forecast it.
If we forecast, the crossing lines is when we hit the wall.
If we hit the wall, we can anticipate any actions. We can say, “Oh, we have to move before the incident.” The trend is a way to forecast, but we can have all the strategies. For example, we can forecast the global signal, not the trend. We see that the crossing time is not exactly the same. Given the workload, the use case, the forecasting strategy, it can be different.
What to do now? Well, we can alert. We can annotate in a dashboard. We can autoscale a service. For example, if it would have been HTTP requests: Okay, we are going to hit the wall. We need more instances so we can actually scale the service. This is all achieved in a single request in our time series database. It’s not analyses; It’s not big data; It’s just one query for a given load balancing experience for a customer that will say, “Okay, now you need to act.” We could act through the time series database, but we don’t do it.
There is another approach of this. Here, I said that it’s a single query inside the time-series database, but we also have a different infrastructure for AutoML, Auto Machine Learning. We train models so that we can go further than this, because here this example is really simple. It’s a global trend and linear aggregation; You forecast it. Okay, job done. But what if you have seasons in your signal? What if your signal has a weekly seasonality? Or a monthly seasonality? If I need to forecast, but I don’t have the global picture, my forecast doesn’t reflect what will come. This is where a trained model will help us to anticipate.
If you want to try this, it’s free. It’s a free service of OVH. You can try it on labs.ovh.com. It’s called prescience and there is a time series forecasting algorithm based on SARIMA. SARIMA is actually a seasonality, moving-average algorithm. But it’s quite interesting in terms of time series forecasting.
We can also detect anomalies. With all those algorithms, we can also maybe do ESG score or z-score tests to get the deviation. Once you have this, you will see on the graph that the outliers will be your most uncommon values. The spikes here, when I notated it, it represents that I could take actions on this. Or if you see latencies or weird events, you can have it. You get it from your time series. So, it’s really insightful and you can do a lot of things with a proper time series database.
New in this Era of Observability
All this is pretty classic, right? It’s just operating a service, collecting metrics, etc. Now, what is new in terms of Era of Observability? Well, it’s SPOE, actually. I was planning to explain what was SPOE, but Pierre did it before so I will shorten, a bit, the presentation.
We tend to all speak in terms of HAProxy like a Swiss Army knife, you know. Actually, SPOE is like adding a bazooka as a tool inside the Swiss Army knife because this kind of thing makes HAProxy like a framework that you can extend, and you can do a lot of things, actually.
Here is a simple example. The idea was to explain how we could do OpenTracing, for example, based on SPOE. What is SPOE? Globally, it’s based on the filter engine, and you will trigger some events. For example, a trigger on-frontend-http-request, and when you hit these kinds of actions, you can act. Here, I could get a trace-id and span-id from a request. Okay?
On the response I can get the same and if I close the times, the time window between the request and the time, I can flush the span on the tracing framework so that I can I get visibility and observability and full tracing between my clients, load balancer, server, etc.
I understand that HAProxy is implementing OpenTracing, right? Or something like this? There is something in this way, but you could have your own tracing solution and maybe the OpenTracing wouldn’t be compatible with your solution. This makes you able to implement your own strategy about tracing or authentication, as we saw previously.
Apparently, the HAProxy team has a message for you, maybe.
So, it was a story quickly about observability. Thank you!