Skip to main content

Kubernetes

Oops, I Wrote a Compiler (While Trying to Cut Logging Costs)

Anyone here using Datadog?

It’s a wonderful tool, that quickly can become expensive. My client approached me to reduce our logging costs by removing unnecessary sources. The company was feeding logs to datadog from almost every source.

Costs were horrible. Consider 5 TBs of logs. No, not monthly. Daily. That gives roughly 150 TB / month, 1.8 PB a year. Petabytes.

For those who like mathematical notation it’s:

1.8 PB = 1.8 * 10^3 TB = 1.8 * 10^6 GiB = 1.8 * 10^9 MB

So, we’d need roughly 1.25 billion floppy disks (remember those?) every year to store it. Aligned next to each other, they’d create a road about 3,000 km long — roughly the distance from Poland to Barcelona and back. That’s roughly 8,000 standard 256 GiB SSDs — every year. That’s around 600 million iPhone photos — one photo per second for 19 years straight. Or 383,000 DVD discs. Stack them on top of each other and you’d get a tower almost 460 meters high — about 1.5× the Eiffel Tower, and slightly higher than the Empire State Building. Not quite Burj Khalifa yet, but… give them two years. That’s also about 900,000 VHS tapes — or five fully packed TIR trucks.

Tuning cluster autoscaler

Background

During one of my assignments, we were evaluating Karpenter. While being a rather cool piece of software, it caused us some pain. The general idea was to deploy Grafana Loki and use an autoscaling tool to maintain node pools automatically.

Karpenter seemed to be the perfect tool for this use case. However, it had (maybe due to my misconfiguration) some undesired side effects. It was quite aggressive in adding and removing nodes, which caused disruptions.