$ whoami

David Pigna

Senior DevOps / SRE

I operate critical high-availability platforms, with experience in large-scale streaming, reliability, automation and observability.

Large-scale streaming Millions of users Production Kubernetes Multi-cloud Incident response

View services Discuss your platform

LinkedIn GitHub

$ ls ./services

What I can do for you

I help teams stabilize, automate and scale production platforms, focusing on reliability, observability and real-world operations.

Platform Engineering

Your platform grows but operations become unpredictable. I design and operate Kubernetes clusters that scale without surprises: safe deployments, stable workloads and zero downtime on every release.

Kubernetes Helm Argo CD Argo Rollouts Karpenter

Infrastructure as Code

Manually-built infrastructure nobody knows how to reproduce. I automate and codify everything with Terraform, Terragrunt and Ansible so every environment is reproducible, auditable and free from manual work.

Terraform Terragrunt Ansible AWS GCP OCI

Observability & SRE

You find out about problems when users have already reported them. I implement full observability stacks with metrics, logs and SLO-oriented alerts so you see problems before they impact your users.

Prometheus Grafana Loki OpenTelemetry SLOs Alerting

CI/CD & Automation

Manual, slow or fragile deployments that block your team. I design continuous delivery pipelines that reduce feedback cycles, eliminate manual steps and make getting to production boring — in the best way.

GitHub Actions GitLab CI/CD Jenkins Concourse

Troubleshooting & On-call

Incidents that repeat, postmortems that lead nowhere. I solve production problems with real root cause analysis and structural remediation so the same incident doesn't wake anyone up at 3am again.

Incident Management Root Cause Analysis Postmortem

Networking & Databases

Unexplained latency, databases that become everyone's bottleneck. I operate and optimize cloud networks and relational and NoSQL databases in production with a focus on performance, availability and reliable recovery.

Networking PostgreSQL Redis MongoDB

$ grep -r "on-call" ./incidents/

When teams reach out

I usually get called when...

Kubernetes is growing but nobody wants to touch it
Incidents are reported by users before alerts fire
Deployments are manual, slow, or fragile
Infrastructure nobody knows how to reproduce
They need a solid reliability foundation before the next scale

$ cat tech-stack.yaml

Technical stack

Orchestration & Containers

Kubernetes
Helm
Docker
Containerd
Argo Rollouts
Karpenter

IaC & Automation

Terraform
Terragrunt
Ansible

Cloud

AWS
GCP
Azure
OCI

CI/CD

GitHub Actions
GitLab CI/CD
Jenkins
Concourse CI
Argo CD

Observability

Prometheus
Grafana
Loki
Alertmanager
OpenTelemetry

Databases

PostgreSQL
MySQL
Redis
Valkey
MongoDB
Elasticsearch

Networking & Security

VPC
VPN
Nginx
Istio
cert-manager
Vault

Scripting & Languages

Bash
Python
Go for tooling

$ cat about.md

About me

about.md

I'm a Senior DevOps / SRE with experience in the reliability and operation of large-scale streaming platforms, focused on availability, automation and troubleshooting of distributed systems serving millions of users.

My focus is on reliability, automation and observability. I believe well-built infrastructure is the kind that doesn't need constant attention — because it was designed, automated and monitored from day one.

I work independently with clients who need to operate their systems seriously: scaling teams, startups that need a solid foundation, or companies looking to improve their reliability posture.

Outside the terminal

TTRPGs

Player and enthusiast of tabletop role-playing games. Founder of dadomanija.com, a news and community site for TTRPGs in Spanish. I also developed the website for the publisher MitoRol.

dadomanija.com mitorol.com

Music & Production

Active musician currently studying Music Production. The same attention to detail I bring to systems, I bring to sound.

$ ping david

Let's work together

Reach out if you're facing reliability, automation, or operations challenges — whether it's a specific project, ongoing support, or an initial assessment.

How I usually engage

Reliability assessment K8s / Platform review Observability & SLOs Incident remediation Ongoing SRE

davidpigna@gmail.com LinkedIn

response.json

{
  "availability": "open to discuss",
  "mode": "remote",
  "timezone": "UTC-3 / Argentina",
  "languages": ["es", "en"]
}