Magnet.me  -  The smart network where students and professionals find their internship or job.

The smart network where students and professionals find their internship or job.

Senior / Principal Site Reliability Engineer

Posted 2 Oct 2025
Share:
Work experience
7 to 20 years
Full-time / part-time
Full-time
Job function
Degree level
Required language
English (Fluent)

Your career starts on Magnet.me

Create a profile and receive smart job recommendations based on your liked jobs.

Join Our Infrastructure Team

Our Infrastructure Team

We’re taking the Horizon franchise online and building the platform for a real-time multiplayer game! The Online Infrastructure team works closely with designers and game developers to deliver a platform that is scalable, reliable, and flexible for fast iteration. We care deeply about sound engineering, operational excellence, and pragmatic automation.

By joining us, you’ll shape the backbone of our online experience and collaborate daily with artists, designers, developers, and game programmers in an ambitious, creative environment. Our aim is to deliver infrastructure and systems that developers love to build on and players never have to think about.

Our Technology Stack

Our infrastructure is centered on CNCF technologies and Kubernetes.

  • Game servers run on our own engine and communicate over a custom UDP-based protocol. We use Agones to orchestrate them in Kubernetes.
  • Backend and game-facing services use established web tech (e.g., Spring). The game communicates via REST APIs and maintains a persistent HTTPS connection for notifications.
  • We run at scale on a hybrid setup across AWS EKS and on-premise data centers, relying on managed services including S3, SQS, Elastic Load Balancers, and RDS.
  • Infrastructure is managed as code with Terraform, Terragrunt, and Crossplane in Git, with automation via Jenkins and the Argo project family, following GitOps principles.
  • Observability includes Thanos, Loki, and Grafana, alongside additional logging/monitoring to ensure performance, reliability, and security.

We are designing for global reach, low latency, and sustained growth.

What you will do

  • Design, implement, and operate the infrastructure that powers our online game platform, with a focus on availability, performance, security, and cost.
  • Drive automation across build, deploy, and operations; reduce toil, codify runbooks, and harden golden paths.
  • Define and instrument SLOs/SLIs; build reliable alerting; lead data-driven capacity planning and readiness for major launches and events.
  • Partner with game and backend teams to improve service reliability, simplify deployments, and optimize runtime performance.
  • Troubleshoot complex production issues across networking, Kubernetes, Linux, and application layers. Perform deep post-incident analysis and implement durable fixes.
  • Champion GitOps practices and leverages the Argo ecosystem to enable safe, frequent, and reversible changes.
  • Mentor teammates, provide technical guidance on design reviews, and act as a steady point of coordination during incidents or cross-team initiatives.
  • Participate in, and help maintain, a healthy and sustainable on-call rotation.

Who you are

We would like to hear from you if you are an experienced Site Reliability Engineer with the following skillset:

  • Have a proven track record operating and scaling large, distributed systems in production (e.g., high concurrency, high throughput, or latency-sensitive workloads). Senior-level depth typically built over 7+ years.
  • Are fluent in Kubernetes and the surrounding ecosystem, and comfortable running mission-critical workloads on AWS EKS.
  • Use Infrastructure as Code daily (Terraform, Crossplane, Terragrunt) and treat configuration, policy, and runbooks as software.
  • Understand deeply SRE practices: SLOs/SLIs, error budgets, incident response, blameless postmortems, chaos/DR drills, and capacity management.
  • Communicate crisply with engineers and non-engineers alike, take ownership in ambiguous situations, and naturally become a go-to person when decisions and coordination matter most.
  • Bonus: experience in gaming or other real-time domains (streaming, voice/video) and with platform security at scale (secrets management, isolation, DDoS considerations).

If you think you’re up for the challenge, we’d love to hear from you! Be sure to submit your CV and a Cover/Motivation Letter; we like learning a bit about your background and your reasons for applying at Guerrilla.

Please note: This position is based in our studio in the heart of Amsterdam. Guerrilla offers relocation and immigration support.

Guerrilla is one of Europe's leading game development companies and a wholly-owned subsidiary of Sony Interactive Entertainment Europe. We started in 2000, and have pushed the boundaries of technical and artistic excellence in our games ever since. Today, we employ more than 270 professionals from 25 different nationalities. Our studio is located in the cultural and historical center of Amsterdam, The Netherlands — a great place to work and play.

Entertainment
Amsterdam
350 employees