Overview
At Blizzard Entertainment, our Site Reliability Engineers (SREs) use systems expertise combined with software engineering patterns to help define, create, and support the architecture, build systems, orchestration, and operations of services across the business. The role is composed of dedicated engineers that are focused on evangelizing reliability-as-a-feature through monitoring, service-level objectives, automation, everything-as-code, and testing.
Blizzard's games and platforms reach a global audience of passionate gamers. The scale is massive, and the challenges are very real, but wise application of technology is the answer to keep it all running reliably with minimal guidance. Our Site Reliability Engineers are at the heart of this work, working directly with the engineering teams from idea to launch to deliver the most epic (and reliable!) experiences... ever.
Responsibilities
- Being part of an on-call rotation to assist finding a resolution during incidents
- Hosting blameless postmortems to share findings, discover gaps, embrace transparency, and improve reliability across our services
- Building positive and collaborative relationships across the company
- Employing your systems knowledge to triage problems and tune resource usage
- Championing automation to reduce toil and increase development velocity
- Helping define and instrument Service-Level Objectives to ensure epic player experiences
- Demonstrating Configuration Management to build and maintain consistency across services
- Building Terraform configs to manage infrastructure in public and private clouds
- Supporting and improving build pipelines with Jenkins, Argo, and / or Spinnaker
- Adopting Containers and Kubernetes for new and existing services
- Applying everything-as-code methodologies across configuration, infrastructure, orchestration, and elsewhere
What You May Succeed At
Love to solve novel and exciting problemsDislike solving the same problems over-and-over- so you automate or eliminate themAre inspired to make everyone's job easier by improving workflowsAre comfortable digging through metrics, logs, and whatever else is available to triage and fix an incident at any timeStrive to be better, smarter, and faster tomorrow than you are todayEnjoy trying new technologies to improve what we're doing todayAre okay using older technologies that may not be perfect, but are good enough and low maintenanceNaturally spread the philosophies and practices of DevOps to othersLike to collaborate with others to solve problems, share knowledge, and provide feedbackCan self-assess the needs of a system or team, and make a case to prioritize that workRelish working with software, network, cloud, and systems engineers to solve problems across all tiers of the stackHelp your peers succeed as much as you canTypes of projects you may work on
Managing services and infrastructure supporting Blizzard's incredible platforms and gamesDefining the future of running services for our platforms and games with KubernetesSupporting our massive global data platforms across multiple cloudsPerforming and improving service migrations from one cloud / data center to anotherWorking closely with our incubation teams to help define how future products should operateIntegrating monitoring and logging with systems to improve observability and enable Service-Level ObjectivesDesigning and completing stress tests to validate scale expectations vs realityAreas of Expertise for an SRE at Blizzard
SREs at Blizzard are expected to become experts in the technologies used by the teams they are working with. Below is a non-exhaustive list of technologies SREs may be exposed to :
Service-Level Objectives (SLI, SLO, SLA, Error Budget, Burn Rate)Distributed Systems (architectures, hybrid environments, high-availability)Configuration Management (Puppet, Hiera, Terraform, Ansible)Container Computing (Docker, Kubernetes)Cloud Services and Architecture (AWS, GCP, OpenStack)Distributed Message Bus (RabbitMQ, Kafka)Proxies and Load Balancing (Nginx, HAProxy, ELB, ALB)Monitoring (Prometheus, Kibana, Grafana, Elasticsearch)Logging (Splunk, SysLog, ELK Stack, Linux Journal)Source Control (GitHub Enterprise, Perforce)CI / CD (Jenkins, Argo)Linux (bash, debugging, performance tuning)Networking (triaging, packet loss, routing)Programming (Python, Go, C++, Shell)Minimum qualifications for a Senior SRE at Blizzard
General knowledge of all areas of expertise and deep knowledge of four areas of expertiseFollows technology trends and industry standards passionatelyCapable of presenting ideas and technology to a broad audience in a clear and effective wayBuilds strong relationships with their partner teams and other SREsEager to help others achieve their goalsCo-owns operations and reliability with the partner teamDemonstrates deep understanding of the services they support and their goalsExplores new technologies with demos / experiments / labsBreaks down complex work into small units of work for themselves or others#J-18808-Ljbffr