Overview AWS is set to introduce the inaugural European Sovereign Cloud (ESC), marking a significant development in utility computing (UC). To spearhead this initiative, we are actively seeking experienced systems engineers with a strong background in automation and operations. As part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our blog : Oversee the launch of the ESC in 2025, working closely with global AWS teams and influencing the evolution of AWS services and technology.
Collaborate with technology leaders to enhance day-to-day operations and improve availability, reliability, latency, performance, and efficiency of the ESC.
Occasionally participate in on-call rotations to resolve incidents occurring out-of-hours.
Lead the delivery of scalable services and ensure a high-availability experience for EU customers.
Contribute to the development and testing of scripts to improve workflows and operational efficiency.
Educate service teams about the complexities of the European Sovereign Cloud and foster knowledge sharing.
Review operational health of services, identify anomalies, and craft actionable bug reports to improve system performance.
Provide feedback on change management documents and help address the operational backlog as a collaborative team effort.
About the team and context Utility Computing (UC) – European Sovereign Cloud (ESC) is part of AWS Utility Computing. UC provides innovations across Compute, Database, Storage, IoT, Platform, and Productivity apps, supporting the development and management of services in AWS. Managed Operations engineers engage with AWS customers who require specialized security solutions for cloud services.
Day-to-day life Embark on a week with meaningful contributions to the operation of significant software systems. Review operational health, identify anomalies, and produce actionable bug reports to enhance efficiency and performance. Contribute feedback on change management and work to reduce operational backlog. Develop and test scripts to improve workflows. Share insights on the ESC with service teams to foster mutual understanding and continuous learning. Maintain 24x7 on-call responsibility as part of a team to root-cause issues and ensure resilience and fault tolerance.
Qualifications Basic Qualifications This role requires you to be a national of an EU member state.
Demonstrated expertise in systems engineering across hardware, software, networking, and operating systems.
Experience scripting processes in Bash, Python, or Ruby.
Strong understanding of Linux and networking; able to troubleshoot and anticipate problems affecting performance, reliability, or availability.
Ability to create, revise, and improve standard operating procedures (SOPs) and drive operational best practices.
Preferred Qualifications Experience with monitoring frameworks (e.g., CloudWatch, Datadog, Grafana, Elastic or similar).
Experience mentoring junior engineers and leading cross-organizational efforts requiring collaboration across multiple teams.
Experience supporting services in AWS or other cloud environments.
Experience operating 24x7 high-availability, distributed applications and optimizing fleet utilization.
Experience with Infrastructure as Code (e.g., CDK, CloudFormation, Puppet, Chef, Ansible, or similar).
Amazon is an equal opportunities employer. We believe that a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify, and build. Please consult our Privacy Notice to understand how we collect, use, and
System Engineer • Zaragoza, Aragon, España