Esta oferta de trabajo no está disponible en tu país.

Senior Site Reliability Engineer

RocheMadrid, Comunidad de Madrid, España

Hace 4 días

Descripción del trabajo

At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come. Join Roche, where every voice matters.

The Position

The role requires the candidate to be available for on-call duty service, responding promptly to urgent issues and emergencies outside of regular working hours, ensuring that critical situations are addressed in a timely and effective manner

Who We Are

At Roche, we are passionate about transforming patients\' lives, and we are bold in both decision and action - we believe that good business means a better world. That is why we come to work every single day. We commit ourselves to scientific rigor, unassailable ethics, and access to medical innovations for all. We do this today to build a better tomorrow.

Roche is strongly committed to a diverse and inclusive workplace. We strive to build teams that represent a range of backgrounds, perspectives, and skills. Embracing diversity enables us to create a great place to work and to innovate for patients.

Roche is building a global site reliability engineering (SRE) team that will support commercial and internal solutions. This team will have the mindset of building and creating engineering solutions to solve a broad spectrum of problems.

Step into the Future of IT Infrastructure with Roche

As a seasoned Site Reliability Engineer (SRE) at Roche, you\'ll leverage your deep software engineering expertise to propel our IT infrastructure to new heights of robustness, scalability, and reliability. This isn\'t just a role—it's an invitation to shape the backbone of critical infrastructures and drive our technological innovations forward.

Your Mission

Design and maintain cutting-edge tools, scripts, and frameworks that automate repetitive tasks, streamline software deployment, and manage expansive systems with unparalleled efficiency.

Partner closely with forward-thinking development teams to architect and implement high-performance solutions that elevate system efficiency, optimize resource utilization, and enhance deployment processes for superior uptime and user satisfaction.

Your Impact

Lead the charge in incident management and response. Detect system anomalies, troubleshoot swiftly, and conduct thorough root cause analyses to prevent recurring issues.

Champion continuous improvement by refining monitoring and alerting mechanisms, conducting insightful post-incident reviews, and embedding best practices in software lifecycle management. Your strategic foresight and meticulous planning will ensure our systems are not only reliable but also superlatively performant.

By joining our elite team, you will play a pivotal role in delivering seamless experiences to our end-users, exceeding business and customer demands, and solidifying Roche\'s reputation as a leader in IT innovation.

Your Core Responsibilities

Reliability Mastery : Proactively monitor and maintain system reliability using advanced tools like DataDog, VictorOps, ELK, Grafana, and Prometheus. Become a key player in ensuring system stability and performance
Uptime Guardian : Ensure optimal uptime and performance by swiftly identifying issues and responding to alerts with precision
Technical Troubleshooter : Basic understanding of Architecture and designs to deep dive into complex technical issues, troubleshoot, investigate, and resolve them. Collaborate seamlessly with engineering teams to enable timely and effective resolutions
Service Excellence : Maintain and consistently achieve defined SLAs, SLIs, and SLOs, ensuring service levels are consistently met or exceeded
Automation Innovator : Develop and deploy automation scripts (using Python or other scripting languages) to streamline operations, enhance system efficiencies, and reduce manual tasks
Cloud Steward : Manage and maintain robust infrastructure across AWS and Azure environments, implementing best practices to ensure peak performance, reliability of cloud-based applications. Drive cost optimization through best practice implementation and continuous vigilance.
Cross-functional Collaborator : Work closely with engineering, DevOps, security and operations teams to drive continuous improvement and foster a culture of reliability and inclusion
Incident Responder : Handle requests and incidents through JIRA and ServiceNow, documenting troubleshooting procedures, solutions, and lessons learned to fuel ongoing improvements
Flexible Scheduling : Work on-call outside of normal working hours and weekends as scheduled to ensure continuous support
Team Builder : Actively contribute to the growth and development of the SRE team\'s capabilities, nurturing a stronger, more inclusive, and resilient team

Who You Are

Educational Background : Bachelor\'s degree in Computer Science, Engineering, or a related field, or equivalent professional experience. An MBA or PhD is a plus, but not required

Certifications : Relevant industry certifications (AWS / Azure) to showcase your expertise

Experience : Approximately 5 years of experience in site reliability engineering, IT operations, DevOps, or related fields, or equivalent skills and experience

Cloud Expertise : Solid experience with AWS and / or Azure, including setting up, monitoring, and maintaining cloud resources (incl. Kubernetes, EKS, AKS, GKE, etc knowledge). Also experience on basis understanding of tools related to Infrastructure as a code, such as Terraform

Tool Proficiency : Proficiency with monitoring and logging tools such as DataDog, Splunk-Oncall, ELK stack, Grafana, and Prometheus etc. Knowledge of Loki Mimir and Tempo is a plus

Hands-On Skills : Hands-on experience with JIRA and ServiceNow for tracking incidents, requests, and documentation

Scripting Knowledge : Proficiency in Python or similar scripting languages for automation purposes

Incident Response : Understanding of SRE Core principles beside in-depth understanding of incident prioritization, escalation processes, and service level management (SLA / SLO / SLI)

Troubleshooting : Demonstrates proficient troubleshooting capabilities, especially in cloud and distributed system environments

Communication and Teamwork : Excellent communication, teamwork, and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies

Diversity and Inclusion : We value and encourage candidates from diverse backgrounds and experiences, believing that diverse perspectives drive innovation and success

Language requirements : Excelling in both spoken and written English communication

Why Join Us?

By joining our team, you will be part of a dynamic environment where your contributions will directly impact the resilience and reliability of our services. You will have opportunities for professional growth and the ability to collaborate with industry leaders. Let\'s drive the future of IT stability together, ensuring an exceptional experience for our customers.

Ready to make a difference? Apply now to be our next SRE Incident Manager and help us build a more reliable future

Who we are

A healthier future drives us to innovate. Together, more than 100\'000 employees across the globe are dedicated to advance science, ensuring everyone has access to healthcare today and for generations to come. Our efforts result in more than 26 million people treated with our medicines and over 30 billion tests conducted using our Diagnostics products. We empower each other to explore new possibilities, foster creativity, and keep our ambitions high, so we can deliver life-changing healthcare solutions that make a global impact.

Let\'s build a healthier future, together.

Roche is an Equal Opportunity Employer.

#J-18808-Ljbffr

Crear una alerta de empleo para esta búsqueda

Site Reliability Engineer • Madrid, Comunidad de Madrid, España

Ofertas relacionadas

Oferta promocionada

Site Reliability Engineer

CapitoleMadrid, Community of Madrid, Spain

Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada
Nueva oferta

Site Reliability Engineer I

Manning Global AGMadrid, Kingdom Of Spain, España

Our client, a leading multinational technology company,is hiring for a Site Reliability Engineer to join their business in Spain. Drive the digital transformation of key MBB and FBB projects in Spai...Mostrar másÚltima actualización: hace 11 horas

Oferta promocionada

Site Reliability Engineer I

TECEZEMadrid, Kingdom Of Spain, España

Role : Site Reliability Engineer.In-depth knowledge of cloud platforms and services.Proficiency in monitoring and logging tools. Strong troubleshooting and problem-solving skills.Experience with aut...Mostrar másÚltima actualización: hace 15 días

Oferta promocionada

Site Reliability Engineer

TECEZEMadrid, Madrid, Spain

Role : Site Reliability Engineer Location : Madrid, Spain Duration : 1 Year Job Description : Skills & Experience : 6-8 years of relevant experience In-depth knowledge of cloud platforms and ...Mostrar másÚltima actualización: hace 7 días

Platform & Site Reliability Engineer

PriceHubbleMadrid, Community of Madrid, ES

Teletrabajo

Quick Apply

PriceHubble is on a mission to transform how real estate and financial professionals make decisions.We’re a fast-growing European B2B SaaS company that leverages the power of AI and big data to bri...Mostrar másÚltima actualización: hace 8 días

Oferta promocionada

SRE (Site Reliability) Engineer

EPAM Systemsmadrid, madrid, España

Be among the first 25 applicants.SRE (Site Reliability) Engineer.This is an exciting opportunity to be part of a high-impact, highly technical group focused on solving some of the most challenging ...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

NeuralTrustmadrid, madrid, España

At NeuralTrust we’re looking for a DevOps / Site Reliability Engineer to take our company to the next level.We’re a Barcelona-based, remote-first startup. Our SaaS platform provides the essential to...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

The Rundown AI, Inc.madrid, madrid, España

Palantir builds the world’s leading software for data-driven decisions and operations.By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving ...Mostrar másÚltima actualización: hace 10 días

Oferta promocionada

Site Reliability Engineer, hibrido

Axpo GroupMadrid, España

Site Reliability Engineer As a Site Reliability Engineer, youll play a critical role in scaling Axpos cloud infrastructure, ensuring high availability and performance for our internal products.This...Mostrar másÚltima actualización: hace 7 días

Oferta promocionada
Nueva oferta

Site reliability engineer

Manning Global AGMadrid, Madrid, SPAIN

Our client, a leading multinational technology company,is hiring for a Site Reliability Engineer to join their business in Spain. Position Title : Site Reliability EngineerPosition Type : ContractStart ...Mostrar másÚltima actualización: hace 12 horas

Oferta promocionada

Site reliability engineer

TECEZEMadrid, Madrid, España

Role : Site Reliability Engineer Location : Madrid, Spain Duration : 1 Year Job Description : Skills & Experience : 6-8 years of relevant experience In-depth knowledge of cloud platfor...Mostrar másÚltima actualización: hace 17 días

Oferta promocionada
Nueva oferta

Platform & Site Reliability Engineer

Pricehubble AGMadrid, Madrid, España

About PriceHubble PriceHubble is on a mission to transform how real estate and financial professionals make decisions.We’re a fast-growing European B2B SaaS company that leverages the power of AI ...Mostrar másÚltima actualización: hace 11 horas

Oferta promocionada

Site Reliability Engineer (SRE)

HCLTechMadrid, Community of Madrid, Spain

What do you need to fulfill this job?.We are seeking a highly skilled.Site Reliability Engineer (SRE).In-depth knowledge of cloud platforms and services. Proficiency in monitoring and logging tools....Mostrar másÚltima actualización: hace 25 días

Oferta promocionada

Site Reliability Engineer

NAGRAMadrid, Community of Madrid, Spain

About Nagra In-Field Provisioning.NAGRA, a digital TV division of the Kudelski Group, provides security and multiscreen user experience solutions for the monetization of digital media.Premium conte...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada
Nueva oferta

Platform - Senior Site Reliability Engineer

Elasticespaña, españa, España

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people.The Elastic Search AI...Mostrar másÚltima actualización: hace 11 horas

Oferta promocionada

Site Reliability Engineer

GreenPowerMonitor, a DNV companyKingdom Of Spain, España

At GreenPowerMonitor, a DNV company, we’re at the heart of the global energy transformation.We use data-driven digital solutions to optimize solar and wind farms worldwide, making renewable energy ...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Senior Site Reliability Engineer

Playneticmadrid, madrid, España

Established in 2023, Playnetic is a new player in the world of gaming entertainment.We design and build slot games from scratch - from idea to release. Our games will be played in regulated markets ...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

Manning Global AGMadrid, Madrid, SPAIN

Our client, a leading multinational technology company,is hiring for a Site Reliability Engineer to join their business in Spain. Tamara Rajic | +49 (0) 89 23 88 98 74.Drive the digital transformati...Mostrar másÚltima actualización: hace 5 días

Oferta promocionada

Senior Site Reliability Engineer

TravelPerkKingdom Of Spain, España

We are TravelPerk : a scaling unicorn valued at $1.Backed by world-class investors with portfolios including AirBnb, Stripe, Slack, Trello, Gusto, Twitter, Farfetch and Deliveroo, our team is made u...Mostrar másÚltima actualización: hace más de 30 días

Oferta promocionada

Site Reliability Engineer

HCLTechMadrid, Kingdom Of Spain, España