Esta oferta de trabajo no está disponible en tu país.

Senior Site Reliability Engineer

NextivaMadrid, Comunidad de Madrid, España

Hace más de 30 días

Descripción del trabajo

Redefine the future of customer experiences. One conversation at a time.

Were changing the game with a first-of-its-kind conversation-centric platform that unifies team collaboration and customer experience in one place. Powered by AI built by amazing humans.

Our culture is forward-thinking customer-obsessed and built on an unwavering belief that connection fuels business and life; connections to our customers with our signature Amazing Service our products and services and most importantly each other. Sincecompanies and 1M users rely on Nextiva for customer and team communication.

If youre ready to collaborate and create with amazing people let your personality shine and be on the frontlines of helping businesses deliver amazing experiences youre in the right place.

We are looking for a Senior Site Reliability Engineer (SRE) to join our Middleware Engineering team based in Bangalore. In this highly dynamic environment youll be responsible for supporting and scaling our Kafka and Elasticsearch infrastructure - core systems that power our SaaS platform.

Were looking for someone who thrives on automation embraces AI-driven observability and is eager to learn and adopt new technologies quickly. Youll not only respond to production issues but proactively build intelligent resilient systems to prevent them.

If you enjoy owning systems end to end writing clean automation and working in a fast-moving team that values innovation this role is for you.

Key Responsibilities

Triage troubleshoot and resolve complex production issues involving Kafka and Elasticsearch
Design and build automated monitoring alerting and logging systems - leveraging AI / ML techniques where possible
Write tools and infrastructure software to support self-healing auto-scaling and incident prevention
Automate system administration tasks - from patching and upgrades to config and deployment workflows
Use and manage GitHub extensively for infrastructure-as-code release management and collaboration
Partner with development QA and performance teams to ensure middleware systems are production-ready
Participate in the on-call rotation and continuously improve incident response and resolution playbooks
Mentor junior engineers and contribute to a culture of automation learning and accountability
Lead large-scale reliability and observability projects in collaboration with global teams

Qualifications

Bachelors degree in Computer Science Engineering or equivalent practical experience

Fluent English communication skills (spoken and written)

Core Competencies

6 years of experience in software development automation or infrastructure engineering

Deep experience with Kafka and / or Elasticsearch in production environments

Strong Linux systems expertise and 6 years managing Linux-based environments

Hands-on experience with cloud platforms - GCP and / or AWS required

Proficient in scripting languages like Python Bash etc

Automation-first mindset - deep experience with Ansible Terraform Jenkins

Expert-level understanding of Git and GitHub workflows for CI / CD and infrastructure-as-code

Proficient with container tools (Docker) and orchestrators (Kubernetes)

Strong understanding of SRE principles - SLAs / SLOs alerting observability and incident management

Experience with SQL caching systems (e.g. Redis) and troubleshooting distributed systems

Quick learner with a strong curiosity for new tools frameworks and AI / ML use cases in operations

Nice to Have

Observability Tools : Datadog Splunk Kibana Opsgenie

Experience with AI / ML-based anomaly detection AIOps platforms and LLM integrations for infrastructure

Azure cloud experience (nice to have)

Why Join Us Why Join Us

Shape the future of middleware reliability using AI and intelligent automation

Work with a global team that values initiative innovation and ownership

Grow in a fast-paced environment where learning and experimentation are part of the culture

Drive technical leadership mentor others and make a meaningful platform-wide impact

How to Apply

If youre passionate about automation AIOps MLOps and scalable middleware infrastructure and youre ready to move fast learn constantly and own critical systems - wed love to connect with you.

Nextiva DNA (Core Competencies)

Nextivas most successful team members share common traits and behaviors :

Drives Results : Action-oriented with a passion for solving problems. They bring clarity and simplicity to ambiguous situations challenge the status quo and ask what can be done differently. They lead and drive change celebrating success to build more success.

Critical Thinker : Understands the why and identifies key drivers learning from the past. They are fact-based and data-driven forward-thinking and see problems a few steps ahead. They provide options recommendations and actions understanding risks and dependencies.

Right Attitude : They are team-oriented collaborative competitive and hate losing. They are resilient able to bounce back from setbacks zoom in and out and get in the trenches to help solve important problems. They cultivate a culture of service learning support and respect caring for customers and teams.

Total Rewards

Our Total Rewards offerings are designed to allow Nexties to take care of themselves and their families so they can do their best.

Our compensation packages are tailored to each role and candidates qualifications. We consider a wide range of factors including skills experience training and certifications when determining compensation. We aim to offer competitive salaries or wages that reflect the value you bring to our team. Depending on the position compensation may include base salary incentives or bonuses.

Insurance - Life insurance covering life and disability

Work-Life Balance - PTO and Paid Sick time as per CBA paid parental leave

Financial Security - Private pension plan available

Wellness -Employee Assistance Program and comprehensive wellness initiatives

Growth -Access to ongoing learning and development opportunities and career advancement

At Nextiva were committed to supporting our employees health well-being and professional growth. Join us and build a rewarding career!

LI-SC1 #LI-REMOTE

Required Experience :

Senior IC

Key Skills

Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

Employment Type : Full Time

Experience : years

Vacancy : 1

Site Reliability Engineer

Madrid, Madrid, Spain

#J-18808-Ljbffr

Crear una alerta de empleo para esta búsqueda

Site Reliability Engineer • Madrid, Comunidad de Madrid, España