Senior DevOps Engineer
About Us
At EY wavespace Madrid - AI Data Hub, we are a diverse, multicultural team at the forefront of technological innovation, working with cutting-edge technologies like Gen AI, data analytics, robotics, etc. Our center is dedicated to exploring the future of AI and Data.
Overview :
Were looking for a Senior DevOps Engineer to build and run cloud and AI infrastructure at scale. Youll own IaC with Terraform, CI / CD, Kubernetes, and Linux. Youll also help run LLM workloads both in Azure and locally (Ollama / vLLM / llama.cpp). Your work will enable fast, secure, repeatable delivery.
Key responsibilities
- Build and maintain Azure infrastructure with Terraform (modules, workspaces, pipelines, policies).
- Design and operate CI / CD with GitHub Actions and / or Azure DevOps (multi-stage, approvals, environments).
- Run containers and Kubernetes / AKS (Helm, ingress, autoscaling, node pools, storage).
- Manage AI / LLM runtime : local model runners (Ollama, vLLM, llama.cpp), GPU / CPU configs.
- Support RAG : embeddings pipelines, vector DBs (Azure AI Search / Cognitive Search, pgvector, Milvus), data sync, retention.
- Automate platform tasks with Python (tooling, CLI utilities, API glue, ops scripts).
- Implement observability (Azure Monitor, Prometheus / Grafana, logs / traces / metrics, alerts, runbooks, SLOs).
- Apply Zero Trust security Enforce least privilege and role-based access control (RBAC), Identity-based segmentation (Azure AD, Conditional Access, MFA).
- Implement policy-as-code (OPA, Azure Policy) for compliance.
- Rotate secrets and certificates via Key Vault integrate with pipelines.
- Add continuous security scanning (SAST / DAST, container image scanning).
- Handle reliability : rollout strategies, health probes, incident response, postmortems.
- Optimize costs : right-sizing, autoscaling, budgets, tags, reporting.
Key requirements :
4+ years in DevOps / SRE / Platform Engineering.Strong Linux (shell, systemd, networking, performance troubleshooting).Terraform at scale (modules, state backends, CI / CD integration).Deep Azure experience (AKS, VNets, Key Vault, Storage, Monitor, Identity, Networking).CI / CD expertise (GitHub Actions and / or Azure DevOps).Containers and Kubernetes in production.Python or scripting for automation (solid scripting and tooling not full-time app dev).Hands-on with LLM setups (local runners or Azure OpenAI), embeddings, vector indexes, and RAG basics.Nice to have
Multi-cloud exposure (AWS / GCP).Azure AI services (Azure OpenAI, Cognitive Search).GitOps (Argo CD / Flux), Helm packaging, OCI registries.Eventing / queues (Event Grid, Service Bus, Kafka).Security / compliance in cloud (CIS, NIST, Microsoft CAF).Certifications : AZ-104, AZ-204, AZ-400, AI-900, HashiCorp Terraform Associate, CKA / CKAD.Experience with GPU nodes, drivers, CUDA / ROCm, or CPU-only optimizations for LLMs.How we work
Everything as code. PRs, reviews, and tests.Small batches. Trunk-based or short-lived branches.Clear runbooks and on-call rotation where needed.Measure, alert, fix, and improve.Our commitment to diversity inclusion
We are genuinely passionate about inclusion and we support individuals of all groups we do not discriminate on the basis of race, religion, gender, sexual orientation, or disability status.
Terraform, CI / CD, Kubernetes, linux, azure, Python