Role Overview
We are looking for an MLOps Engineer with solid cloud experience to help build, deploy and maintain Machine Learning solutions in production. The right person combines strong engineering foundations with a passion for automation, cloud architectures and operational excellence.
You will collaborate closely with Data Scientists, Data Engineers, Architects and Platform teams to ensure that AI models are delivered in a scalable, reproducible and secure way.
Key Responsibilities
- Design, build and maintain end to end ML pipelines for training, validation and deployment
- Implement CI CD practices for the entire model lifecycle
- Automate workflows using tools such as Airflow, Kubeflow, MLflow or Vertex Pipelines
- Deploy models on cloud platforms like AWS Sagemaker, Azure ML or GCP Vertex AI
- Implement monitoring, observability and alerting for models in production
- Work with Data Scientists to optimise performance, datasets and reproducibility
- Manage infrastructure as code with Terraform or CloudFormation
- Ensure compliance with data governance, security and model versioning policies
- Optimise reliability, performance and cloud cost of ML platforms
- Document architectures, processes and best practices
Requirements
Technical Skills
3 to 6 years of experience in Data Engineering, MLOps or similar rolesStrong Python experience applied to ML or pipeline developmentHands on experience with at least one major cloud provider :AWS (Sagemaker, ECR, Lambda, Step Functions)Azure (Azure ML, Databricks, AKS)GCP (Vertex AI, Cloud Run, BigQuery)Experience with Docker and KubernetesExperience with CI CD systems like GitHub Actions, GitLab CI or Azure DevOpsKnowledge of MLflow, DVC, Feast, Metaflow or similar tracking toolsExperience with Infrastructure as Code using Terraform or PulumiUnderstanding of ML models, pipelines and reproducibility principlesSoft Skills
Strong engineering mindset and focus on automationAbility to collaborate with multidisciplinary teamsClear communication of technical conceptsProactive, reliable and quality focusedNice to Have
Experience with RAG or GenAI workflowsExperience with Spark or DatabricksExperience with event driven systems such as Kafka or Pub SubBackground in security and data governance