Job Opportunity :
We are seeking a seasoned Data Science Specialist to join our cutting-edge team in Catalonia. As a Data Science Specialist focused on Product Development, you will be responsible for conceptualizing and implementing innovative AI models from inception to deployment.
- Main Responsibilities :
- Data Pipeline Development : Design and build comprehensive data pipelines that encompass data sourcing, augmentation, validation, training, fine-tuning, evaluation, and model shipping.
- Evaluation Frameworks : Develop robust evaluation frameworks that include statistical testing and reliability checks to verify task competence and alignment.
- Scaled Model Deployment : Scale training and inference by leveraging distributed computing, optimizing throughput / latency, and identifying opportunities for algorithmic or systems-level speedups.
- Model Improvement Techniques : Apply Supervised Fine-Tuning (SFT) and preference-based or reinforcement learning to enhance model helpfulness, safety, and reasoning.
- Model Specialization : Optimize and specialize models using compression techniques to meet performance and footprint targets.
- Cross-Functional Collaboration : Partner closely with ML engineers, researchers, and software engineers on data curation, evaluation design, training runs, model serving, and observability.
- Code Excellence : Contribute to our shared codebase by writing clean, well-tested Python; documenting decisions; and upholding high engineering standards.
Necessary Skills and Qualifications :
A Bachelor's degree in Computer Science, Math, Physics, Data Science, Operations Research, or a related field.Strong programming skills in Python and fluency with the modern ML stack (e.g., PyTorch), data tooling (NumPy / Pandas), and basic software practices (git, unit tests, CI).Solid grounding in language modelling concepts around training, evaluation, model architecture, and data.Comfort working with datasets at scale : collection, cleaning, filtering, labelling / annotation strategies, and quality controls.Experience using GPU resources and familiarity with containerized workflows (Docker) and job schedulers or cloud orchestration.The ability to read research papers, prototype ideas quickly, and translate them into reproducible, production-ready code.Clear, pragmatic communication skills and a highly collaborative mindset.