Let’s breathe life into great tech ideas! With 3,000 people globally, Intellias is a company where benchmark technological solutions are born. Join in and take your part in digitalizing the world.
We are exploring cutting-edge OCR and metadata extraction from PDF documents. OCR and document intelligence are rapidly evolving fields, with open-source models like DeepSeek OCR and LightOn OCR pushing the boundaries.
We are seeking an experienced engineer to help us build high-precision solutions for PDF-to-Markdown and PDF-to-HTML conversion, particularly for complex documents with diverse layouts.
Key Responsibilities :
- Research, evaluate, and fine-tune open-source OCR and document intelligence models for text and layout extraction from complex PDFs.
- Develop end-to-end solutions for PDF-to-Markdown / PDF-to-HTML conversion, preserving text structure, formatting, and layout accuracy.
- Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
- Implement post-processing techniques, text alignment, and metadata extraction to improve model precision.
- Collaborate closely with research and engineering teams to integrate OCR pipelines into production-ready systems.
- Stay current with advancements in document AI, multimodal learning, and OCR research.
Required Skills & Experience :
5+ years of experience in Machine Learning, with at least 2 years focused on OCR, Document AI, or vision-language models.Strong expertise in Python, PyTorch, and Hugging Face Transformers (training, fine-tuning, inference).Solid understanding of ComputerVision and its implementationHands-on experience deploying LLM / VLM models on vLLM or similar high-performance inference frameworks.Deep understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).Familiarity with cloud infrastructure and GPU-based inference pipelines.Research-oriented mindset with the ability to experiment, analyze results, and iterate quickly.Excellent communication and documentation skills.At Intellias, where technology takes center stage, people always come before processes. We're dedicated to cultivating a tech-savvy environment that empowers individuals to unlock their true potential and achieve extraordinary results. Our customized benefits not only prioritize your well-being but also charge your professional growth, making this opportunity an ideal match for tech enthusiasts like you.