-
Intellias

Senior ML Engineer/Researcher

Intellias
Poland · Full-time · Mid-Senior

We are actively experimenting with OCR and metadata extraction from the PDF documents. OCR is one of the very hot topics these days with open models actively competing for the leading places - DeepSeek OCR, LightOn OCR, etc.

We are looking for someone with the experience of running OSS models on vLLM with focus on document intelligence - computer vision that results in PDF -> Markdown or PDF -> HTML conversion with high precision for complex


Tech Stack:

  • Python
  • vLLM
  • Hugging Face (inference)
  • Computer Vision
  • PyTorch


Requirements:

  • 5+ years of experience in Machine Learning, with at least 2+ years focused on OCR, Document AI, or vision-language models.
  • Strong hands-on expertise with Python, PyTorch, and Hugging Face Transformers (training, fine-tuning, inference).
  • Practical experience deploying LLM / VLM models on vLLM or equivalent high-performance inference frameworks.
  • Solid understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).
  • Understanding of cloud infrastructure and GPU-based inference pipelines.
  • Research mindset with the ability to experiment, analyze, and iterate quickly.
  • Strong communication and documentation skills; ability to clearly present findings and proposed improvements.


Responsibilities:

  • Research, evaluate, and fine-tune open-source OCR and document intelligence models for text and layout extraction from complex PDFs.
  • Develop end-to-end solutions for PDF-to-Markdown / PDF-to-HTML conversion with high accuracy in text structure, formatting, and layout retention.
  • Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
  • Implement techniques for post-processing, text alignment, and metadata extraction to enhance model precision.
  • Collaborate with research and engineering teams to integrate OCR pipelines into production-grade systems.
  • Stay up to date with the latest developments in document AI, multimodal learning, and OCR research.



  • LLMs, vLLM
  • OCR
  • Computer Vision

Key Skills

Ranked by relevance

ai machine learning computer vision pytorch python cloud
Login to Apply
Posted
Nov 07, 2025
Type
Full-time
Level
Mid-Senior
Location
Poland
Company
Intellias

Industries

IT Services IT Consulting

Categories

Engineering

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Fujitsu
Related

Data Scientist

2026-05-19

Full-time
Not Applicable
Poland
IT Services
Engineering
View Job Details
LSEG
Related

Junior Data Scientist

2026-05-26

Full-time
Not Applicable
Poland
IT Services
Engineering
View Job Details
LSEG
Related

Data Scientist– AI & Automation

2026-05-26

Full-time
Not Applicable
Poland
IT Services
Engineering