Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Background
Within Scania, massive amounts of unstructured data are continuously generated, for example documents, images, audio, videos, and tabular files. Accessing relevant and usable information from these sources remains a major challenge. Recent advancements in multimodal agents offer new possibilities: these agents dynamically orchestrate specialized tools for each data modality (e.g., text extraction, image processing, audio transcription), combine intermediate results, and reason over them to produce coherent, explainable responses.
This thesis will focus on the design of a scalable, explainable information retrieval system based on multimodal agents. The system will extract, represent, and make information accessible and explainable across multiple data types. Students will have access to cloud platforms such as AWS and Snowflake to build scalable, reproducible solutions.
Assignment
The main goal of this thesis is to design and evaluate an information retrieval system on multimodal data. The system should be developed, deployed, and tested in a cloud environment, focusing on scalability, reproducibility, and explainability.
The challenges include:
Extraction: Implement methods for extracting information from documents, images, audio, video, and tabular data.
Representation: Build structured knowledge representations (e.g., knowledge graphs, relational or vector databases) that support efficient retrieval.
Accessibility: Modularize and expose the represented knowledge via APIs or MCP servers to enable seamless integration with other systems.
Explainability: Ensure responses are transparent and traceable, clearly referencing their sources and reasoning steps.
Evaluation: Evaluate the system across multiple layers (extraction accuracy, representation quality, accessibility, and explainability) .
Even if you don’t have experience with everything mentioned above, we still encourage applications from students of all backgrounds and perspectives. Participants will gain hands-on experience and receive regular mentorship and collaboration opportunities throughout the project.
Education and time plan
Education: Master’s program in Computer Science, Data Science, Artificial Intelligence, Machine Learning, Industrial Analytics
Number of students: 1 - 2
Start date: January 2026
Estimated time needed: 20 weeks
Topics: Artifical Intelligence, Agents, Explainability, Cloud
Contact persons and supervisors:
Swathi Rao and Joris Rombouts will be the supervisors and will be able to answer questions on the project.
email:[email protected] and [email protected]
Application:
Your application must include a CV, personal letter and transcript of grades.
A background check might be conducted for this position. We are conducting interviews continuously and may close the recruitment earlier than the date specified.
Key Skills
Ranked by relevanceReady to apply?
Join Scania Group and take your career to the next level!
Application takes less than 5 minutes

