Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
As a Data Scientist at VTTI, you will take end-to-end ownership of building scalable analytics and AI capabilities across the organization. This role sits at the intersection of data engineering and applied machine learning, with responsibility for designing analytics-ready data pipelines, modernizing the data platform, and delivering high-impact ML/AI use cases. You will move beyond ad-hoc analysis and legacy workflows to establish robust, production-grade data foundations that support advanced analytics, predictive modeling, and AI-driven decision-making across the energy value chain. This includes designing reliable data transformation layers, enabling experimentation frameworks, defining model evaluation standards, and ensuring analytics workloads are built on scalable, cost-efficient architectures. This is a hands-on role requiring strong technical depth, architectural thinking, and the ability to independently lead initiatives from problem definition through production deployment. You will collaborate closely with business stakeholders while helping raise technical standards in analytics and machine learning practices across the team.
Key Responsibilities:
1. Data Platform Modernization
- Lead end-to-end migration of data pipelines from the legacy data platform to Databricks.
- Assess current architecture, identify inefficiencies, and redesign pipelines for scalability, cost efficiency, and reliability.
- Refactor ETL/ELT workflows, optimize distributed processing, and ensure reproducibility across environments.
- Define migration strategy, rollback plans, validation frameworks, and performance benchmarks.
- Ensure minimal business disruption during transition.
2. AI & Advanced Analytics Enablement
- Prepare and structure data assets to support AI/ML use cases, including predictive analytics, optimization models, and generative AI initiatives.
- Translate business problems into measurable analytical objectives, define appropriate modeling approaches, and establish rigorous evaluation metrics and validation strategies.
- Design and implement feature pipelines and transformation layers optimized for scalable, production-grade ML workloads.
- Contribute to experimentation frameworks and model evaluation processes.
- Support MLOps practices including model monitoring, retraining strategies, and version control.
3. Data Architecture & Performance Optimization
- Design scalable data models and storage patterns to support structured, semi-structured, and geospatial datasets.
- Optimize query performance and compute usage to reduce platform costs.
- Implement data quality checks, observability, and automated validation mechanisms.
- Ensure pipelines are deterministic, fault-tolerant, and auditable.
4. Governance, Security & Reliability
- Implement role-based access controls and data governance standards.
- Ensure compliance with enterprise security policies and best practices.
- Develop monitoring frameworks to track pipeline performance, data drift, and model degradation.
- Document architecture decisions and technical standards.
5. Cross-Functional Collaboration
- Partner with IT and Business stakeholders to translate requirements into scalable data and AI solutions.
- Provide technical leadership on platform strategy and AI enablement.
- Communicate trade-offs clearly — especially around cost, scalability, and long-term maintainability.
- Mentor team members on best practices in data science and pipeline design.
Required Qualifications
- Degree in Computer Science, Engineering, Mathematics, Statistics, or related technical discipline.
- Strong expertise in Python and SQL.
- Experience designing and implementing analytics-grade data pipelines.
- Experience with distributed data processing frameworks (e.g., Spark or equivalent).
- Solid grounding in machine learning and statistical learning principles, including model selection, evaluation metrics, cross-validation strategies, and trade-offs between bias, variance, and generalization.
- Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP).
- Experience building and supporting ML workflows, including feature engineering, model evaluation, and production integration.
- Strong understanding of data modeling principles and analytics architectures (including separation of transactional and analytical workloads).
- Experience with version control (Git), CI/CD practices, and automated testing.
Preferred / Nice to Have
- 5+ years of relevant industry experience in data science, applied machine learning, or data engineering roles.
- Experience migrating to spark-based pipelines.
- Experience implementing MLOps practices (model monitoring, retraining, data drift detection).
- Experience handling large-scale structured, semi-structured, or geospatial datasets.
- Experience optimizing data platform performance and cost structures.
- Familiarity with AI applications in energy, trading, or asset optimization domains.
- Experience operating in very large-scale environments.
Application process
Please apply through the Apply Button. Due to GDPR reasons we cannot accept applications by email or personal messages.
Acquisition as a result of this vacancy and/or responses from recruitment agencies are not appreciated.
Key Skills
Ranked by relevanceReady to apply?
Join VTTI and take your career to the next level!
Application takes less than 5 minutes

