-
View all jobs
Scope:
- Design and Optimize LLM Evaluation Pipelines: Create and implement efficient, scalable pipelines for assessing the performance of large language models. Use your expertise in Ray to optimize the distribution of evaluation tasks.
- Develop Evaluation Metrics and Benchmarks: Build and integrate a comprehensive set of metrics and benchmarks to evaluate LLM performance across various tasks and domains.
- Analyze and Interpret Results: Conduct thorough analysis of evaluation outcomes to identify model strengths, weaknesses, and areas for improvement. Effectively communicate insights to stakeholders.
- Collaborate with Research and Engineering Teams: Partner with research scientists and engineers to understand LLM requirements and develop effective evaluation strategies.
- Stay Current with LLM Advancements: Keep up to date with emerging trends, tools, and best practices in LLM evaluation.
Requirements:
- Expertise in LLM Evaluation: In-depth knowledge of LLM evaluation methods, metrics, and benchmarks, with experience evaluating models for tasks such as text generation, question answering, and summarization.
- Proficiency in Distributed Computing: Hands-on experience with distributed computing frameworks like Ray or similar technologies to scale LLM evaluations.
- Strong Python and ML Skills: Solid programming skills in Python, with experience in machine learning frameworks and libraries.
- Kubernetes and Cloud Infrastructure: Familiarity with deploying and managing applications on Kubernetes, as well as experience with cloud platforms like AWS.
- Effective Communication and Collaboration: Excellent communication skills and the ability to work effectively in cross-functional teams.
About your Application:
- Apply to this job posting, and email your CV with the job title as the subject line to: [email protected]
Key Skills
Ranked by relevance
python
cloud
distributed computing
machine learning
kubernetes
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Power ML Engineer
2026-05-26
Full-time
Not Applicable
Singapore
Technology
Engineering
View Job Details
Related
Senior Embedded Machine Learning Engineer (C++)
2026-05-28
Full-time
Mid-Senior
Finland
Software Development
Information Technology
View Job Details
Related
DevOps Engineer
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
Login to Apply
- Posted
- Jan 31, 2025
- Type
- Contract
- Level
- Mid-Senior
- Location
- Singapore
- Company
- Talentvis
Industries
Technology
Information
Media
Categories
Information Technology
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Power ML Engineer
2026-05-26
Full-time
Not Applicable
Singapore
Technology
Engineering
View Job Details
Related
Senior Embedded Machine Learning Engineer (C++)
2026-05-28
Full-time
Mid-Senior
Finland
Software Development
Information Technology
View Job Details
Related
DevOps Engineer
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering