AI Engineer

Critical ManufacturingPortugal7 hours ago

Full-timeEngineering

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Critical Manufacturing is dedicated to empowering high-performance operations to make Industry 4.0 a reality with the most innovative, comprehensive, and modular MES software. We have a global presence, but our headquarters, and the main technical center, are in Porto (Maia), Portugal, where we develop a state-of-the-art solution for Semiconductor, Electronics, Medical Devices, and other Discrete industries.

Recognized for the third consecutive year as a Leader by Gartner, we are part of ASMPT, the world's largest supplier of best-in-class equipment, and technological process partner for the electronics and semiconductor industries.

The role:

You will join an existing AI engineering team focused on building reliable AI infrastructure for manufacturing systems. This is hands-on work developing MCP servers, creating tooling for model observability, telemetry, and retraining pipelines—no leadership required, just solid execution within a collaborative team.

This role is based at our headquarters in Porto, Portugal, where collaboration, experimentation, and rigorous engineering standards are essential. You're expected to stay closely connected—actively participating in technical design reviews, architecture discussions, and engaging with teams across Product, Data, and Platform Engineering. This is a role for someone who cares about building AI systems that are not just smart, but observable, debuggable, and continuously improving.

What you'll do:

Develop MCP Servers

Implement and maintain Model Context Protocol (MCP) servers that connect language models to manufacturing domain tools and data sources
Optimize server performance and define clear interfaces for tool integration, ensuring models have safe, reliable access to business logic
Collaborate with team leads to map complex manufacturing workflows into structured tools and prompts

Build Model Observability and Telemetry Infrastructure

Design and implement comprehensive telemetry systems to track model behavior, token usage, latency, and cost in production
Create dashboards and alerting systems that give real-time visibility into model performance and anomalies
Instrument models to capture structured traces: prompts/system context, tool invocations, inputs/outputs, intermediate artifacts, and decision metadata
Contribute to standards for logging, tracing, and distributed observability across all AI systems

Develop Retraining and Continuous Improvement Pipelines

Build data collection pipelines that capture production interactions, model failures, and edge cases for retraining
Implement automated systems for evaluating model improvements and managing safe rollouts
Contribute to feedback loops that allow the platform to learn from real-world usage without manual intervention

Support Team Deliverables

Write clean, testable code and contribute to team codebases, documentation, and CI/CD processes
Participate in code reviews, technical design reviews, and troubleshooting production issues
Experiment with new tools and techniques under team guidance to improve AI system reliability
Promote the adoption of agentic coding across teams to accelerate delivery and increase throughput while maintaining quality and security standards
Design repositories, CI, and developer tooling that make agent-driven changes safe (linting, typed APIs, contract tests, golden tests, eval gates)

Ensure Production Reliability

Implement robust error handling, fallback strategies, and graceful degradation for AI systems
Monitor and tune AI systems for performance, uptime, and safety in manufacturing environments
Gather feedback from operations and product teams to refine tooling and server implementations

What Success Looks Like

Within your first year, you will have:

Deployed production MCP servers handling real manufacturing workloads
Built and iterated on observability tools used daily by engineering and ops teams
Contributed to retraining pipelines that reduce model staleness and improve prediction accuracy
Established clear patterns and best practices that help the team scale AI systems reliably
Delivered robust tooling for debugging, monitoring, and managing AI systems in manufacturing environments

Why Join Us

Work on AI that powers real factories, solving problems with immediate industrial impact
Join a tight-knit engineering team building the backbone of trustworthy AI infrastructure for manufacturing
Contribute to systems that manufacturers depend on daily, with full observability and reliability
Enjoy the freedom to code, collaborate, and grow technically in a rigorous engineering environment

Requirements

What You Will Bring

At least 1 year of hands-on machine learning experience, including training and testing models, and a practical understanding of overfitting, generalization, and bias; plus a solid grasp of common model families (e.g., k-nearest neighbors, decision trees/random forests, support vector machines, linear/logistic regression, and basic neural networks)
At least 1 year of hands-on experience with LLMs in production or applied settings, including inference, prompt engineering, and evaluation; with a working understanding of how LLMs are configured and behave (e.g., temperature, top-p, max tokens, context windows, and tool/function calling)
Experience with agentic coding workflows or LLM-based code assistance, using tools that accelerate implementation, refactoring, and test generation while maintaining strong engineering rigor (reviews, testing, documentation, and CI discipline)
Familiarity with server development, APIs, and containerization (Docker/Kubernetes)
Strong problem-solving skills and comfortable writing production code—tests, docs, and all
Excellent software engineering fundamentals: version control, testing, code review, documentation
Ability to collaborate effectively in a team and work well under technical leadership
Excellent spoken and written English communication skills

What we consider a plus (not mandatory):

Experience with manufacturing operations, MES systems, or Industry 4.0 concepts
Familiarity with MLOps tools, model monitoring platforms, or ML infrastructure
Basic knowledge of observability tools (Prometheus, Grafana, or similar) and data pipelines
Proficiency in Python and experience with AI frameworks like PyTorch, TensorFlow, or LangChain

Diversity, Equity and Inclusion are a source of commitment and innovation

At Critical Manufacturing, we welcome and encourage applications from individuals of all backgrounds, regardless of disabilities, diverse abilities, identities, or experiences. Our commitment is to create an inclusive environment where everyone has equal opportunities to succeed and thrive.

If you need accommodation during the recruitment process, please let us know—we're happy to support you.

Key Skills

Ranked by relevance

Ready to apply?

Join Critical Manufacturing and take your career to the next level!

Application takes less than 5 minutes

Apply