Shama AI
Senior AI Deployment/MLOPs Engineer
Shama AIAustria2 days ago
Full-timeRemote FriendlyInformation Technology

This is a full-time hybrid role for a Senior AI Deployment /MLOPS Engineer. The role involves deploying and optimizing AI models and DSP algorithms for audio processing. It focuses on low-level programming for operating systems targeting deployment on local environments and resource-constrained devices while maintaining real-time performance.


Work will primarily be based in Austria.  


Key Responsibilities: 

  • Software development:

Compiling AI models from high-level languages to low-level programming, writing and maintaining software, on various architectures, such as x86 and ARM. Deploy and integrate AI models directly into device-running applications, enabling real-time solutions for voice processing tasks. Building inference systems and containerization. 

  • Hardware Integration: 

Deploy AI models on a variety of devices, from microcontrollers to gateways, and ensure they work efficiently with specific hardware accelerators (e.g., CPUs, DSPs, GPUs, NPUs). 

  • Model optimization and productization:

Compress, quantize, and prune AI models to make them efficient enough to run on local environments and hardware with limited resources. CPU/GPU acceleration

  • Performance analysis and Validation: 

Profiling and debugging models and systems at the system level to analyze performance metrics like CPU/GPU utilization and latency. 

  • Deployment and maintenance: 

MLOps and managing the deployment and updating of AI models in the field, including the use of CI/CD pipelines.

·       Scaling & Reliability

Auto-scaling inference servers, load balancing, monitoring latency and errors, logging & observability



Necessary skills

Bachelor’s/Master’s/Ph.D. degree in Computer Science/AI/Applied Mathematics/Physics or related fields, specialized in audio, acoustics, and speech signal processing.


·       Industry experience 

5+ years working in the industry. 

  • Tools and Platforms: 

Experience with MLOps tools and platforms designed for local and edge deployment, such as NVIDIA Triton Inference Server, Edge Impulse, or ONNX runtime. 

  • Programming languages: 

Proficiency in programming languages such as C/C++ and Python is essential. 

Deep understanding of Python internals or interoperability (Cython, CPython API, PyBind11)

Capability to port high-level Python logic (e.g., signal processing, ML inference) into low-level, efficient C code, for real-time performance

  • Software engineering: 

Application of general software engineering best practices, such as version control, testing, and documentation. Skills in software development, programming languages relevant to the device, and an understanding of system architecture. 

  • Speech AI Models: 

Strong experience in deployment of speech-related AI models, such as STT, TTS, ASR, and others. 


  • Problem-solving: 

The ability to approach problems from a user-centric perspective to create solutions that add real value. 

  • Team-work:

The ability to work in a team environment.


Preferred skills

  • Machine learning: 

Understanding of machine learning concepts, model optimization techniques, and the ability to work with model conversion tools. Strong knowledge of machine learning concepts, model development, and popular frameworks (e.g., TensorFlow, PyTorch, etc).

  • Audio Signal Processing 

Preferably, experience in digital audio signal processing and with FFmpeg and similar audio transcoding platforms is a plus.

  • Hardware familiarity: 

An understanding of electronics and hardware interfaces is crucial for successful integration. 


Key Skills

Ranked by relevance