MA CAPITAL U.S. LLC
Site Reliability Engineer
MA CAPITAL U.S. LLCSwitzerland8 hours ago
Full-timeEngineering, Information Technology

Who We Are

MA Capital US LLC is a proprietary trading firm specializing in systematic and high-performing discretionary strategies across multiple asset classes. We leverage advanced technology, quantitative research, and sophisticated models to capitalize on opportunities in global markets. Our culture is built on innovation, efficiency, and transparency, providing our professionals with the tools and flexibility to succeed.


Position Overview

As a Site Reliability Engineer (SRE), you will own the intersection of performance-focused Linux engineering and modern DevOps automation. We are looking for a "builder" rather than a "maintainer", someone who is obsessed with microsecond-level performance but refuses to achieve it through manual effort. Your goal is to ensure our systematic trading strategies have a rock-solid foundation that scales through code, not headcount.

You will work with developers and traders to ensure production systems meet strict performance and reliability expectations, while also driving the next phase of operational maturity.


The Evolution: What You Will Drive

  • Automated Performance: You won’t just tune a kernel; you’ll write the code (Ansible/Terraform) that ensures every server is "born" optimized with CPU isolation, NUMA alignment, and PTP time-sync.
  • Proactive Observability: You’ll move us past simple "up/down" monitoring to deep-system profiling (eBPF, Prometheus) to catch micro-bursts before they impact a trade.
  • Engineering Reliability: You will treat the entire stack, from BIOS settings to the application layer—as version-controlled code.


Key Responsibilities

Reliability & Production Ownership

  • Own the availability and performance of our Linux-based trading fleet (RedHat, Rocky/Ubuntu).
  • Lead incident response and "Blameless Post-Mortems," focusing your energy on writing the code or process change that ensures a failure never happens twice.
  • Participate in on-call rotations, incident response, and post-incident reviews
  • Develop and maintain runbooks, documentation, and operational standards


Linux & Performance Engineering

  • Perform OS and system-level tuning (CPU topology, IRQ affinity, memory, networking) to ensure microsecond-level predictability.
  • Diagnose performance issue using advanced tools: perf, ftrace, tcpdump, and eBPF.
  • Partner with developers and traders to understand real-world trading workloads and optimize system behaviour accordingly


DevOps, Automation & Infrastructure as Code

  • Treat infrastructure as a version-controlled, reproducible system
  • Automate provisioning, configuration, and lifecycle management using Infrastructure as Code (Ansible, Terraform, etc.)
  • Ensure systems are “born optimized” with performance and reliability built in
  • Design and improve CI/CD pipelines that include automated performance benchmarking.
  • Reduce operational effort through scripting, tooling, and standardization
  • Support both containerized and non-containerized workloads where appropriate
  • Manage "plumbing" (DNS, NFS, LDAP, Multicast) through automation rather than manual config changes.


Monitoring, Observability & Self-Healing

  • Build and evolve monitoring, alerting, and logging for trading-critical systems
  • Implement SLIs and SLOs aligned with trading impact
  • Improve alert quality to reduce noise and improve response time
  • Implement automated remediation for known failure scenarios


Process & Maturity Advancement

  • Identify gaps in existing operational practices and propose better approaches
  • Introduce scalable processes that balance speed, safety, and performance
  • Help move the organization from reactive support toward proactive reliability engineering
  • Ability to produce clear documentation, post-mortems, and operational procedures
  • Influence how reliability, performance, and operational readiness are considered during system design


Required Qualifications

  • 4–8+ years in Linux Engineering, SRE, DevOps, or infrastructure engineering roles
  • Hands-on experience with highly available, performance-sensitive systems Linux systems engineering in production environments with real uptime and performance constraints
  • Deep Linux understanding (scheduling, memory management, interrupts), filesystems and storage behaviour
  • Solid understanding of Networking concepts: TCP/IP, UDP, and Multicast
  • Familiarity with observability stacks such as Prometheus, Grafana, or ELK
  • Experience with containers or orchestration platforms (Docker, Kubernetes)
  • Familiarity with Git-based workflows, CI/CD concepts, deployment pipelines benchmarking, testing and containerization through structured change management
  • Proficiency with automation and configuration management (Python, Ansible, Terraform, etc)
  • Experience supporting or troubleshooting production databases
  • Experience integrating system and application logs with centralized logging and SIEM platforms
  • Understanding of incident management and on-call best practices


Why Join Us?

• Foundational Impact: Shape the firm’s core data platform and influence the long-term research and trading technology strategy.

• Working from the company's Headquarters in Cham, Switzerland.

• Startup Environment: Agile, entrepreneurial culture encouraging ownership and rapid decision-making.

• Efficient Infrastructure: Proprietary low-latency platform supporting systematic & discretionary trading.

• Benefits and pension contributions in line with Swiss labor regulations.

Key Skills

Ranked by relevance