Grid Dynamics
Senior Software Engineer (ML Infrastructure)
Grid DynamicsArgentina3 days ago
Full-timeRemote FriendlyEngineering, Information Technology
This position is part of a project for a world-renowned high-tech company, focused on building and improving the infrastructure that enables large scale machine learning frameworks used across the industry. Engineers on this project ensure that complex C++ and Python codebases, including core ML frameworks and supporting libraries, remain stable, efficient, and well integrated as they evolve. The role combines software development with build engineering and CI/CD automation, focusing on continuous enhancement of build and integration workflows and contributing to the ongoing evolution of these frameworks and their ecosystem.

Responsibilities

  • Develop, refine, and enhance large scale build configurations for C++ and Python projects
  • Build and operate CI/CD pipelines to automate validation, testing, releases, and rollout of updates
  • Investigate and resolve complex build, dependency, and integration issues across multiple repositories
  • Implement code changes, fixes, and small features in ML frameworks and related libraries
  • Ensure reproducible and hermetic builds using modern toolchains, caching, and distributed testing
  • Manage and optimize containerized build and test environments (Docker)
  • Collaborate with infrastructure, release, and ML engineering teams to ensure consistent integration and delivery

Requirements

  • Strong proficiency in C++ and Python
  • Experience with modern build systems at scale such as Bazel (preferred). Other experience with large build systems (Buck, Pants, CMake, or similar) is also valuable
  • Hands-on experience with CI/CD automation (GitHub Actions preferred, Jenkins, Buildkite, or similar)
  • Proficiency with Git or Mercurial including complex rebases, cherry-picks, and patch workflows
  • Strong Bash or shell scripting for automation and environment setup
  • Familiarity with Docker or similar container technologies for build and test automation
  • Detail-oriented, systematic approach to problem solving with focus on reliability and scalability

Nice to have

  • Experience working with large open source ML frameworks such as TensorFlow, PyTorch, or JAX
  • Familiarity with GPU build and testing workflows or multi-architecture builds
  • Exposure to distributed or hermetic build environments and remote execution
  • Understanding of dependency graph analysis and build tooling such as Bazel query or cquery

We offer

  • Flexible working hours (full-time).
  • One "Flex Day" off per month – eligible after six months with the company.
  • 10 business days of vacation.
  • Swiss Medical health coverage.
  • Permanent contract with salary review every four months (in ARS).
  • Access to Udemy and Platzi for professional training.
  • Employee Assistance Program (financial, nutritional, psychological support, etc.).
  • Fully covered English classes during working hours.
  • Discounts on Club de Beneficios and Samsung products.
  • Birthday day off.

About Us

Mobile Computing is joining Grid Dynamics (NASDAQ: GDYN), a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.

Key Skills

Ranked by relevance