-
Playtika

Site Reliability Engineer

Playtika
Ukraine · Full-time · Mid-Senior

Responsibilities:

  • Maintain and improve existing monitoring configurations (alerts, dashboards, service discovery, scrape configs, etc.)
  • Implement and enhance alerting logic, including threshold tuning and dynamic alert conditions
  • Troubleshoot monitoring and metrics-related issues (e.g., missing data, false alerts, broken dashboards)
  • Support and improve self-developed metrics collectors and Python-based monitoring services
  • Assist NOC and SRE teams with alert deduplication, escalation rules, and alert quality improvements
  • Participate in design and implementation of observability improvements for new services and infrastructure components
  • Review, modify, and extend existing scripts and plugins (primarily Python and Bash)
  • Provide monitoring-related guidance to development, infrastructure, and operations teams
  • Ensure monitoring tools and services operate reliably within Kubernetes clusters and Linux systems
  • Maintain monitoring configuration in Git and follow internal version control best practices
  • Participate in cross-team initiatives to improve the overall monitoring and incident response ecosystem

Requirements:

  • Strong hands-on experience with Linux systems (primarily Ubuntu)
  • Practical knowledge of Prometheus ecosystem, VictoriaMetrics, Grafana, and Zabbix
  • Experience supporting monitoring systems in Kubernetes-based infrastructure
  • Solid scripting skills (Bash)
  • Familiarity with Git and common version control workflows
  • Good understanding of networking and infrastructure concepts (ports, protocols, DNS, etc.)
  • Ability to troubleshoot metric collection, alert firing, and data visualization issues
  • Basic knowledge of SQL (e.g., for querying time-series or metadata stores)
  • Strong communication skills for cross-functional collaboration

Nice to have:

  • Understanding of high-availability and failover patterns in observability systems
  • Experience working with SLO/SLA-based alerting or anomaly detection mechanisms
  • Exposure to automation and CI/CD pipelines for monitoring infrastructure

Key Skills

Ranked by relevance

kubernetes python linux git data visualization incident response prometheus grafana cicd sql dns
Login to Apply
Posted
Jul 30, 2025
Type
Full-time
Level
Mid-Senior
Location
Vinnytsya
Company
Playtika

Industries

Computer Games

Categories

Engineering Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
King
Related

Senior Automation Engineer | Minecraft Blast, New Games

2026-05-26

Full-time
Not Applicable
Sweden
Computer Games
Engineering
View Job Details
King
Related

Senior Automation Engineer | Minecraft Blast, New Games

2026-05-26

Full-time
Not Applicable
Spain
Computer Games
Engineering
View Job Details
IO Interactive
Related

Private Cloud Engineer

2026-05-21

Full-time
Not Applicable
Sweden
Computer Games
Engineering