Site Reliability Engineer

HIGOGAME

Singapore · Full-time · Mid-Senior

Company Description

Higogame is a trailblazer in the mobile gaming and entertainment industry. Since our inception in late 2020, we have been dedicated to transforming the gaming landscape in Southeast Asia and beyond, delivering innovative and immersive experiences that engage millions of players around the globe.

Our revenue has seen remarkable growth year after year, with operations extending across multiple regions worldwide.
In just three years, we've risen to become one of the top two games of our kind in the local market.
We proudly serve around 2 million active users daily and have a total monthly active user base of 5 million worldwide.
Our team consists of over 200 talented employees, including a robust R&D division of more than 100 experts.
We offer exceptional career development opportunities and foster a multinational culture that empowers everyone to reach their full potential.

Join us as we continue to push the boundaries of mobile gaming!

Job Responabilities:

Responsible for the full lifecycle management of the company’s global/multi-region infrastructure. Lead the setup of the Singapore physical data center and deep operations of Google Cloud (GCP) platform. Drive automation and intelligent operations systems to ensure high availability, low cost, and scalable business operations. The role requires both traditional data center operations experience and cloud-native technical vision, acting as a key technical backbone connecting physical resources with cloud capabilities.

I. Core Responsibilities

1. Physical Data Center Planning & Implementation

Lead end-to-end management of self-built/hosted data centers: requirements analysis, architecture design (network/power/cooling/cabling), equipment selection (servers/switches/UPS), construction acceptance, and post-operations optimization.
Design multi–data center disaster recovery architectures (e.g., active-active across two sites, three centers), including cross-site synchronization and failover strategies to ensure business continuity.
Manage internal resource backup/disaster recovery, including art assets, code, and other data assets.

2. Google Cloud (GCP) Deep Operations & Optimization

Design and manage GCP architecture (Compute Engine, VPC, Cloud Storage, GKE, BigQuery, etc.), supporting cloud migration and hybrid cloud deployment of core business systems.
Lead full lifecycle management of cloud resources, including cost optimization (reserved instances, autoscaling, idle resource reclamation), performance tuning (network latency, storage IOPS, compute utilization), and security hardening (IAM governance, encryption policies, vulnerability scanning).
Build cloud-native ops systems using Cloud Monitoring/Logging for real-time alerting and fault detection.

3. Automation & Intelligent Operations Systems

Lead development and integration of operations toolchains (e.g., Ansible/Puppet automation, Prometheus+Grafana monitoring, ELK logging) to shift operations from manual to platform-based and intelligent.
Integrate CI/CD pipelines with cloud platforms, optimizing deployment efficiency and stability of containerized (K8s) and serverless (Cloud Functions) workloads.
Lead root cause analysis (RCA) and postmortems of major incidents, deliver improvement plans, and strengthen contingency planning and drills (e.g., data center power outage, cloud region failure).

4. Cross-Team Collaboration & Technical Enablement

Collaborate with R&D, QA, and Product teams to provide infrastructure support for rapid business delivery.
Develop operations standards and technical documentation, drive team knowledge sharing, and mentor junior engineers.

II. Requirements

Basic Qualifications

Bachelor’s degree or higher in Computer Science, Network Engineering, Cloud Computing, or related fields.
5+ years in IT operations, including 3+ years in physical data center build/ops, and 2+ years of hands-on GCP experience (must provide project examples).
Experience in large-scale distributed systems, with solid knowledge of Linux, network protocols (TCP/IP, SDN), and high availability database architectures (MySQL/Redis).
Must be able to converse in Mandarin due to the need to travel to China to communicate with Chinese speaking stakeholders
Must be able to travel (up to 50% of the time)

Technical Skills

Data Center

Familiar with infrastructure (power/cooling/fire safety/cabling), and optimization metrics like PUE/CUE.
Experience in IDC hosting, custom data center builds, or third-party acceptance audits. Knowledge of industry standards (e.g., GB50174 Data Center Design Standard).

Google Cloud (GCP)

Proficient in GCP core services: GCE, VPC, Cloud SQL/Spanner, GKE.
Skilled in GCP cost management (Budgets & Alerts, preemptible VMs, storage tiers).
Strong in GCP security: IAM, VPC Service Controls, Cloud Firewall, KMS.

Automation & Toolchains

Skilled with Terraform/Ansible for IaC, scripting in Shell/Python/Go for ops tooling.
Experienced in Prometheus+Grafana monitoring, ELK/OpenTelemetry for logging & tracing.
Hands-on Kubernetes operations (scaling, node management, Helm) and CI/CD pipeline integration (Jenkins/GitLab CI).

Soft Skills

Strong troubleshooting and resilience, able to quickly resolve complex incidents (e.g., data center outage, regional cloud failure).
Excellent cross-team communication and project leadership skills.
Fast learner, stays updated on cloud-native (CNCF), AIOps, and industry trends.

III. Nice-to-Haves

GCP certifications (e.g., Professional Cloud Architect, Associate Cloud Engineer) or ITIL/ISO20000.
Led/participated in large-scale data center builds (multi-million) or GCP ops projects with million+ annual cloud spend.
Experience in hybrid cloud (GCP + on-premises) or edge computing ops.
Published blogs, open-source contributions, or active participation in tech communities (GitHub, CNCF events).

IV. What We Offer

Competitive salary
Global platform: Participate in building multi-region intelligent operations infrastructure.
Growth: Internal tech sharing, external conferences, certification & training support.
Work environment: Flat management, flexible hours, free snacks, comprehensive medical, hospital and dental coverage

Key Skills

Ranked by relevance

cloud gcp storage high availability cicd kubernetes serverless firewall linux elk sdn

Related Jobs

3 roles aligned with this opportunity

View all jobs

Senior PHP Application Engineer | Hybrid (Singapore)

2026-06-30

Full-time

Mid-Senior

Singapore

Information Services

Information Technology

Full Stack Engineer

2026-06-18

Full-time

Not Applicable

Spain

Computer Games

Engineering

Principal Full Stack Engineer

2026-07-03

Full-time

Not Applicable

Singapore

Computer Games

Engineering

🇸🇬

Country Guide

Singapore

High-pay global hub in Asia

Posted: Nov 12, 2025
Type: Full-time
Level: Mid-Senior
Location: Singapore
Company: HIGOGAME

Industries

Information Services Computer Games

Related Jobs

3 roles aligned with this opportunity

View all jobs

Senior PHP Application Engineer | Hybrid (Singapore)

2026-06-30

Full-time

Mid-Senior

Singapore

Information Services

Information Technology

Full Stack Engineer

2026-06-18

Full-time

Not Applicable

Spain

Computer Games

Engineering

Principal Full Stack Engineer

2026-07-03

Full-time

Not Applicable

Singapore

Computer Games

Engineering

Site Reliability Engineer

Key Skills

Related Jobs

Senior PHP Application Engineer | Hybrid (Singapore)

Full Stack Engineer

Principal Full Stack Engineer

Related Jobs

Senior PHP Application Engineer | Hybrid (Singapore)

Full Stack Engineer

Principal Full Stack Engineer

Cookie Settings