Site Reliability Engineer

We are looking for a motivated Site Reliability Engineer (SRE) with strong Linux expertise to join our Engineering team. You will be responsible for ensuring the reliability, scalability, and performance of our systems and applications across global environments. This role blends system administration, DevOps, and application support, with a heavy focus on automation, observability, and continuous improvement.

Primary Duties:

Manage and maintain Linux-based production environments, ensuring uptime, performance, and security.
Support and monitor application services, identifying bottlenecks and proactively resolving issues.
Automate system operations, deployments, and scaling using modern tools and scripting.
Build and maintain observability stacks (logging, metrics, tracing) with tools such as Prometheus, Grafana, ELK/EFK.
Collaborate with developers, QA, and product teams to design reliable and scalable systems.
Implement and manage CI/CD pipelines for application deployments.
Perform incident response, root cause analysis, and drive postmortems with clear action items.
Manage capacity planning, disaster recovery strategies, and performance tuning.

Required Skills:

3+ years of experience in Linux system administration and/or SRE roles.
Strong knowledge of Linux internals, networking, and system troubleshooting.
Hands-on experience with cloud platforms (AWS, GCP, or Azure).
Proficiency in scripting/automation (Bash, Python, or Go).
Experience with containerization and orchestration (Docker, Kubernetes).
Familiarity with monitoring/observability tools (Prometheus, Grafana, ELK stack).
Understanding of CI/CD workflows and tools (Jenkins, GitLab CI/CD, ArgoCD).
Intermediate or higher English level for technical communication and documentation.

Nice-to-Have Skills:

Experience with infrastructure-as-code (Terraform, Ansible, Helm).
Exposure to service meshes (Istio, Linkerd) or API gateways.
Knowledge of databases (MySQL, PostgreSQL, Redis, MongoDB).
Familiarity with security practices for Linux and cloud workloads.
Strong understanding of high availability, scaling strategies, and performance optimization.

We Offer:

Employment in a stable, well-recognized international company.
Competitive salary and benefits package.
Supportive and professional team environment.
Flexible working hours and modern office space.
Comprehensive medical insurance.
Training programs and excellent travel opportunities.
Recognition and career growth opportunities.

Site Reliability Engineer

Key Skills

Related Jobs

Network Engineer

Network Engineer

AI Engineer

Related Jobs

Network Engineer

Network Engineer

AI Engineer

Cookie Settings