POSITION SUMMARY
As a Site Reliability Engineer (SRE), your responsibilities will include building scalable infrastructure on which we deliver our software. You will help ensure the reliability, availability, and performance of our production and development infrastructure.
You will collaborate with cross-functional teams to drive reliability automation, optimize deployment strategies, and enhance infrastructure monitoring.
PRIMARY RESPONSIBILITIES
- Develop and maintain automation and processes to improve system reliability and enable teams to build and deploy secure and scalable applications in AWS using technologies such as Kubernetes and Terraform.
- Establish and maintain infrastructure and application monitoring systems.
- Define and monitor SLIs, SLOs, and SLAs to ensure operational excellence.
- Analyze usage trends to forecast infrastructure needs and ensure scalability.
- Conduct load testing to validate system capacity and optimize performance.
- Participate in the incident lifecycle: preparation, detection, response, analysis, and post-incident learning. Be ready to respond to a team or business critical incident in a timely manner (be a part of the on-call rotation).
- Work closely with development teams in all phases of SDLC to investigate areas of improvement and seek for bottlenecks.
- Guide and encourage teams to follow SRE best practices.
- Participate in operations efforts and be the point person for infrastructure activities.
- Participate in architectural decisions to help improve the quality of our software and infrastructure.
QUALIFICATIONS
- BS in Computer Science, Engineering, or a related field, or equivalent practical experience.
- 3 years of experience as an SRE or a similar role.
KNOWLEDGE, SKILLS, AND ABILITIES
- Strong problem-solving and analytical skills; Strong ability to troubleshoot complex issues ranging from system resourcing, network issues to application stack traces.
- Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog)
- Strong proficiency in programming or scripting languages (e.g., Python or Bash).
- Hands-on experience with Kubernetes, Docker, and infrastructure-as-code tools (e.g., Terraform).
- Proven expertise in managing AWS Cloud Infrastructure.
- Experience in Linux/Unix administration.
- Ability to read and understand Java and Python code.
- Excellent communication and collaboration abilities. Be able to justify and stand for the proper solution.
- Ability to work effectively in a cross-functional, fast-paced environment.
- Nice to have:
- Knowledge of database operations and performance optimization
- Experience with GitLab
- Experience with Atlassian services
- Experience programming in Java or other OOP languages
EEO Statement
Orion is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, citizenship status, disability status, genetic information, protected veteran status, or any other characteristic protected by law.
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Software Engineer III, Debug
2026-06-19
Full Stack Engineer
2026-06-19
Software Engineer III, General Software Development, Search
2026-06-19
- Posted
- Feb 06, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Istanbul
- Company
- Orion Innovation Turkey
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Software Engineer III, Debug
2026-06-19
Full Stack Engineer
2026-06-19
Software Engineer III, General Software Development, Search
2026-06-19