My Client is an innovative and rapidly growing SaaS company that delivers cutting-edge solutions and are seeking a talented and driven Site Reliability Engineer (SRE) to join our growing engineering team. This is an exciting opportunity to be part of a high-impact team focused on ensuring the availability, scalability, and performance of our platform in a fast-paced and dynamic environment.
Key Responsibilities:
- System Reliability: Ensure the reliability, availability, and performance of our SaaS platform by developing and maintaining automated monitoring, alerting, and incident response systems.
- Automation & Tooling: Automate manual processes and optimize operational workflows to reduce overhead and improve efficiency. Build tools to manage infrastructure at scale.
- Capacity Planning & Scaling: Plan and execute scaling strategies, ensuring that infrastructure can handle growth and demand spikes without impacting user experience.
- Incident Management: Lead the response to incidents, perform root cause analysis (RCA), and put in place preventive measures to reduce recurring issues.
- Collaboration: Work closely with Development, QA, and Operations teams to build processes and solutions that optimize the balance between development velocity and system reliability.
- Continuous Improvement: Help drive the adoption of best practices across engineering teams, improve our deployment pipelines, and ensure systems are secure, highly available, and well-documented.
- Performance Optimization: Monitor and optimize system performance, identify bottlenecks, and implement effective solutions.
Requirements:
- 3+ years of experience in a Site Reliability Engineering, DevOps, or similar role in a SaaS environment or large-scale distributed systems.
- Proficiency in cloud platforms such as AWS, Azure, or GCP.
- Strong experience with containerization technologies (Docker, Kubernetes).
- Proficient in scripting and automation (e.g., Python, Bash, Go).
- Familiarity with CI/CD pipelines and related tools (e.g., Jenkins, GitLab, CircleCI).
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, ELK stack).
- Knowledge of infrastructure-as-code tools (e.g., Terraform, CloudFormation).
- Familiarity with incident response, postmortems, and continuous improvement processes.
- Strong troubleshooting and problem-solving skills in distributed systems.
- Excellent communication skills with the ability to work cross-functionally with product, engineering, and operations teams.
- A degree in Computer Science, Engineering, or a related field is preferred, though relevant experience is valued.
Nice to Have:
- Experience with service mesh technologies (e.g., Istio).
- Background in microservices architecture and its challenges.
- Familiarity with security best practices in cloud-based systems.
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Full Stack Engineer
2026-05-26
Java Software Engineer
2026-05-27
Cyber Security Engineer
2026-05-27
- Posted
- Feb 04, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Dublin
- Company
- Solas IT Recruitment
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Full Stack Engineer
2026-05-26
Java Software Engineer
2026-05-27
Cyber Security Engineer
2026-05-27