Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
SRE Devops
Dublin/Hybrid
Permanent
The Role
- Plan, manage, and oversee all aspects of a Production Environment
- Define strategies for Application Performance Monitoring, Optimization in Prod environment
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Support deployment of code into multiple lower environments. Supporting current processes with an emphasis on automating everything as soon as possible.
- Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.
- Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones.
- Ability to share knowledge and explain processes and procedures to others.
- Share knowledge and mentor junior resources
- Able to perform on-call duties on a rotational basis.
- Occasional off hours work required.
Requirements
Key skills Must to have
Jenkins
Chef
Bash
Splunk
Dynatrace
Linux
Bit Bucket
Problem Management
ITIL
Remedy
Good To have
Python
AWS * Migrating to AWS
Key Responsibilities
What You’ll Do:
•Demonstrate and innovate SRE practices by collaborating with stakeholders to implement important SRE principles and objectives and create new practices where applicable.
•Partner with product and platform teams to define and track service level objectives (SLOs) and indicators (SLIs).
•Monitor and manage system reliability performance, ensuring systems meet SLOs.
•Communicate reliability concerns and their potential impact with key stakeholders.
•Promote the prioritization of reliability throughout the software development life cycle.
•Design, code, test, and deliver solutions to automate manual operations.
•Participate in on-call rotations, provide support for SRE systems, and lead or participate in post-mortem incident analysis.
•Engage in system design, capacity planning, and architecture discussions to ensure operational requirements are met.
•Share lessons learned and best practices regarding reliability and performance with stakeholders and team members.
•Assist in training and mentoring fellow junior SREs to ensure best practices are followed and scaled within the organization.
•Pursue continuous improvement opportunities to stay up to date on SRE methods and trends and participate in organizational learning initiatives.
•Support governance and ensure compliance with policies by collaborating with security, compliance, and other teams.
•Respond promptly to requests for assistance from technical customers, providing engineering support and best-practice guidance.
•Adhere to and suggest improvements to standard operating procedures, advocate for automation and workflow optimization.
Team Specific Skills
It is not expected that any single candidate would have expertise across all these areas, but a Biz Ops engineer will spend time throughout their career with various aspects of the role:
Operational Resiliency Architect:
•Support application health, performance, and capacity.
•Assist in system design consulting, capacity planning, and launch reviews.
•Collaborate with development and product teams to establish monitoring and alerting strategies. DevOps/Automation:
•Engage in development, automation, and business process improvement.
•Support CI/CD pipelines and promote software into higher environments.
•Increase automation and tooling to reduce manual intervention
ITSM Practices:
•Analyze ITSM activities and provide feedback to development teams on operational gaps or resiliency concerns.
•Perform root cause analysis of incidents and work with development teams to resolve issues.
Key Skills
Ranked by relevanceReady to apply?
Join Fulcrum Digital Inc and take your career to the next level!
Application takes less than 5 minutes

