Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Location: Toronto, ON (Hybrid - 4x Onsite a Week)
Employment Type: Contract Opportunity
Interview Type: Face 2 Face (Onsite Interview Only)
Job Description
What is the Opportunity?
Seeking to hire a Senior Site Reliability Engineer for its Application Maintenance and Transformation, Data Services and Integration team. As a Senior Site Reliability Engineer, you will bring the engineering mindset of bold ambition, curiosity and outcome focus to ensuring the performance and reliability of our systems. This role calls for a dynamic individual who excels in a collaborative environment, interacting with cross-functional teams to establish best practices for observability, monitoring, logging, alerting, and automation.
What will you do?
- Set vision for SRE product base (monitoring, alerting, self-healing, reliability testing).
- Lead cross-functional collaborations to define and implement best practices for monitoring, logging, and incident response, driving a proactive stance on system health.
- Function as portfolio SME (Subject Matter Expert) – understand & document common components, core functionalities, infrastructure of supported applications.
- Actively participate in deploying software applications, automation tools, and IT infrastructure.
- Work closely with development teams to understand code changes and their impact on the production environment, ensuring that new releases meet our reliability standards.
- Drive transformation by continuously looking for ways to automate existing SRE processes and increase operational efficiency.
- Guide the technical direction for future deployments, advocating for reliability and performance improvements based on industry trends and company objectives.
- Lead in incident management and problem management for applications in scope and RCA action items fulfillment/ownership.
- Debug production issues across services and levels of the stack and provide primary operational support.
- Perform occasional off-hours support.
- Bachelor’s degree in Computer Science, Electrical or Electronics Engineering or related field or equivalent experience.
- 3+ years IT experience in software development and/or maintenance or SRE or DevOps Engineering experience.
- 1+ years experience building Java Spring boot applications and rest API development.
- Experience working on relational databases – MS-SQL Server or MySQL, MariaDB and SingleStore or in-memory distributed databases.
- Experience working on Containerization platforms such as Docker and container orchestration tools like Kubernetes (Azure Kubernetes or OpenShift Kubernetes Service preferred).
- Solid Git skills with experience working on popular CI tools - Jenkins or UCD
- Experience working on Windows and Linux based infrastructure.
- 1+ years developing cloud-native applications using Java or Python.
- Experience writing SQL queries and fine tuning or optimization skills.
- Experience using centralized logging solutions (Splunk, Elk (preferred), etc.) and active monitoring systems (Dynatrace, etc.)
- Experience deploying and operating cloud-native applications in a Private (OpenShift) or public cloud (Azure/AWS preferred)
- In-depth and proactive communication skills around status of projects/issues in production
- Must be a self-starter, motivated, resourceful, and driven to work with cross functional teams in large enterprises with complex org structures to meet business timelines on delivery.
- Financial Services domain knowledge preferably Capital Markets and Wealth Management.
- Experience implementing dashboards to help teams visualize logs, instrumentation, and other data to ensure optimal performance of the platform services, infra, and deployed applications (Grafana preferred).
- Exposure to Datawarehouse’s like Informatica, Snowflake or Databricks and Business intelligence tools like SAP BO or similar.
- Experience creating runbooks, processes, and test plans around reliability, performance, etc. of infrastructure and applications.
- Exposure to PagerDuty, Postman, ServiceNow, SonarQube, NexusIQ and vault tools.
- Exposure to event brokers like Kafka or IBM-MQ, Mainframe tools and environment,
- Exposure to Industry Disaster recovery test exercises.
Key Skills
Ranked by relevanceReady to apply?
Join Aarorn Technologies Inc and take your career to the next level!
Application takes less than 5 minutes