Aarorn Technologies Inc
Site Reliability Engineer
Aarorn Technologies IncCanada9 hours ago
ContractRemote FriendlyEngineering, Information Technology
Job Title: Site Reliability Engineer

Location: Toronto, ON (Hybrid - 4x Onsite a Week)

Employment Type: Contract Opportunity

Interview Type: Face 2 Face (Onsite Interview Only)

Job Description

What is the Opportunity?

Seeking to hire a Senior Site Reliability Engineer for its Application Maintenance and Transformation, Data Services and Integration team. As a Senior Site Reliability Engineer, you will bring the engineering mindset of bold ambition, curiosity and outcome focus to ensuring the performance and reliability of our systems. This role calls for a dynamic individual who excels in a collaborative environment, interacting with cross-functional teams to establish best practices for observability, monitoring, logging, alerting, and automation.

What will you do?

  • Set vision for SRE product base (monitoring, alerting, self-healing, reliability testing).
  • Lead cross-functional collaborations to define and implement best practices for monitoring, logging, and incident response, driving a proactive stance on system health.
  • Function as portfolio SME (Subject Matter Expert) – understand & document common components, core functionalities, infrastructure of supported applications.
  • Actively participate in deploying software applications, automation tools, and IT infrastructure.
  • Work closely with development teams to understand code changes and their impact on the production environment, ensuring that new releases meet our reliability standards.
  • Drive transformation by continuously looking for ways to automate existing SRE processes and increase operational efficiency.
  • Guide the technical direction for future deployments, advocating for reliability and performance improvements based on industry trends and company objectives.
  • Lead in incident management and problem management for applications in scope and RCA action items fulfillment/ownership.
  • Debug production issues across services and levels of the stack and provide primary operational support.
  • Perform occasional off-hours support.

Must-have

  • Bachelor’s degree in Computer Science, Electrical or Electronics Engineering or related field or equivalent experience.
  • 3+ years IT experience in software development and/or maintenance or SRE or DevOps Engineering experience.
  • 1+ years experience building Java Spring boot applications and rest API development.
  • Experience working on relational databases – MS-SQL Server or MySQL, MariaDB and SingleStore or in-memory distributed databases.
  • Experience working on Containerization platforms such as Docker and container orchestration tools like Kubernetes (Azure Kubernetes or OpenShift Kubernetes Service preferred).
  • Solid Git skills with experience working on popular CI tools - Jenkins or UCD
  • Experience working on Windows and Linux based infrastructure.
  • 1+ years developing cloud-native applications using Java or Python.
  • Experience writing SQL queries and fine tuning or optimization skills.
  • Experience using centralized logging solutions (Splunk, Elk (preferred), etc.) and active monitoring systems (Dynatrace, etc.)
  • Experience deploying and operating cloud-native applications in a Private (OpenShift) or public cloud (Azure/AWS preferred)
  • In-depth and proactive communication skills around status of projects/issues in production
  • Must be a self-starter, motivated, resourceful, and driven to work with cross functional teams in large enterprises with complex org structures to meet business timelines on delivery.
  • Financial Services domain knowledge preferably Capital Markets and Wealth Management.

Nice-to-have

  • Experience implementing dashboards to help teams visualize logs, instrumentation, and other data to ensure optimal performance of the platform services, infra, and deployed applications (Grafana preferred).
  • Exposure to Datawarehouse’s like Informatica, Snowflake or Databricks and Business intelligence tools like SAP BO or similar.
  • Experience creating runbooks, processes, and test plans around reliability, performance, etc. of infrastructure and applications.
  • Exposure to PagerDuty, Postman, ServiceNow, SonarQube, NexusIQ and vault tools.
  • Exposure to event brokers like Kafka or IBM-MQ, Mainframe tools and environment,
  • Exposure to Industry Disaster recovery test exercises.

Key Skills

Ranked by relevance