HCLTech
Site Reliability Engineer (SRE)
HCLTechSingapore21 hours ago
Full-timeRemote FriendlyInformation Technology

Job Purpose


The Site Reliability Engineer (SRE) combines software development and system leading to build and run distributed solutions in a secured multi-tier heterogeneous environment to safeguard, provide and continuously improve the software and systems behind the organization’s cloud platform solutions.


The Job

  • Work closely with Cloud Portfolio Director and the Cloud Engagement Lead to derive and implement a technical roadmap that is align with the IT architecture and cloud strategy.
  • Work closely with a team of engineers of different domains to implement new and enhanced services the cloud platform.
  • Be the solution engineer for the cloud platform, which includes new services via IaC on IaaS across Hybrid Cloud.
  • Leverages on software as the primary tool to optimizing systems, building infrastructure and removing mundane work through automation. Scale sustainably via automation and evolve services/solutions, leveraging IaaS across hybrid cloud to drive changes that improves reliability and velocity.
  • Engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • Automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediate actions.
  • Support cloud adoption including new system and legacy system migration from traditional infrastructure to GEL Cloud platform
  • Deploy product updates as required while implementing integrations when they arise. Specifying, documenting and developing new product features, and writing automating scripts.

Our Requirements

  • Strong hands on experience with using and designing VMware Cloud Foundry solution including NSX-T, vRealize Suite, vSphere/vCenter is mandatory.
  • Ansible Automation Platform or Ansible Skills is an added advantage
  • Strong hands on experience with using and designing Public Clouds (AWS, GCP or Azure) landing zone and infrastructure layer
  • Strong working experience on patch management for operating systems and middleware is mandatory. Eg, Windows, RedHat, Websphere, Weblogic, MSSQL etc.
  • Hands on experience with creation and maintenance of VMware server templating/blueprints such as RedHat, Windows server templates.
  • Hands on experience with infrastructure-as-code, orchestration, configuration management and provisioning tools is mandatory.
  • Systematic problem-solving approach, coupled with effective communications skills and a sense of ownership and drive.
  • Strong experience in a Continuous Integration/Continuous Delivery (CI/CD) environment with strong appreciation of change/version control process and methodologies.
  • Worked with DevOps and Automation tools (E.g. Selenium, SOAPUI, Bamboo, Jenkins, Ansible, Marvin, Github, Bitbucket, Nexus, Jira, Confluence etc).
  • Must code, debug and optimize code and automate mundane tasks.
  • Experience in scripting languages such as Bash, Batch, Powershell, YAML etc.
  • Experience implementing distributed solutions in a secured multi-tier heterogeneous environment is mandatory.

Key Skills

Ranked by relevance