Main responsibilities:
1. Ensure the stability of the application system and deploy it online in new regions;
2. Maintain the stability of daily systems across multiple cloud vendors;
3. Carry out CICD and DevOps construction;
4. Assist in the deployment and upgrade of application systems to ensure a smooth transition.
5. Maintain operation and maintenance documents, record system changes and operation manuals.
6. Provide technical support to solve technical problems raised by customers and internal teams.
7. Collaborate with the development team to support the development and testing environment of application systems.
Job requirements:
Educational background:
1. College degree or above in computer science, information technology, software engineering or related majors (equivalent overseas education).
hands-on background:
At least 3 years of SRE operation and maintenance experience.
2. Experience in SRE operations and maintenance for large enterprises or multinational corporations is preferred.
Certification requirements:
1. Familiar with ITIL (Information Technology Infrastructure Library) framework and best practices for application operation and maintenance.
2. Holders of relevant technical certifications (such as RHCE, MCSE, AWS, Azure, CKA, SRE Foundation/Professional, etc.) are preferred.
Skill requirements:
1. Proficient in Kubernetes (k8s) container orchestration, containerization technology (such as Docker), and deployment, operation, monitoring, and troubleshooting of microservice architecture.
2. Proficient in relevant CI/CD toolchains (such as Jenkins, GitLab CI, Argo CD, etc.) and DevOps practices.
3. Proficient in Linux/Unix system management, familiar with common application operation and maintenance monitoring tools (such as Prometheus, Grafana, ELK/Loki, etc.) and automated configuration management tools (such as Ansible, Terraform, etc.) are preferred.
4. Have basic programming skills, at least one general-purpose programming or scripting language (such as Python, Go, Shell, etc.).
5. Familiar with the core services of cloud platforms such as AWS, GCP, Azure, etc., with experience in cloud environment operation and maintenance.
6. Experience in managing databases such as MySQL, PostgreSQL, Redis, etc. is preferred.
7. Good communication and coordination skills, as well as teamwork spirit, enable efficient collaboration with development teams, testing teams, and other technical departments to jointly ensure system stability and promote continuous improvement.
Language requirements:
1. Fluent English communication skills, capable of conducting professional oral and written communication.
2. Chinese or other foreign language proficiency is preferred (such as Spanish, Malay, German, Arabic, etc.).
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Site Reliability Engineer
2026-03-02
Site Reliability Engineer (SRE) Mid-Level / Senior, Portugal
2026-04-11
SRE Engineer
2026-01-05
- Posted
- Apr 07, 2026
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Bucharest
- Company
- ThunderSoft
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Site Reliability Engineer
2026-03-02
Site Reliability Engineer (SRE) Mid-Level / Senior, Portugal
2026-04-11
SRE Engineer
2026-01-05