Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Lenovo is a US$69 billion revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, Lenovo has built on its success as the world’s largest PC company with a full-stack portfolio of AI-enabled, AI-ready, and AI-optimized devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage, edge, high performance computing and software defined infrastructure), software, solutions, and services. Lenovo’s continued investment in world-changing innovation is building a more equitable, trustworthy, and smarter future for everyone, everywhere. Lenovo is listed on the Hong Kong stock exchange under Lenovo Group Limited (HKSE: 992) (ADR: LNVGY).
This transformation together with Lenovo’s world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit www.lenovo.com, and read about the latest news via our StoryHub.
We are seeking a skilled Golang Engineer to design and implement comprehensive monitoring and observability solutions for our cloud infrastructure. This role is responsible for building scalable monitoring systems that provide real-time visibility into system health and performance across Linux, OpenStack, and Kubernetes environments.
Key Responsibilities
Design and develop core components of Kubernetes-based container platforms using Golang, focusing on control plane extensions, operators, and cloud-native service meshes .
Implement and optimize Kubernetes networking (CNI plugins like Calico/Cilium) and storage solutions (CSI drivers, Rook/Ceph integration), addressing challenges in multi-tenant isolation and high-throughput data paths .
Troubleshoot deep-level Kubernetes issues (e.g., etcd corruption, kube-scheduler deadlocks, CNI policy conflicts) using Golang debugging tools (pprof, delve) and log analysis .
Build automation frameworks for cluster lifecycle management, security hardening, and observability using Golang (primary) and Python (secondary for scripting) .
Collaborate with infrastructure teams to align platform capabilities with AI workload requirements, optimizing resource scheduling for GPU/accelerator workloads .
Qualifications
Technical Expertise:
- Mastery of Golang: 3+ years building production-grade systems with Goroutines, interfaces, and standard library (e.g., net/http, k8s.io/client-go) .
- Kubernetes Internals: Deep understanding of control plane components (API server, scheduler, controller manager) and ability to extend via CRDs/Operators .
- Network/Storage Proficiency: Hands-on experience selecting and implementing CNI (VXLAN/BGP modes) and CSI solutions (RBD, iSCSI), with performance benchmarking skills .
- Linux/Container Expertise: Proficient in cgroups, namespaces, and container runtimes (containerd, CRI-O) for debugging resource leaks or security flaws .
- 3+ years developing cloud infrastructure with Golang as primary language, including at least one major Kubernetes platform project (e.g., cluster autoscaler, custom scheduler) .
- Demonstrated ability to resolve critical production issues (e.g., etcd leader election failures, network policy drops) in large-scale clusters (1k nodes) .
- Rigorous analytical approach to system design and failure root-cause analysis.
- Ability to document complex technical concepts for cross-team alignment.
- Kubernetes SIG contributions (e.g., networking, storage, or scheduling working groups) .
- Experience with eBPF-based tools (Cilium, Pixie) for advanced network observability .
- Proficiency in Python for infrastructure scripting (Ansible/Terraform integrations) or Java for enterprise service interoperability .
- Familiarity with service meshes (Linkerd, Istio) and GitOps pipelines (Argo CD, Flux) .
- Knowledge of cloud-native security (OPA/Gatekeeper, Kyverno) and AI/ML workload optimization .
Key Skills
Ranked by relevanceReady to apply?
Join Lenovo and take your career to the next level!
Application takes less than 5 minutes

