Network and Systems Operations Engineer (Family Networking)

Coherent Solutions is a digital product engineering company focused on empowering business success. Our global team of 1800+ talented professionals collaborate seamlessly to deliver innovative solutions that drive measurable business impact. Headquartered in Minneapolis, USA, the company’s core competencies across 10 locations worldwide include product software development, IT consulting, data and analytics, machine learning, mobile app development, DevOps, Salesforce, and more.


We grow a team of advisors, not just order takers, and strive to make the company a place for career growth and opportunities. If you want to grow your core competencies, share your passion and be sure that every contribution is evaluated, we are on the same page.


Company Background

Our client is a publicly traded technology company focused on family safety and connectivity, serving millions of users across 140 countries. Their platform provides real-time location sharing, crash detection, roadside assistance, and other safety features. The company operates in a Remote First environment, fostering inclusivity, innovation, and collaboration.


Project Description

The Network and Systems Operations (NSO) Team is part of Cloud Operations, supporting over 325 engineers. The team's mission is twofold:

  • Providing world-class observability infrastructure and tooling for system monitoring and reporting;
  • L1 service support and incident management, ensuring high availability and reliability of services.

The role involves monitoring, responding to alerts, and executing runbooks to resolve service issues. The system comprises dozens of microservices, all requiring tracking, reporting, and optimization. The position requires strong troubleshooting skills, familiarity with observability tools, and a proactive approach to automation.


Technologies

  • Prometheus
  • Grafana
  • Datadog
  • Java
  • Python
  • Shell
  • Ruby
  • Docker
  • Kubernetes
  • AWS
  • Terraform
  • CloudFormation
  • Chef
  • Ansible


What You'll Do

  • Use tools such as Prometheus, Grafana, and Datadog to create and maintain observability infrastructure and tooling, including creating alerts, production reporting, and writing documentation;
  • Serve as a member of L1 support, working alone or with teammates to answer pages for all onboarded services and resolve or escalate issues in a timely manner;
  • Utilize anomaly detection and alerting, respond to alerts in PagerDuty, drive incidents to their conclusion, and lead the effort to strengthen the system based on post-mortem action items;
  • Coordinate cross-team and cross-functional efforts with processes, documentation, and tooling to ensure operational excellence;


Job Requirements

  • Bachelor's in Computer Science, Engineering, related field or equivalent practical experience;
  • 5+ years experience writing/reading/debugging code in one or more languages, such as Java, Python, Shell, Ruby;
  • 5+ years experience working with large-scale distributed systems and managing Linux-based systems in a cloud like AWS;
  • In depth experience with large scale observability and reporting systems (New Relic, Datadog, Elastic, Prometheus, etc.);
  • 3+ years experience with solutions such as Docker, Kubernetes, system virtualization, cloud monitoring and logging;
  • 3+ years experience with IaC and config management tools such as Terraform, Cloudformation, Chef, Ansible, and similar;
  • Experience working as part of a team, using analytical, problem-solving skills;
  • Excellent troubleshooting and attention to detail;
  • Ability to quickly learn new technologies and follow industry trends;
  • Ability to analyze and optimize high-traffic internet applications;


What Do We Offer

The global benefits package includes:

  • Technical and non-technical training for professional and personal growth;
  • Internal conferences and meetups to learn from industry experts;
  • Support and mentorship from an experienced employee to help you professional grow and development;
  • Internal startup incubator;
  • Health insurance;
  • English courses;
  • Sports activities to promote a healthy lifestyle;
  • Flexible work options, including remote and hybrid opportunities;
  • Referral program for bringing in new talent;
  • Work anniversary program and additional vacation days.


Please take you time to see Coherent Solutions Privacy Policy for Job Applicants for details on how we process your personal data: https://www.coherentsolutions.com/privacy-policy-for-job-applicants

Post Date
2025-05-08
Job Type
REMOTE
Employment type
Full-time
Category
Engineering, Information Technology
Level
Mid-Senior
Country
Romania
Industry
Software Development ,
Coherent Solutions Romania*******