Intellias
DevOps Engineer (Kubernetes & New Relic)
IntelliasPortugal6 days ago
Full-timeRemote FriendlyInformation Technology, Engineering +1

We are looking for an experienced Kubernetes & Observability Engineer to join our team and help us build robust, scalable, and observable infrastructure. In this role, you will be responsible for designing and maintaining Kubernetes clusters, implementing best practices for monitoring and alerting using New Relic, and ensuring high availability and performance of our systems. If you're passionate about cloud-native technologies, automation, and delivering reliable solutions - we’d love to hear from you.


Project Overview:

Our customer is a multinational corporation with more than a century of history and offices in over 180 countries. Their most ambitious goal at the time is to introduce a range of Reduced-Risk Products (RRPs). The target audience is more than 1 billion of consumers around the globe. IT platform hosts 700+ applications.

Intellias mission is to help the client with the engineering of a comprehensive software ecosystem for a game-changing IoT product on the margin of innovative consumer experience and cutting-edge technology. Our teams are involved in the engineering of core platform components for best in class eCommerce, Digital Marketing and IoT solutions. As a Cloud engineer you will become a part of Core Architecture Team and be responsible for the architecture, implementation of best practices in our Digital Engineering Enterprice Platform.

The Platform is a set of services and internet applications that accelerate the development and delivery of software applications by taking care of common SDLC challenges. The Platform provides access and consumption for engineering teams to a set of services, technologies, practices for their development and for operating their application, ensure a set of compliance and best practices.

Project is in production for 2+ years, being supported by multiple teams.


Our technical domains are:

  • AWS cloud, partially Azure;
  • SSO, Organizations, Service control policies, access models;
  • IAAC: terraform enterprise, terratest, chalice;
  • Serverless: lambda, step functions, wide range of misc automations, fargate;
  • System, Application, Network and security architectures;
  • Orchecstration: k8s (eks);
  • SRE activities (logging, tracing, monitoring), OpsGenie, Splunk;
  • Hashicorp Vault;
  • Hybrid Networking.


Requirements:

Technical Skills:

- Strong experience with Kubernetes:

  • Deployment, scaling, and management of containerized applications;
  • Helm charts, namespaces, RBAC, network policies;
  • Troubleshooting pods, services, ingress controllers;

- Observability & Monitoring:

  • Hands-on experience with New Relic:

. Setting up APM, infrastructure monitoring, dashboards;

. Custom instrumentation and alerting;

. Integration with Kubernetes clusters;

  • Understanding of metrics, logs, traces (OpenTelemetry is a plus);

- Infrastructure as Code:

  • Experience with Terraform or similar tools;

- CI/CD pipelines:

  • Integration of observability tools into deployment workflows;

- Cloud platforms:

  • Experience with AWS / GCP / Azure (at least one);

Soft Skills & Collaboration:

  • Ability to work closely with DevOps, SRE, and development teams;
  • Strong analytical and problem-solving skills;
  • Clear communication of technical issues and solutions;
  • Proactive in identifying performance bottlenecks and reliability risks;

Nice to Have:

  • Experience with other observability tools (Grafana, Prometheus, Datadog);
  • Familiarity with service meshes (Istio, Linkerd);
  • Scripting (Python, Bash, Go).


Responsibilities:

  • Design, build, and maintain software delivery pipelines and infrastructure that support continuous integration, delivery, and deployment.
  • Collaborate with development and operations teams to ensure that software is delivered with high quality, speed, and reliability.
  • Automate manual processes, such as testing, deployment, and monitoring, to improve efficiency and reduce errors.
  • Develop and maintain monitoring and alerting systems to proactively identify and address issues in production environments.
  • Troubleshoot production issues, conducting root cause analysis and implementing remediation plans.
  • Manage and scale infrastructure resources, such as servers, databases, and cloud services, to ensure optimal performance and cost-effectiveness.
  • Implement security best practices and ensure compliance with industry standards and regulations.
  • Continuously learn and keep up to date with new technologies and industry trends to improve system performance, security, and efficiency.


Why this position:

At Intellias, where technology takes center stage, people always come before processes. We're dedicated to cultivating a tech-savvy environment that empowers individuals to unlock their true potential and achieve extraordinary results. Our customized benefits not only prioritize your well-being but also charge your professional growth, making this opportunity an ideal match for tech enthusiasts like you.

Key Skills

Ranked by relevance