Smartbrain.io
Monitoring and Observability Engineer
Smartbrain.ioNorway2 days ago
Full-timeRemote FriendlyEngineering, Information Technology
Full-time

Remotely

Grafana Python

This role involves designing, implementing, and managing comprehensive monitoring solutions using Prometheus, Grafana, SNMP-Exporter, Streaming Telemetry, OpenTelemetry, and other related technologies.

Responsibilities

  • Design, implement, and manage Prometheus-based monitoring solutions, including configurations and alert rules.
  • Develop and maintain interactive and visually appealing Grafana dashboards.
  • Configure SNMP modules/jobs to scrape SNMP metrics for different network technologies in a very optimized way.
  • Strong knowledge of Git to be able to clone working branches, develop, and commit to the main branch. Or other approaches, but show a strong hold on Git usage.
  • Identify and onboard new metrics from various systems and applications, developing data pipelines for metrics collection and storage.
  • Optimize and scale monitoring environments to handle large volumes of metrics and ensure comprehensive monitoring coverage.
  • Implement and manage Streaming Telemetry solutions for real-time data collection and monitoring.
  • Integrate and manage OpenTelemetry for comprehensive tracing and observability across services.
  • Troubleshoot and resolve issues related to data collection, monitoring configurations, and dashboard performance.
  • Ensure proper instrumentation of applications and infrastructure with DevOps, development, and operations teams.
  • Document configurations, procedures, and provide training to team members and stakeholders.

Skills

  • Familiarity with network monitoring tools and practices.
  • Extensive experience with Prometheus and related technologies (Alertmanager, Pushgateway, etc.).
  • Strong knowledge of time-series databases and monitoring concepts.
  • Proficiency in writing Prometheus queries (PromQL).
  • Strong experience with Grafana and its ecosystem.
  • Proficiency in creating and managing Grafana dashboards and panels.
  • Knowledge of data visualization principles and best practices.
  • Familiarity with monitoring and observability tools and practices.
  • Strong knowledge of SNMP protocols and network device management.
  • Experience with SNMP-Exporter and its integration with Prometheus.
  • Strong in SNMP module creation and scrape congas for various network technologies.
  • Strong Git experience.
  • Strong understanding of metrics and monitoring concepts.
  • Experience with metrics collection tools (Prometheus, Telegraf, Collectd, etc.).
  • Experience with Streaming Telemetry solutions for real-time monitoring.
  • Experience with OpenTelemetry for tracing and observability.
  • Familiarity with Linux/Unix systems and scripting languages (Bash, Python).
  • Experience with containerization and orchestration tools (Docker, Kubernetes).

Qualification

  • Bachelor’s degree in Computer Science, Engineering, or related.
  • 5+ years of experience in monitoring and observability roles.
  • Proficiency in tools like Prometheus, Grafana, PromQL, Alertmanager, Alert Framework, GitHub, SNMP-exporter, Streaming-Telemetry, Otel.
  • Strong coding and scripting skills.
  • Excellent problem-solving abilities and attention to detail.
  • Strong communication and teamwork skills.

Key Skills

Ranked by relevance