GRAI
Python Engineer
GRAIPoland2 days ago
Full-timeEngineering, Information Technology
We are looking for a mid-level Python Engineer to strengthen our team, with a primary focus on developing and maintaining robust data scraping solutions. You will work on extracting complex datasets from a wide variety of online sources, ensuring scalability, reliability, and high-quality data output. The role requires strong practical experience and a proactive attitude toward solving non-trivial scraping challenges.

You will:

  • Design and implement efficient, scalable web scraping pipelines using Python.
  • Analyze and reverse-engineer the structure of various online resources to extract structured and semi-structured data.
  • Develop and maintain crawlers and parsers for diverse content types.
  • Ensure reliability and stability of scraping solutions (e.g., handling anti-bot protections, proxies, headless browsers).
  • Collaborate with data engineers and product teams to deliver clean, normalized, and production-ready datasets.
  • Maintain best practices around legal and ethical scraping (robots.txt, rate limiting, terms of service compliance).

We'd like to see:

  • 3+ years of Python development experience, including solid experience with web scraping frameworks (requests, BeautifulSoup, Scrapy, Selenium or alternatives).
  • Strong understanding of web technologies: HTML, CSS, JavaScript, and browser DOM behavior.
  • Practical knowledge of parsing complex web pages and dynamic content.
  • Experience working with APIs and designing scraping logic for different data formats (JSON, XML, etc.).
  • Familiarity with common scraping challenges (e.g., captchas, user-agent spoofing, proxy rotation) and solutions.
  • Experience storing and processing scraped data efficiently (SQL/NoSQL databases).
  • Good communication skills and ability to work autonomously on mid-sized projects.

⭐ Nice to Have:

  • Experience deploying scraping workloads to the cloud (AWS, GCP, Azure).
  • Knowledge of distributed scraping architectures.
  • Familiarity with Docker and orchestration tools like Airflow.
  • Understanding of handling large-scale media or binary data scraping.

Key Skills

Ranked by relevance