Detailed description of work task to be carried out
* Working with Spark & Python to define and maintain data ingestions and transformation
* Building distributed and highly parallelized big data processing pipeline which process massive amount of data (both structured and unstructured data) in near real-time
* Leverage Spark to enrich and transform corporate data to enable searching, data visualization, and advanced analytics
* Working closely with analysts and business stakeholders to develop analytics models
* Collaboration with Data Scientist and Machine Learning experts
* Providing detailed information about progress during PI Demo sessions
Description of knowledge and experience
MUST HAVE
* Experience in Spark & Python - minimum 4 years
* Strong Spark SQL or Hive SQL - minimum 4 years
* Experience with Hadoop/Hive ecosystem and/or other BIG Data technologies minimum 3 years
* Previous experience in creating data flows (ETL's, ELT's, ect.)
* Experience with BitBucket and GIT, code versioning and branching strategy
* Familiar with Agile/Safe framework
Big PLUS to have:
* Cloud Experience is a big plus (AWS, GCP, AZURE)
* Basic knowledge about machine learning models ( how to build, validate and maintenance regression models )
NICE to have:
* Scala /MLOPS experience
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Senior Data Software Engineer
2026-04-08
Senior Data Scientist
2026-04-09
Technical Product Manager
2026-04-10
- Posted
- Dec 26, 2024
- Type
- Contract
- Level
- Mid-Senior
- Location
- Poland
- Company
- Infosys
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Senior Data Software Engineer
2026-04-08
Senior Data Scientist
2026-04-09
Technical Product Manager
2026-04-10