KINESSO Poland
Junior Data Scientist
KINESSO PolandPoland3 days ago
Full-timeProject Management

At KINESSO, we offer a unique perspective on the marketing landscape. We're building the future of performance marketing, fueled by a dynamic data ecosystem. This includes the breadth and depth of consumer data – encompassing demographics, lifestyle, purchase behavior, and more – combined with first-party client data and the rich, real-time signals from social media platforms and DSPs. This unparalleled data fusion enables the development and deployment of sophisticated AI and machine learning models, unlocking predictive capabilities and driving highly targeted, effective campaigns. The scale and complexity of this data presents a compelling challenge for those seeking to push the boundaries of what's possible in performance marketing. The company has more than 6,000 employees operating in more than 60 countries. Learn more at www.KINESSO.com.



Our Audience Console product empowers marketing agencies to target chosen audiences using high-quality, integrated data from multiple sources. We're committed to seamless data integration and analytics, delivering actionable insights across global markets. The position is situated within a Global Data Integration and Fusion Data Science team.



We are seeking a motivated Junior Data Scientist to join our dynamic team.

You will be responsible for running and tuning data fusion processes, executing data pipelines, performing data exploration, and assessing datasets. This role offers an excellent opportunity to develop your skills in a global environment and contribute to our innovative audience solutions.


What is Data Fusion?

Data fusion is an approach that matches respondent-level data across different datasets. This process links individuals from two or more datasets effectively combining all the content. The matching is performed using common characteristics found in the datasets, typically including demographic information, geographical data, and other relevant attributes. This powerful technique allows us to create more comprehensive and insightful datasets, enabling deeper analysis and more accurate predictions.


Responsibilities:


Data Fusion:

  • Execute and tune data fusion processes.
  • Analyze fusion methodologies and provide recommendations.
  • Perform data quality assurance on fusion outputs.

Data Pipelines:

  • Run and maintain ETL data pipelines.
  • Perform data preprocessing, cleaning, and transformation to ensure data is in a unified and consistent format suitable for analysis and modeling. This includes handling missing values, correcting inconsistencies, and standardizing data types.Identify and resolve data quality issues.

Data Exploration and Assessment:

  • Conduct exploratory data analysis on diverse datasets.
  • Assess data quality and identify potential issues.
  • Perform data wrangling to conform to expected schemas.

Taxonomy Analysis:

  • Perform taxonomy analysis, which involves the classification and organization of data into a structured system. This includes evaluating the quality and business value of existing taxonomies, as well as developing new taxonomies to improve data discoverability and usability.

Documentation and Support:

  • Create and update code documentation and user manuals.


Qualifications:


  • A degree with a focus on statistics, mathematics, computer science, data science, economics, or a comparable qualification.
  • Knowledge of Python and SQL.
  • Experience in data exploration and visualization using Jupyter Notebooks.
  • Basic knowledge of machine learning algorithms.
  • Very good written and spoken English skills, as most of our communication is international.
  • Familiarity with cloud computing environments and tools (experience with Snowflake is a plus).
  • Hands-on mentality, strong communication skills, and a willingness to learn.
  • Ability to work independently and in a self-organized manner, with high quality standards for your own work.

Nice to Have:

  • Experience with version control systems, particularly Git.
  • Knowledge of Large Language Models (LLMs) and their applications in data processing.

Key Skills

Ranked by relevance