Join Our Data Science Life Sciences Team
Are you interested in working with data and analytics to solve problems?
Are you interested in bringing your GenAI, ML and NLP expertise to projects?
About our Team
Data Science Life Sciences is a diverse team focusing on GenAI, ML, and NLP. We mainly develop best-in-class enrichment pipelines for Elsevier’s life science products such as Reaxys, Embase, and Pharmapendium.
About the Role
As a Senior Data Scientist, you will play a pivotal role in the development and deployment of cutting-edge Gen AI models and solutions. You will be responsible for building, testing, and maintaining our Gen AI, RAG, and NLP solutions. You will work throughout the whole life cycle of data science projects: design, implementation, production, and beyond. You will deliver efficient and production-ready Python code. You will collaborate closely with developers to deploy and productionize our data science pipelines and with subject matter experts in biology and chemistry domains to validate the output. This role requires a strong foundation in Natural Language Processing (NLP), Machine Learning, Transformer models, and Generative AI, as well as proficiency in Python.
Responsibilities
- Data collection, data analysis, model development, defining quality metrics, quality assessment of models, and regular presentations to stakeholders.
- Creating production-ready Python packages for each component of data science pipelines (such as pre-processing and model inference) and their deployment together with the software engineering team.
- Optimizing and customizing Retrieval Augmented Generation (RAG) pipelines to meet specific project requirements that involve content ingestion, machine translation, and contextualized information retrieval.
- Ingesting, preprocessing, and transforming large-scale multilingual data to ensure high-quality inputs for downstream models.
- Building AI agentic models integrated with RAG pipelines.
- Conducting rigorous testing and evaluation of AI models to ensure high performance and reliability.
- Integrating data science components and performing end-to-end quality assessments.
- Maintaining robustness of data science pipelines against model drift and ensuring consistent output quality.
- Establishing reporting processes for pipeline performance and developing automated re-training strategies for existing pipelines.
- Collaborating with cross-functional teams to integrate AI solutions into existing products and services.
- Leading and managing projects with a team of data scientists and independently executing entire small-scale projects.
- Mentoring junior data scientists and fostering a knowledge-sharing culture within the team.
- Staying up-to-date with the latest advancements in AI, machine learning, and NLP technologies.
Requirements
- Master’s or Ph.D. in Computer Science, Data Science, Artificial Intelligence, or a related field.
- 5+ years of relevant applied experience in data science, with a focus on Generative AI, NLP, and machine learning.
- Proficiency in Python for data analysis, model development, and deployment.
- Strong experience with transformer models.
- Proficiency in Generative AI technologies, including utilizing LLMs via API access, LLM evaluation tools, and prompt engineering.
- Knowledge of various RAG pipelines and their practical implementation.
- Experience building Agentic RAG systems is a strong requirement.
- Experience with AI agent management frameworks such as LangChain, or similar tools.
- Experience with advanced algorithms in deep learning, neural networks, reinforcement learning, and transfer learning.
- Familiarity with traditional machine learning algorithms such as random forests, SVM, logistic regression, and Bayesian modelling for model building, validation, and testing.
- Familiarity with cloud platforms (e.g., Bedrock, AWS, Azure) for model deployment and the creation of production-ready pipelines.
- Proficiency in data visualization tools and techniques.
- Experience with version control systems (e.g., GitLab or GitHub), Jira, and working in an Agile environment.
- Proficient in using OpenSearch and Databricks.
- Excellent problem-solving and analytical skills, with strong attention to detail.
- Strong communication skills and the ability to work effectively in a team-oriented environment.
Work in a way that works for you
We promote a healthy work/life balance across the organisation. We offer an appealing working prospect for our people. With numerous wellbeing initiatives, shared parental leave, study assistance, and sabbaticals, we will help you meet your immediate responsibilities and your long-term goals.
- Working flexible hours - flexing the times when you work in the day to help you fit everything in and work when you are the most productive
Working for you
We know that your well-being and happiness are key to a long and successful career. These are some of the benefits we are delighted to offer:
- Dutch Share Purchase Plan
- Annual Profit Share Bonus
- Comprehensive Pension Plan
- Home, office or commuting allowance
- Generous vacation entitlement and option for sabbatical leave
- Maternity, Paternity, Adoption and Family Care leave
- Flexible working hours
- Personal Choice budget
- Variety of online training courses and career roadshows
- Wellbeing programs and gym facility in the office
- Internal communities and networks
- Various employee discounts
- Recruitment introduction reward
- Work from anywhere
- Employee Assistance Program (global)
- Annual Event
About Data Science at Elsevier
Elsevier is one of the world’s leading publishers of trusted scientific, technology, and medical content. Building on that foundation of content, we leverage data science and AI to deliver knowledge and analytics products that advance science and improve healthcare outcomes. Therefore, an efficient, streamlined, and optimized data science development process is key to sustaining innovation and business growth, especially with the opportunities ahead with GenAI. Elsevier is committed to adhere to our RELX Responsible AI Principles, for the development and deployment of all of our AI solutions.