Senior Data Engineer
Role details
Job location
Tech stack
Job description
We are looking for a Senior Data Engineer to optimize, secure, and industrialize the data pipelines powering our AI products. Your role is critical: you ensure the quality, availability, and performance of the data used daily by AI, Product, and DevOps teams., * Design, develop, and maintain robust, automated, and scalable data pipelines.
- Ensure data quality, security, and reliability throughout the entire data lifecycle.
- Define and maintain infrastructure as code for data-related services.
- Build and maintain dashboards, monitoring tools, and reports for internal teams.
- Work closely with Data Science, DevOps, and Product teams to ensure data consistency and value.
- Monitor and optimize performance using observability tools (Datadog, Grafana, Prometheus).
Requirements
Do you have experience in SQL?, Do you have a Master's degree?, * Master's degree (or equivalent) in computer science, data engineering, or AI.
- 5+ years of experience in Data Engineering, ideally in cloud and AI-driven environments.
- Excellent command of Python and software engineering best practices (testing, versioning, packaging).
- Strong knowledge of SQL and NoSQL databases (PostgreSQL, DynamoDB).
- Solid experience with workflow automation (Airflow, GitHub Actions, GitLab CI/CD).
- Strong understanding of MLOps concepts, data integration into ML workflows, monitoring, and deployment.
- Cloud experience on AWS or GCP (S3, Lambda, RDS, Step Functions).
- Knowledge of Docker and containerized environments.
Soft skills
- Strong technical rigor and constant focus on quality.
- High level of autonomy and ability to own a broad scope.
- Clear, structured communication with a collaborative mindset.
- Ability to work with cross-functional teams.
- Analytical mindset and attention to detail.
What Will Make the Difference
- Proven experience running critical production data pipelines.
- Advanced practice of data observability (logs, metrics, alerting).
- Open-source contributions in the data or ML ecosystem.
- Proactive approach to continuous improvement of data workflows and environments.
- Sensitivity to the environmental or societal impact of technology.
Tech Stack
- Languages: Python
- Databases: PostgreSQL, DynamoDB
- Pipelines: GitHub Actions
- Cloud: AWS (S3, Lambda, RDS, Step Functions), GCP
- Containerization: Docker
- Observability: Datadog, Grafana, Prometheus
- MLOps: MLflow, SageMaker
Benefits
Hybrid work
"Contrat cadre" and RTT (between 8-12 per year depending on the number of public holidays in the current year)
A Mac or PC depending on your preferences