Data Scientist
Role details
Job location
Tech stack
Requirements
Responsibilities * Design, develop, and maintain scalable data pipelines using PySpark to efficiently process large volumes of data. * Develop, test, and optimize Machine Learning models for predictive analytics, including regression, classification, and clustering techniques. * Manage and deploy Machine Learning environments and workflows using AWS SageMaker, while leveraging Athena for data querying and analysis. * Integrate Machine Learning models into production pipelines, ensuring proper monitoring, maintenance, and continuous improvement through MLOps best practices. * Collaborate closely with cross-functional teams to understand business challenges and translate them into data-driven solutions. * Conduct advanced data analysis using Python and SQL to deliver actionable insights. * Work collaboratively with multidisciplinary teams using Git/GitHub and agile development methodologies. Qualifications * 2-3 years of experience in Data Engineering, Machine Learning, or similar roles. * Advanced proficiency in Python, including libraries such as Pandas, NumPy, and Scikit-learn. * Strong SQL skills for data querying, transformation, and analysis. * Hands-on experience with PySpark for large-scale data processing. * Solid understanding of Machine Learning algorithms, including regression, classification, clustering, and hyperparameter optimization techniques. * Experience working with AWS services, especially SageMaker and Athena. * Knowledge of MLOps practices and experience integrating and monitoring ML models in production environments. * Familiarity with Git and collaborative development workflows using GitHub. * Strong analytical thinking and problem-solving skills. * Curiosity, autonomy, attention to detail, and a collaborative mindset. Benefits * Flexible hours * Special timetable: Fridays and summer 7h * Individual budget for attending forums and training * English classes * Health insurance * Every three years, 6 extra days off * Day off