Sr. Developer

Life Technologies
Sudbury, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 186K

Job location

Remote
Sudbury, United States of America

Tech stack

Java
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Artificial Neural Networks
User Authentication
Bash
Cloud Database
Cloud Storage
Computer Programming
Computer Engineering
Information Engineering
Data Governance
Data Integration
Relational Databases
Query Languages
Linux
Distributed Data Store
Github
Graph Database
Identity and Access Management
Python
Network Security
PostgreSQL
Linear Regression
Logistic Regression
Machine Learning
Microsoft SQL Server
MySQL
Role-Based Access Control
Power BI
Azure
Salesforce
SAP Applications
Shell Script
Software Engineering
SQL Databases
Systems Integration
Supervised Learning
Data Processing
Prophet
Large Language Models
Deep Learning
Generative AI
Matplotlib
Build Management
Data Lake
PySpark
Kubernetes
Information Technology
Plotly
Text Analysis
REST
Terraform
Document Classification
Data Pipelines
K Means
Recurrent Neural Networks
Unsupervised Learning
Jenkins
Redshift
Databricks

Job description

DUTIES: Design and implement scalable, distributed data pipelines using AWS services such as S3, Redshift, Glue, Lambda, EMR, Athena, and Kinesis, with transformation logic developed in PySpark, Python, and SQL. Architect and lead the implementation of a comprehensive security framework for the Databricks platform, including identity and access management (IAM), data governance, network security, encryption, and audit controls. Define and enforce enterprise-grade security standards across Databricks workspaces, Unity Catalog, and associated data pipelines, ensuring alignment with organizational policies and industry best practices. Implement and manage user access provisioning and authentication through Microsoft Entra ID (formerly Azure AD), including SCIM-based group provisioning, SSO integration, RBAC policies, and conditional access for Databricks. Apply deep domain expertise in molecule-level data to uncover strategic insights and identify business opportunities across the drug development lifecycle. This includes interpreting and linking molecular entities, manufacturers, regulatory events, and clinical stage indicators to support asset evaluation and portfolio optimization. Lead the design, development, and deployment of generative AI use cases, taking them from ideation through production implementation, ensuring long-term scalability and maintainability. Develop and fine-tune large language model (LLM) applications, including prompt engineering strategies, reusable prompt templates, and context augmentation techniques for improved response accuracy and relevance. Integrate generative AI systems with enterprise data ecosystems using REST APIs, vector databases, knowledge graphs, orchestration frameworks, and other scalable backend components. Establish robust LLM evaluation and monitoring frameworks, defining key metrics for measuring model accuracy, relevance, safety, and overall production performance. Collaborate cross-functionally with engineering, data science, product, and business stakeholders to prioritize and deliver impactful, responsible AI solutions aligned with business goals. Conduct architecture reviews and optimize end-to-end data pipelines and GenAI workflows for cost efficiency, runtime performance, and scalability in multi-cloud environments. Implement CI/CD pipelines using GitHub and GitHub Actions, enabling modular, version-controlled deployment of infrastructure, data products, and AI applications. Develop intelligent knowledge workflows using Flowise, retrieval-augmented generation (RAG), function calling, SQL orchestration, and webhook integrations to support dynamic use cases. Design and build process mining dashboards using Celonis EMS, including KPI definitions, root cause analysis using Process Query Language (PQL), and operational insights. Automate enterprise workflows through Celonis Action Flows, integrating seamlessly with systems like SAP, Salesforce, and other business platforms to enable process optimization. Model enterprise process data within the Celonis Data Model (CDM) and configure scalable data pipelines using Celonis Data Integration for high-performance analytics. Can work remotely or telecommute.

Requirements

REQUIREMENTS: MINIMUM Education Requirement: Bachelor's degree in Computer Science, Computer Engineering, or related field of study. MINIMUM Experience Requirement: 7 years of Software Engineering, Data Engineering, or related experience. Alternative Education and Experience Requirement: Master's degree in Computer Science, Computer Engineering, or related field of study plus 5 years of Software Engineering, Data Engineering, or related experience. Required knowledge or experience with: Proficiency in programming languages: Python, R, Scala, and Java. Design and implement scalable, distributed data pipelines using AWS services such as S3, Redshift, Glue, Lambda, EMR, Athena, and Kinesis, with transformation logic developed in PySpark, Python, and SQL. Proficient in Linux/Unix environments with experience in shell scripting (Bash) for automation and system operations. Experienced in working with relational databases such as MySQL, PostgreSQL, and SQL Server, as well as cloud data warehouses like Amazon Redshift. Architect and lead the implementation of a comprehensive security framework for the Databricks platform, including identity and access management (IAM), data governance, network security, encryption, and audit controls. Define and enforce enterprise-grade security standards across Databricks workspaces, Unity Catalog, and associated data pipelines, ensuring alignment with organizational policies and industry best practices. Implement and manage user access provisioning and authentication through Microsoft Entra ID (formerly Azure AD), including SCIM-based group provisioning, SSO integration, RBAC policies, and conditional access for Databricks. Apply deep domain expertise in molecule-level data to uncover strategic insights and identify business opportunities across the drug development lifecycle, including linking molecular entities, manufacturers, regulatory events, and clinical stage indicators to support asset evaluation and portfolio optimization. Experience in managing Databricks Unity Catalog using Terraform, including configuration of external locations, catalogs, schemas, and access controls. Proficient in automating data governance and access management through Terraform modules to provision Unity Catalog resources and integrate securely with cloud storage. Implement automated CI/CD pipelines with GitHub, GitHub Actions, Jenkins, and Airflow, enabling modular, version-controlled deployment of infrastructure. Develop and deploy machine learning models using supervised learning (linear regression, logistic regression, decision trees, random forests), unsupervised learning (k-means clustering, PCA), and deep learning (neural networks, CNNs, RNNs) to generate actionable insights and improve metrics. Apply time series forecasting models such as ARIMA, Prophet, and LSTM for predictive analytics on temporal datasets. Apply NLP and text analytics techniques, including text preprocessing, TF-IDF, Word2Vec embeddings, and transformer-based models (BERT) for text classification and entity recognition. Create interactive visualizations using Power BI on top of Databricks Delta tables for real-time analytics and develop in-depth exploratory visualizations using Matplotlib, Seaborn, and Plotly. Develop and maintain interactive dashboards and visualizations in Amazon QuickSight, leveraging data processed and stored in Delta Lake.

Benefits & conditions

Salary: $178131 to $186000 per year

Compensation and Benefits The salary pay range estimated for this position Sr Developer based in Massachusetts is $178,131.00-$186,000.00.

This position may also be eligible to receive a variable annual bonus based on company, team, and/or individual performance results in accordance with company policy. We offer a comprehensive Total Rewards package that our U.S. colleagues and their families can count on, which includes:

  • A choice of national medical and dental plans, and a national vision plan, including health incentive programs
  • Employee assistance and family support programs, including commuter benefits and tuition reimbursement
  • At least 120 hours paid time off (PTO), 10 paid holidays annually, paid parental leave (3 weeks for bonding and 8 weeks for caregiver leave), accident and life insurance, and short- and long-term disability in accordance with company policy
  • Retirement and savings programs, such as our competitive 401(k) U.S. retirement savings plan
  • Employees' Stock Purchase Plan (ESPP) offers eligible colleagues the opportunity to purchase company stock at a discount

Apply for this position