Machine Learning Engineer
phData, Inc.
4 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Tech stack
Java
API
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Azure
Cloud Computing
Computer Programming
Data Integration
ETL
Data Transformation
Data Systems
Relational Databases
Software Debugging
Distributed Data Store
Elasticsearch
Hadoop
Hadoop Distributed File System
Python
NoSQL
Cloud Services
Cloudera
Software Engineering
Solr
SQL Databases
Data Streaming
Google Cloud Platform
Delivery Pipeline
Snowflake
Spark
Core Data
Information Technology
Luigi
Cassandra
Amazon Web Services (AWS)
Kafka
Apache Nifi
Spark Streaming
Data Pipelines
Databricks
Requirements
- 8+ years as a hands-on Solutions Architect and/or Data Engineer designing and implementing data solutions
- Team lead, and/or mentorship of other engineers
- Ability to develop end-to-end technical solutions into production - and to help ensure performance, security, scalability, and robust data integration.
- Programming expertise in Java, Python and/or Scala
- Core cloud data platforms including Snowflake, AWS, Azure, Databricks and GCP
- SQL and the ability to write, debug, and optimize SQL queries
- Client-facing written and verbal communication skills and experience
- Create and deliver detailed presentations
- Detailed solution documentation (e.g. including POCS and roadmaps, sequence diagrams, class hierarchies, logical system views, etc.)
- 4-year Bachelor's degree in Computer Science or a related field
Prefer any of the following:
- Production experience in core data platforms: Snowflake, AWS, Azure, GCP, Hadoop, Databricks
- Cloud and Distributed Data Storage: S3, ADLS, HDFS, GCS, Kudu, ElasticSearch/Solr, Cassandra or other NoSQL storage systems
- Data integration technologies: Spark, Kafka, event/streaming, Streamsets, Matillion, Fivetran, NiFi, AWS Data Migration Services, Azure DataFactory, Informatica Intelligent Cloud Services (IICS), Google DataProc or other data integration technologies
- Multiple data sources (e.g. queues, relational databases, files, search, API)
- Complete software development lifecycle experience including design, documentation, implementation, testing, and deployment
- Automated data transformation and data curation: dbt, Spark, Spark streaming, automated pipelines
- Workflow Management and Orchestration: Airflow, AWS Managed Airflow, Luigi, NiFi
Benefits & conditions
- Remote-First Work Environment
- Casual, award-winning small-business work environment
- Collaborative high performance culture that prizes autonomy, creativity, and transparency
- Competitive comp, excellent benefits, generous weeks PTO plus 10 Holidays (and other cool perks)
- Accelerated learning and professional development through advanced training and certifications
About the company
Join phData, a dynamic and innovative leader in the modern data stack. We partner with major cloud data platforms like Snowflake, AWS, Azure, GCP, Fivetran, Pinecone, Glean, and dbt to deliver cutting-edge services and solutions. We're committed to helping global enterprises overcome their toughest data challenges.
phData is a remote-first global company with employees based in the United States, Latin America, and India. We celebrate the culture of each of our team members and foster a community of technological curiosity, ownership, and trust. Even though we're growing extremely fast, we maintain a casual, exciting work environment. We hire top performers and allow you the autonomy to deliver results.
* 6x Snowflake Partner of the Year (2020, 2021, 2022, 2023, 2024, 2025)
* Fivetran, dbt, Atlation, and AWS Partner of the Year
* #1 Partner in Snowflake Advanced Certifications
* 600+ Expert Cloud Certifications (Sigma, AWS, Azure, Dataiku, etc)
Recognized as an award-winning workplace in the US, India, and LATAM