Remote Senior Data Engineer - Platform Engineering
Role details
Job location
Tech stack
Job description
As a Senior Data Engineer in our new Central Data Engineering team in the Platform Mission, you will be leading work that is mission-wide in terms of scope. Our new Central Data Engineering team will inform data quality standards, access standards, create templatised solutions, assist in managing mission-level datasets/dashboards, and prepare our mission's data for the AI Future that expected to impact every squad and product area. As the Senior engineer on the team, you will deal with bringing structure to ambiguity and be the bridge between business goals and translating them to technical deliverables., * Build large-scale data pipelines using frameworks like Google Cloud Platform and Apache Beam.
- Work on projects powering new generative AI experiences and helping to build models.
- Learn and contribute to the team's best practices and techniques for building data pipelines for large-scale models, including cleaning, filtering, classifying, and labeling.
- Collaborate with other engineers, researchers, product managers, and stakeholders, taking on learning and leadership opportunities that arise.
- Deliver scalable, testable, maintainable, and high-quality code.
- Share knowledge, promote standard methodologies, and improve the team through mentorship and constructive accountability.
Requirements
- You have Data Engineering experience and know how to work with high-volume, heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, Cassandra, GCP, AWS.
- You have experience building clean, high-quality datasets for training large-scale models.
- You have experience with one or more higher-level Python or Java-based data processing frameworks such as Beam, Dataflow, Crunch, Scalding, Storm, Spark, etc.
- You have strong Python programming abilities. You might have worked with Docker as well as Luigi, Airflow, or similar tools.
- You care about quality and know what it means to ship high-quality code.
- You have experience managing data retention policies.
- You care about agile software processes, data-driven development, reliability, and responsible experimentation.
- You understand the value of collaboration and partnership within teams.