Sr. Data Architect - Aviation
Role details
Job location
Tech stack
Job description
We are seeking a Senior Data Architect to lead the design and evolution of enterprise-level data ecosystems. You will be responsible for architecting scalable, secure, and high-performance data infrastructures that support mission-critical aviation sustainment. This is a "player-coach" role that requires high-level strategic planning alongside hands-on engineering execution., Architecture & Design: Design conceptual, logical, and physical data models for complex federal environments. Lead the transition from legacy on-premises systems to modern, cloud-native (AWS/GCP) data platforms.
Pipeline Development: Architect and oversee the build of automated ETL/ELT pipelines using Python, SQL, and PySpark to ingest and transform unstructured and structured data.
Cloud Data Warehousing: Implement and optimize enterprise data warehouses using tools like AWS Redshift, Google BigQuery, AWS Glue, and Databricks.
Governance & Compliance: Establish data governance frameworks, metadata management, and data lineage in alignment with federal standards (HIPAA, FHIR, NIST).
Performance Optimization: Conduct index/partition design, query tuning, and sharding strategies to ensure high availability and scalability for real-time analytics.
AI/ML Support: Design data architectures that facilitate AI/ML initiatives, including model training pipelines and real-time inference in production environments.
Leadership: Mentor a team of data engineers, enforce software engineering best practices (CI/CD, unit testing, documentation), and serve as a technical bridge between stakeholders and delivery teams.
Requirements
Do you have experience in Tooling?, Do you have a Master's degree?, * Must be a U.S. Citizen.
- Masters's Degree or Above in Systems Engineering, Computer Science or related field.
- An active security clearance or the ability to obtain one is required.
- Minimum 6+ years of experience to include:
- Experience in data management, utilizing advanced analytics tools and platforms and Python.
- Experience with Data Warehousing consulting/engineering or related technologies (Redshift, Databricks, BigQuery, OADW, Apache Hive, Apache Lucene).
- Experience in scripting, tooling, and automating large-scale computing environments.
- Extensive experience with major tools such as Python, Pandas, PySpark, NumPy, SciPy, SQL, and Git; Minor experience with TensorFlow, PyTorch, and Scikit-learn.
- Compliance: Deep understanding of data security and federal compliance requirements., * Data Architecture and Design
- Skills: o Data modeling (conceptual, logical, and physical) o Database schema design o Understanding of different database paradigms (relational, NoSQL, graph databases, etc.) o ETL (Extract, Transform, Load) processes and tools o Experience with modern data warehousing solutions (e.g., Redshift, Snowflake, BigQuery) o Understanding of dimensional modeling (star/snowflake schemas) and data vault techniques. o Experience designing for both OLTP and OLAP workloads. o Familiarity with metadata-driven design and schema evolution in data systems. o Experience defining data SLAs and lifecycle management policies. o Project Experience: Designing and implementing scalable data architectures that support business intelligence, analytics, and machine learning workflows.
- Data Pipeline Development
- Skills: o Proficiency in tools like Apache Kafka, Airflow, Spark, Flink, or NiFi o Experience with cloud-based data services (AWS Glue, Google Cloud Dataflow, Azure Data Factory) o Real-time and batch data processing o Automation and monitoring of data pipelines o Strong understanding of incremental processing, idempotency, and backfill strategies. o Knowledge of workflow dependency management, retries, and alerting. o Experience writing modular, testable, and reusable Python-based ETL code. o Project Experience: Leading the development of highly available, fault-tolerant, and scalable data pipelines, integrating multiple data sources, and ensuring data quality.
- Cloud Platforms and Services
- Skills: o Expertise in cloud environments (AWS, GCP, Azure) o Understanding of cloud-based storage (S3, Blob Storage), databases (RDS, DynamoDB), and compute resources o Implementing cloud-native data solutions (Data Lake, Data Warehouse, Data Mesh) o Experience with cost monitoring and optimization for data workloads. o Familiarity with hybrid and multi-cloud architectures. o Understanding of serverless data patterns (e.g., Lambda + S3 + Athena, Cloud Functions + BigQuery). o Project Experience: Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services, with a focus on cost efficiency and scalability.
- Big Data Technologies
- Skills: o Experience with big data ecosystems (Hadoop, HDFS, Hive, Spark) o Distributed computing, parallel processing, and handling petabyte-scale data o Tools for querying large datasets (Presto, Athena) o Understanding of lakehouse frameworks (Delta Lake, Iceberg, Hudi). o Familiarity with data compaction, schema evolution, and ACID guarantees in distributed storage o Project Experience: Building and managing big data platforms to enable large-scale analytics, often incorporating structured and unstructured data.
- Database Administration and Optimization
- Skills: o Expertise in database technologies (SQL, NoSQL, GraphDBs) o Query optimization, indexing, and partitioning strategies o Backup, replication, and disaster recovery planning o Understanding of query execution plans, cost-based optimization, and caching strategies. o Experience performing index and partition design based on query patterns. o Familiarity with data versioning and temporal tables. o Experience profiling and optimizing application code interacting with databases. o Project Experience: Performance tuning for complex queries, implementing database replication and sharding strategies to support high availability and scalability.
- Data Governance and Security
- Skills: o Data privacy, encryption, and compliance with regulations (GDPR, CCPA) o Implementing data governance frameworks (data lineage, cataloging, metadata management) o Role-based access control and user management for sensitive data o Experience with automated policy enforcement and data lineage visualization tools (e.g., DataHub, Collibra, Alation). o Knowledge of data quality frameworks integrated into CI/CD pipelines. o Familiarity with data contract testing between producer and consumer teams. o Project Experience: Developing and implementing data governance policies and security controls across the organization's data assets, ensuring compliance with industry standards.
- Programming and Scripting Languages
- Skills: o Proficiency in Python and SQL o Experience with version control (Git) and CI/CD for data engineering (Gitlab, Jenkins, CircleCI) o API design and integration (Postman) o Strong understanding of object-oriented programming (OOP) principles and design patterns in Python. o Familiarity with software engineering best practices (modularity, testing, documentation, linting). o Understanding of algorithmic complexity (Big O notation) and ability to optimize code for scale. o Experience with parallel and distributed computation frameworks (Spark, Dask, Ray). o Ability to profile and debug performance bottlenecks in data workflows. o Use of type hinting, logging frameworks, and automated testing frameworks (pytest, unittest)
- AI/ML Pipeline Support and Analytics
- Skills: o Experience in supporting data scientists with feature engineering, data wrangling, and model deployment o Knowledge of ML orchestration tools (MLflow, Kubeflow) o Hands-on experience with analytics tools (e.g., Tableau, Power BI) o Familiarity with feature store design and model feature lineage tracking. o Understanding of data versioning and reproducibility for ML workflows. o Experience supporting real-time model inference pipelines. o Project Experience: Designing architectures that support AI/ML initiatives, enabling scalable data pipelines for training models, and supporting experimentation in the production environment.
- Leadership and Mentorship
- Skills: o Leading data engineering teams, cross-functional collaboration with data scientists, analysts, and business units o Project management (Agile, Scrum, Kanban) and stakeholder communication o Experience with mentorship and growing junior data engineers o Experience establishing data architecture standards and best practices. o Ability to review and approve technical designs for consistency and scalability. o Proven success in mentoring engineers in code quality, modeling, and system design. o Project Experience: Leading the technical direction for large-scale data initiatives, such as enterprise data lake implementations or the creation of a unified data platform.
Benefits & conditions
Pulled from the full job description
- Health insurance
- 401(k) matching
- Paid time off
- Vision insurance
- Dental insurance
- Life insurance
- Paid holidays, * Health insurance
- Dental insurance
- Vision insurance
- Life Insurance
- 401(k) Retirement Plan with matching
- Paid Time Off
- Paid Federal Holidays