Data Engineer

SGA Inc.
Tysons, United States of America
19 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Tysons, United States of America

Tech stack

API
Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Business Analytics Applications
Confluence
Build Automation
Automation of Tests
Big Data
Cloud Computing
Configuration Management
Code Generation
Information Systems
Databases
Continuous Integration
Information Engineering
ETL
Database Queries
Software Debugging
Memory Management
Fault Tolerance
Github
Monitoring of Systems
Hive
Python
Octopus Deploy
Object-Oriented Software Development
Pair Programming
Performance Tuning
Cloud Services
Prometheus
Standard Sql
Simple Data Format
Software Engineering
SQL Databases
Data Logging
Freeform SQL
Google Cloud Platform
GitHub Copilot
Large Language Models
Grafana
Multi-Agent Systems
Concurrency
Prompt Engineering
Spark
Cloudformation
Containerization
Data Lake
AI Platforms
PySpark
Gitlab-ci
Kubernetes
Information Technology
Code Testing
Wikis
Presto
Virtual Agents
Functional Programming
Dataiku
Cloudwatch
Terraform
Code Restructuring
GPT
Data Pipelines
Docker
ELK
Jenkins

Job description

Software Guidance & Assistance, Inc., (SGA), is searching for a Data Engineer for a CONTRACT assignment with one of our premier Regulatory clients in Rockville, MD or Tysons, VA.

The Data Engineer works with moderate supervision across two equally weighted domains: (1) large-scale data pipeline development processing market events in a cloud environment, and (2) design and development of agentic AI systems including LLM-powered regulatory data assistants, MCP servers, and agent harness architectures. This position contributes to overall product quality throughout the software development lifecycle., * Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments

  • Develop and optimize SQL for large-scale surveillance datasets including window functions, multi-table joins, and complex aggregations
  • Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku)
  • Participate in data quality monitoring, anomaly detection, and production incident investigation
  • Develop AI agent systems using AWS Bedrock and agent frameworks (Strands Agents SDK, LangChain/LangGraph, or equivalent)
  • Build agent harness architectures combining LLM reasoning with deterministic execution - skill/RAG-based SQL generation and structured output validation
  • Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups) across the data lake
  • Build evaluation frameworks for agent accuracy - paraphrase robustness, routing precision, and structural consistency
  • Stay informed of advances in LLM frameworks (LangGraph, Google ADK, AWS Strands) and emerging AI capabilities
  • Write clean, well-tested code; contribute to CI/CD Jenkins pipelines and infrastructure-as-code on AWS
  • Ensure secure handling of RCI and sensitive regulatory data across both data pipelines and agent outputs - auditable execution traces
  • Adhere to FINRA and team standards for secure development practices and technology policies
  • Partner across teams, communicate technical information at the appropriate level, and maintain documentation on Confluence/Wiki
  • Actively learn from senior team members; contribute to process improvement in line with FINRA's values of collaboration, expertise, innovation, and responsibility

Requirements

Data Engineering & Big Data Technologies

  • Experience building data pipelines using Apache Spark (PySpark preferred) and SQL
  • Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud data platforms (AWS S3, EMR, Lambda)
  • Understanding of common issues like data skew and strategies to mitigate it, working with large data volumes, and troubleshooting job failures due to resource limitations, bad data, and scalability challenges
  • Real-world experience with debugging and mitigation strategies

Generative AI & Agentic Systems

  • Practical experience building LLM-powered agent systems that use tools and produce structured outputs (not just chatbot interfaces)
  • Hands-on experience with at least one agent framework: LangChain, LangGraph, AWS Strands, or equivalent
  • Working knowledge of prompt engineering, RAG architectures, and context/memory management
  • Experience with foundation model APIs (Anthropic Claude, Amazon Nova, OpenAI, or similar)
  • Memory Architecture: Understanding of agent memory tiers - working memory, episodic memory, semantic memory - and strategies for context persistence, pruning, and retrieval across sessions
  • Agent Harness Design: Familiarity with harness patterns that wrap LLM reasoning with deterministic guardrails, tool routing, verification loops, and graceful degradation

AI Tool Proficiency

  • Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
  • Experience with spec-driven development - using structured specifications to guide AI code generation, review, and validation
  • Ability to leverage AI pair programming for code suggestions, debugging, refactoring, and automated test generation

Cloud Technologies

  • Experience with AWS services like S3, EMR, EMR on EKS, Lambda, Bedrock, Step Functions, etc.
  • Hands-on experience using S3 with Spark (e.g., dealing with file formats, consistency issues)
  • Familiarity with AWS Bedrock for foundation model invocation, knowledge bases, guardrails, and agent orchestration
  • Exposure to Google Cloud Vertex AI (model garden, grounding, agent builder) or equivalent managed AI platforms
  • Familiarity with AWS monitoring and logging tools (CloudWatch, CloudTrail) for production workloads

Programming - Python

  • Proficiency in Python for data engineering and automation
  • Ability to write clean, modular, and performant code
  • Experience with functional programming concepts (e.g., immutability, higher-order functions)
  • Strong understanding of collections, concurrency, and memory management

SQL Skills (Window Functions, Joins, Complex Queries)

  • Proficiency with SQL window functions, multi-table joins, and aggregations
  • Ability to write and optimize complex SQL queries
  • Experience handling edge cases like NULLs, duplicates, and ordering

Good to Have

  • AWS Bedrock AgentCore (memory, identity, tool gateway)
  • Model Context Protocol (MCP) server development and integration
  • Agent evaluation harnesses and agentic patterns (draft-verification, compile-style generation)
  • Fine-tuning foundation models for domain-specific tasks (LoRA, PEFT, or managed fine-tuning via Bedrock/Vertex AI)
  • Local model execution with Ollama, vLLM, or similar for development and experimentation
  • Vector databases (FAISS, Pinecone, OpenSearch)
  • Docker, Kubernetes, and Amazon EKS for containerized workloads
  • Infrastructure as Code (Terraform, CloudFormation)
  • Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, ArgoCD)
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack)
  • AWS certifications (AI Practitioner, Solutions Architect, or Kubernetes certifications like CKA/CKAD), * Bachelor's degree in Computer Science, Data Science, Information Systems, or related discipline with at least two (2) years of related experience; or equivalent training and/or work experience; past Financial Services industry experience preferred
  • Demonstrated technical expertise in Object Oriented and database technologies/concepts which resulted in deployment of enterprise quality solutions
  • Extensive knowledge of industry leading software engineering approaches including Test Automation, Build Automation and Configuration Management frameworks
  • Strong written and verbal technical communication skills
  • Demonstrated ability to develop effective working relationships that improved the quality of work products
  • Ability to maintain focus and develop proficiency in new skills rapidly
  • Ability to work in a fast paced environment

About the company

SGA is a technology and resource solutions provider driven to stand out. We are a women-owned business. Our mission: to solve big IT problems with a more personal, boutique approach. Each year, we match consultants like you to more than 1,000 engagements. When we say let's work better together, we mean it. You'll join a diverse team built on these core values: customer service, employee development, and quality and integrity in everything we do. Be yourself, love what you do and find your passion at work. Please find us at .

Apply for this position