Reliability Engineer - Automation Engineer

Blackrock, Inc.
Wilmington, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Compensation
$ 110K

Job location

Remote
Wilmington, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Continuous Integration
Linux
Programming Tools
Disaster Recovery
Fault Tolerance
Monitoring of Systems
Systems Analysis
Python
Key Management
Machine Learning
Performance Tuning
Ansible
Runbook
Cloud Platform System
System Availability
Grafana
Reliability of Systems
Generative AI
Containerization
Kubernetes
Information Technology
Data Analytics
Virtual Agents
Terraform
Server Operating Systems & Platforms

Job description

Company Overview: We are looking for a highly skilled and dynamic individual to join our Production Engineering team within the Aladdin Engineering. This role is perfect for someone passionate about technical troubleshooting, optimizing system performance, and developing innovative automation solutions. In addition to strong expertise in secrets management and automation, this role increasingly focuses on applied AI, including AI-assisted operations and Retrieval-Augmented Generation (RAG)-based assistants, to improve reliability, operability, and developer productivity

Role Overview: As a Reliability Engineer, you will be responsible for ensuring the performance, scalability, and stability of our platforms through automation, intelligent tooling, and AI-assisted workflows. You will build and operate automation and AI-driven solutions that enhance system reliability, optimize operational efficiency, and improve how engineers interact with platforms-particularly around Vault infrastructure and secrets management. This includes leveraging Python, AI frameworks, and data-driven approaches to analyze system behavior, automate diagnostics, and enable self-service capabilities via AI assistants., * Design, build, and operate reliable, scalable platforms using automation, Infrastructure as Code (Terraform, Ansible), CI/CD, and GitOps practices across cloud and containerized environments.

  • Automate operational workflows using Python and scripting, including secure configuration and enterprise secrets management lifecycle (provisioning, rotation, access, and recovery).

  • Develop and integrate AI-assisted operational tooling, including RAG-based assistants, to support incident response, troubleshooting, diagnostics, and engineer self-service.

  • Build and maintain knowledge ingestion and retrieval pipelines (logs, metrics, runbooks, configuration data) to power AI assistants and intelligent automation.

  • Monitor and analyze system health, performance, and capacity using observability tools (e.g., logs, metrics, dashboards), and perform root cause analysis for platform incidents.

  • Participate in on-call rotations, supporting production systems and driving continuous improvement through automation and AI-driven reduction of operational toil.

  • Support disaster recovery planning and execution, application onboarding, upgrades, and change management for platforms relying on centralized secrets and secure configuration services.

  • Ensure all automation and AI solutions are secure, explainable, and production-ready, meeting enterprise and regulated-environment requirements.

Requirements

  • Bachelor's degree (or equivalent) in Computer Science, Engineering, Mathematics, or a related field.

  • Strong Python development skills for automation, systems analysis, and integration with AI tooling.

  • Experience with cloud platforms such as AWS and Azure.

  • Hands-on experience with Ansible, Terraform, and configuration-as-code practices.

  • Experience using monitoring and observability tools to track and optimize resource utilization.

  • Foundational understanding of AI/ML concepts, particularly as applied to automation, observability, or developer tooling.

Preferred Skills:

  • Experience with Linux-based server environments.

  • Familiarity with Kubernetes and containerized platforms.

  • Exposure to RAG architectures, vector databases, or AI assistant frameworks in production or platform contexts.

  • Prior experience in financial services or large-scale technology environments.

  • Experience designing systems for high availability, scalability, and fault tolerance.

Good to have:

  • Azure certification.

  • Certification or hands-on experience with enterprise secrets management platforms.

  • Experience building or operating AI-powered internal developer platforms or assistants.

Ideal Candidate: You are a proactive, analytical engineer with a strong foundation in secrets management, automation, and platform reliability, and a growing passion for AI-driven operations. You enjoy applying AI pragmatically-building assistants, automations, and intelligent workflows that reduce toil, improve reliability, and empower engineers at scale. If you are excited about combining automation, AI, and secure platform services to shape the future of production engineering, we would love to speak with you.

Benefits & conditions

For Wilmington, DE Only the salary range for this position is USD$110,000.00 - USD$138,000.00 . Additionally, employees are eligible for an annual discretionary bonus, and benefits including healthcare, leave benefits, and retirement benefits. BlackRock operates a pay-for-performance compensation philosophy and your total compensation may vary based on role, location, and firm, department and individual performance.

Our benefits

To help you stay energized, engaged and inspired, we offer a wide range of benefits including a strong retirement plan, tuition reimbursement, comprehensive healthcare, support for working parents and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.

About the company

BlackRock's hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person - aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock. About BlackRock At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being. Our clients, and the people they serve, are saving for retirement, paying for their children's educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress. This mission would not be possible without our smartest investment - the one we make in our employees. It's why we're dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive., BlackRock Our purpose is to help more and more people experience financial well-being, with clients ranging from governments, foundations and other large institutions, to those investing on behalf of individuals - like firefighters, nurses, teachers and factory workers - saving for retirement. Clients turn to us - as both an asset manager and leading provider of financial technology - for the innovative solutions they need when planning for their most important goals.

Apply for this position