Reliability Engineer - Automation Engineer
Role details
Job location
Tech stack
Job description
Company Overview: We are looking for a highly skilled and dynamic individual to join our Production Engineering team within the Aladdin Engineering. This role is perfect for someone passionate about technical troubleshooting, optimizing system performance, and developing innovative automation solutions. In addition to strong expertise in secrets management and automation, this role increasingly focuses on applied AI, including AI-assisted operations and Retrieval-Augmented Generation (RAG)-based assistants, to improve reliability, operability, and developer productivity
Role Overview: As a Reliability Engineer, you will be responsible for ensuring the performance, scalability, and stability of our platforms through automation, intelligent tooling, and AI-assisted workflows. You will build and operate automation and AI-driven solutions that enhance system reliability, optimize operational efficiency, and improve how engineers interact with platforms-particularly around Vault infrastructure and secrets management. This includes leveraging Python, AI frameworks, and data-driven approaches to analyze system behavior, automate diagnostics, and enable self-service capabilities via AI assistants., * Design, build, and operate reliable, scalable platforms using automation, Infrastructure as Code (Terraform, Ansible), CI/CD, and GitOps practices across cloud and containerized environments.
-
Automate operational workflows using Python and scripting, including secure configuration and enterprise secrets management lifecycle (provisioning, rotation, access, and recovery).
-
Develop and integrate AI-assisted operational tooling, including RAG-based assistants, to support incident response, troubleshooting, diagnostics, and engineer self-service.
-
Build and maintain knowledge ingestion and retrieval pipelines (logs, metrics, runbooks, configuration data) to power AI assistants and intelligent automation.
-
Monitor and analyze system health, performance, and capacity using observability tools (e.g., logs, metrics, dashboards), and perform root cause analysis for platform incidents.
-
Participate in on-call rotations, supporting production systems and driving continuous improvement through automation and AI-driven reduction of operational toil.
-
Support disaster recovery planning and execution, application onboarding, upgrades, and change management for platforms relying on centralized secrets and secure configuration services.
-
Ensure all automation and AI solutions are secure, explainable, and production-ready, meeting enterprise and regulated-environment requirements.
Requirements
-
Bachelor's degree (or equivalent) in Computer Science, Engineering, Mathematics, or a related field.
-
Strong Python development skills for automation, systems analysis, and integration with AI tooling.
-
Experience with cloud platforms such as AWS and Azure.
-
Hands-on experience with Ansible, Terraform, and configuration-as-code practices.
-
Experience using monitoring and observability tools to track and optimize resource utilization.
-
Foundational understanding of AI/ML concepts, particularly as applied to automation, observability, or developer tooling.
Preferred Skills:
-
Experience with Linux-based server environments.
-
Familiarity with Kubernetes and containerized platforms.
-
Exposure to RAG architectures, vector databases, or AI assistant frameworks in production or platform contexts.
-
Prior experience in financial services or large-scale technology environments.
-
Experience designing systems for high availability, scalability, and fault tolerance.
Good to have:
-
Azure certification.
-
Certification or hands-on experience with enterprise secrets management platforms.
-
Experience building or operating AI-powered internal developer platforms or assistants.
Ideal Candidate: You are a proactive, analytical engineer with a strong foundation in secrets management, automation, and platform reliability, and a growing passion for AI-driven operations. You enjoy applying AI pragmatically-building assistants, automations, and intelligent workflows that reduce toil, improve reliability, and empower engineers at scale. If you are excited about combining automation, AI, and secure platform services to shape the future of production engineering, we would love to speak with you.
Benefits & conditions
For Wilmington, DE Only the salary range for this position is USD$110,000.00 - USD$138,000.00 . Additionally, employees are eligible for an annual discretionary bonus, and benefits including healthcare, leave benefits, and retirement benefits. BlackRock operates a pay-for-performance compensation philosophy and your total compensation may vary based on role, location, and firm, department and individual performance.
Our benefits
To help you stay energized, engaged and inspired, we offer a wide range of benefits including a strong retirement plan, tuition reimbursement, comprehensive healthcare, support for working parents and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.