Cloud Engineer - Sr

AKAASA TECHNOLOGIES INC
Richmond, United States of America
1 month ago

Role details

Contract type
Temporary to permanent
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Richmond, United States of America

Tech stack

Java
API
Amazon Web Services (AWS)
Cloud Engineering
Continuous Integration
Data Files
Python
Newrelic
SQL Databases
Systems Architecture
Systems Integration
Datadog
Backend
Amazon Web Services (AWS)
Drilldown
GraphQL
Splunk
New Relic (SaaS)
Docker
Jenkins
Databricks

Job description

SRE background required, AWS, Python/Java, Expertise in observability tools like Splunk, New Relic, Observe (Must have) Working on journey mapping on DFS intent Focus: Full-Stack Observability, System Traceability, & Executive Health Scoring

Requirements

We are seeking a hands-on Observability Specialist to accelerate the adoption of our Observe based platform. The ideal candidate possesses an SRE mindset-the ability to explore how complex systems interact and identify the exact data sets needed to provide a 360-degree view of the environment. You will bridge the gap between disparate Lines of Business (LOBs) to build E2E traceability and unified "Health Indices" that reduce mean-time-to-detect (MTTD) from hours to minutes.\Technical Skill Requirements

  1. Core Observability & Tooling
  • Platform Expertise: Deep experience with modern observability platforms. While we use Observe, proficiency in New Relic, Splunk, or Databricks is required for rapid ramp-up.
  • Query & Data Fluency: Expert-level ability to write complex queries (SQL-based or proprietary like NRQL/SPL) to aggregate API success rates, latency, and crash-free session data.
  • Dashboard Architecture: Proven track record of building "Drill-Down" architectures-moving from high-level user journeys (e.g., Login) directly into microservice-level logs and traces.
  1. The Modern Tech Stack
  • Infrastructure: Hands-on experience with AWS (ECS/Fargate/Lambda) and Docker.
  • Languages: Ability to navigate and instrument code in Python or Java.
  • Integrations: Familiarity with GraphQL for data fetching and Jenkins for CI/CD pipeline monitoring.
  • Instrumentation: Hands-on experience with OTel, and familiarity with NewRelic APM or Datadog APM
  1. SRE & Systems Architecture Mindset
  • Cross-Domain Traceability: Experience monitoring digital customer engagement across disparate system boundaries (e.g., Comms, Phone, and Backend APIs) to expose "silent failures."
  • Telemetry Mapping: Ability to map technical metrics to business outcomes, specifically creating Unified Health Indices for Senior Leadership (SLT)Root Cause Analysis (RCA): Skill in configuring alerts and correlations that enable instant pinpointing of failures within complex user flows.

About the company

© 2026 Careerjet All rights reserved

Apply for this position