RL Deep Learning Engineer (Remote)
Role details
Job location
Tech stack
Job description
We're seeking an engineering generalist to build the first RL environments and benchmarks purpose-built for long-horizon legal reasoning-tasks where AI agents must search, read, analyze, and draft across real case filings, the same work that still takes teams of lawyers days to weeks. Frontier labs are will use these environments to make future models more legally capable and we need an engineer to own the infrastructure that makes it all work.
You'll design and scale the systems that turn millions of real court filings into verifiable evaluation environments and RL training tasks. You'll work directly with our attorneys, our data pipeline, and our partners at frontier AI labs.
What you'll do
-
Build and maintain the evaluation harness and RL environment infrastructure-task runners, sandboxed environments, and scoring logic that can scale to thousands of parallel agents
-
Own the data pipeline that turns freshly collected court filings into benchmark and RL tasks before they reach any model's training set
-
Integrate with partner harnesses and model APIs to run contamination-free evaluations
-
Collaborate with attorneys to translate legal workflows like cite checks, motion drafting, and precedent research into structured, scorable task formats using the Harbor spec
Requirements
Strong generalist software engineering fundamentals. You've built, scaled, and maintained diverse systems in production
-
You've built entire systems yourself, don't require detailed specs or product managers, and take full ownership over your projects
-
Deep experience with Python, bonus for TypeScript. Most importantly, you can work on hard engineering problems
-
You should be kind, self-managing, and a clear communicator
-
You make effect use of Cursor/Claude Code/Codex and are capable of writing good code without them
Bonuses but not requirements
-
Familiarity with LLM evaluation. You get what makes a good rubric and why benchmarks leak
-
Comfort working with messy, real-world document data (legal filings, PDFs, long-form text)
Benefits & conditions
Be an Early Applicant In-Office or Remote Hiring Remotely in United States 125K-180K Annually Mid level In-Office or Remote Hiring Remotely in United States 125K-180K Annually Mid level Build and scale RL environments and evaluation harnesses for long-horizon legal reasoning. Own pipelines converting court filings into contamination-free benchmarks and RL tasks, integrate with partner model APIs, and collaborate with attorneys to create scorable task formats. The summary above was generated by AI
About Midpage
Midpage is the search engine for legal data used by AI labs. We cover all US court data - 20M records. Over 300 law firms use our platform directly, 200k+ visitors read cases on our site every month, and five multibillion-dollar companies including Perplexity trust us as their legal data supplier. We're a team of 7 in Bowery, lower Manhattan. Our ARR has grown from $400k to $2M in the last 4 months., 8 Hours Ago Remote or Hybrid 212K-244K Annually Senior level 212K-244K Annually Senior level Artificial Intelligence * Professional Services * Business Intelligence * Consulting * Cybersecurity * Generative AI The Anthropic Alliance Manager at PwC focuses on building partnerships, driving revenue growth, and executing marketing strategies to enhance brand visibility and client engagement. Responsibilities include relationship management, strategic planning, and team leadership to deliver on client expectations and organizational goals. Top Skills: Microsoft Office SuiteSalesforce PwC, Remote or Hybrid Denver, CO, USA 150K-438K Annually Senior level 150K-438K Annually Senior level Artificial Intelligence * Professional Services * Business Intelligence * Consulting * Cybersecurity * Generative AI Lead global information reporting tax engagements, ensure compliance, analyze financial data, develop tax strategies, drive business development, mentor teams, build executive client relationships, and promote tax technology and thought leadership across the PwC network. Coinbase, Remote Easy Apply 191K-191K Annually Senior level 191K-191K Annually Senior level Artificial Intelligence * Blockchain * Fintech * Financial Services * Cryptocurrency * NFT * Web3 Design, build, and operate Kubernetes cluster management tooling and developer-facing workflows. Deliver compute capabilities (jobs, cron, deployments, EFS, right-sizing), automate toil, improve observability and incident response, and partner with Security/Reliability teams. Apply AI tooling to improve infrastructure automation and developer productivity. Top Skills: ArgocdAWSCncfEc2EcsEfsEksEnvoyGCPGenerative AiHelmIamIstioKubernetesMesosNomadPrometheusVpc
What you need to know about the Colorado Tech Scene
With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.
Key Facts About Colorado Tech
- Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
- Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
- Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
- Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute
About the company
PwC provides services to 420 out of 500 Fortune 500 companies. The firm was formed in 1998 by a merger between Coopers & Lybrand and Price Waterhouse.