Principal Machine Learning Engineer (Internet Security LLMs, NLP, Deep Learning))

Palo Alto Networks

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 254K

Job location

Tech stack

Java

Amazon Web Services (AWS)

Data analysis

Architectural Patterns

Data Stores

Software Debugging

Linux

Internet Security

Python

Machine Learning

MongoDB

MySQL

Natural Language Processing

NoSQL

Performance Tuning

TensorFlow

Shell Script

Software Deployment

Data Logging

Google Cloud Platform

PyTorch

Delivery Pipeline

Large Language Models

Multi-Agent Systems

Deep Learning

Backend

Keras

Containerization

Scikit Learn

Kubernetes

Information Technology

Deployment Automation

Machine Learning Operations

Virtual Agents

GPT

Data Pipelines

Docker

Job description

Design, build, and operate production machine learning systems that balance model quality, cost, latency, and reliability in a security-sensitive environment.
Own the end-to-end lifecycle of ML and LLM components, from problem formulation and model development to production deployment, monitoring, and iterative improvement.
Integrate ML and LLM-based services with backend systems and data pipelines, ensuring scalability, observability, and safe operation in production.
Develop and maintain automated training, evaluation, and retraining pipelines, and build data analysis tools to continuously improve model performance as data and threats evolve.
Partner closely with Product Managers and domain experts to translate product and security requirements into robust ML solutions with clear success metrics.
Collaborate with software engineers and SREs on release planning, deployment strategies, monitoring, and incident response to ensure reliable and predictable production behavior.

Requirements

MS or Ph.D. in Computer Science or a related field, with a focus on Machine Learning, and 8+ years of industry experience delivering ML systems in production environments.
Strong problem solver and collaborative team player with clear communication skills, able to work effectively across engineering, product, and SRE teams.
Solid foundation in Machine Learning, Deep Learning, and NLP, with hands-on experience using modern architectures such as transformer-based models and representation learning techniques.
Practical experience applying Large Language Models (LLMs) to real-world problems, including text understanding, classification, extraction, summarization, or reasoning over large-scale and noisy data.
Experience designing, implementing, and operating LLM-powered components in production, including prompt design, model adaptation or fine-tuning, evaluation, and cost/performance optimization.
Experience with MLOps / AIOps practices for operating ML and LLM systems in production, including model lifecycle management, monitoring, logging, alerting, retraining workflows, and debugging production issues.
Understanding of model quality, robustness, and safety considerations, including evaluation methodologies, failure modes, and guardrails required for production ML systems in security-sensitive environments.
Strong experience with ML frameworks, libraries, and tooling (e.g., PyTorch, Tensorflow, Keras, Scikit-learn, Kubeflow), and solid software engineering fundamentals.
Ability to independently own ML features end-to-end, from problem formulation and system design to implementation, deployment, and iterative improvement in production.
Proficient in Python, working knowledge of Java, Linux, and shell scripting.
Experience building and operating services on cloud platforms (Google Cloud Platform and/or AWS) and in containerized environments (Docker, Kubernetes)., * Familiarity with AI agent-based approaches, such as multi-step inference pipelines, tool-augmented LLM workflows, or systems that combine models, heuristics, and external signals to drive reliable decisions.
Experience with website content understanding, website classifications, security, or large-scale internet data is a strong plus.
Familiarity with relational and NoSQL data stores such as MySQL, MongoDB, or similar systems.
Experience applying LLMs and agentic systems in security-sensitive or high-precision domains is a strong plus.

Benefits & conditions

The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/com-missioned roles) is expected to be the annual range listed below. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.

$157,200.00 - $254,100.00/yr

Our Commitment

We're trailblazers that dream big, take risks, and challenge cybersecurity's status quo. It's simple: we can't accomplish our mission without diverse teams innovating, together.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

Apply for this position

Good distractions

Moments

Videos View all