Principal Machine Learning Engineer (Internet Security LLMs, NLP, Deep Learning))

Palo Alto Networks
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 254K

Job location

Tech stack

Java
Amazon Web Services (AWS)
Data analysis
Architectural Patterns
Data Stores
Software Debugging
Linux
Internet Security
Python
Machine Learning
MongoDB
MySQL
Natural Language Processing
NoSQL
Performance Tuning
TensorFlow
Shell Script
Software Deployment
Data Logging
Google Cloud Platform
PyTorch
Delivery Pipeline
Large Language Models
Multi-Agent Systems
Deep Learning
Backend
Keras
Containerization
Scikit Learn
Kubernetes
Information Technology
Deployment Automation
Machine Learning Operations
Virtual Agents
GPT
Data Pipelines
Docker

Job description

  • Design, build, and operate production machine learning systems that balance model quality, cost, latency, and reliability in a security-sensitive environment.
  • Own the end-to-end lifecycle of ML and LLM components, from problem formulation and model development to production deployment, monitoring, and iterative improvement.
  • Integrate ML and LLM-based services with backend systems and data pipelines, ensuring scalability, observability, and safe operation in production.
  • Develop and maintain automated training, evaluation, and retraining pipelines, and build data analysis tools to continuously improve model performance as data and threats evolve.
  • Partner closely with Product Managers and domain experts to translate product and security requirements into robust ML solutions with clear success metrics.
  • Collaborate with software engineers and SREs on release planning, deployment strategies, monitoring, and incident response to ensure reliable and predictable production behavior.

Requirements

  • MS or Ph.D. in Computer Science or a related field, with a focus on Machine Learning, and 8+ years of industry experience delivering ML systems in production environments.
  • Strong problem solver and collaborative team player with clear communication skills, able to work effectively across engineering, product, and SRE teams.
  • Solid foundation in Machine Learning, Deep Learning, and NLP, with hands-on experience using modern architectures such as transformer-based models and representation learning techniques.
  • Practical experience applying Large Language Models (LLMs) to real-world problems, including text understanding, classification, extraction, summarization, or reasoning over large-scale and noisy data.
  • Experience designing, implementing, and operating LLM-powered components in production, including prompt design, model adaptation or fine-tuning, evaluation, and cost/performance optimization.
  • Experience with MLOps / AIOps practices for operating ML and LLM systems in production, including model lifecycle management, monitoring, logging, alerting, retraining workflows, and debugging production issues.
  • Understanding of model quality, robustness, and safety considerations, including evaluation methodologies, failure modes, and guardrails required for production ML systems in security-sensitive environments.
  • Strong experience with ML frameworks, libraries, and tooling (e.g., PyTorch, Tensorflow, Keras, Scikit-learn, Kubeflow), and solid software engineering fundamentals.
  • Ability to independently own ML features end-to-end, from problem formulation and system design to implementation, deployment, and iterative improvement in production.
  • Proficient in Python, working knowledge of Java, Linux, and shell scripting.
  • Experience building and operating services on cloud platforms (Google Cloud Platform and/or AWS) and in containerized environments (Docker, Kubernetes)., * Familiarity with AI agent-based approaches, such as multi-step inference pipelines, tool-augmented LLM workflows, or systems that combine models, heuristics, and external signals to drive reliable decisions.
  • Experience with website content understanding, website classifications, security, or large-scale internet data is a strong plus.
  • Familiarity with relational and NoSQL data stores such as MySQL, MongoDB, or similar systems.
  • Experience applying LLMs and agentic systems in security-sensitive or high-precision domains is a strong plus.

Benefits & conditions

The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/com-missioned roles) is expected to be the annual range listed below. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.

$157,200.00 - $254,100.00/yr

Our Commitment

We're trailblazers that dream big, take risks, and challenge cybersecurity's status quo. It's simple: we can't accomplish our mission without diverse teams innovating, together.

Apply for this position