Back to Jobs

Machine Learning Engineering Manager – LLM Serving, Infrastructure

Remote, USA Full-time Posted 2025-11-24
• Lead a high-performing engineering team to develop, build, and deploy a high-scale, low-latency LLM Serving Infrastructure. • Drive the implementation of a unified serving layer to support multiple LLM models and inference types (batch, offline eval flows and real-time/streaming). • Lead all aspects of the development of the Model Registry for deploying, versioning, and running LLMs across production environments. • Ensure successful integration with the core Personalization and Recommendation systems to deliver LLM-powered features. • Define and champion standardized technical interfaces and protocols for efficient model deployment and scaling. • Establish and monitor the serving infrastructure's performance, cost, and reliability, including load balancing, autoscaling, and failure recovery. • Collaborate closely with data science, machine learning research, and feature teams (Autoplay, Home, Search, etc.) to drive the active adoption of the serving infrastructure. • Scale up the serving architecture to handle hundreds of millions of users and high-volume inference requests for internal domain-specific LLMs. • Drive Latency and Cost Optimization: partner with SRE and ML teams to implement techniques like quantization, pruning, and efficient batching to minimize serving latency and cloud compute costs. • Develop Observability and Monitoring: build dashboards and alerting for service health, tracing, A/B test traffic, and latency trends to ensure consistency to defined SLAs. • Contribute to Core LPM Serving: focus on the technical strategy for deploying and maintaining the core Large Personalization Model (LPM). Apply tot his job Apply To this Job

Similar Jobs

OB Hub RN- Virtual Fetal Heart Monitoring (ONSITE)

Remote, USA Full-time

Experienced Part-Time Customer Service Phone Representative – Remote In-Home Opportunity with blithequark

Remote, USA Full-time

IT Data Analyst - remote

Remote, USA Full-time

[Remote] Data Entry - Typist Part-Time - Remote

Remote, USA Full-time

Experienced Airport Customer Service Agent and Ramp Handler – Ensuring Safe and Timely Cargo Transportation at blithequark

Remote, USA Full-time

Administrative Data Entry Virtual Assistant

Remote, USA Full-time

Virtual Benefits Rep (fully remote)

Remote, USA Full-time

Experienced Remote Data Entry and Live Chat Support Specialist – Flexible Part-Time Work from Home Opportunity with blithequark

Remote, USA Full-time

Dispatch (REMOTE) Greenville NC

Remote, USA Full-time

Senior Administrative Assistant job at Cribl in US National

Remote, USA Full-time

Hiring Now: Fedex Entry-Level Remote Jobs (No Experience)

Remote, USA Full-time

Remote Sales Job at The Mendicino Agency in Queen Creek

Remote, USA Full-time

RN Labor and Delivery Nursery and Postpartum

Remote, USA Full-time

Regional Human Resources Manager

Remote, USA Full-time

Intern Summer 2026- Information Technology

Remote, USA Full-time

**Experienced Remote Call Center Representative - 1st Shift | Make a Difference in People's Lives | $15/Hour | Flexible Work Arrangements**

Remote, USA Full-time

Gables Search – Project Manager (Jr – Mid level) – DENVER – Denver, CO

Remote, USA Full-time

**Experienced Chat Specialist – Automotive and Recreational Vehicle Sales, Service, and Finance**

Remote, USA Full-time

Engineering Technician II - R10188113

Remote, USA Full-time

Youtube Style Edits

Remote, USA Full-time