Back to Jobs

[Remote] Senior AI/ML Specialist Solutions Architect (AI Infra & Cloud)

Remote, USA Full-time Posted 2025-11-24
Note: The job is a remote job and is open to candidates in USA. Lavendo is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform. They are seeking a Senior AI/ML Specialist Solutions Architect to design and implement scalable AI solutions for AI-focused customers, working with state-of-the-art technologies and contributing to one of the most powerful commercially available supercomputers. Responsibilities • Architect and optimize distributed training and inference systems for large-scale AI models • Design and deliver customer-focused solutions that maximize performance and business value • Lead the transition of ML pipelines from POC to scalable production systems • Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals • Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices • Provide technical leadership and mentor teams on AI infrastructure and deployment strategies • Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps Skills • 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles • Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments • Demonstrated success delivering ML products, scaling from POC to production • Deep knowledge of ML frameworks like PyTorch and JAX • Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband) • Exceptional communication skills to engage both technical teams and business stakeholders • Legal authorization to work in the United States on a full-time basis without sponsorship • Programming Languages: Python, Go, Java, C++ • Infrastructure as Code (IaC): Terraform, Ansible • Orchestration: Kubernetes (K8s), Slurm • DevOps Tools: Git, Docker, Helm • Big Data Frameworks: Spark, Kafka, Hadoop • Databases: SQL, NoSQL, and vector databases • ML Frameworks: PyTorch, TensorFlow, JAX, HuggingFace, Scikit-learn Benefits • Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families • 401(k) plan with a 4% match program • Stock options plan • Flexible remote work environment • Company-paid short-term, long-term disability, and life insurance coverage • 20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers • Up to $85/month for mobile and internet Company Overview • Sales recruiting for startups in the United States It was founded in 2021, and is headquartered in San Francisco, California, USA, with a workforce of 2-10 employees. Its website is https://www.lavendo.io/. Apply tot his job Apply To this Job

Similar Jobs