[Remote] Senior AI/ML Specialist Solutions Architect (AI Infra & Cloud)
Note: The job is a remote job and is open to candidates in USA. Lavendo is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform. They are seeking a Senior AI/ML Specialist Solutions Architect to design and implement scalable AI solutions for AI-focused customers, working with state-of-the-art technologies and contributing to one of the most powerful commercially available supercomputers.
Responsibilities
• Architect and optimize distributed training and inference systems for large-scale AI models
• Design and deliver customer-focused solutions that maximize performance and business value
• Lead the transition of ML pipelines from POC to scalable production systems
• Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals
• Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices
• Provide technical leadership and mentor teams on AI infrastructure and deployment strategies
• Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps
Skills
• 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles
• Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments
• Demonstrated success delivering ML products, scaling from POC to production
• Deep knowledge of ML frameworks like PyTorch and JAX
• Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband)
• Exceptional communication skills to engage both technical teams and business stakeholders
• Legal authorization to work in the United States on a full-time basis without sponsorship
• Programming Languages: Python, Go, Java, C++
• Infrastructure as Code (IaC): Terraform, Ansible
• Orchestration: Kubernetes (K8s), Slurm
• DevOps Tools: Git, Docker, Helm
• Big Data Frameworks: Spark, Kafka, Hadoop
• Databases: SQL, NoSQL, and vector databases
• ML Frameworks: PyTorch, TensorFlow, JAX, HuggingFace, Scikit-learn
Benefits
• Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families
• 401(k) plan with a 4% match program
• Stock options plan
• Flexible remote work environment
• Company-paid short-term, long-term disability, and life insurance coverage
• 20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers
• Up to $85/month for mobile and internet
Company Overview
• Sales recruiting for startups in the United States It was founded in 2021, and is headquartered in San Francisco, California, USA, with a workforce of 2-10 employees. Its website is https://www.lavendo.io/.
Apply tot his job
Apply To this Job