Senior Data Scientist III – AI Evaluation, Prompt Engineering
Job Description:
• Evaluate and tune LLM-powered features, such as prompt optimization, retrieval-augmented generation (RAG) systems, and semantic search performance
• Design and execute experiments to measure model quality, reliability, and user impact—translating technical findings into product recommendations
• Develop and maintain data pipelines for evaluating, tracking, and improving system performance (e.g., accuracy, latency, cost, and relevance metrics)
• Analyze structured and unstructured datasets (e.g., product usage logs, document metadata, LLM outputs) to identify patterns, insights, and areas for optimization
• Collaborate with product managers to translate product goals into measurable data science questions, propose next steps, and inform roadmap priorities
• Provide technical guidance to data engineers who build and maintain analytics and model evaluation infrastructure
• Communicate results clearly—through written reports, dashboards, and presentations—to technical and non-technical stakeholders
• Stay current on emerging practices in applied NLP, LLM evaluation, and data-driven product development, and thoughtfully adapt them to our environment
Requirements:
• 3–6 years of experience in data science, applied NLP, or AI product analytics, preferably within a SaaS or research-heavy product environment
• Strong proficiency in Python and data analysis libraries such as Pandas; solid working knowledge of SQL
• Ability to design and evaluate LLM-based systems (e.g., RAG pipelines, prompt evaluations, output scoring), even if not specialized in deep learning
• Experience with data exploration, experimentation, and reporting—from defining metrics to visualizing and interpreting results
• Comfort working with document-based datasets (e.g., text corpora, metadata, embeddings) and understanding information retrieval/semantic search concepts
• Excellent written and verbal communication skills—able to present complex ideas simply and persuasively across distributed teams
• Proven ability to self-direct, learn new tools and concepts quickly, and apply them pragmatically
• Strong sense of curiosity, patience, and collaboration—especially in working across different disciplines and cultures
Benefits:
• Flexible remote-first work environment, with the option to work from our New York office
• Comprehensive health coverage, including medical, dental, and vision plans
• Retirement plan with inclusive risk benefits (disability, critical illness, life cover, and funeral cover)
• Modern family benefits, including adoption, surrogacy, and parental leave
• Paid study leave and professional development support
• Well-being initiatives and opportunities for sabbaticals and personal growth
• A culture that values work/life balance, clear communication, and continuous learning
Apply tot his job
Apply To this Job