Senior Python Developer – LlamaIndex / RAG Pipeline Engineer
We are seeking a highly skilled Senior Python Developer with expertise in LlamaIndex and RAG pipeline engineering. The ideal candidate will be responsible for designing and implementing efficient data processing pipelines, optimizing workflows, and ensuring seamless integration of LlamaIndex solutions. Strong problem-solving skills and the ability to work collaboratively in a fast-paced environment are essential. If you're passionate about leveraging Python to build scalable solutions, we want to hear from you!
Required Skills & Experience:
3+ years Python development with production deployments
Hands-on experience with LlamaIndex (not just LangChain)
Vector database implementation (Qdrant, Milvus, or pgvector)
Document parsing pipelines (Apache Tika, Docling, PyMuPDF, or Unstructured)
Local LLM inference with llama.cpp or vLLM
Experience with open-source models (Llama, Mistral, Gemma, or similar)
GGUF model formats and quantization (Q4, Q8)
Sentence-transformers or HuggingFace embedding models
Multi-tenant application architecture
Async Python (asyncio)
Strong Plus:
Gemma model family experience
LoRA/QLoRA fine-tuning
Custom LlamaIndex NodeParser or BaseReader implementations
HNSW/IVF index tuning
Chunking strategy optimization (semantic chunking experience)
Docker/Kubernetes deployment
Not Required:
OpenAI API experience (we run fully self-hosted)
Frontend development
Apply tot his job
Apply To this Job