Development

Senior Gen AI Developer

Pune, Maharashtra
Work Type: Full Time

Role: Gen AI Developer

Total Experience: 6+ years with 2+ years working on GenAI initiatives

Employment Type: Permanent & Full time

Working Model: Hybrid (3 days work from office)


Job Summary:

We are seeking a Senior AI Developer with proven expertise in Generative AI technologies, a solid foundation in machine learning, and a strong understanding of data governance. The ideal candidate will have hands-on experience with both cloud-based LLM platforms, on-premise, open-source LLMs like Ollama, Llama.cpp, and GGUF-based models. You should also have good knowledge in Model Context Protocol (MCP). You will help architect and implement GenAI-powered products that are secure, scalable, and enterprise-ready.


Key Responsibilities:

  • Design, build, and deploy GenAI solutions using both cloud-hosted and on-prem LLMs.
  • Work with frameworks like Hugging Face, LangChain, LangGraph, LlamaIndex to enable RAG and prompt orchestration.
  • Implement private LLM deployments using tools such as Ollama, LM Studio, llama.cpp, GPT4All, and vLLM.
  • Design retrieval-augmented generation (RAG) pipelines with context-aware orchestration using MCP.
  • Implement and manage Model Context Protocol (MCP) for dynamic context injection, chaining, memory management, and secure prompt orchestration across GenAI workflows.
  • Fine-tune open-source models for specific enterprise tasks and optimize inference performance.
  • Integrate LLMs into real-world applications via REST, gRPC, or local APIs.
  • Ensure secure data flows and proper context management in RAG pipelines.
  • Collaborate across data, product, and infrastructure teams to operationalize GenAI.
  • Incorporate data governance and responsible AI practices from design through deployment.


Required Skills and Qualifications:

  • 6+ years of experience in AI/ML; 2+ years working on GenAI initiatives.
  • Experience with OpenAI, Claude, Gemini, cloud based LLMs (AWS/GCP/Azure) and open-source LLMs like Mistral, LLaMA 2/3, Falcon, Mixtral.
  • Strong hands-on expertise with on-premise LLM frameworks (Ollama, llama.cpp, GGUF models, etc.)
  • Hands-on experience with Model Context Protocol (MCP) for structured prompt orchestration, context injection and tool execution.
  • Proven experience in building and optimizing Retrieval-Augmented Generation (RAG) pipelines, including document chunking, embedding generation, and vector search integration.
  • Proficiency in Python and libraries such as Transformers, Hugging Face, LangChain, and PyTorch.
  • Experience with embedding models and vector DBs (FAISS, Pinecone, Weaviate, Qdrant, etc.)
  • Familiarity with MLOps, GPU optimization, containerization, and deployment in secure environments.
  • Good understanding of data governance—access control, lineage, auditability, privacy.


Nice to Have:

  • Exposure to multi-modal models (image, speech) and toolformer-style agents
  • Experience integrating AI into enterprise platforms (e.g., ServiceNow, Salesforce, Jira)
  • Awareness of inference acceleration tools (vLLM, DeepSpeed, TensorRT)

Submit Your Application

You have successfully applied
  • You have errors in applying