Logo

Senior Data Engineer - GenAI & Unstructured Data Pipelines (RARR Job 6324)

For A Leading Technology Services And Consulting Company
7 - 10 Years
Full Time
Immediate
Up to 27 LPA
1 Position(s)
Bangalore / Bengaluru, Chennai, Gurgaon / Gurugram, Hyderabad, Kolkata, Mumbai, Noida, Pan India, Pune
Posted Updated Today

Job Skills

Job Description

Job Title: Senior Data Engineer - GenAI & Unstructured Data Pipelines

Responsibilities:

  • Own the end to end ML lifecycle, including data ingestion, feature engineering, training, evaluation, deployment, monitoring, retraining, and rollback
  • Design, build, and operate production grade ML pipelines using Azure native services with solid CI/CD and automation practices
  • Use Azure Machine Learning and MLflow for experiment tracking, model registry, and governed promotion across Dev/Test/Prod environments
  • Design and deploy Generative AI solutions using Azure OpenAI, embeddings, vector search, and RAG pipelines
  • Build Agentic AI workflows with multi step reasoning, tool usage, guardrails, observability, reliability, and cost control
  • Build scalable data and feature pipelines using Azure Databricks (batch and streaming)
  • Build scalable data pipelines for text, documents, logs, and multi-modal data
  • Develop RAG pipelines including chunking, embedding, and retrieval workflows
  • Design and manage vector search systems (Azure AI Search, Pinecone, etc.)
  • Build batch + real-time data ingestion pipelines using Spark and Kafka


Required Skills:

  • 6–8 years of experience in Data Engineering
  • Strong hands-on experience with Python / PySpark / SQL
  • Apache Spark, Airflow, Kafka
  • Handling unstructured data (JSON, logs, documents, PDFs)
  • Experience with cloud platforms (Azure preferred)
  • Exposure to LLM / GenAI pipelines (RAG, embeddings, vector DBs)