Logo

Data Engineer (Nuc Job/ 79)

For Technology Company
5 - 5 Years
Part Time
Immediate
Up to 1.5 LPA
1 Position(s)
Remote/Work From Home (Wfh)
Posted 1 Day Ago

Job Skills

Job Description

Job Title: Data Engineer

Exp.: 5+ Years 

Location: India

About the Role: 

We are seeking a proactive Data Engineer to support our data engineering initiatives and real estate market intelligence platform. You will collaborate closely with our London based Data team to maintain robust ETL/ data extraction pipelines, audit our growing data estate, and ensure the absolute accuracy of our tracking systems covering millions of datapoints across UK and Europe.

Key Responsibilities:

· Implement our data quality roadmap. Implement and maintain standardised testing frameworks and auditing protocols that ensure data integrity & quality across our entire extraction and transformation layers.

· AI & LLM Data Quality Engineering: 

Build and maintain rigorous testing suites for LLM-driven workflows. You will implement automated checks for prompt injection, output hallucination, and graceful failure mechanisms to ensure our AI-dependent ETL pipelines remain accurate and secure.

· Data Auditing & Analysis: 

Continuously audit our real estate data, thoroughly test newly added data objects across BigQuery, Cloud Storage, SharePoint etc and proactively find and fix data quality issues/anomalies.

· Data Extraction & Scraping: 

Support and maintain Python-based web scrapers (using libraries like Playwright, Scrapy etc) that capture daily rent and availability metrics.

· Data Transformation: Write and optimize SQL workflows to transform raw scraper logs, enforcing data integrity at the analytical layer.

Required Qualifications & Experience:

· Education: Master's degree in computer science, data science or a closely related field.

· Quality Assurance/ Testing : 6+ years of expertise in building automation systems and unit testing frameworks with hands-on experience auditing, quality checking and maintaining large-scale datasets.

Hands-on experience with Pytest/Unittest and Dataplex/Great Expectations.

· Python: 6+ years of professional, hands-on experience. Must have strong proficiency in Object-Oriented (Classful) programming, building ETL/ELT pipelines. Experience using LLM APIs (e.g., OpenAI API, Gemini API, LangChain etc), Places API and libraries like Pandas is required.

· SQL & No-SQL: 6+ years of professional experience writing advanced, optimized queries for data transformation and analytics. Experience using No-SQL databases is required

· Google Cloud Platform (GCP): Prior experience of 4+ years using services like BigQuery, Firestore, Cloud Storage, Dataform, Cloud Run and Compute Engine.

· GitHub & DevOps: Experience with GitHub, Terraform and setting up CI/CD pipelines (e.g., GitHub Actions) is required.

· IDEs: Work experience using LLM supported coding tools like Cursor, Antigravity is required

Technical Stack & Environment: While you may not use every tool daily, you will be operating within and auditing systems built on the following stack:

· Programming Languages & Libraries: Python, SQL, YAML, No-SQL.

· Cloud & Infrastructure (GCP): BigQuery, Cloud Storage, Compute Engine, Cloud Run Jobs, Cloud Run Functions, Cloud Build, Cloud Scheduler, Firestore, Dataform, Artifact Registry, Secrets Manager & Cloud Dataflow.

· DevOps, QA & Tools: GitHub, GitHub Actions (CI/CD), Docker, Terraform (IaC), Linux, Pytest, Pydantic & Jira.