Machine Learning Engineer (RARR Job 6294)

For India'S Leading Diversified Group Of Manufacturing And Services

3 - 6 Years

Full Time

Immediate

Up to 18 LPA

1 Position(s)

Gurugram/ Gurgaon

Posted By : RARR Technologies Pvt Ltd

Posted 7 Days Ago

Job Skills

Job Description

About the Role

This is the core technical role. You will own the full ML pipeline - from raw inverter and weather time-series data to production-grade models that guarantee solar generation, predict inverter failures 7-14 days in advance, and detect anomalies in real time. We focus heavily on traditional machine learning built elegantly and robustly as per our requirements (XGBoost, LightGBM, Prophet, Isolation Forest) to ensure fast inference and strong accuracy. The right model for the job, shipped reliably, is what matters.

What You'll Own

End-to-end ML pipeline: data ingestion from ClickHouse → feature engineering → training → evaluation → MLflow registry → FastAPI inference API
Generation model: compute and track Performance Ratio (PR) for each site, detect underperformance vs GHI-based expected yield with ±5% accuracy
Anomaly detection: Isolation Forest (phase 1) → LSTM autoencoder (phase 2) on MPPT power ratio, inverter temperature trend, fault frequency
Predictive maintenance: Remaining Useful Life (RUL) model on inverter temp + fault code history - 7-14 day failure prediction horizon
Yield forecast: LightGBM / XGBoost model using Solcast/Open-Meteo GHI forecasts + historical PR baseline to predict weekly kWh ± 8%
MLOps: weekly automated retraining pipeline, model versioning in MLflow, A/B model promotion logic, performance drift detection
Feature engineering from raw time-series: rolling averages, sin/cos time encoding, weather transposition (GHI → POA), lag features, string imbalance ratios
Monthly automated report generation: actual vs forecast, PR trend, maintenance log, CO₂ offset

Bonus Skills (Strong Plus)

Solar domain knowledge - Performance Ratio, specific yield, CUF, irradiance transposition models (Perez, Hay-Davies)
Survival analysis / RUL modelling - Weibull distribution, Cox proportional hazards, degradation models
LSTM autoencoder for anomaly detection - implementation experience, not just awareness
ClickHouse - time-series query patterns, MergeTree engines, materialised views for feature computation
Kafka consumer in Python - reading from Kafka topics for online feature computation
AWS SageMaker, Vertex AI, or any managed ML platform - training job orchestration
NILM (non-intrusive load monitoring) - useful for future load disaggregation features
Elixir, Go, or Rust - reading code from these languages for pipeline integration

What Good Looks Like in This Role

Day 30: PR pipeline computing daily for each site; baseline anomaly model live with <10% false positive rate
Day 90: MPPT imbalance detection live; inverter temp trend model with 7-day prediction horizon
Day 150: All 4 models in production; weekly retraining cron running; monthly PDF report auto-generating
Day 365: LSTM anomaly upgrade live; yield forecast MAE <8%; RUL model validated on 6+ months of fault history

You'll Thrive Here If You

Prefer simple, explainable models that work over complex models that sometimes work
Are rigorous about evaluation - you set up proper train/val/test splits on time-series data (no data leakage)
Understands that a model in a Jupyter notebook is not a model in production
Communicate uncertainty clearly - you know when your model is confident and when it isn't
Are self-directed - you can take 'we need to predict failures' and figure out the data, the model, and the metric

What You Get

Ownership of the entire ML stack - from data to production inference
Real, novel problem: generation on time-series IoT data is not a solved problem
Work directly with the founding team - your architectural decisions shape the product

Machine Learning Engineer (RARR Job 6294)

Job Skills

Job Description

Matching Jobs