AMLEGALS
AI Sector — DPDPA Compliance

DPDPA for AI, Machine Learning & GenAI Companies

AI companies face a fundamental tension — the technology depends on large-scale data processing, but DPDPA introduces purpose limitation, consent, and data principal rights at every stage of the ML pipeline. Compliance requires rethinking data architecture from collection through deployment.

Training DataSDF AssessmentAlgorithmic AuditGenAI RiskSection 10 + Rule 14

DPDPA does not contain AI-specific provisions — unlike the EU AI Act. But Section 10 (Significant Data Fiduciary), Rule 14 (algorithmic assessment), and Section 11 (right to information) combine to create a de facto AI governance framework. Any AI company processing personal data at scale must navigate this framework.

Sub-Sector Analysis

DPDPA Challenges by AI Sub-Sector

Large Language Models (LLMs) & GenAI

Foundation models, chatbots, content generation
  • Training data containing personal data scraped from Indian websites — consent at scale is impractical
  • Generated content that reproduces personal data from training sets — data leakage liability
  • User prompts as personal data — conversation logs, uploaded documents, context windows
  • Cross-border transfer of training data and model weights — Section 16 compliance
  • SDF classification for GenAI platforms with millions of Indian users
  • Right to erasure — can personal data be removed from a trained model? Technical and legal ambiguity

Computer Vision & Facial Recognition

Surveillance, authentication, image processing
  • Facial data as biometric personal data — consent requirements for collection and matching
  • CCTV and surveillance deployments — purpose limitation beyond security
  • Training datasets with Indian faces — consent for dataset inclusion
  • Liveness detection and identity verification — processing duration and retention
  • Real-time vs post-facto recognition — different risk profiles under DPDPA

Predictive Analytics & Decision Systems

Credit scoring, hiring tools, risk assessment
  • Automated decisions affecting individuals — Section 10 + Rule 14 algorithmic assessment
  • Profiling for credit, insurance, or employment — heightened scrutiny under DPDPA
  • Explainability requests under Section 11 right to information
  • Bias in training data leading to discriminatory outcomes — regulatory risk even without explicit DPDPA anti-discrimination provisions
  • Third-party data enrichment for predictions — consent chain verification

NLP & Voice AI

Voice assistants, transcription, language processing
  • Voice recordings as personal data — consent for recording, transcription, and model improvement
  • Speaker identification and diarisation — biometric processing implications
  • Multilingual data processing — Indian language data across 22 scheduled languages
  • Call centre AI — processing customer service calls for training without explicit consent
  • Voice cloning and deepfake — emerging risk without specific DPDPA provision

AI-as-a-Service Platforms

Cloud AI APIs, MLOps, model marketplaces
  • Processor vs Fiduciary classification — does the AI platform see the data or just process it?
  • Customer data used for model improvement — purpose creep beyond service delivery
  • Multi-tenant data isolation — one customer's data contaminating another's model
  • API logging and monitoring — retention of input/output data containing personal data
  • Sub-processor chain — cloud provider → AI platform → customer → end user

5 DPDPA Compliance Pillars for AI Companies

Training Data Governance

Audit all training datasets for personal data. Implement consent mechanisms where feasible, anonymisation where not. Document lawful basis for every dataset. Maintain data lineage from collection through model training.

Section 6, Section 8(7)

Algorithmic Assessment

SDFs must assess algorithmic processing that poses risk to data principals. Build assessment frameworks that evaluate bias, accuracy, and impact. Document assessment methodology and remediation steps.

Section 10, Rule 14

Inference-Time Compliance

When AI models process personal data at inference (user queries, uploaded documents, API inputs), DPDPA applies to that processing event. Implement purpose limitation, retention controls, and processing records for inference data.

Section 8, Rule 6

Cross-Border Data Architecture

AI training often uses global datasets and cloud infrastructure. Map every cross-border data flow — training data transfers, model hosting location, inference routing, and data backup — against Section 16.

Section 16

Erasure and Model Unlearning

Section 12 grants data principals the right to erasure. For AI companies, this raises the question of machine unlearning — can personal data be removed from a trained model? Implement practical approaches: retraining, data isolation, output filtering.

Section 12, Section 8(7)

AI-Specific DPDPA Advisory

AI compliance sits at the intersection of technology architecture and legal interpretation. AMLEGALS brings 27 years of regulatory experience to DPDPA implementation for AI companies — from training data governance and algorithmic assessment to GenAI deployment and cross-border data architecture.

Request a Confidential Briefing

Our data privacy counsel will reach out within one working day.

Your information is handled in accordance with our privacy obligations. No spam, ever.

Insights & Answers

What practitioners and boards are asking

How does DPDPA apply to AI and machine learning companies in India?

DPDPA applies to AI companies at every stage of the ML pipeline. data collection, training, inference, and deployment. If training data contains personal data of individuals in India, DPDPA's consent and purpose limitation apply. Section 10 (Significant Data Fiduciary) combined with Rule 14 (algorithmic assessment) creates a de facto AI governance framework. The right to erasure under Section 12 raises machine unlearning questions. Cross border training data transfers must comply with Section 16. AMLEGALS advises AI companies on training data governance, algorithmic assessment, inference time compliance, and GenAI deployment frameworks.