AI Sector — DPDPA Compliance

DPDPA for AI, Machine Learning & GenAI Companies

Q: Does DPDPA require algorithmic transparency for AI systems?

DPDPA does not mandate algorithmic explainability directly, but Significant Data Fiduciaries must conduct algorithmic assessment of processing that "may pose a risk to data principals" under Section 10 read with Rule 13. This effectively creates a transparency obligation for high-risk AI systems operated by SDFs. Additionally, the right to information under Section 11 means data principals can request details about how their data is being processed, which includes AI-based processing. The combination of Section 10, Section 11, and Rule 13 creates a de facto algorithmic governance framework.

Q: How does DPDPA handle AI training data and web scraping?

DPDPA applies to personal data in training datasets regardless of how it was collected. Web scraping of personal data for AI training requires lawful basis under DPDPA — either consent (impractical at scale) or reliance on Section 7 deemed consent provisions (narrow). Publicly available data does not automatically become free-to-use under DPDPA — the purpose limitation principle still applies. AI companies must audit training datasets for personal data, implement data minimisation, and consider anonymisation techniques that fall outside DPDPA's definition of personal data.

AI companies face a fundamental tension — the technology depends on large-scale data processing, but DPDPA introduces purpose limitation, consent, and data principal rights at every stage of the ML pipeline. Compliance requires rethinking data architecture from collection through deployment.

Training DataSDF AssessmentAlgorithmic AuditGenAI RiskSection 10 + Rule 13

DPDPA does not contain AI-specific provisions — unlike the EU AI Act. But Section 10 (Significant Data Fiduciary), Rule 13 (algorithmic assessment), and Section 11 (right to information) combine to create a de facto AI governance framework. Any AI company processing personal data at scale must navigate this framework.

Sub-Sector Analysis

DPDPA Challenges by AI Sub-Sector

Large Language Models (LLMs) & GenAI

Foundation models, chatbots, content generation

›Training data containing personal data scraped from Indian websites — consent at scale is impractical
›Generated content that reproduces personal data from training sets — data leakage liability
›User prompts as personal data — conversation logs, uploaded documents, context windows
›Cross-border transfer of training data and model weights — Section 16 compliance
›SDF classification for GenAI platforms with millions of Indian users
›Right to erasure — can personal data be removed from a trained model? Technical and legal ambiguity

Computer Vision & Facial Recognition

Surveillance, authentication, image processing

›Facial data as biometric personal data — consent requirements for collection and matching
›CCTV and surveillance deployments — purpose limitation beyond security
›Training datasets with Indian faces — consent for dataset inclusion
›Liveness detection and identity verification — processing duration and retention
›Real-time vs post-facto recognition — different risk profiles under DPDPA

Predictive Analytics & Decision Systems

Credit scoring, hiring tools, risk assessment

›Automated decisions affecting individuals — Section 10 + Rule 13 algorithmic assessment
›Profiling for credit, insurance, or employment — heightened scrutiny under DPDPA
›Explainability requests under Section 11 right to information
›Bias in training data leading to discriminatory outcomes — regulatory risk even without explicit DPDPA anti-discrimination provisions
›Third-party data enrichment for predictions — consent chain verification

NLP & Voice AI

Voice assistants, transcription, language processing

›Voice recordings as personal data — consent for recording, transcription, and model improvement
›Speaker identification and diarisation — biometric processing implications
›Multilingual data processing — Indian language data across 22 scheduled languages
›Call centre AI — processing customer service calls for training without explicit consent
›Voice cloning and deepfake — emerging risk without specific DPDPA provision

AI-as-a-Service Platforms

Cloud AI APIs, MLOps, model marketplaces

›Processor vs Fiduciary classification — does the AI platform see the data or just process it?
›Customer data used for model improvement — purpose creep beyond service delivery
›Multi-tenant data isolation — one customer's data contaminating another's model
›API logging and monitoring — retention of input/output data containing personal data
›Sub-processor chain — cloud provider → AI platform → customer → end user

5 DPDPA Compliance Pillars for AI Companies

Training Data Governance

Audit all training datasets for personal data. Implement consent mechanisms where feasible, anonymisation where not. Document lawful basis for every dataset. Maintain data lineage from collection through model training.

Section 6, Section 8(7)

Algorithmic Assessment

SDFs must assess algorithmic processing that poses risk to data principals. Build assessment frameworks that evaluate bias, accuracy, and impact. Document assessment methodology and remediation steps.

Section 10, Rule 13

Inference-Time Compliance

When AI models process personal data at inference (user queries, uploaded documents, API inputs), DPDPA applies to that processing event. Implement purpose limitation, retention controls, and processing records for inference data.

Section 8, Rule 6

Cross-Border Data Architecture

AI training often uses global datasets and cloud infrastructure. Map every cross-border data flow — training data transfers, model hosting location, inference routing, and data backup — against Section 16.

Section 16

Erasure and Model Unlearning

Section 12 grants data principals the right to erasure. For AI companies, this raises the question of machine unlearning — can personal data be removed from a trained model? Implement practical approaches: retraining, data isolation, output filtering.

Section 12, Section 8(7)

Related DPDPA Resources

DPDPA vs GDPR vs EU AI Act

15-dimension comparison

Significant Data Fiduciary

Section 10 SDF obligations

Compliance Checklist

8-phase implementation guide

Cross-Border Transfers

Section 16 deep-dive

DPDPA for Startups

AI startup compliance playbook

Data Breach Response

Section 8(6) + Rule 7 protocol

Enterprise Governance

Board-level framework

DPDPA Consulting

Counsel-led advisory services

AI-Specific DPDPA Advisory

AI compliance sits at the intersection of technology architecture and legal interpretation. AMLEGALS brings 27 years of legal experience to DPDPA implementation for AI companies — from training data governance and algorithmic assessment to GenAI deployment and cross-border data architecture.

Request a Confidential Briefing

Our data privacy counsel will reach out within one working day.

Insights & Answers

What practitioners and boards are asking

How does DPDPA apply to AI and machine learning companies in India?

DPDPA applies to AI companies at every stage of the ML pipeline. data collection, training, inference, and deployment. If training data contains personal data of individuals in India, DPDPA's consent and purpose limitation apply. Section 10 (Significant Data Fiduciary) combined with Rule 13 (algorithmic due diligence) creates a de facto AI governance framework. The right to erasure under Section 12 raises machine unlearning questions. Cross border training data transfers must comply with Section 16. AMLEGALS advises AI companies on training data governance, algorithmic assessment, inference time compliance, and GenAI deployment frameworks.