Representation learning

Official Definition

A machine learning technique where the system automatically discovers the representations or features needed for a task from raw data, rather than relying on manually engineered features.

Source: AIEOG AI Lexicon (Feb 2026), adapted from arXiv:1206.5538 and NIST AI 100-1

What representation learning means in plain language

Representation learning is a technique where an AI system learns to create its own internal representation of data rather than relying on humans to define what features matter. In traditional machine learning, data scientists manually select and engineer the features (variables) a model uses. In representation learning, the model discovers useful features automatically from raw data.

For example, instead of manually defining that a fraud model should look at transaction amount, time of day, and merchant category, a representation learning system would analyze raw transaction data and discover which patterns and combinations are most useful for detecting fraud — potentially identifying patterns that humans would not have thought to look for.

Deep learning is the most prominent form of representation learning. Neural networks with multiple layers learn increasingly abstract representations of data at each layer, from simple patterns in early layers to complex concepts in deeper layers. Word embeddings (like Word2Vec or BERT representations) are another example where the model learns meaningful numerical representations of words.

Why it matters in financial services

Representation learning powers many of the most capable AI systems in financial services, but it creates a governance tension: the features the model learns are often not directly interpretable by humans.

This creates challenges for explainability (how do you explain a decision based on learned features that have no intuitive meaning?), bias detection (how do you identify bias in features you cannot see or name?), and validation (how do you assess whether learned representations are appropriate and stable?).

For compliance teams, the key question is whether the model’s learned representations align with regulatory expectations for transparency and fairness.

Key considerations for compliance teams

Assess explainability impact. Determine whether representation learning creates explainability gaps for your specific use case.
Use interpretability tools. Apply feature visualization, probing classifiers, and other tools to understand what learned representations capture.
Test for proxy discrimination. Learned representations can encode protected characteristics indirectly. Test for disparate impact.
Validate representation stability. Monitor whether learned representations change over time or with new data.
Document the approach. Record why representation learning was chosen, what data was used, and what interpretability limitations exist.
Include in model risk assessment. The use of representation learning should be a factor in the model’s risk classification.

Stay current on AI risk in financial services

Get practical guidance on AI governance, model risk, and regulatory developments delivered to your inbox. Stay up to date on the latest in financial compliance from our experts.