Expression Measurement

Measure nuanced human expression across face, voice, and language with Hume's multimodal models.

Hume’s Expression Measurement API captures hundreds of dimensions of human expression from audio, video, images, and text. Built on over a decade of research in computational emotion science, these models go beyond basic sentiment to measure subtle expressions like admiration, awe, empathic pain, and dozens more.

Expressions are complex and multifaceted. They should not be treated as direct inferences of emotional experience. To learn more about the science behind expression measurement, visit the About the science page.

Quickstart

Get up and running with Expression Measurement. Each guide walks you through both the Batch API and the Streaming API, from setup to your first predictions.

Models

Expression Measurement provides a suite of models, each designed for a different modality. You can run multiple models simultaneously on the same input.

Each model produces its own set of predictions independently. When you include multiple models in a single job, the response contains separate results for each model.

For example, submitting an audio file with the prosody, vocal burst, and language models enabled returns three distinct sets of predictions, one per model, each with scores for every expression that model measures.

All expression models share a common output format: each expression is assigned a score indicating the degree to which a human rater would identify that expression in the given sample.

See the individual model guides for job configuration options, output details, and the full list of measured expressions.

Batch API vs Streaming API

Expression Measurement is available through two APIs, each designed for different workflows.

APIProtocolBest forHow it works
BatchRESTProcessing files at scale, such as research datasets, media libraries, and recorded content.Submit a job with URLs or files, then retrieve predictions when processing completes.
StreamingWebSocketReal-time analysis, such as live webcam feeds, microphone input, and interactive applications.Open a persistent connection and send data continuously for immediate predictions.

Both APIs provide access to the same set of models and return predictions in the same format. Some model job configuration options differ between APIs. See individual model guides for details.

Glossary

TermDefinition
Job

A batch processing request. You submit a job with files or URLs, and retrieve predictions once the job completes.

PredictionThe output of a model for a given input. Contains scores for each detected expression.
Expression score

A value indicating the degree to which a human rater would identify a particular expression in a sample. Higher scores indicate stronger presence of the expression.

Granularity

The level at which predictions are generated. The Batch API supports word, sentence, utterance, and conversational_turn. The Streaming API supports word, sentence, utterance, and passage. Applies to the language and prosody models.

FACS

Facial Action Coding System. An optional output of the face model that provides action unit measurements (e.g., “Inner brow raise”, “Lip corner puller”).

Developer tools

Hume provides a suite of developer tools for integrating Expression Measurement.

API limits

Batch API

LimitValue
Request rate limit50 requests per minute
Maximum queued jobs500
Maximum file size (remote URL)1 GB
Maximum file size (local upload)100 MB
Maximum audio/video duration3 hours
Maximum text input size255 MB per string
Maximum items per request100 URLs, 100 text strings, and 100 local files (independently counted)
Supported archive formats.zip, .tar.gz, .tar.bz2, .tar.xz

When submitting files by URL, you can use signed URLs from your cloud storage provider to keep files private. See documentation for GCP, AWS, or Azure.

Streaming API

LimitValue
Connection timeout (inactivity)1 minute
Request rate limit (WebSocket handshake)50 requests per second
Maximum audio/video payload5 seconds
Maximum image size3,000 x 3,000 pixels
Maximum text payload10,000 characters