Facial expression

The facial expression model measures 48 dimensions of emotional expression from facial movements in images and video. It detects subtle expressions often associated with emotions like admiration, awe, empathic pain, and more. Recommended input filetypes: .png, .jpeg, .mp4.

You can optionally enable two additional outputs alongside emotion scores: FACS 2.0 action units and facial descriptions. The facemesh model can also be run alongside facial expression to capture detailed face geometry.

Job configuration

The face model is configurable in both Batch API and Streaming API jobs.

Parameter	Type	Default	Description
`facs`	object	—	Include this field to enable FACS 2.0 predictions alongside emotion scores.
`descriptions`	object	—	Include this field to enable facial description predictions (e.g., “Smile”, “Frown”, “Wink”).
`identify_faces`	boolean	`false`	When enabled, assigns a consistent identifier to each detected face across frames, useful for tracking individuals in video.
`fps_pred`	number	`3`	Number of frames per second to process in video input. Lower values improve performance; higher values increase temporal resolution.
`prob_threshold`	number	`0.99`	Minimum confidence threshold for face detection. Faces detected below this threshold are excluded from results.
`min_face_size`	integer	`60`	Minimum bounding box size in pixels. Faces smaller than this are ignored.
`save_faces`	boolean	`false`	Batch API only. When enabled, extracts detected faces into the job artifacts ZIP file.

Example job configuration

$ curl -X POST "https://api.hume.ai/v0/batch/jobs" \
>   -H "X-Hume-Api-Key: <YOUR_API_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": {
>       "face": {
>         "fps_pred": 3,
>         "identify_faces": true,
>         "facs": {},
>         "descriptions": {}
>       }
>     },
>     "urls": ["https://example.com/video.mp4"]
>   }'

Output

Each prediction includes:

Bounding box: the x, y, w, h coordinates of the detected face
Detection confidence: the probability that the detection is a face
Face ID: a consistent identifier when identify_faces is enabled
Emotion scores: scores for each of the 48 expressions
FACS scores: action unit scores when facs is enabled
Description scores: facial description scores when descriptions is enabled

1 {
2   "grouped_predictions": [
3     {
4       "id": "unknown",
5       "predictions": [
6         {
7           "frame": 0,
8           "time": 0.0,
9           "prob": 0.9994,
10           "box": {
11             "x": 187.88,
12             "y": 197.70,
13             "w": 401.67,
14             "h": 561.43
15           },
16           "emotions": [
17             { "name": "Admiration", "score": 0.107 },
18             { "name": "Joy", "score": 0.482 },
19             ...
20           ],
21           "facs": [
22             { "name": "AU12", "score": 0.82 },
23             ...
24           ],
25           "descriptions": [
26             { "name": "Smile", "score": 0.91 },
27             ...
28           ]
29         }
30       ]
31     }
32   ]
33 }

When identify_faces is enabled, each group’s id is replaced with a unique face identifier that persists across frames, allowing you to track the same individual throughout a video.

Expressions

The facial expression model measures the following 48 expressions. Each score indicates the degree to which a human rater would identify that expression in the given sample.


Admiration	Confusion	Empathic Pain	Pride
Adoration	Contempt	Entrancement	Realization
Aesthetic Appreciation	Contentment	Envy	Relief
Amusement	Craving	Excitement	Romance
Anger	Desire	Fear	Sadness
Anxiety	Determination	Guilt	Satisfaction
Awe	Disappointment	Horror	Shame
Awkwardness	Disgust	Interest	Surprise (negative)
Boredom	Distress	Joy	Surprise (positive)
Calmness	Doubt	Love	Sympathy
Concentration	Ecstasy	Nostalgia	Tiredness
Contemplation	Embarrassment	Pain	Triumph

FACS 2.0

The Facial Action Coding System (FACS) measures individual facial muscle movements called action units. When facs is enabled, predictions include intensity scores for each of the following action units.

Action unit	Description
AU1	Inner Brow Raise
AU2	Outer Brow Raise
AU4	Brow Lowerer
AU5	Upper Lid Raise
AU6	Cheek Raise
AU7	Lids Tight
AU9	Nose Wrinkle
AU10	Upper Lip Raiser
AU11	Nasolabial Furrow Deepener
AU12	Lip Corner Puller
AU14	Dimpler
AU15	Lip Corner Depressor
AU16	Lower Lip Depress
AU17	Chin Raiser
AU18	Lip Pucker
AU19	Tongue Show
AU20	Lip Stretch
AU22	Lip Funneler
AU23	Lip Tightener
AU24	Lip Presser
AU25	Lips Part
AU26	Jaw Drop
AU27	Mouth Stretch
AU28	Lips Suck
AU32	Bite
AU34	Puff
AU37	Lip Wipe
AU38	Nostril Dilate
AU43	Eye Closure
AU53	Head Up
AU54	Head Down
—	Hand over Mouth
—	Hand over Eyes
—	Hand over Forehead
—	Hand over Face
—	Hand touching Face / Head

Facial descriptions

When descriptions is enabled, predictions include scores for high-level facial descriptions. These provide an intuitive, human-readable summary of the face’s appearance.


Beaming	Frown	Laugh	Squint
Biting lip	Gasp	Licking lip	Sulking
Cheering	Glare	Pout	Tongue out
Cringe	Glaring	Scowl	Wide-eyed
Cry	Grimace	Smile	Wince
Eyes closed	Grin	Smirk	Wrinkled nose
Face in hands	Jaw drop	Snarl

Facemesh

The facemesh model detects 468 facial landmark points, providing detailed face geometry data. This is useful for applications that need precise face shape analysis beyond emotional expression.

Facemesh has no configurable parameters. Enable it alongside the face model by passing an empty object:

$ curl -X POST "https://api.hume.ai/v0/batch/jobs" \
>   -H "X-Hume-Api-Key: <YOUR_API_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": {
>       "face": {
>         "identify_faces": true
>       },
>       "facemesh": {}
>     },
>     "urls": ["https://example.com/video.mp4"]
>   }'

Each facemesh prediction includes an array of 478 3D facial landmark coordinates for each detected face.