Emotional language

The emotional language model measures 53 dimensions of emotional expression from the meaning and tone of text. It supports 5 additional expressions beyond the other models: Annoyance, Disapproval, Enthusiasm, Gratitude, and Sarcasm. Recommended input filetypes: .txt, .mp3, .wav, .mp4.

You can optionally enable sentiment analysis and toxicity detection alongside emotion scores. The NER model can also be run alongside emotional language to identify named entities in text.

Job configuration

Parameter	Type	Default	Description
`granularity`	string	`word`	Level at which predictions are generated. See Granularity for available values.
`identify_speakers`	boolean	`false`	When enabled, identifies and labels different speakers in transcribed audio. Batch API only.
`sentiment`	object	—	Include this field to enable sentiment analysis. Returns a distribution over a 9-point scale.
`toxicity`	object	—	Include this field to enable toxicity detection. Returns scores for 6 categories.

Example job configuration

$ curl -X POST "https://api.hume.ai/v0/batch/jobs" \
>   -H "X-Hume-Api-Key: <YOUR_API_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": {
>       "language": {
>         "granularity": "sentence",
>         "sentiment": {},
>         "toxicity": {}
>       }
>     },
>     "urls": ["https://example.com/audio.mp3"]
>   }'

Output

Each prediction includes:

Text: the analyzed text segment
Position: the begin and end character indices
Emotion scores: scores for each of the 53 expressions
Sentiment: distribution over the 9-point scale (when enabled)
Toxicity: scores for each toxicity category (when enabled)

1 {
2   "grouped_predictions": [
3     {
4       "id": "unknown",
5       "predictions": [
6         {
7           "text": "I'm so happy to see you",
8           "position": {
9             "begin": 0,
10             "end": 23
11           },
12           "emotions": [
13             { "name": "Admiration", "score": 0.107 },
14             { "name": "Joy", "score": 0.482 },
15             ...
16           ],
17           "sentiment": [
18             { "name": "1", "score": 0.01 },
19             ...
20             { "name": "9", "score": 0.04 }
21           ],
22           "toxicity": [
23             { "name": "toxic", "score": 0.001 },
24             ...
25           ]
26         }
27       ]
28     }
29   ]
30 }

Granularity

The granularity parameter controls how text is segmented before predictions are generated.

Value	API	Description
`word`	Both	One prediction per word. Provides the most detailed resolution. This is the default.
`sentence`	Both	One prediction per sentence.
`utterance`	Both	One prediction per utterance, a continuous segment of text separated by pauses or punctuation.
`conversational_turn`	Batch	One prediction per speaker turn. Requires `identify_speakers` to be enabled.
`passage`	Streaming	One prediction for the entire text of the streaming payload.

Sentiment

When sentiment is enabled, each prediction includes a probability distribution over a 9-point scale, where 1 represents the most negative sentiment and 9 represents the most positive.

1 "sentiment": [
2   { "name": "1", "score": 0.01 },
3   { "name": "2", "score": 0.02 },
4   { "name": "3", "score": 0.05 },
5   { "name": "4", "score": 0.10 },
6   { "name": "5", "score": 0.30 },
7   { "name": "6", "score": 0.25 },
8   { "name": "7", "score": 0.15 },
9   { "name": "8", "score": 0.08 },
10   { "name": "9", "score": 0.04 }
11 ]

Toxicity

When toxicity is enabled, each prediction includes scores for the following categories:

Category	Description
`toxic`	General toxicity
`severe_toxic`	Severe or extreme toxicity
`obscene`	Obscene or vulgar language
`threat`	Threatening language
`insult`	Insulting language
`identity_hate`	Hate speech targeting identity groups

Transcription

When processing audio or video with the language model, Hume transcribes speech to text before analysis. Transcription settings are configured separately from models.

Parameter	Type	Default	Description
`language`	string	`null`	BCP-47 language tag (e.g., `en`, `fr`, `ja`). When `null`, the language is auto-detected.
`identify_speakers`	boolean	`false`	Enable speaker diarization in the transcript.
`confidence_threshold`	number	`0.5`	Minimum confidence for including transcribed text. Range: 0.0 to 1.0.

$ curl -X POST "https://api.hume.ai/v0/batch/jobs" \
>   -H "X-Hume-Api-Key: <YOUR_API_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": {
>       "language": {
>         "granularity": "sentence"
>       }
>     },
>     "transcription": {
>       "language": "en",
>       "confidence_threshold": 0.5
>     },
>     "urls": ["https://example.com/audio.mp3"]
>   }'

Named Entity Recognition (NER)

The NER model identifies people, places, organizations, and other entities in text. It can be run alongside the emotional language model.

NER accepts one job configuration parameter:

Parameter	Type	Default	Description
`identify_speakers`	boolean	`false`	When enabled, identifies and labels different speakers in transcribed audio.

$ curl -X POST "https://api.hume.ai/v0/batch/jobs" \
>   -H "X-Hume-Api-Key: <YOUR_API_KEY>" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "models": {
>       "language": {
>         "granularity": "sentence"
>       },
>       "ner": {
>         "identify_speakers": true
>       }
>     },
>     "urls": ["https://example.com/audio.mp3"]
>   }'

Expressions

The emotional language model measures the following 53 expressions. The 5 expressions marked with * are unique to the language model and not available in the face, prosody, or vocal burst models.


Admiration	Contempt	Enthusiasm*	Pain
Adoration	Contentment	Entrancement	Pride
Aesthetic Appreciation	Craving	Envy	Realization
Amusement	Desire	Excitement	Relief
Anger	Determination	Fear	Romance
Annoyance*	Disappointment	Gratitude*	Sadness
Anxiety	Disapproval*	Guilt	Sarcasm*
Awe	Disgust	Horror	Satisfaction
Awkwardness	Distress	Interest	Shame
Boredom	Doubt	Joy	Surprise (negative)
Calmness	Ecstasy	Love	Surprise (positive)
Concentration	Embarrassment	Nostalgia	Sympathy
Confusion	Empathic Pain		Tiredness
Contemplation			Triumph