Research data showing why voice notes cause anxiety and stress for most recipients
Research
7 min read

The Psychology of Voice Note Anxiety: Why We Dread the Play Button

By Kyle RResearch

Research shows 30% feel annoyed by voice notes and 68% replay them to understand. Science explains why voice messages cause stress.

Seven billion. That's how many voice messages WhatsApp users send every single day, according to the platform's own 2022 announcement. And yet, research consistently shows that most people would rather not receive them.

If you find voice notes annoying – staring at a 3-minute audio message with a sinking feeling – you're not alone. The gap between how easy voice notes are to send and how stressful they are to receive is backed by real data – and it reveals something interesting about how we communicate.


Most People Don't Actually Want Your Voice Note

A 2023 YouGov survey across 17 countries found that 66% of consumers prefer receiving text messages, while only 7% prefer audio messages. In the UK and Denmark, that preference climbs to 77%.

Americans aren't much different. A Preply survey of 1,000 US adults in May 2023 found that 30% feel annoyed or inconvenienced when receiving a voice note. Nearly half – 48% – say voice notes require more effort to process than a typed message.

The kicker? The people who send the most voice notes are also the most annoyed by them. Preply found that Millennials report the highest annoyance rate at 37%, followed by Gen Z at 31%. Gen X and Baby Boomers, who use voice notes less frequently, report annoyance rates of just 20% and 12% respectively.

There's a pattern here: the more voice notes you deal with, the more annoying they become.


The Replay Problem: Why Once Isn't Enough

Unlike text, a voice note doesn't let you scan for the important parts. You're locked into the speaker's pace, their tangents, their "um"s and "so basically"s.

This creates a measurable problem. 68% of Americans report needing to listen to a voice note more than once to fully understand and respond, according to Preply's research.

The science explains why. Marc Brysbaert's 2019 meta-analysis of 190 studies (published in the Journal of Memory and Language) established that the average person reads at 238 words per minute while normal conversational speech lands around 150 words per minute. That means reading a transcript is roughly 1.6 times faster than listening to the same content.

But speed isn't the real issue – it's control. When you read text, your eyes naturally jump back to re-check details. Brysbaert's research found that about 10–15% of eye movements during reading go backward for exactly this purpose. With audio, there is no effortless backtracking. You have to stop, scrub, guess where the important part was, and re-listen.


Your Brain Works Harder to Process Voice Notes

Emile Foulke and Thomas Sticht's foundational research on listening comprehension (published in Psychological Bulletin, 1969) demonstrated that comprehension drops steeply once speech exceeds approximately 275 words per minute. Even at normal speaking rates, listening demands more sustained working memory than reading.

Why? Text sits still. Your brain can process it in bursts – read a sentence, pause, absorb, continue. Audio doesn't wait for you. The words keep coming at the speaker's pace, and your working memory has to hold previous sentences while processing new ones.

A 2021 study by Kuperman and colleagues (published in the Journal of Experimental Psychology) confirmed that reading and listening follow the same fundamental time constraints – but reading gives you control over pacing, while listening does not. That lack of control is what makes audio mentally taxing over time.

This is why "listener fatigue" is a documented phenomenon in audiology research. Extended speech processing tires the brain, even for people with perfect hearing.


The Sender-Receiver Gap: Easy to Send, Hard to Receive

Here's the core voice note etiquette problem: voice notes transfer effort from the sender to the receiver.

Recording a voice note is fast and effortless. You talk, you send. But the receiver has to find a quiet place to listen (or dig out headphones), give the message their full attention, possibly replay it, and then mentally extract the key points before responding.

Research by psychologists Justin Kruger and Nicholas Epley (published in the Journal of Personality and Social Psychology, 2005) found that people systematically overestimate how well their intended meaning comes across in messages. Senders predicted roughly 78% accuracy in conveying their tone – the actual rate was significantly lower.

While their landmark study focused on email, the underlying principle – egocentrism in communication – applies even more to voice notes. The sender hears their own message with full context and intent. The receiver hears it cold, often in a noisy environment, possibly distracted.

The result is a voice note etiquette gap. What takes the sender 30 seconds to record might cost the receiver 3 minutes to properly process, understand, and act on.


The Privacy Factor

Voice note etiquette isn't purely about cognitive load – it's situational. 41% of people worry about others eavesdropping when they play a voice note, according to Preply's survey.

This creates an invisible filter on when voice notes can actually be consumed. In meetings, on public transport, in open offices, at the dinner table – there are dozens of daily situations where hitting play isn't an option. Text has no such constraint. Beyond convenience, this is also an accessibility barrier for the 1.5 billion people with hearing loss who can never hit play at all.

The result: voice notes sit unplayed. They pile up. And the longer they sit, the more anxiety they generate.


A Generational Divide (But Not the One You'd Expect)

You might assume younger generations love voice notes while older people avoid them. The reality is more nuanced.

A 2024 survey by Uswitch and Opinium of 2,000 UK adults found that 37% of 18–24 year-olds prefer voice messages over phone calls – but just 1% of 35–54 year-olds share that preference.

Gen Z doesn't prefer voice notes over text. They prefer them over phone calls. Those are very different things. Voice notes give Gen Z the asynchronous control they want (no real-time conversation pressure) while still conveying tone and personality.

Preply's data backs this up: 84% of Gen Z send voice notes compared to 47% of Baby Boomers. But Gen Z also reports the second-highest annoyance rate (31%) when receiving them. Even the generation most comfortable with voice notes recognizes the friction on the receiving end.


Why We Send Them Anyway

If most people prefer text, why are 7 billion voice notes flying around WhatsApp daily?

Research by Amit Kumar and Nicholas Epley (published in the Journal of Experimental Psychology: General, 2021) offers a clue. Their studies found that voice-based interactions create stronger social bonds than text – but people consistently underestimate this effect and default to text expecting voice to feel awkward.

Voice notes live in an interesting middle ground. They carry the warmth and personality of voice without the real-time pressure of a phone call. For the sender, they feel intimate and expressive. For the receiver, they feel like an obligation.

This tension isn't going away. Voice notes solve a real emotional need. But the data is clear: most receivers would prefer to read that emotion rather than be forced to listen to it.


Common Questions About Voice Note Anxiety

Why are voice notes so annoying?

Voice notes shift the effort from sender to receiver. The receiver must find a private place to listen, give full attention, process at the speaker's pace, and often replay the message. Research shows 68% of people need to listen more than once to fully understand.

What percentage of people dislike receiving voice notes?

A 2023 YouGov survey found 66% of consumers prefer text over audio messages. In the US, 30% report feeling annoyed when they receive a voice note, with Millennials (37%) reporting the highest annoyance rate.

Is reading faster than listening to voice notes?

Yes. Research shows the average person reads at 238 words per minute while conversational speech is around 150 words per minute – making reading roughly 1.6x faster than listening to the same content.

Do younger people actually prefer voice notes?

It's nuanced. Gen Z prefers voice notes over phone calls, not over text. And despite 84% of Gen Z sending voice notes, they report the second-highest annoyance rate (31%) when receiving them.


Turning Audio Anxiety Into Readable Text

The research points to a simple insight: people want the warmth of voice without the friction of listening.

That's exactly what transcription does. You keep the message, lose the cognitive burden. No replaying. No finding headphones. No scrubbing through a 4-minute ramble to find the one sentence that matters.

Transcribbit converts WhatsApp voice notes into accurate, readable text in seconds. You forward the voice note, and you get a transcript back – searchable, skimmable, and quotable.

  • For the 68% who replay: One read is enough
  • For the 41% worried about eavesdropping: Read it silently, anywhere
  • For everyone processing 1.6x slower than they could read: Get that time back

Your audio is automatically deleted within 60 seconds for privacy. The text stays with you.


Sources and Research Citations

  1. WhatsApp/Meta (2022). 7 billion voice messages sent daily on WhatsApp. Announced March 30, 2022. TechCrunch
  2. YouGov (2023). Global survey across 17 markets: 66% prefer text, 7% prefer audio. November 2023. YouGov
  3. Preply (2023). Survey of 1,000 US adults, May 2023. Voice note attitudes, annoyance rates, replay behavior. Preply
  4. Brysbaert, M. (2019). "How many words do we read per minute? A review and meta-analysis of reading rate." Journal of Memory and Language, 109, 104047. DOI: 10.1016/j.jml.2019.104047
  5. Foulke, E., & Sticht, T. G. (1969). "Review of research on the intelligibility and comprehension of accelerated speech." Psychological Bulletin, 72, 50–62. DOI: 10.1037/h0027575
  6. Kuperman, V., et al. (2021). "A lingering question addressed: Reading rate and most efficient listening rate are highly similar." Journal of Experimental Psychology: Human Perception and Performance, 47(8), 1103–1112. DOI: 10.1037/xhp0000932
  7. Kruger, J., Epley, N., Parker, J., & Ng, Z.-W. (2005). "Egocentrism over e-mail: Can we communicate as well as we think?" Journal of Personality and Social Psychology, 89(6), 925–936. DOI: 10.1037/0022-3514.89.6.925
  8. Kumar, A., & Epley, N. (2021). "It's surprisingly nice to hear you." Journal of Experimental Psychology: General, 150(3), 595–607. DOI: 10.1037/xge0000962
  9. Uswitch / Opinium (2024). Survey of 2,000 UK adults, April 2024. Generational phone and voice message preferences. Uswitch