Paul Alan Ruben

Paul Alan Ruben

Monday, February 20, 2012

Not For Librarians Only: An Audio Book Reviewer’s Template

If we want criticism to matter…we have to treat it with
more respect. This means abandoning the notion that it’s
just hack work or service journalism or literary
bookkeeping… criticism, done well, is…a singular genre in
which a lot of great writers have done their best work.
            -Sam Anderson, “Why Criticism Matters
New York Times Sunday Book Review, Dec. 31, 2010

TEMPLATE (Online Miriam Webster): a gauge, pattern, or mold …used as a guide to the form of a piece being made.

STORYTELLING: A co-creative experience involving the senses with an intentional storyteller and acknowledged listener
            -Rachel Hedman, Professional storyteller

I begin by acknowledging that most audio book reviewers are not compensated, or paid little for their services. Their listening commitment (a 15 hour book is 15 hours) and the time required to conscientiously assess a program is significant.
In recognition of the reviewer’s serious labor (of love, mostly), I hope the following template will add credibly to the proposition that audio book narration - whose roots burrow powerfully through a time-honored and singularly distinctive storytelling tradition that probably dates back to humanity’s nascent guttural utterances – is eminently worthy of the reviewer’s effort to unpack this performance art for the listener.
By the way, narrators and audio book professionals do read reviews and do take them seriously, especially when they’re favorable. That’s a fact!


PERFORMANCE MARKERS: Critique vocabulary that identifies specific performance issues directly related to the listener’s experience.
LISTENING LINK: Audio book segments that - in the best sense - epitomize performance markers.

EVALUATION QUESTIONS: Highlighted in boldface, salient critique questions reviewers might pose to themselves while evaluating performance.

Before identifying vocabulary (what I’m calling ‘performance markers’) that specifically addresses the efficacy of a given performance, it’s important to remember that this template separates experience of the written word from the spoken word, the implication being that there’s something special about ‘spoken’ storytelling. While automobiles and their standard cassette players may have been the progenitor of audio books, there is, nevertheless, something wholly unique about speech, something the storyteller can conjure that the writer cannot. It’s what that most incredible instrument of all - the human voice - can perpetrate on a listener that I’m drilling into when I serve up this so-called performance vocabulary.
In a word, this template is dedicated to the singularity of the narrator’s performance art: storytelling.


As you prepare to evaluate an audio book, take a metaphorical breath, as if to allow yourself time to position the critic’s lens to properly see the storyteller’s aesthetic purpose: To act as a conduit that connects the text’s emotional intent to the listener. That is the performer’s bottom line obligation.
Said differently: Whatever the main character Joe is experiencing emotionally – whether he’s evil incarnate or Mr. Lovable - the narrator (conduit) must connect Joe (his feelings) to the listener, precisely as the author intends.


The reviewer’s ears separate performance from text. Love the book or hate it, conflating text and narrator disregards the storyteller’s only obligation: emotionally connecting you to the story. A performance review focuses on the transport of the narrative’s feelings and perhaps, depending on the text, a little pity for the narrator who is at the words’ mercy.


         Axiomatically, if actors can only act emotions, shouldn’t emotions be the reviewer’s primary focus.
If a reviewer writes, “I liked Cindy’s performance,” or “the narrator’s characterizations were great,” we might fairly ask, how does ‘like’ inform us about Cindy’s emotional impact on the listener? It kind of does, but kind of not, too. “The narrator’s portrayal of tough-minded Cindy was good, straight-forward, precise,” doesn’t quite reference how we might actually feel when hearing Cindy.
Again, actors act emotion. That’s it. If I’m directing an actor and say, okay, this character, Cindy, she grew up in poverty, so what does that tell you emotionally about Cindy? Nothing.
 Poverty isn’t an emotion; it’s not actable. Therefore, mentioning Cindy’s poverty isn’t an actable clue to the storyteller. But, when I suggest that Cindy’s poverty has made her strong-willed, suspicious of wealthy people who never knew what being poor meant, now the actor gets Cindy emotionally. These are actable clues that encourage the narrator to inhabit Cindy’s feelings, then connect those feelings to the listener.
An emotionally-centric review might alert us to “…Cindy’s  forceful, yet tempered tone…The town’s arrogant mayor, whose patronizing opposition to Cindy’s implacable demands to help the poor pushes us to dislike him as though he were the object of our scorn.” This more evocative syntax suggests to the listener, here’s how you’re going to feel about these characters.


* If you’re an exception-to-the-rule person, please know that attached to the following performance markers is an imaginary asterisk. My vocabulary is not fixed. It’s meant to be mediated.
* The accompanying performance links are my opinion, yours may differ. That’s what makes awards ceremonies maddening.


Note: The following guidelines apply to fiction and non-fiction.

PERFORMANCE MARKER: Chewing the syntax
LISTENING LINK: Listen to Simon Jones.

Is there an abandoned, unloved word, let alone phrase?

Imagine an author slaving over his narrative, saying to himself, well, I’ll just heave in some sentences that really don’t matter to me.
         Not likely.
         The reviewer should rightfully imagine the storyteller - as she contemplates six grueling days in the studio plowing through a 23 finished hour marathon – silently ingratiating herself to the author: ‘Yeah, your words, I know. I know, all 216,000 of ‘em. Your babies. Well, for the next week, they’re mine!’
         Contemplating the author’s narrative, storytellers recognize that there’s no such thing as a word or phrase undeserving of their total emotional commitment.
         Readers languidly gum the language, storytellers chew it, intensely.

PERFORMANCE MARKER: The here and now

LISTENING LINK: Listen to Maggie-Meg Reed

and Barbara Caruso.

Are these storytellers in the moment, experiencing what’s occurring as if it were happening in the here and now?
Or do they sound removed, uninvolved, as if reporting an unfelt event or a time whose palpitating resonance they can longer recall, much less replicate?

         When the storyteller inhabits this marker, it’s a performance maker. Ignore the here and now, or worse, defy it, it’s a performance breaker.
Without the here and now there’s no emotional consequence, no storytelling. At best, you’ll get a well enunciated, bland reading. In the absence of the here and now, listeners may understand everything, but they’ll feel nothing about it.
         Storytellers evoke the here and now when they emotionally engage the text as if events were unfolding in front of them right now, even if they’re talking about the invention of the wheel. Storytellers experience the words’ feeling, existentially, in real time. Regardless of topic, location, or who is speaking, if the storyteller is immersed in the here and now, the listener will be as well.
         Narrators who disregard the here and now tend to ‘report’ experience. Reporting, as I call it, is emotionally empty, non-committal, passive, as if proffered by a marvelous body (great voice) with no soul (feelings).
Think of it this way: a reporter’s job one is the objective dissemination of information, one-eighty opposite the storyteller’s prime directive, emotional connection.
Next time you hear the TV news broadcaster’s honeyed rhythms, authoritative cadences or the commercial voice-over talent’s clear, buttery-smooth voice, mellifluously uttered, meticulously modulated words, it’s worth noting their outcome: Dispassionate reportage; encouragement of the purchase of a product or service. But never to connect themselves to you emotionally, as if they were an integral part of the story.
Reportage is anathema to storytelling because it emotionally distances narrator and listener. The here and now actualizes the moment. It immerses narrator and listener in events as if they’re occurring in real time. If the goal is transfer of information, hire a reporter or announcer.

LISTENING LINK: Listen to Holter Graham

and Beth McDonald.

Is the story unfolding as if these narrators were the characters they’re talking about?
Or, do the narrators act merely as if they’re the characters’ spokesperson? Or maybe the narrators are speaking as themselves, for themselves, as if their finely tuned vocal instrument should be the story’s focus rather than its characters?

Storytelling cannot exist unless narrators speak from the point of view of who or what they’re speaking about.

Point of view can only be revealed when the narrator speaks, not as the character (that’s what the actual dialogue is for), not for the character (implying the performer is the narrative’s focus), but rather as if the narrator has permeated their soul for the purpose of intimately transmitting the character’s feelings to the listener. 
         Think of a story about a man who hates his wife. As a listener, would you prefer to hear an omniscient voice announce the story from his emotionally unattached vantage or be thrust by the storyteller into the marrow of this hateful husband’s bones, feeling suddenly provoked by the woman he detests? When point of view vanishes, so does emotional connection.
It’s challenging enough for narrators to consistently maintain the point of view of whom they’re speaking of. What about inanimate objects and non-sentient beings?

LISTENING LINK: Listen to Barbara Rosenblat breathe feeling into mostly things.

PERFORMANCE MARKER: Discovery (playing the ‘wow’)

LISTENING LINK: Listen to David Ledoux.

Does his cadence reflect someone who, just before he utters the next word, suddenly discovered it? As if he has no idea of what he’s going to say until the syntax erupts from his lips?
Or are David’s words perfunctorily expelled, quickly, as if uninterrupted by preceding thought? Does he blow through punctuation, reading what sounds like a 300 page run-on sentence Gatling gun style?

Discovery is literally the conscious thought that occurs the very moment before the syntax that expresses it. Discovery is an emotional propellant that turbocharges listeners’ willing suspension of disbelief, immersing them, magically, in the here and now. To the storyteller, discovery translates as: Wow!
 I’ll explain:
Discovery is a counterintuitive process because the narrator is  reading what should sound extemporaneous. Because the book is in front of her, she knows what’s coming.
The narrator’s discovery challenge exemplifies one of storytelling’s ironic truths: listeners are aesthetically defaulted to want to willingly suspend their disbelief, to believe that what’s being read is actually occurring in the moment.
Because narrative fiction and non-fiction connect us with our experience of ourselves and others, isn’t sharing in the active discovery of the feelings that emotionally represent that experience a lot more exciting than being treated to a recitation by some bored reader for whom there are no emotional stakes.
Or think of it this way: in real life discourse, do we literally know what’s coming next? Don’t we routinely stop/start, stutter, pause, mumble, hesitate when we speak. So, why wouldn’t the narrator do the same?
I often tell narrators that discovery oscillates in the white spaces between punctuation (a comma, period, colon, etc.). That white space is where the author readies us for a new thought – from subtle to profound - which is always joined at the hip with attendant feeling. When the storyteller discovers this white space, it’s as if to say: Wow! Wait till you hear this!

LISTENING LINK: Listen to Oliver Wyman.

As he describes this couple’s passion, is it as if the words he’s emphasizing emerge from his discovery of their intimate relationship?
Or is he arbitrarily modulating (‘punching,’ ‘vocally highlighting,’ almost ‘singing’ words or phrases) in an effort to percolate his narration, juice it with inflection? Simply, does his emphasis sound real or fake, organic (natural) or modulated (forced)?

         Axiomatically, real (organic) connects; fake (manufactured) disconnects.
         Ironically, emphasis is as much the storyteller’s enemy as her friend. The harder the narrator works at ‘hitting’ the right words, the more boring (or annoying) she becomes. Why? Because modulation disconnects us from feeling. Emphasis for its own sake invalidates emotion. Manufactured inflection asserts emphasis over empathy. Modulation – purposefully whacking a word for variety sake or because it somehow begs to be EMPH-a-sized - is the moral equivalent of gorging on a lusciously gooey hot fudge sundae: tastes great but try surviving off those calories.
         When directing narrators, if I detect manufactured inflections that indicate feeling, I’ll often suggest to the narrator: Don’t sing, don’t modulate. Inhabit the feeling inside the word and emphasis will occur, naturally, organically.
         To sample what I’ll suggest is empty calorie emphasis (again, only as it pertains to storytelling) listen to a radio or TV voice-over, or news broadcaster: their job is to interest or inform only. If you happen to ride the New York City transit, listen to Charlie Pellett, ubiquitous voice of the newest subway cars. His mother of all empty calorie announcements - a vocal aria that regards words as manic action figures - bounce sung syntax up, down, high, low, as if each were on its own trampoline. And why not! His job is to inform passengers, alert them about any “suspicious package,” not freak the daylights out of them, cause mass panic, or insinuate his feelings into the content of his communication.
         Modulation mocks feeling. You’ll know empty emphasis when you hear it because its emotional impact will be like you were just punched by a pleasant puff of air.


LISTENING LINK: Listen to Thérèse Plummer

and Tristan Layton.

Is each character responding directly to the other?

Or rather, to some amorphous, don’t quite know who? Are the characters’ responses indicative of their emotional state based on what they just heard as well as how they’re predisposed? In other words, as they interact, are the characters engaged?        

The essential dialogue challenge for narrators isn’t speaking, it’s listening. Think of it this way: how can a character’s response believably resonate if that person hasn’t heard what was just said?
When dialogue sounds distant, oddly wooden, metaphorically ‘out there,’ the reason can usually be traced to one of two distinct possibilities: the storyteller simply can’t affect believable characterizations (bluntly, he can’t act) or in the pejorative show biz lingua franca, she’s ‘phoning it in’: reciting unfelt feelings, as if by rote, rather than inhabiting them; indicating emotions, rather than responding organically to them.
Considering the bottom line (emotional connection between listener and narrator), as reviewers critique dialogue, it is fair to consider: did that character who just spoke hear and then react to what was previously said? Or was it simply that character’s turn to speak?
What about accents, creating unique voices, and playing the opposite gender?
Remember the narrator’s chief concerns: reflecting the point of view of whom or what she is speaking about; honoring the author’s intent; keeping the listener connected. When an implausible sounding voice or opposite gender short-circuits the listener-narrator connection, the listener’s willing suspension of disbelief wriggles distractedly, uncomfortably, aesthetically frazzled, jarred. It’s as if the head phones had been unceremoniously ripped from the listener’s ears.
We know that the narrator plays every character. We accept that the storyteller is one sex. Of course individuated voices immediately identify who is speaking, and we certainly need to know whether the character is male or female. But I would argue, all things being equal, who characters are resonates more than what they sound like.
I often suggest to talent, play the character, not his voice; inhabit the gender. Males and females speak in differing rhythms and cadences and when an actor can locate those, gender differences become obvious.
Finally, less is almost always more. Subtle tonal shifts generally identify which sex is talking. Yes, the degree of characterization must fit the genre, and commensurately address the author’s vocal description of the person. But ultimately, if the characters aren’t  responding to what they heard, great character voices become little more than vocalized bling.


Listen to Chuck Stransky

and Yelena Shmulenson.

Think: genre and aesthetic. Are their accents consistent with both? Chuck’s narrative is clearly comedic. Do these guys sound like comedic tough guys? Yelena’s story is dramatic. Does her accent support or interfere with the narrative’s intimate mood?

          Considering accents - domestic and foreign - think propriety, necessity, and less is more, generally.
An accent’s ‘thickness’ may reduce a 3-D character to1-D. A hint may be enough to fully believe the character is Irish. To aesthetically buy into the character, is an accent even necessary? Not if the accent isn’t directly indicated or implied.
Nothing distracts a listener from the narrative like an inauthentic, unidentifiable or plainly inaccurate accent. I always suggest to actors, if you can’t imitate the accent perfectly, don’t even go there, or just suggest a hint. Second, listeners accept that the narrator isn’t part Chinese, Italian, Mexican, British, South African, and Texan. A hint will maintain their suspension of disbelief, willingly.
The text - the primary reference for every performance question – may or may not provide clarity. If the author writes, “she said with a pronounced Kentucky twang,” the actor must comply. But just because the story takes place in Boston does not imply the characters all have to sound like JFK.
Imagine the cackling Brooklynite, driving home over the Verrazano Bridge, as he pauses the CD then glances at the wife:  “Dis here guy’s supposed to be from Canarsie? Fahhgetaboutit!” 

         There are at least two performance claims that distinguish non-fiction from fiction, and they’re interrelated:

PERFORMANCE MARKER: It’s the author’s story…

PERFORMANCE MARKER: …never the characters’ story

LISTENING LINK: Listen to Dennis Boutsikaris.

Do you have the distinct impression the author is speaking? Is Dennis reporting or inhabiting the author’s point of view?

         From a performance perspective the fiction narrator’s objective is to connect the listener to each character’s point of view. The non-fiction narrator is obliged only to connect us to the author’s.
The non-fiction author is eager to excite us about the ideas, characters or events she’s describing. Her objective is to convey her amazement, her fascination, and her belief that her story must be told, now.
When the narrator affects a ‘voice’ our focus shifts to the characterization and/or accent, to this actor’s vocal acuity. Suddenly, the author’s purpose is transgressed as we are aesthetically torn from  non-fiction’s calling: to illuminate, inform, or otherwise educate us about this particular topic.
Non-fiction characters aren’t really three-dimensional. They exist to serve the author’s intellectual purpose. Portrayal of their voice promotes them undeservedly, features them instead of the author and negates non-fiction’s performance mandate: author-listener connection.
         Think of it this way: When the non-fiction narrator affects Robert E. Lee’s southern accent, voices the Arab and Israeli, or imitates Lincoln, the aesthetic blow-back turns the author’s story into the actor’s audition: hey everyone, do I sound like Barak Obama or what!


AUNT MARY: The Narrator’s Altered Ego

         Readers of this blog are familiar with the vicissitudinous kvetchings of the ever suffering (and insufferable) would-be narrator, Aunt Mary. Tenaciously insistent, long on ambition, devoid of intuition, AM emailed me after reading my last post, insisting, “…once the reviewers here (sic) my skills, well, you can believe they’ll tell publishers to hire Aunt Mary for everything! You betchya!”
         Aunt Mary – the storyteller’s nightmare - is an obtusely arrogant agglomeration of vocal affectations that are incompatible with storytelling: She flamboyantly strains declaimed syntax, her not altogether unappealing vocal instrument torturing the ear like a Stradivarius infected with irritable bowel syndrome. Her artificially placed, mildly frantic pitch, and her modulated volume nauseate the narrative, as if the author wrote seasick. And her speedy reads reduce her male/female characterizations to stereotypical sound-alikes whose point of view always appears to be, who else? Aunt Mary’s.
But AM seeks employment, persistently - a serially misinformed performer who conflates words spoken with stories told. To her, it’s all about ‘the voice.’
But what about sentience?
Sorry AM, but you don’t get the storyteller as emotion’s conduit. You don’t feel the storyteller’s commitment to dislodge the palpable humanity embedded in the author’s narrative, then connect those live feelings into the listener. Alas AM, you’re a reader. End of story.


I hope reviewers will regard this template’s performance markers – from chewing the syntax to the here and now - as malleable assessment tools whose collective bias sheds informative light on the process of critiquing the narrator’s capacity to emotionally connect to the listener.
I think of each tool as an individual performance portal, or maybe a lens, through which the critic can analyze the particular storytelling element it addresses.
Finally, I am arguing that precise critique vocabulary exists (performance markers) that can be referenced by the reviewer in order to advise listeners about what they’re going to feel when they press ‘start,’ how much or how little they’re going to feel and why or why not.


  1. Thank you for your wonderful article and examples. I was an audiobook reviewer, but apparently was not well at my craft-being asked to stop. I wrote my reviews based much on what you have stated. I am a professionally trained voice-talent with a BA in Mass Communications majoring in writing/film. But I am not bitter, in fact I am grateful. Currently, I am under contact for my first audio book in the "non-fiction" catagory. Now, I pray, should one day someone out there review my work they first read your wonderful article, and take to heart this template and listen to these examples. Knowledge and education create better understanding making intellectually driven reviews.

  2. Many thanks for this lesson. I just spent time with this and feel like it was a class in acting.

    Lynn Benson

  3. Brilliant. We're sharing this with students in our Dallas Audio Book Weekend March 24-25.

  4. Thank you, this has been very helpful!

    Kat Hooper

  5. Sensational article. Thank you for sharing your insight, Paul.

  6. Very late to the party here but this was brilliant to read. I cant open the sound files though - any advice?