A person listening with earbuds

I experienced something like epiphany at a stoplight in downtown Ft. Lauderdale. Thumbing through Feedly, I stumbled across this article on Stratechery just as the light changed and traffic collectively twitched forward.

Shit. Okay, eyes on the road.

There I was, recently neck-deep in an accessibility article I am co-writing for code{4}lib and ugh-ing at my dumb luck being just then hurdling down the road in a death machine, when it dawns on me that – hey, Siri can just read this to me.

My experience enabling VoiceOver on iOS had been an adventure fumbling through my phone’s settings, but since screen readers were on my mind I thought I’d give “Siri, enable voice over” a crack. It worked.

Siri interface showing my command and its acknowledgment.

Huh. It never occurred to me to enable VoiceOver in this way, although in my day-work I often refer to it. My use is always as a developer, during testing, punctuating through the accessibility tree at 500 blistering words per minute crossing-Ts against one or another rubric.

We have tricks to mimic real-world use: you can flip off the screen and tab through the darkness with a keyboard, or — if you’re like me — you take off your glasses and squint at your UI. I care very much that cues are useful, but I’ve never actually been an end-user relying on them.

There is a gulf between maker and user and on one side is intimacy with the technology, and — more often than not — on the other side is the one who wrote the code.

Listen to Siri read through this content.

A long-form article in Feedly

Were you paying attention to the cues, the speed, comparing accuracy against the text in the screenshot? Or, were you listening for meaning?

My epiphany, then, was that — for the first time relying on this assistive technology — testing for accessibility, even as well-intended, is still a far cry from the experience of someone actually caring about the content on the screen.

See, it’s not just that the article has to be articulated – but it has to be read well.

What’s more, what happens if you lose your place because you’re momentarily distracted, whether that’s someone walking into the room, asking for your attention, a phone rings, or you’re on the road? VoiceOver’s touch interface enables this control but it requires a deliberate and artful Fillory of perplexing gesture. It is a slight distraction even master magicians can’t avoid, and this small commitment of time creates literal dissonance while Siri continues to jabber on in the background as you find your place in the article.

We command screen readers to iterate through the source, pause, jump from one heading to another, but these tools aren’t responsive to the nuance of context – say, in traffic. This should be possible:

What was that? Say that again? Bounce-back ten seconds.
Slow down a bit. Stop right there,
I need to focus on something else

— but it isn’t.

Even so, for these imperfections I found myself relieved to just have access to this content in this way. It is with this note we can observe that the user experience is a holistic value. VoiceOver leaves a lot to be desired, but holy shit this tool exists – and that matters.

We talk about “assistive technology” in the same way we talk about a crutch. Necessary, useful, but with the implication of something broken.

Look ahead and observe the convergence of Accessible UI and Conversational UI. I now believe and propose that we instead conflate this meaning with the rise of the virtual assistant — Alexa, Siri, Cortana, Google — and with blossoming voice user interfaces through which we intentionally divorce ourselves from the screen.

This is impressive technology that liberates us all to interface with the web in the way we choose.

Michael Schofield is a service and user-experience designer specializing in libraries and the higher-ed web. He is a co-founding partner of the Library User Experience Co., a developer at Springshare, librarian, and part of the leadership team for the Practical Service Design community.