Ux + AI = Cognitive Ergonomics
Summary: The addition of AI capabilities to our personal devices, applications, and even self-driving cars has caused us to take a much deeper look at what we call ‘User Experience’ (Ux). A more analytical framework identified as Cognitive Ergonomics is becoming an important field for data scientists to understand and implement.
I for one tend to be skeptical when I see a new term used in place of an existing data science term. Is this new descriptor really needed to describe some new dimension or fact that we currently describe with a comfortable, commonly used phrase? Or is this just some gimmick to try to differentiate the writer’s opinion?
That comfortable and commonly used phrase in this case is ‘User Experience’ (the ubiquitous abbreviation Ux). Ux describes the totality of the end user’s experience with our app, personal device, IoT enabled system, wearable, or similar. So far this has meant taking a great deal of care in designing screens, icons, buttons, menus, and data displays so they are as easy and as satisfying for the end user as possible. It also means a design that as much as possible eliminates common user errors in interacting with our system.
Recently I’ve been seeing ‘Cognitive Ergonomics’ used where we used to see ‘Ux’ to describe this design process. Ux was such a compelling shorthand. Is there some substance in ‘Cognitive Ergonomics’ that might make us want to switch?
Gartner goes so far to say that whether you are working on autonomous vehicles, smart vision systems, virtual customer assistants, smart (personal) agents or natural-language processing, within three years (by 2020) “organizations using cognitive ergonomics and system design in new AI projects will achieve long-term success four times more often than others”.
What’s changed is the inclusion of AI in all these devices just over the last one or two years. The 2014 prediction of Kevin Kelly, Sr. Editor at Wired has come true:
“The business plans of the next 10,000 startups are easy to forecast: Take X and add AI.”
I personally dislike generalizations that make broad reference to “AI” so let’s be specific. The deep learning functionalities of image, video, text, and speech processing that rely on Convolutional Neural Nets (CNNs) and Recurrent Neural Nets (RNNs) are the portions of AI that are ready for prime time. AI is much broader than this but its other capabilities are still too developmental for now to be market ready.
Since CNNs and RNNs are basically the eyes, ears, and mouth of our ‘robot’ some folks like IBM have taken to calling this cognitive computing. This actually ties very nicely with what Cognitive Ergonomics is trying to communicate since the AI we’re talking about are its cognitive capabilities corresponding to eyes, ears, and mouth.
We no longer have just our fingers to poke at screens; we can now use voice and a variety of image recognition functionalities. These have the capability to significantly improve our Ux but it seems correct that we should focus on all these new types of input and output by using the more specific name Cognitive Ergonomics.
What Exactly Is Cognitive Ergonomics
The International Ergonomics Association says Cognitive Ergonomics has the goals of:
- shortening the time to accomplish tasks
- reducing the number of mistakes made
- reducing learning time
- improving people’s satisfaction with a system.
So it broadly shares with Ux the goals of making the application or product ‘sticky’ and ensuring the use of the product does not produce any unintended bad result (like texting while driving creates accidents).
Where Cognitive Ergonomics exceeds our old concepts of Ux is that it expands our ability to interact with the system beyond our fingers to include our eyes, ears, and mouths. This multidimensional expansion of interaction now more fully involves “mental processes, such as perception, memory, reasoning, and motor response, as they affect interactions among humans and other elements of a system”.
Cognitive Ergonomics is not new but you probably thought about it more in the design of complex airplane cockpits or nuclear reactor control rooms. Projecting that onto our apps, personal devices, wearables, or even self-driving cars requires not only knowledge of the physical situation in which the device or app is used, but more particularly the end user’s unconscious strategies for performing the required cognitive tasks and the limits of human cognition.
Nothing beats a good example or two to illustrate how our ability to interact with our devices is changing.
Natural Language Processing (NLP)
Voice search and voice response are major developments in our ability to interact not only with personal devices like our phones but also with consumer and business apps.
Voice search enables commands like ‘how do I get to the nearest hospital’ but also ‘will it rain at 5:00 today’ or ‘give me the best country songs of 2010’.
In terms of improving our satisfaction with apps, one of the strongest uses is to eliminate visual, multi-level menus. Who hasn’t drilled down in their phone through too many levels of menus to try to turn on or off a feature? This eliminates mistakes, frustration, and speeds results when we can simply say ‘turn off X application’.
Commands no longer need be one way conversations. Chatbots enable a dialogue with the app if the action is unclear. This might be a device operating question but it could as easily be a customer service issue or even personal psychological counselling (Andrew Ng recently announced the release of Facebook’s Woebot. Woebot is a chatbot that gives one-on-one psychological counseling for depression and by all reports does a pretty good job.)
Perhaps the most important improvement is one impacting safety. The ability of your app to ‘read’ texts aloud or to allow you to respond to a text by voice keeps your eyes on the road while you’re driving.
Facial and Image Recognition
It might seem that text/voice AI is the most prevalent advancement but facial and image recognition is not far behind. Of course cameras can face outward to guide self-driving cars and they can a capture street or business signs in foreign languages and offer audio translations.
Importantly they can also face inward to detect many things about the user. Facial recognition cues can trigger algorithms that work for our well-being or safety.
- Face detection and localization
- Facial expression
- Assessment of head turns and viewing direction
- Person tracking and localization
- Articulated body tracking
- Gesture recognition
- Audio-visual speech recognition
- Multi-camera environments
Other Haptic Inputs and Outputs
Not yet as common but coming shortly are a wide variety of other sensors beyond voice, text, and image. One that is widely written about in partially self-driving vehicles are cameras that scan the driver to ensure their eyes are on the road and therefore ready to take over from the AV at a moment’s notice. When the scan detects that the driver is visually distracted it can provide haptic feedback by vibrating the steering wheel.
Apple recently introduced a texting feature that automatically suspends notifications and ring tones when it detects you are driving, presumably using a combination of GPS and accelerometers to measure movement.
Other types of feedback sensors are already at use in some types of games which can measure heart rate, tightness of grip on the controller, or even galvanic skin resistance to detect sweat. Combined, these can be used as guides to how excited or frustrated you are with the game. Based on these inputs the game can then either become easier or more challenging to keep you engaged.
Another soon-to-be-major input/output device will be the Augmented Reality headset with even greater implications for this category.
How Will This Change the Life of the Data Scientist?
Ux these days is still considered largely an art form, but given the need for multi-dimensional inputs and outputs Cognitive Ergonomics will require more analytics in design and objective testing.
Gartner and others already see this in an uptick in the need for data scientists to be able to work with unstructured text data. While this will often happen through easier to use packages or the expansion of current platforms to include this feature, working with unstructured text will be a new skill for many data scientists used to supervised modeling.
Crowdflower recently released its survey of 170 data scientists in mid-size and large organization and more than half already report that a significant portion of their work involves unstructured data.
And that the majority of that unstructured data was text with almost half also reporting working with image and video.
While not all chatbots or image recognition projects are driven by the need to improve Ux, a significant majority are and foreshadows the importance of Cognitive Ergonomics for data scientists.
About the author: Bill Vorhies is Editorial Director for Data Science Central and has practiced as a data scientist since 2001. He can be reached at: