Digital assistants have become a part of our everyday lives. As they become more useful, we’re beginning to get over the friction of speaking to phones, speakers, watches, and computers. With the technology rapidly improving, we’ve also seen a bunch of companies working on augmenting this with realistic digital characters in the hope of improving the overall experience. Our research and experimentation have challenged our thinking around how to do this well, if at all. Factors like appearance, proficiency, discrimination, and comfortability all come into play.
My early exposure to artificial intelligence with some sort of visual representation first came from sci-fi movies. Most were machine-like, but some introduced digital depictions that had humanistic characteristics. Although not intelligent in any capacity, my most memorable real-world example was the annoying Clippy that appeared when I least wanted it in Microsoft Office. When I couldn’t be bothered turning it off, I recall defaulting to The Genius character because it was more familiar. Yet both failed because they didn’t offer any meaningful interaction or solve any problems.
For many decades, companies have grappled with using refined brand characters through to illustrated humans. The focus was to create an emotional connection with the brand, transcending products. Although this has been successful for many brands, it’s not clear that it has translated well into technology or perhaps technology has not been able to meet the expectations on the character.
In recent years, we’ve seen an acceleration taking place attributed to the gaming industry, movie studios, and machine learning. The evolution of realistic digital humans that enhance interaction is well and truly underway. The premise of striving for realism is that it is more relatable and familiar. Communication is more effective, and our acceptance is more generous when we interact with someone we can relate to. Likewise, the use of facial features (eyes, eyebrows, and mouth) and facial expressions provide additional communication cues.
The jury is still out on whether the world is ready for such interactions. When I led a team to experiment on the use of such technology, we were convinced that the outcome had to be to understand the human response first and foremost. The experiment consisted of using a realistic digital human to execute a booking task.
Getting Real & Relatable
The interaction with our digital human was a first for all participants. So it carried many of the same challenges as meeting another person for the first time. Creating relatability and reducing friction was important. Considerable research pointed to a female digital human being a more trusted and warmer experience. Ethnicity required a bit more magic. We wanted to achieve a digital human that was relatable across cultures. We ended up using a mix of European, Chinese, and Pasifika ethnicity. The result was visually effective and similar to that used by Serko for booking corporate travel. Short of playing with genetics or finding this mix in a human employee, it quickly gave us insight into the benefits of digital humans. The process also highlighted the spectrum of gender and whether a similar outcome might be achievable in future.
The experiment produced some fascinating insights into the complexity of human communication and the expectations associated with this new technology. For example, there was a correlation between the expectation of realism and capability. In other words, the more real a digital human appears, the more intelligent it must be. Some participants were so immersed in the realistic interaction, they didn’t want to break eye contact.
Interacting With The Unknown
Stepping back in the experience, one of the biggest hurdles was initial engagement. Not many of us enjoy dialogue with Siri when others are nearby. The fear of looking stupid when she doesn’t understand you or responds in error is disabling. This gets heightened when you add a life-size digital human to the mix. Improving the success of interaction was achieved by making several physical and digital changes.
Privacy — We changed her height to below-average human height and decreased her size to slightly smaller than an adult human. The latter was done based on feedback from the user group that described her appearance as intimidating.
Dimensions — We changed her height to below-average human height and decreased her size to slightly smaller than an adult human. The latter was done based on feedback from the user group that described her appearance as intimidating.
Demeanour — Her resting facial expressions were described as stern and not welcoming. Overnight the developers changed a few rig dials and presto she was pleasant and warm.
The results of these changes were instantaneous in reducing both anxiety and intimidation. We found that we could further enhance the interaction with facial queues. Based on human dialogue, she would be nodding and showing active listening. Whilst speaking, her facial expressions would match her dialogue. This wasn’t always perfect. Expressions like frowning were created to make her more realistic. When the right dialogue triggered the wrong expression, it was a disaster. However, with some careful UX testing and refinement, it was a powerful communication aid.
Enhancements that weren’t available in the experiment were the use of hands and posture. Our digital human was shoulders and above. In human communication, our body language plays an integral part in the overall interaction. The introduction of an upper torso and arms would have enhanced the communication, albeit increasing the complexity of execution.
The Uncanny Reality
Our experiment left us with the understanding that beyond initial surprise, delight and fear, there was still an uncanny valley effect. Considerable work needs to go into making digital humans more effective to match the expectations. Technical artists, armed with gaming engines, are producing more realistic digital humans than ever before. Deep learning is driving a more realistic visual and natural language interaction. Yet combining everything into a meaningful and frictionless experience has not quite been achieved. Acceptance will remain elusive until this is resolved. Perhaps we will see it initially succeed in the safety of our homes as part of the IoT ecosystem. After all, we once frowned upon using digital assistants, yet Alexa and Siri have found their way into our lives. As I observe progress in this space and see my kids interacting with digital assistants like family, I’m convinced it’s just a matter of time before digital humans become an accepted form of engagement.