As digital assistants invade our lives, the opportunity and limitations are becoming more evident. Natural language processing and the digital manifestation of these, reinforced by machine learning, is rapidly developing. When the opportunity to understand and evaluate digital human technology came along, I leapt into the deep end with a team of equally enthusiastic specialists. As I write on the learnings, I thought it would be worthwhile to describe the set up for context.
The scope was to learn as much as we could about this new type of interaction. The team engaged with stakeholders and used a series of lean and design-thinking workshops to come up with assumptions to test and use-cases to apply. Perhaps the most critical questions we wanted answers to were, are people ready for this type of interaction? And, what variables could be changed to improve the experience? For this, we would need to set up a digital human-powered by AI to understand and execute a task based on natural language processing.
The set up was to have a 32-inch screen displaying the digital human in portrait orientation, a Microsoft Kinect camera, and a microphone. All driven by a gaming machine and third-party software. The software used IBM Watson, Unreal Engine, and Google Dialogue Flow.
We tested the screen size and determined that the 32-inch size was less intimidating and sized the digital assistant to relatable human size. The portrait orientation tested better than the landscape orientation because it worked well with an interaction that was 1–1.5 meters away and felt more natural. Portrait left us with an orientation less useful for information, but more suitable for a digital human, which we prioritized as most important.
The motion-sensing camera was a critical component. It defined the active area in which the interaction would begin. By stepping into the area, the digital human would start with something like, “Good morning, how can I assist you today?”. Later we also pivoted to providing a physical indication (coloured tile) to show people where to stand. This was not only important for starting the conversation, but also ensuring that people stayed in place. It aided the other function of the camera, which was to reset the experience when people walked away. Walk out of the box, and she would cease talking and return to resting state.
The directional microphone was another critical bit of hardware and where we had some issues. I’m referring to GIGO — garbage in, garbage out. The gain had to be played with to ensure sufficient clarity of the speaker whilst ignoring background noise. This became a daily affair due to changes in ambient noise. It led to a lot of discovery around physical joinery set up, optimizing location, and sound dampening.
The sound from the digital assistant came from the screen speaker initially, but we learnt a directional speaker provided greater privacy and less noise pollution for those nearby and the microphone.
The set up was intimidating for anyone. Not many enjoy talking to Siri in public. A digital human provides an even greater challenge. We found screening on one side provided more privacy and reduced noise pollution.
Our learnings went into conceiving a design prototype that could be utilized in the appropriate space of a bricks and mortar environment.
The prototype worked to combine the perception of privacy, welcoming interaction, acoustic panelling, and guidance to achieve the optimal experience.
Lastly, assembling the right team to set up and run experiments on disruptive technology was critical. The specialists that we had involved were:
- Business Stakeholders — to represent and use cases.
- Designer — to work on the user experience.
- User Experience Researchers — to define the experiments, test and survey.
- Content Writer — to assist with the script and language
- Developer — to refine the solution.
- Technicians — to assist with set up and configuration.
- Product Manager — to lead the team and direction of the solution.
- The Voice — one of our content writers had an amazing voice, which was synthesized for use.
I expected resistance in getting this team assembled, given their busy day jobs. However, the journey of discovery for this new user experience was too remarkable to miss. I formed the project team in days, and we very quickly began to scope the experiment with our business partners.
The learnings were too great to cover here, so I’ve written about them in a series I hope you find interesting and useful.
No digital humans were harmed in our experimentation!