Ava – an Enhanced Conversational User Experience

Within my Bachelor thesis for the degree "User Experience Design" (B.Sc., First Class Degree) I conducted independent research in the field of customer service chatbots. The results of the five-month study showed great potential for improved human-chatbot interactions and as well as revealing new business opportunities were revealed.

  • PROJECT TYPE
    B.Sc. thesis project
  • MY ROLE
    UX Researcher, Developer
  • DURATION
    Five Months
  • TEAM SETUP
    Supervisor, co-supervisor and myself

Please note: This case study breaks down research from five months and is rather detailed to provide context. There is a TL;DR-section that gives a good overview of the results.

Problem

As chatbots nowadays still fail relatively often, the goal of my bachelor thesis was it to optimize the customer’s "handover experience”. If the communication with a chatbot fails, there is a broad consensus that failed dialogs must be handed over to a human customer service representative. While the unsatisfied service request is handed over, a waiting time arises which is potentially harmful to businesses. In an unmanaged handover situation, customers have no information about the following process and uncertainty about the wait arises which may lead to negative emotions. Also, customers are left alone with their unsatisfied service request and experience apathy on behalf of the company while they are waiting. My thesis was concerned about the question how handovers can be optimized to create positive experiences and increase the perceived quality of service.

Constraints

  • The field of Human-Chatbot Interaction is currently lacking research and no prior research has been made in the area of chatbot handovers: This leads to the fact that there is no profound knowledge about user needs and arising problems in handover situations
  • Potential solutions need to be assessed from a user perspective ("positive waiting experiences") and at the same time through the application of reliable business metrics ("perceived quality of service") to reach a solution that takes account of the user and business goals
Time framework

Research Approach

How the Problem was Tackled

To develop a better understanding of the user’s thoughts, fears, wishes and mindset in situations that provoke a handover between the chatbot and a customer service representative, I decided to undertake a qualitative approach through semi-structured interviews (7 people) in the first place. In parallel, I made an extensive literature review and analyzed what strategies companies apply within classical offline waiting contexts to make waiting more convenient and less harmful for their business. Furthermore, I studied the perception and psychology of waiting (times).

Backed with the qualitative insights and knowledge from prior research, I derived three chatbot prototypes that should ultimately help to understand what makes a positive waiting experience for customers and guarantees a high perceived quality of service. These chatbot prototypes were implemented in a realistic shop scenario through a custom-built Angular application that made use of Google Dialogflow’s natural language processing capabilities. A multi-day laboratory user study with 28 users was then conducted to empirically evaluate the prototypes. Within the user study, each user fulfilled three tasks (e.g. tracking a package) and experienced one handover prototype in every condition. Thereby data was collected through questionnaires, emotion recognition through facial analysis as well as through measuring the subject’s electrodermal activity. A subsequent statistical analysis under the aspects of the subjective waiting experience, the resulting emotional affect, and the perceived service quality followed.

My research process

Qualitative Insights

Developing Empathy for the Users

To identify and understand the user’s pain-points and wishes that must be met by an enhanced conversational experience ("managed handover"), I undertook semi-structured interviews with seven subjects from mixed backgrounds.

Hereby, a scenario was read out to the subjects and they were asked to put themselves in the situation of the plot. The scenario comprised the common chatbot use case of requesting an order’s delivery status. The chatbot only gave a partially satisfying answer ("order has not been delivered right now"), whereby the conversational agent was not able to explain when the shipment should take place. An open-end suggested that a real customer service representative could help with this issue, whereby no following process (further communication channel, answer time, chatbot autonomy) was defined. This allowed the participants to come up with their ideas and desires, how this situation should be solved for them. The interview comprised different areas of interest (desired process, forwarding the dialog, etc.), whereby the subjects were allowed to share all their thoughts and impressions.

Empathy map

The qualitative insights were gathered, categorized and synthesized: The resulting Empathy Map shows the findings about what users said, thought, did, and felt when they experienced a failed customer service chatbot conversation.

Interim Findings

  • User Goal: Users want a fast solution for their service inquiry which couldn’t be entirely answered by the chatbot. This makes a handover to a customer service representative obligatory.
  • Pain Points: Users have the feeling that they have already (mis)invested time, are uncertain about the following process, face boredom while waiting for a service employee.

Hypotheses & Prototypes

Combining user needs & business goals

The qualitative insights show that the users have a desire for a quick resolution of their service inquiry. They indicated that a waiting time of one to two minutes and sometimes even five minutes is considered as fair. However, they strictly expressed that the failed dialog must be handed over to avoid explaining their problem again. Also, their uncertainty about the arising waiting time, boredom while waiting and disappointment on the basis of the failed dialog should be addressed by a managed chatbot handover solution.

Those finding inspired me to ideate on different handover prototypes that respectively address each of the problems/desires. Hereby, the following ideas originated:

  • "The Reporter": The user receives continuous updates about the remaining waiting time and process through an animated and playful interface. This maximum of information should give the user a feel of care and should reduce his uncertainty about the wait.
  • "The Gambler": Users are able to choose from a variety of mini-games that can be played right in the chatbot. This alternative activity should help the customers to bridge the waiting time.
  • "The Attentive Engager": An ongoing chatbot dialog actively engages the users while waiting. Hereby, the communication can be held upright between the company and the user. Users have the opportunity to choose from a variety of topics: current offers, general information about the company, interesting facts on how a chatbot works, etc.
  • "The Pusher": This prototype allows users to subscribe to auditory and visual push notifications that inform them when a solution to the problem is ready.
  • ("The Status Quo":) A animated progress indicator and text "please wait…" shows the user that the chat is handed over to a customer service rep (no alternative engagement, no time information). This prototype idea simply depicts the status quo of unmanaged chatbot handovers that do not take account of any user needs and is a negative example that should be rejected through the following empirical study. However, it has been implemented to observe the severity of unmanaged handovers.

The Implemented Prototypes

Due to the high amount of development resources and statistical requirements on sample size and selectivity, I decided to implement three prototypes for the following user study. Therefore, I developed the chatbot Ava (artificial virtual agent) as a custom-built Angular app which uses Google’s Dialogflow for NLP. Ava was implemented in a realistic online shop environment for the fictive company MyShop24.

"The Reporter" seemed to be the most desired one from the perspective of transparency and care. While "The Attentive Engager" allowed users to fill the waiting time with an alternative engagement which could make sense for the client (user-centric dialog around interests) and at the same time for the business (marketing purposes). As well as "The Status Quo" was important to show the negative effects of unmanaged chatbot handovers in terms of the user experience and perceived quality of service which may cause economic damage. "The Gambler" and "The Pusher" are interesting concepts that potentially allow a fun user experience or autonomy, However, these concepts seemed to be more of a feature of either "The Reporter" or "The Attentive Engager" but no holistic solution.

Selected Hypotheses

With the different prototypes come a variety of hypotheses regarding the perceived quality of service, perceived waiting time, arising stress while waiting, user’s uncertainty, and arising emotional affect. I selected some exciting results in combination with the hypotheses (more upon request) in the results-section.

Multi-Day User Study

Put the Prototypes to the Proof

In a four-day laboratory study, 28 employees and students used the three prototypes in a pseudo-randomized task-based user study. In every condition, the user had to chat with the chatbot and only got a partially satisfying answer to what provoked a handover. The study comprised a multitude of – partially adjusted – questionnaires such as: Time Perception (Hinz), Satisfaction and Service Quality (Yang et al., Parasuraman et al.), Positive-Affect-Negative-Affect-Schedule (Watson et al.), Scale of Psychological Needs (Sheldon et al.). Stress was also was measured through the users' electrodermal activity. Emotions were analyzed through facial recognition by the Emotion-as-a-Service Affectiva (Affdex). Short closing interviews and a ranking of the prototypes rounded out the data set.

Study Design

The graphic above shows the study design. Participants used each of the handover prototypes (HO) in connection with a specific use case (UC). The waiting experience was assessed on the basis of a 183 sec-long waiting time in every condition.

Results

Enhanced Chatbot Waiting Experiences

General speaking, "The Status Quo" of unmanaged chatbot handovers which do not provide any waiting time or following process information lead to an unsatisfying waiting experience with uncertainty. They subsequently caused a potentially harmful low perceived quality of service. The provision of waiting time information turned out to be obligatory in chatbot handovers ("The Reporter"). As well as the reported process information ("One customer is in front of your…The service representative is currently taking care of your request.") provided transparency and reduced the fear of being forgotten. The time, as well as the process component, can be considered as critical factors for high perceived quality of service. "The Attentive Engager" led to a very positive and convenient waiting experience, while the perceived length of the waiting time was reduced drastically. The alternative engagement to waiting led to increased acceptance for waits in general and ultimately resulted in higher perceived quality of service.

Selected Hypotheses and Results

  • H1 The provision of time/process information or an alternative user engagement while waiting results in a higher perceived quality of service. – Hypothesis can be confirmed: this shows the great economic & CRM potential of managed chatbot handovers.
  • H2 A low degree of information about the waiting time and following process, as well as no alternative user employment while waiting causes stress. – Hypothesis cannot be confirmed, however, small tendencies were visible within the electrodermal activity of the users.
  • H4 Actively engaging users in chatbot waits significantly decreases the perceived length of a waiting time. – Hypothesis can be partially confirmed: No engagement led to the longest perceived waiting time. However, "The Reporter" and "The Attentive Engager" differed non-significantly. I assume that this goes back to the fact that continuous updates and the animation provide some sort of engagement.
  • H5 A communicative chatbot that actively engages users through an ongoing dialog while waiting evokes a positive emotional affect. – Hypothesis can be confirmed. Interacting with a chatbot is still a relatively novel experience and users have fun while interacting with bots.

Implications for Future Customer Service Chatbots

  • Managing chatbot handovers comes with general deliberations as well as with practical measures. First, things like forwarding the dialog, appropriate average waiting times, etc. must be considered. And second, provision of time and process information, as well as an alternative engagement within the bot, have a quite high effect on user satisfaction, service quality and more.
  • A symbiosis of "The Reporter" and "The Attentive Engager" was often desired (however, due to the statistical selectivity not implemented in one solution).
  • Surprising and most important for businesses: The users indicated great interest in the ongoing chatbot dialog which bridges the waiting time. This makes a tailored conversation around the particular user's interests (important data!) possible. This can be ultimately used to generate new revenue through cross/up-selling for example.

Learnings

What I've learned from this project

  • Qualitative insights are so powerful and there are great tools around to handle, categorize and code the massive input easily: I used NVivo for the content analysis of the interviews and can absolutely recommend it for larger qualitative data sets.
  • Despite the fact that chatbots nowadays still fail relatively often, Ava – based on Google Dialogflow – did a pretty robust job and didn’t disappoint me within the user tests. Therefore, I had to train the chatbot with a lot of possible inputs for each user intent.
  • Having great supervisors is a huge asset when it comes to planning a study and choosing the different approaches. Thanks again to Prof. Dr. Andreas Riener and Philipp Wintersberger for the guidance!
Let's go on a mission together!