A MULTI-METHOD USABILITY EVALUATION: THE NEW SCHOOL ONE STOP CENTER
[USER RESEARCH]
How might we evaluate user satisfaction while interacting with the One Stop Center’s Chatbot?
How can we identify opportunities for improvement?
Attending college or graduating college includes more than just academics, students are tasked with navigating administrative challenges. Universities have many services that help support students with these challenges. The One Stop Center serves as the primary access point for students, their goal is not to solve any problems directly but directing the students towards the necessary resources within the university infrastructure.
The New School’s One Stop Center was launched the last year, and is currently building their infrastructure and designing interfaces that students can interact with. At The New School, the One Stop Center has a few different forms - physical offices (where students may drop-in or set up advising sessions), a phone line and a chatbot.
This project focuses on developing a proposal to evaluate the usability and effectiveness of the One Stop Service Center Chatbot. Based on analytics reports provided to us, and further conversations with the director of the One Stop Center, we identified international students as our target audience.
Note: This proposal was submitted to the Director of the One Stop Center at The New School.
The proposal included a comprehensive report detailing the research process. We identified a user group of international students, and designed an experiment to understand their experience while trying to find information regarding financial aid.
Context
The One Stop Center at The New School serves as a support hub for students encountering any form of difficulty, aimed at providing immediate assistance and then guiding students to the appropriate department for further support. In its online form, The New School Chatbot functions as a virtual counterpart to the phone lines and advisors, answering student questions and directing students to appropriate departments. It is available on all New School websites. While designed to answer general inquiries, students can also log into their accounts to get personalised information based on their profile.
Before we began identifying objectives, we recognised a need to narrow the scope of our inquiry. Given the New School’s diverse student base, we looked at identifying a smaller user group for the study to identify more nuanced insights into the user experience. We started with an analytics report for eight months when the chatbot was functional, with 1% net positive sentiment. The top topics searched included ‘Tuition’, ‘Scholarships’ and ‘International Students’. We also noted that one of the lowest performing responses with 9% negative feedback was for queries about financial aid options for international students. Using this as our starting point, we identified international students as our targeted user group for this study.
Evaluation
Objectives
The research proposal is designed with three main objectives:
To evaluate the ease and effectiveness of the chatbot for international students.
To assess the chatbot’s ability to provide useful information about financial aid for international students.
To explore whether the chatbot’s tone has an effect on users’ perceptions of the interaction.
Overview of
the Proposal
We proposed a multi-method approach, an experiment layered with the think-aloud protocol(participants think aloud as they are performing tasks) and surveys so we can collect different kinds of data during the research process. Twenty international students will be recruited to participate in the process, participants will have the option to participate using their first language (a manual will be developed using translators to translate and code the findings).
The task given to the participants was to use the chatbot to find as much information as they could about financial aid options available to international students. The task would be deemed completed when the participants felt they had enough information to take the next step.
The participant will be briefed on the whole process, appropriate consent will be acquired and they will be asked to fill out a pre-task assessment.
The pre-task assessment will collect demographic data and establish a baseline for the information a participant may already have about the topic. They will then be given a brief orientation about the think-aloud protocol. The process will be explained, along with the expectations of the level of thinking to be verbalised. The participants may choose to speak in their first language to reduce the cognitive load. Participants will be asked to perform a simple practice task like charging a laptop, while thinking aloud.
The user group will then be divided into two, randomly, the control group and the experimental group. The control group will interact with the chatbot as is, and the experimental will interact with an alternative chatbot. The participants can deem the task complete when they have enough information necessary to take the next action.
Following the experiment, the participants will be asked to fill out a survey (After Scenario Questionnaire) with three different parts. The first assessing the difficulty and satisfaction with the completion of the task - items will be rated using a 7-point Likert scale, ranging from ‘strongly agree’ to ‘strongly disagree’. The second part consists of semantic differential questions rating how positive their experience was on a 5-point scale, ranging from ‘very negative’ to ‘very positive’. The last section consists of open-ended questions about the next steps participants can take and the reliability of the chatbot.
Note: The next part details the design of the experiment and the alternative chatbot, which was led by me.
The Experiment
The One Stop Center is the first contact point for students who may be experiencing duress or need information with urgency. The chatbot is the Center’s first line of communication online, and it should be able to communicate with students and provide them with useful information. The goal of this experiment is to understand:
How does actionable information within the response from the bot affect the user experience?
How does a more conversational tone affect the overall impression of the experience?
The experiment will divide the 20 participants into two groups, randomly: the control group and the experimental group. The control group will interact with the chatbot as it is currently programmed. The experimental group will interact with an alternative chatbot, which has a more conversational tone as well as provides more actionable information. The experiment will be conducted on school computers under appropriate conditions to record the think-aloud.
The Procedure
Each participant will go into a school computer lab accompanied by a trained research assistant who will give instructions.
On a school computer, participants will read and digitally sign an informed consent form, consenting to participate in this study, to being audio and video-recorded, and to having this data used for research purposes.
Then, the pre-task survey will be administered.
Participants will be given instructions for the think-aloud protocol by a trained research assistant. The research assistant will then leave the room.
The task assigned to the participants is to find actionable information about financial aid (scholarships, jobs, monetary relief) options available for international students from the chatbot.
Participants will be asked to carry out the task on the chatbot depending on their group (current or alternative chatbot).
The task will be deemed to be complete when the participant feels they have enough information on the topic to take any next steps.
Once the task is completed, and think-aloud data is collected, the post-task survey will be administered on the school computers.
Design of the
Alternative Chatbot
The design of the alternative chatbot takes into consideration the key goals of this experiment and proposes a change in the responses on two fronts - a more conversational tone, and more actionable information in the responses.
A conversational or warmer tone may help users feel more comforted and supported in their experience, improving the overall user experience. More actionable information (contacts of offices or emails of individuals, links) may yield more productive queries in the conversation. A few examples of what the alternative chatbot will look like are elaborated below.
The Welcome Message This is the first message the user sees when they open the chatbot. The proposed alternative has a more warmer and inviting tone and also asks the user to change the language if they wish to do so (which is a feature of the current chatbot). It also invites the user to ask it a question.
Standard Response to Known QueriesThe chatbot has a few questions where it has pre-designed answers. In the alternative chatbot, the response to this particular question not only has a more casual and warmer tone but also further prompts the user to enter their department. This allows the chatbot to give a more customized response, where the user may email the contacts as their next step.
Response to Unknown Queries Sometimes, the chatbot does not have an answer to a question and asks the user to rephrase the question or give alternative suggestions. In the current design, this loop is endless and can cause frustration in users. The proposed alternative redirects the user to the most appropriate office after one request to rephrase the question. This might reduce frustration, while also providing users with easy next steps in situations when the chatbot cannot help them.
Data Collection from
the Experiment
From the experiment, we hope to gain insight into how users react to the responses, how they phrase queries and what their thought process is during the interaction. From the experiment, quantitative data will be collected to gain further insight into the user experience.
Screen monitoring software will capture the required information, and it will be stored securely and anonymously. It will be analyzed in conjunction with the think-aloud and survey data.
Information that will be collected:
Time
Time to respond to the question
Total Interaction Time
Note: We realize that recordings of time might not be an accurate representation of interactions (cognitive load and response times might be affected by the think-aloud protocol), but the overall data may be compared to gain insights.
Transcript of the conversation
Total number of queries the participant makes to receive the information they need.
The number of questions participants ask on the one sub-topic.
How long do they keep asking questions, before they give up?(when the chatbot says ‘Do any of these suggestions help?’)
How often do they leave the chatbot to check the linked information?
How much time do they spend on the external links?
Do they come back to ask more questions?
Conclusion
This multi-method evaluation nests three different research methods so that we can gather all data (from the experiment, the think-aloud, and the survey) for both versions of the chatbot. This will allow us to not only see whether the proposed experimental version of the chatbot is an improvement of the current version, but also to see in what ways specifically does it improve or not the existing chatbot: does it make the experience more positive for users? Is it easier to use? Does it demand less queries to provide all necessary information?
Thus, this evaluation design will lead to concrete suggestions and solutions for One Stop Student Services about the most effective ways to improve their chatbot system. We might propose, for example, adding automatic follow-up questions that allow the chatbot to better tailor responses to individual students, making the option to switch languages immediately known and visible to users, or relying less on external link redirections.
We recognize that the structured observation setting may influence users' behaviors, but we are confident that our recruitment criteria and the combination of different evaluation methods will lead to a helpful and productive evaluation.
Team
This project was undertaken as coursework, part of the Intro to Applied Psychology and Design course at the New School for Social Research.
The team consisted of me (Aditi Gunna), Fernanda Queiroz de Vasconcellos and Youyang Li. We were adivsed by our faculty, Michael Schober.