A MULTI-METHOD USABILITY EVALUATION: THE NEW SCHOOL ONE STOP CENTER
[USER RESEARCH]
How can we identify opportunities for improvement?
The New School’s One Stop Center was launched the last year, and is currently building their infrastructure and designing interfaces that students can interact with. At The New School, the One Stop Center has a few different forms - physical offices (where students may drop-in or set up advising sessions), a phone line and a chatbot.
This project focuses on developing a proposal to evaluate the usability and effectiveness of the One Stop Service Center Chatbot. Based on analytics reports provided to us, and further conversations with the director of the One Stop Center, we identified international students as our target audience.
Note: This proposal was submitted to the Director of the One Stop Center at The New School.
Designed at The New
School for Social Research
2 Weeks, December 2023
User Research, Designing the Experiment
Before we began identifying objectives, we recognised a need to narrow the scope of our inquiry. Given the New School’s diverse student base, we looked at identifying a smaller user group for the study to identify more nuanced insights into the user experience. We started with an analytics report for eight months when the chatbot was functional, with 1% net positive sentiment. The top topics searched included ‘Tuition’, ‘Scholarships’ and ‘International Students’. We also noted that one of the lowest performing responses with 9% negative feedback was for queries about financial aid options for international students. Using this as our starting point, we identified international students as our targeted user group for this study.
Objectives
- To evaluate the ease and effectiveness of the chatbot for international students.
- To assess the chatbot’s ability to provide useful information about financial aid for international students.
- To explore whether the chatbot’s tone has an effect on users’ perceptions of the interaction.
the Proposal
The task given to the participants was to use the chatbot to find as much information as they could about financial aid options available to international students. The task would be deemed completed when the participants felt they had enough information to take the next step.
The participant will be briefed on the whole process, appropriate consent will be acquired and they will be asked to fill out a pre-task assessment.
The pre-task assessment will collect demographic data and establish a baseline for the information a participant may already have about the topic. They will then be given a brief orientation about the think-aloud protocol. The process will be explained, along with the expectations of the level of thinking to be verbalised. The participants may choose to speak in their first language to reduce the cognitive load. Participants will be asked to perform a simple practice task like charging a laptop, while thinking aloud.
The user group will then be divided into two, randomly, the control group and the experimental group. The control group will interact with the chatbot as is, and the experimental will interact with an alternative chatbot. The participants can deem the task complete when they have enough information necessary to take the next action.
Following the experiment, the participants will be asked to fill out a survey (After Scenario Questionnaire) with three different parts. The first assessing the difficulty and satisfaction with the completion of the task - items will be rated using a 7-point Likert scale, ranging from ‘strongly agree’ to ‘strongly disagree’. The second part consists of semantic differential questions rating how positive their experience was on a 5-point scale, ranging from ‘very negative’ to ‘very positive’. The last section consists of open-ended questions about the next steps participants can take and the reliability of the chatbot.
Note: The next part details the design of the experiment and the alternative chatbot, which was led by me.
- How does actionable information within the response from the bot affect the user experience?
- How does a more conversational tone affect the overall impression of the experience?
The experiment will divide the 20 participants into two groups, randomly: the control group and the experimental group. The control group will interact with the chatbot as it is currently programmed. The experimental group will interact with an alternative chatbot, which has a more conversational tone as well as provides more actionable information. The experiment will be conducted on school computers under appropriate conditions to record the think-aloud.
The Procedure
Each participant will go into a school computer lab accompanied by a trained research assistant who will give instructions.
- On a school computer, participants will read and digitally sign an informed consent form, consenting to participate in this study, to being audio and video-recorded, and to having this data used for research purposes.
- Then, the pre-task survey will be administered.
- Participants will be given instructions for the think-aloud protocol by a trained research assistant. The research assistant will then leave the room.
- The task assigned to the participants is to find actionable information about financial aid (scholarships, jobs, monetary relief) options available for international students from the chatbot.
- Participants will be asked to carry out the task on the chatbot depending on their group (current or alternative chatbot).
- The task will be deemed to be complete when the participant feels they have enough information on the topic to take any next steps.
- Once the task is completed, and think-aloud data is collected, the post-task survey will be administered on the school computers.
Alternative Chatbot
A conversational or warmer tone may help users feel more comforted and supported in their experience, improving the overall user experience. More actionable information (contacts of offices or emails of individuals, links) may yield more productive queries in the conversation. A few examples of what the alternative chatbot will look like are elaborated below.
The Welcome Message This is the first message the user sees when they open the chatbot. The proposed alternative has a more warmer and inviting tone and also asks the user to change the language if they wish to do so (which is a feature of the current chatbot). It also invites the user to ask it a question.
Standard Response to Known QueriesThe chatbot has a few questions where it has pre-designed answers. In the alternative chatbot, the response to this particular question not only has a more casual and warmer tone but also further prompts the user to enter their department. This allows the chatbot to give a more customized response, where the user may email the contacts as their next step.
Response to Unknown Queries Sometimes, the chatbot does not have an answer to a question and asks the user to rephrase the question or give alternative suggestions. In the current design, this loop is endless and can cause frustration in users. The proposed alternative redirects the user to the most appropriate office after one request to rephrase the question. This might reduce frustration, while also providing users with easy next steps in situations when the chatbot cannot help them.
the Experiment
Screen monitoring software will capture the required information, and it will be stored securely and anonymously. It will be analyzed in conjunction with the think-aloud and survey data.
Information that will be collected:
Time
- Time to respond to the question
- Total Interaction Time
- Note: We realize that recordings of time might not be an accurate representation of interactions (cognitive load and response times might be affected by the think-aloud protocol), but the overall data may be compared to gain insights.
Transcript of the conversation
- Total number of queries the participant makes to receive the information they need.
- The number of questions participants ask on the one sub-topic.
- How long do they keep asking questions, before they give up?(when the chatbot says ‘Do any of these suggestions help?’)
How often do they leave the chatbot to check the linked information?
- How much time do they spend on the external links?
- Do they come back to ask more questions?
Thus, this evaluation design will lead to concrete suggestions and solutions for One Stop Student Services about the most effective ways to improve their chatbot system. We might propose, for example, adding automatic follow-up questions that allow the chatbot to better tailor responses to individual students, making the option to switch languages immediately known and visible to users, or relying less on external link redirections.
We recognize that the structured observation setting may influence users' behaviors, but we are confident that our recruitment criteria and the combination of different evaluation methods will lead to a helpful and productive evaluation.
The team consisted of me (Aditi Gunna), Fernanda Queiroz de Vasconcellos and Youyang Li.
We were adivsed by our faculty, Michael Schober.