Roche Blood Glucose Meter

A Summative Usability Evaluation

Group 118 (4)

The Challenge

Demonstrate that the Roche Cobas Pulse blood glucose meter could be used safely by the intended user group, and to note any design or usability flaws of the device.

Background

Roche is a multinational healthcare organization based in Europe. They sell medical devices used both in hospitals and home settings all over the world. Among these devices are blood glucose meters, which measure a patient's blood sugar and track insulin intake.

We were contracted by Roche to perform a summative usability test on a newly developed blood glucose meter as part of their FDA submission process.

icons8-blood-512

Compared to a traditional usability test, a summative usability test for an FDA submission has a different structure, is subject to specific rules, and follows a different set of best practices. It is a required part of a company’s submission to the FDA in order for a medical device be sold in the U.S, and is often the penultimate or final step before a company makes this submission. In summative testing, while we are noting user behaviors and trying to understand them like with any user testing, the primary goal is to show that the product can be safely and effectively used as is. Less emphasis is placed on how to change or improve the product. 

One immediate difference is that the research team will conduct a training session with each participant shortly before beginning the usability test. During this training, the participant will be introduced to the device and the testing environment and given a thorough training on safely using the device. This training teaches the participant everything they need to know to use the device during the usability test, and they are allowed to take any notes they would like.

The process of moderating a summative usability evaluation is also unique. Rather than freely probing participants and asking follow up questions based on their responses and behaviors, it is of utmost important to adhere to the same protocol and word choice for each session, to keep each individual session standardized. Instead of following up in the moment, there is a dedicated section at the conclusion of the session to probe participants about unexpected behaviors, understand why participants behaved certain ways, and collect subjective feedback about the device.

My Role

For this project, I was the lead researcher and moderator, and I was assisted by a colleague who acted as a notetaker during this study.

Due to a non-disclose agreement with Roche, I am limited in the amount of analysis and findings I can disclose. If you'd like to hear more, please contact me at Nick.weinel@gmail.com

Testing Structure and Methodology

Our goal with this summative usability test was to demonstrate that this blood glucose meter could be used safely by the intended user group, and to note any design or usability flaws of the device. To do so, we conducted 15 test sessions with participants from each user group and created a set of task scenarios for participants to complete, which studied all major identified risks with the blood glucose meter.

Test Structure

Our participants for this research consisted of 15 Nurses who worked in hospital environments, and 15 Point of Care Coordinators who were providing special care to diabetic patients in clinics or similar environments.

Each session lasted 90 minutes in total, and was broken up into 3 parts: a “free exploration” period, an evaluation, and a debrief.

During the evaluation portion, one member of the research team acted as a mock patient as participants navigated through 6 test scenarios. Each test scenario contained a number of subgoals, which represented specific behaviors or actions within the test scenario that participants were evaluated on. Throughout the evaluation, neither the moderator nor the notetaker was able to give any additional assistance to the participant, outside of repeating the test scenario and providing scripted intervention when called for in the testing plan.

IMG_1966

Our testing facility, created with guidance from the manufacturer to emulate a hospital room

To keep each test session consistent, we used a moderation guide which contained information such as:

  • Scripts for the introduction portion of the evaluation, and to introduce each task scenario to the participant

  • A checklist of preparations to be made before each test scenario

  • A subgoal breakdown with a description of pass/fail conditions for each subgoal

  • Scripted interventions and anticipated participant issues

Analyzing the Results

Scoring the Results

Participants were scored on how they completed each subgoal for a test scenario on a 4 point scale. We defined each of these scores as follows: 

icons8-approval-80

Success

The participant completed the task without any use errors, and were able to understand the product’s display, controls, and instructions without difficulty


Success

The participant completed the task without any use errors, and were able to understand the product’s display, controls, and instructions without difficulty

 

Success

The participant completed the task without any use errors, and were able to understand the product’s display, controls, and instructions without difficulty

 

Success

The participant completed the task without any use errors, and were able to understand the product’s display, controls, and instructions without difficulty

 

Success

The participant completed the task without any use errors, and were able to understand the product’s display, controls, and instructions without difficulty

 

icons8-loading-80

Use Difficulty

A brief hesitation, struggle, or confusion that the participant worked through to complete the task. Indicates when a subgoal was completed without observed use errors, but inefficiently.

Use Difficulty

A brief hesitation, struggle, or confusion that the participant worked through to complete the task. Indicates when a subgoal was completed without observed use errors, but inefficiently.

Use Difficulty

A brief hesitation, struggle, or confusion that the participant worked through to complete the task. Indicates when a subgoal was completed without observed use errors, but inefficiently.

Use Difficulty

A brief hesitation, struggle, or confusion that the participant worked through to complete the task. Indicates when a subgoal was completed without observed use errors, but inefficiently.

Use Difficulty

A brief hesitation, struggle, or confusion that the participant worked through to complete the task. Indicates when a subgoal was completed without observed use errors, but inefficiently.

icons8-reset-80

Close Call

The participant performed a use error, which they later corrected in the same test scenario to perform the task successfully. The way they recovered from the error would not pose a risk to the patient or user in a real-world scenario.

Close Call

The participant performed a use error, which they later corrected in the same test scenario to perform the task successfully. The way they recovered from the error would not pose a risk to the patient or user in a real-world scenario.

Close Call

The participant performed a use error, which they later corrected in the same test scenario to perform the task successfully. The way they recovered from the error would not pose a risk to the patient or user in a real-world scenario.

Close Call

The participant performed a use error, which they later corrected in the same test scenario to perform the task successfully. The way they recovered from the error would not pose a risk to the patient or user in a real-world scenario.

Close Call

The participant performed a use error, which they later corrected in the same test scenario to perform the task successfully. The way they recovered from the error would not pose a risk to the patient or user in a real-world scenario.

icons8-high-importance-80

Use Error

An action or lack of action that leads to a different result than intended or expected. This includes the inability of the participant to complete a task.

Use Error

An action or lack of action that leads to a different result than intended or expected. This includes the inability of the participant to complete a task.

Use Error

An action or lack of action that leads to a different result than intended or expected. This includes the inability of the participant to complete a task.

Use Error

An action or lack of action that leads to a different result than intended or expected. This includes the inability of the participant to complete a task.

Use Error

An action or lack of action that leads to a different result than intended or expected. This includes the inability of the participant to complete a task.

When a Close Call or a Use Error was observed during testing, the moderator would probe the participant on what they were thinking during the task, discuss observed close calls or use errors, and find out why they took the actions (or the inaction) that were observed. The moderator and note-taker rated all tasks during the sessions and compared their ratings afterward, reviewing their observation notes or video recordings if needed.

Explaining root causes

For every Close Call and Use Error we observed, we sought to explain why these ratings occurred through a 'Root Cause Analysis'. In this analysis, we describe the circumstances of the Close Call and Use Error, and then suggest any factors which may have contributed to these ratings. We referred to both observed behaviors by the moderator and notetaker, and to explanations about their thought process obtained directly from the participant during the study.

This analysis was vital to understanding the effectiveness and shortcomings of the product we tested, as it helps identify when observed issues are indicative of a flaw in the product’s design and usability, or as a a result of external factors. It also allows us to pinpoint where there is legitimate risk to the patient and user.

*A note about identified risks: With this project, and frequently with summative testing, our client has usability engineers compile all possible safety risks associated with this product. We then use this risk analysis to create our moderation guide and task scenarios*

Screen Shot 2022-05-04 at 7.57.58 PM

This action was marked as a Use Error because the participant didn't complete one of the required task subgoals. However, based on this analysis, we can determine the error was unrelated to the usability of the device itself

By providing this analysis, it shows the FDA that some Use Errors should not be weighted as heavily as Use Errors that occurred due to the device's design. 

This action was a use error because the participant didn't permit one of the required task subgoals. However, based on this analysis, it's clear it wasn't a usability problem that contributed to the error.

Without providing this analysis, these Use Errors would have equal weight as use errors that occurred due to the device's design. 

This action was a use error because the participant didn't permit one of the required task subgoals. However, based on this analysis, it's clear it wasn't a usability problem that contributed to the error.

Without providing this analysis, these Use Errors would have equal weight as use errors that occurred due to the device's design. 

This action was a use error because the participant didn't permit one of the required task subgoals. However, based on this analysis, it's clear it wasn't a usability problem that contributed to the error.

Without providing this analysis, these Use Errors would have equal weight as use errors that occurred due to the device's design. 

This action was a use error because the participant didn't permit one of the required task subgoals. However, based on this analysis, it's clear it wasn't a usability problem that contributed to the error.

Without providing this analysis, these Use Errors would have equal weight as use errors that occurred due to the device's design. 

Screen Shot 2022-05-04 at 7.57.04 PM

This action is also a use error, but here we get an explanation of how the medical device may have contributed to the error, and to what extent. 

This action is also a use error, but here we get an explanation of how the product may have contributed to the error, and to what extent. 

This action is also a use error, but here we get an explanation of how the product may have contributed to the error, and to what extent. 

This action is also a use error, but here we get an explanation of how the product may have contributed to the error, and to what extent. 

This action is also a use error, but here we get an explanation of how the product may have contributed to the error, and to what extent. 

System Usability Scale

Screen Shot 2022-05-05 at 1.39.27 AM

Standard system usability scale questions, administered during the debrief portion of the study through a google form

During the debrief portion of the study, participants completed a system usability scale consisting of 10 questions, with all questions formatted as a likert scale with options from 1 (strongly disagree) to 5 (strongly agree).

By having participants complete this system usability scale, we were able to obtain an objective measure of the device's usability, and to obtain a quantifiable metric we could use for comparison's sake, both for similar products and completed unrelated medical devices.

With a system usability scale of 83 (based on the average responses from participants), this blood glucose moniter graded out well above the usability benchmark of 68 and objectively marks this product as effective, efficient, satisfactory and usable.

Results

Through our analysis, we were able to determine that this blood glucose moniter we tested was safe to use, effective device which medical professionals can be expected to use without error in most cases. There was no pattern of results that was indicative of systematic design flaws, and participants in both user groups (nurses and point of care coordinators) were generally successful around 95% of the time at completing the tasks in the test scenarios presented to them.

While the device is still awaiting approval from the FDA to be sold in the US, it is currently available on the European marketplace.

Due to a non-disclose agreement with Roche, I am limited in the amount of analysis and findings I can disclose. If you'd like to hear more, please contact me at Nick.weinel@gmail.com

Shoutout to my research partner for this project Ruiqi Li