EchoXP

Visual Studio Code:

React Native
Typescript
Expo

September 2024 - April 2025
(8 months)

High-fidelity mobile app prototype
Chatbot system with OpenAI GPT-4
Sensor data collection
User testing
Comparative study results and analysis
Final research report

Individual Project - Final year project

Role

Product Designer
Full Stack Developer

Deliverables

Project type

Duration

Tools

A chatbot-based mobile application for collecting real-time experiences using conversational AI and sensor data

2025

OVERVIEW

Introduction

EchoXP: AI-Enhanced Data Collection for Healthcare Research

EchoXP is a chatbot-based mobile application that reimagines how we collect subjective health data. Developed as my final year university project, it explores whether conversational AI can make data collection more engaging and effective than traditional surveys. Using chronic musculoskeletal pain as a case study, the app integrates GPT-4 powered conversations with real-time sensor tracking to capture richer and more contextual information about patient experiences.

Through a comparative study with 12 participants, EchoXP demonstrated that conversational interfaces can elicit longer and more detailed responses while reducing user frustration, therefore, opening new possibilities for mobile health research and continuous patient monitoring.

The Process

Project Objectives

The following objectives address both the technical challenge of developing a chatbot-based mobile application and the empirical question of whether it improves user engagement. Focusing on chronic musculoskeletal pain as a case study, this research contributes to healthcare while developing methodologies applicable to other conditions with subjective experiences.

To investigate the effects of implementing a chatbot system on the user engagement of the mobile application
To conduct a comparative user testing that evaluates the user engagement of a chatbot mobile application against a traditional survey approach

RESEARCH

The Problem

1. Chronic musculoskeletal pain

20% of adults worldwide are affected by chronic pain
20.4% (50 million) of adults in the US have chronic pain
61% of adults in Europe are less able or unable to work outside their homes

The complex nature of chronic musculoskeletal pain presents challenges for healthcare. Unlike acute pain, which is caused by a specific disease or injury, chronic pain can occur independently of any clear physical cause and is often more complex to manage. This prolonged pain is a global health issue, affecting a substantial portion of the population and imposing a heavy burden on individuals and society. Therefore, it is necessary to consider using technology to automatically detect these experiences.

2. Limitations in traditional data collection methods

Retrospective pain reports are subject to recall bias (influenced by peak intensity and recency)
Pain experiences are contextual and dynamic, varying throughout the day based on activities and environmental factors
Paper diaries show extremely poor compliance rates (11% actual compliance vs. 94% for electronic diaries)
Current data collection methods fail to capture the complex, fluctuating nature of chronic pain

This significant gap highlights the challenges in maintaining consistent user engagement with traditional data collection methods and calls into question the validity of data collected through such approaches.

3. Lack of user engagement in existing health applications

Health app retention rates drop dramatically (30% after 30 days, 10% after 90 days)
Pain tracking applications show an even faster decline in engagement
Progressive disengagement negatively impacts data quality and completeness

Despite smartphones being ideal platforms for ecological momentary assessments (EMA), current solutions fail to maintain long-term user engagement.

Review of Existing Work

Review of Existing EMA Applications

Existing EMA applications approach data collection differently.

ilumivu offers a highly customizable research infrastructure for longitudinal studies with adaptive sampling techniques.
AbleTo SelfCare+ takes a consumer-oriented approach, integrating evidence-based CBT tools with mood tracking for mental health support.
m-Path bridges therapy and daily life through blended care, featuring mobile sensing capabilities and gamification to enhance user engagement and motivation through customizable rewards and achievement systems.

Review of User Studies on Mobile App Design Features for Improving User Engagement

Wei et al. (2020) conducted a systematic review of 35 studies examining mHealth app engagement features across diverse user populations (ages 14-74). The research identified five key design features that enhance engagement: personalization, reinforcement, communication, navigation, and interface aesthetics. While the study didn't explicitly examine chatbots, these identified features, particularly personalization, communication, and interface design, are directly applicable to my project and crucial for maintaining consistent user engagement in health-related data collection applications.

Reinforcement

Personalization

Navigation

Communication

Interface Aesthetics

Review of Large Language Models (LLMs) and Conversational AI

Large Language Models (LLMs) like OpenAI's GPT-4, Anthropic's Claude 3.7 Sonnet, Meta's LlaMA 4 Scout, and Google's Gemini 2.5 Pro have transformed chatbot capabilities through multi-stage training processes, including pre-training on vast text data and fine-tuning with techniques like Reinforcement Learning from Human Feedback (RLHF). Modern transformer-based architectures enable more human-like, contextually aware conversations compared to traditional rule-based chatbots.

OpenAI's GPT-4 was selected for this project due to its straightforward API integration, strong context-handling capabilities, precise prompt engineering support, and demonstrated high performance in producing relevant, coherent, and natural responses. These are essential qualities for healthcare applications requiring empathy and appropriate emotional support.

DESIGN & DEVELOPMENT

Requirements Analysis

Defines project scope:
- Establishes what the application must do (functional) and how well it must perform (non-functional)
Guides development:
- Provides clear specifications for building the chatbot, sensor integration, database, and user authentication features
Ensures completeness:
- Lists all necessary features, including chatbot conversation flow, emotion analysis, IMU sensor data collection, and data storage protocols
Provides testing criteria:
- Creates measurable benchmarks to verify the application meets all specified requirements before user testing
Documents stakeholder needs:
- Translates research objectives into concrete technical specifications that address both user experience and data collection goals

System Architecture

The application implements a unidirectional data flow:

User interactions and sensor readings are captured on the client device
Data is processed locally for immediate feedback and temporary storage
Processed data is sent to the API tier for validation and enrichment
Validated data is stored in the Supabase PostgreSQL database
Confirmations and necessary responses are returned to the client

This flow ensures data integrity while maintaining a responsive user experience.

Chatbot Implementation

The chatbot that I decided to name as "Ekko" was implemented using OpenAI's GPT-4 API with a text-to-text interface, selected for its balance of response quality, API integration ease, and real-time conversation capabilities. Through prompt engineering, the system was configured to maintain a friendly personality while following a structured flow of 15 predefined questions, creating natural transitions by referencing previous user responses.

The implementation includes emotion analysis functionality that extracts and quantifies seven positive emotional states (joy, satisfaction, excitement, enthusiasm, pleasure, happiness, contentment) from user responses with intensity scores from 0-5. This hybrid approach balances structured data collection with natural conversational flow, making the experience feel less like a survey and more like a thoughtful conversation.

Code snippet of prompting the GPT-4 model to brief the chatbot “Ekko”

Sensor Data Collection

The application integrates accelerometer and gyroscope sensors via React Native's Expo APIs to capture linear acceleration and rotational movements across three dimensions (x, y, z)

Sensor data collection activates automatically upon user login and continues throughout the session, with readings processed and stored in 20-second batches to balance data quality with system performance
Performance optimizations include dedicated background thread processing, buffered storage to reduce I/O operations, adaptive sampling that adjusts based on device constraints, and efficient JSON formatting to minimize storage and transmission overhead

This approach ensures comprehensive movement tracking without compromising the application's responsiveness during user interactions.

UI Design - Key Features

EchoXP's UI follows a user-centered design approach prioritizing simplicity, consistency, and engagement through minimalist aesthetics and React Native Paper components
The interface features anonymous authentication using nicknames and recovery codes, intuitive tab-based navigation across Home, Chat, and Settings screens, and distinct visual designs for user versus chatbot messages with timestamps and avatars
Accessibility features include high-contrast text, responsive typography, clear interactive feedback, and comprehensive error handling with helpful instructions for issues like network failures or authentication errors
The chatbot "Ekko" is personified with a friendly robot avatar and empathetic personality to foster emotional engagement, while information is progressively disclosed to prevent cognitive overload and maintain a natural conversation flow

EchoXP: Login

EchoXP: Home

EchoXP - Demo Video

Code snippet of the System Prompt for the GPT-4 model to extract positive emotion labels and assign the intensity

EchoXP: Chat

EchoXP: Sign out

This demonstration video showcases the complete EchoXP user experience, from registration through conversational interaction with Ekko to completion of all 15 food preference questions. The walkthrough illustrates the chatbot's natural dialogue flow, sensor data collection in action, and the seamless user interface designed to maximize engagement while gathering detailed qualitative and quantitative data.

USER TESTING

Testing Objectives

Primary Objectives

To quantitatively measure and compare user engagement levels between the EchoXP chatbot application and a traditional Qualtrics survey using the validated User Engagement Scale- Short Form (UES-SF) across its four dimensions (Focused Attention, Perceived Usability, Aesthetic Appeal, and Reward)
To investigate the effects of user engagement gathered through both interfaces, specifically measuring response length, and consistency
To evaluate the effectiveness of integrating sensor data collection with self-reported information within the chatbot interface, examining both the technical feasibility and the contextual value

Secondary Objectives

To identify potential usability challenges in both interfaces that might influence engagement
To gather qualitative feedback from participants regarding their experience with each interface to inform future refinements and development

Testing Methodology

The test employed two methods:

Following each testing method, participants completed the Experience Evaluation Questionnaire that is based on the standards of the User Engagement Scale- Short Form (UES-SF).

—> Participant Recruitment

Sample size = 12 undergraduate students
Each participant were randomly assigned to one of the two testing methods

None had prior experience with both testing methods
Each participants were assigned a unique nickname for anonymisation

—> Scenario & Protocols

Scenario: Food Preferences (favourite food)
- Universally relatable
- Does not require sharing sensitive information
- Avoids evoking negative emotions
Protocols:
- Briefed participants about the study purpose and procedure
- Obtain their written consent (Information & Consent Form)
- Participants were assigned to either the chatbot group or the survey group

After answering all of the questions, the participant immediately filled out the Experience Evaluation Questionnaire
Concluded each session with a brief interview about factors that influenced their engagement ratings in the Experience Evaluation Questionnaire and how they felt about their assigned testing method

—> Qualtrics Survey

Similar aesthetics to EchoXP
Shows the question one by one

—> User Engagement Short-Form (UES-SF)

The Experience Evaluation Questionnaire is based on the User Engagement Short-Form (UES-SF) standards for measuring engagement

RESULTS & ANALYSIS

UES Analysis

Graph 1:

Chatbot group rated significantly higher in Perceived Usability (M=4.56) than survey group (M=4.17)
- Users found the chatbot interface more intuitive and less frustrating
Focused Attention scores slightly lower for chatbot (M=2.78) than survey (M=2.94)
Aesthetic Appeal slightly lower for chatbot (M=3.06) than survey (M=3.28)
Reward scores is similar (chatbot: M=3.28, survey: M=3.22)
Overall Engagement is similar (chatbot: M=3.42, survey: M=3.40)

Graph 2:

Survey group rated higher on:
- "I lost myself in this experience" (FA.1)
- "The time I spent doing this activity just slipped away" (FA.2)
- "The format of this activity was attractive" (AE.1)
- "This activity appealed to my senses" (AE.3)

Graph 3:

Strong negative correlation between Focused Attention and Reward (r=-0.58)
- Higher levels of absorption are associated with lower perceived reward
Strong positive correlations between Overall Engagement and:
- Perceived Usability (r=0.68)
- Aesthetic Appeal (r=0.68)
- Reward (r=0.66)
Weak negative correlation between Focused Attention and Overall Engagement (r=-0.11)
Moderate positive correlation between Perceived Usability and Reward (r=0.26)
- More usable interfaces are perceived as more rewarding

Comparison of Response Lengths

Message Activity & Sensory Activity

Post-Test Interview

Graph 4:

Chatbot elicited longer responses (median ≈ 8.5 words) than survey (median ≈ 6.5 words)
Wider variation in chatbot response lengths, showing greater range of expression
Maximum average word count for chatbot (16.5 words) exceeded survey maximum (9.5 words)
- Conversational interface naturally encouraged more elaborate responses
Interesting pattern at lower end: chatbot minimum (2 words) vs. survey minimum (4.5 words)
- Chatbot accommodated both detailed and brief responses, reflecting natural conversation patterns
Response length varied based on specific questions and participant engagement level
- Suggests chatbot provides more flexible expression options than traditional surveys

Graph 5:

Substantial variation in response length throughout conversations
Some participants (Dolphin and Pudding) produced notably longer responses at specific points
- More detailed responses occurred at intervals, likely for questions that naturally elicited elaboration
Sensor activity generally remained stable (0.8-1.2 magnitude range) with occasional spikes
Physical movement and messaging behaviour relationship varies by individual
Different interaction patterns among participants in the same group (chatbot):
- some showed longer engagement (Pudding), others completed questions early on (Brownbread, Trumpet)

Response Length vs. Emotional Expression

Graph 6:

Response lengths ranged from brief replies to approximately 40 words
Emotion intensity measured on a 0-5 scale across seven positive emotional states (joy, satisfaction, excitement, enthusiasm, pleasure, happiness, and contentment)

Moderate positive correlation (r=0.36) between word count and emotional intensity
Longer responses tend to contain somewhat higher levels of expressed emotion
Some shorter responses still conveyed high emotional intensity (e.g., Pudding's 10-word response exceeded the intensity of 4)
Some longer responses exhibited relatively low emotional scores
Some individuals showed consistent emotional expression regardless of message length
Relationship is not strictly linear across all users, suggesting individualized patterns

Participants tried to “stress test” the chatbot by asking back questions
- Chatbot demonstrated flexibility by responding to unexpected queries
Multiple participants described chatbot's tone as "overly nice" and too eager to please
- Excessive empathetic personality perceived as unnatural
Consistent pleasantness felt artificial compared to human conversation
- Human conversation typically contains more variation in tone and formality
Despite engagement success, future versions need more balanced conversational style
Need more authentic interaction patterns that better mimic natural conversation
Balancing structure with conversational authenticity remains a key design challenge

DISCUSSION

Discussion & Implications

Chatbot interface elicited longer, more detailed responses than traditional surveys
Conversational AI demonstrated ability to collect richer qualitative data
Integration of sensor data with self-reports provides a more holistic understanding of chronic pain
Potential for machine learning applications using combined data:
- Predictive models for pain levels
- Sentiment analysis capabilities
- Activity classification from sensor data
- Algorithms to help anticipate pain fluctuations

Limitations

Participants attempted to test system boundaries beyond the intended conversation flow
The computer science student demographic may have approached the chatbot more critically
Technical constraints emerged, including inconsistent sensor data collection
Fixed conversation flow and limited context awareness sometimes created disjointed interactions
Future needs: more adaptive conversations with better context awareness

Ethical Considerations

Future development requires balanced question design and authentic conversational capabilities
Health contexts demand a balance between data collection, empathy, and user sensitivity
Security concerns: data handling by LLM providers needs careful consideration
Need for stronger anonymization techniques and transparent data retention policies
Healthcare data regulation compliance is crucial for trust and protection
Success requires interdisciplinary collaboration: HCI, psychology, pain management, ML, data security

CONCLUSION

In Conclusion..

The study successfully met primary objectives of investigating chatbot impact on user engagement
EchoXP demonstrated moderate success in enhancing user interactions
Participants produced longer and more detailed responses with chatbot interface
Differences between interfaces were nuanced, not as initially hypothesized
Comparative testing revealed promising insights about multimodal data collection
Integration of conversational and sensor data provided important contextual support for self reports
Conversational AI showed potential for capturing more extensive health data in sensitive domains
Multimodal approach creates new pathways for more effective pain monitoring
Project serves as a stepping stone toward more effective, user-centered health solutions
Future solutions should continue to balance robust data collection with engaging user experience