EchoXP

Visual Studio Code:

  • React Native

  • Typescript

  • Expo

September 2024 - April 2025
(8 months)

  • High-fidelity mobile app prototype

  • Chatbot system with OpenAI GPT-4

  • Sensor data collection

  • User testing

  • Comparative study results and analysis

  • Final research report

Individual Project - Final year project

Role

  • Product Designer

  • Full Stack Developer

Deliverables

Project type

Duration

Tools

A chatbot-based mobile application for collecting real-time experiences using conversational AI and sensor data

2025


OVERVIEW

Introduction

EchoXP: AI-Enhanced Data Collection for Healthcare Research

EchoXP is a chatbot-based mobile application that reimagines how we collect subjective health data. Developed as my final year university project, it explores whether conversational AI can make data collection more engaging and effective than traditional surveys. Using chronic musculoskeletal pain as a case study, the app integrates GPT-4 powered conversations with real-time sensor tracking to capture richer and more contextual information about patient experiences.

Through a comparative study with 12 participants, EchoXP demonstrated that conversational interfaces can elicit longer and more detailed responses while reducing user frustration, therefore, opening new possibilities for mobile health research and continuous patient monitoring.

The Process

Project Objectives

The following objectives address both the technical challenge of developing a chatbot-based mobile application and the empirical question of whether it improves user engagement. Focusing on chronic musculoskeletal pain as a case study, this research contributes to healthcare while developing methodologies applicable to other conditions with subjective experiences.

  • To investigate the effects of implementing a chatbot system on the user engagement of the mobile application

  • To conduct a comparative user testing that evaluates the user engagement of a chatbot mobile application against a traditional survey approach

RESEARCH

The Problem

1. Chronic musculoskeletal pain

  • 20% of adults worldwide are affected by chronic pain

  • 20.4% (50 million) of adults in the US have chronic pain

  • 61% of adults in Europe are less able or unable to work outside their homes

The complex nature of chronic musculoskeletal pain presents challenges for healthcare. Unlike acute pain, which is caused by a specific disease or injury, chronic pain can occur independently of any clear physical cause and is often more complex to manage. This prolonged pain is a global health issue, affecting a substantial portion of the population and imposing a heavy burden on individuals and society. Therefore, it is necessary to consider using technology to automatically detect these experiences.

2. Limitations in traditional data collection methods

  • Retrospective pain reports are subject to recall bias (influenced by peak intensity and recency)

  • Pain experiences are contextual and dynamic, varying throughout the day based on activities and environmental factors

  • Paper diaries show extremely poor compliance rates (11% actual compliance vs. 94% for electronic diaries)

  • Current data collection methods fail to capture the complex, fluctuating nature of chronic pain

This significant gap highlights the challenges in maintaining consistent user engagement with traditional data collection methods and calls into question the validity of data collected through such approaches.

3. Lack of user engagement in existing health applications

  • Health app retention rates drop dramatically (30% after 30 days, 10% after 90 days)

  • Pain tracking applications show an even faster decline in engagement

  • Progressive disengagement negatively impacts data quality and completeness

Despite smartphones being ideal platforms for ecological momentary assessments (EMA), current solutions fail to maintain long-term user engagement.

Review of Existing Work

Review of Existing EMA Applications

Existing EMA applications approach data collection differently.

  • ilumivu offers a highly customizable research infrastructure for longitudinal studies with adaptive sampling techniques.

  • AbleTo SelfCare+ takes a consumer-oriented approach, integrating evidence-based CBT tools with mood tracking for mental health support.

  • m-Path bridges therapy and daily life through blended care, featuring mobile sensing capabilities and gamification to enhance user engagement and motivation through customizable rewards and achievement systems.

Review of User Studies on Mobile App Design Features for Improving User Engagement

Wei et al. (2020) conducted a systematic review of 35 studies examining mHealth app engagement features across diverse user populations (ages 14-74). The research identified five key design features that enhance engagement: personalization, reinforcement, communication, navigation, and interface aesthetics. While the study didn't explicitly examine chatbots, these identified features, particularly personalization, communication, and interface design, are directly applicable to my project and crucial for maintaining consistent user engagement in health-related data collection applications.

Reinforcement

Personalization

Navigation

Communication

Interface Aesthetics

Review of Large Language Models (LLMs) and Conversational AI

Large Language Models (LLMs) like OpenAI's GPT-4, Anthropic's Claude 3.7 Sonnet, Meta's LlaMA 4 Scout, and Google's Gemini 2.5 Pro have transformed chatbot capabilities through multi-stage training processes, including pre-training on vast text data and fine-tuning with techniques like Reinforcement Learning from Human Feedback (RLHF). Modern transformer-based architectures enable more human-like, contextually aware conversations compared to traditional rule-based chatbots.

OpenAI's GPT-4 was selected for this project due to its straightforward API integration, strong context-handling capabilities, precise prompt engineering support, and demonstrated high performance in producing relevant, coherent, and natural responses. These are essential qualities for healthcare applications requiring empathy and appropriate emotional support.

DESIGN & DEVELOPMENT

Requirements Analysis

  • Defines project scope:

    • Establishes what the application must do (functional) and how well it must perform (non-functional)

  • Guides development:

    • Provides clear specifications for building the chatbot, sensor integration, database, and user authentication features

  • Ensures completeness:

    • Lists all necessary features, including chatbot conversation flow, emotion analysis, IMU sensor data collection, and data storage protocols

  • Provides testing criteria:

    • Creates measurable benchmarks to verify the application meets all specified requirements before user testing

  • Documents stakeholder needs:

    • Translates research objectives into concrete technical specifications that address both user experience and data collection goals

System Architecture

The application implements a unidirectional data flow:

  • User interactions and sensor readings are captured on the client device

  • Data is processed locally for immediate feedback and temporary storage

  • Processed data is sent to the API tier for validation and enrichment

  • Validated data is stored in the Supabase PostgreSQL database

  • Confirmations and necessary responses are returned to the client

This flow ensures data integrity while maintaining a responsive user experience.

Chatbot Implementation

The chatbot that I decided to name as "Ekko" was implemented using OpenAI's GPT-4 API with a text-to-text interface, selected for its balance of response quality, API integration ease, and real-time conversation capabilities. Through prompt engineering, the system was configured to maintain a friendly personality while following a structured flow of 15 predefined questions, creating natural transitions by referencing previous user responses.

The implementation includes emotion analysis functionality that extracts and quantifies seven positive emotional states (joy, satisfaction, excitement, enthusiasm, pleasure, happiness, contentment) from user responses with intensity scores from 0-5. This hybrid approach balances structured data collection with natural conversational flow, making the experience feel less like a survey and more like a thoughtful conversation.

Code snippet of prompting the GPT-4 model to brief the chatbot “Ekko”

Sensor Data Collection

  • The application integrates accelerometer and gyroscope sensors via React Native's Expo APIs to capture linear acceleration and rotational movements across three dimensions (x, y, z)

  • Sensor data collection activates automatically upon user login and continues throughout the session, with readings processed and stored in 20-second batches to balance data quality with system performance

  • Performance optimizations include dedicated background thread processing, buffered storage to reduce I/O operations, adaptive sampling that adjusts based on device constraints, and efficient JSON formatting to minimize storage and transmission overhead

This approach ensures comprehensive movement tracking without compromising the application's responsiveness during user interactions.

UI Design - Key Features

  • EchoXP's UI follows a user-centered design approach prioritizing simplicity, consistency, and engagement through minimalist aesthetics and React Native Paper components

  • The interface features anonymous authentication using nicknames and recovery codes, intuitive tab-based navigation across Home, Chat, and Settings screens, and distinct visual designs for user versus chatbot messages with timestamps and avatars

  • Accessibility features include high-contrast text, responsive typography, clear interactive feedback, and comprehensive error handling with helpful instructions for issues like network failures or authentication errors

  • The chatbot "Ekko" is personified with a friendly robot avatar and empathetic personality to foster emotional engagement, while information is progressively disclosed to prevent cognitive overload and maintain a natural conversation flow

EchoXP: Login

EchoXP: Home

EchoXP - Demo Video

Code snippet of the System Prompt for the GPT-4 model to extract positive emotion labels and assign the intensity

EchoXP: Chat

EchoXP: Sign out

This demonstration video showcases the complete EchoXP user experience, from registration through conversational interaction with Ekko to completion of all 15 food preference questions. The walkthrough illustrates the chatbot's natural dialogue flow, sensor data collection in action, and the seamless user interface designed to maximize engagement while gathering detailed qualitative and quantitative data.

USER TESTING

Testing Objectives

Primary Objectives

  • To quantitatively measure and compare user engagement levels between the EchoXP chatbot application and a traditional Qualtrics survey using the validated User Engagement Scale- Short Form (UES-SF) across its four dimensions (Focused Attention, Perceived Usability, Aesthetic Appeal, and Reward)

  • To investigate the effects of user engagement gathered through both interfaces, specifically measuring response length, and consistency

  • To evaluate the effectiveness of integrating sensor data collection with self-reported information within the chatbot interface, examining both the technical feasibility and the contextual value

Secondary Objectives

  • To identify potential usability challenges in both interfaces that might influence engagement

  • To gather qualitative feedback from participants regarding their experience with each interface to inform future refinements and development

Testing Methodology

The test employed two methods:

Following each testing method, participants completed the Experience Evaluation Questionnaire that is based on the standards of the User Engagement Scale- Short Form (UES-SF).

—> Participant Recruitment

  • Sample size = 12 undergraduate students

  • Each participant were randomly assigned to one of the two testing methods

  • None had prior experience with both testing methods

  • Each participants were assigned a unique nickname for anonymisation

—> Scenario & Protocols

  • Scenario: Food Preferences (favourite food)

    • Universally relatable

    • Does not require sharing sensitive information

    • Avoids evoking negative emotions

  • Protocols:

    • Briefed participants about the study purpose and procedure

    • Obtain their written consent (Information & Consent Form)

    • Participants were assigned to either the chatbot group or the survey group

  • After answering all of the questions, the participant immediately filled out the Experience Evaluation Questionnaire

  • Concluded each session with a brief interview about factors that influenced their engagement ratings in the Experience Evaluation Questionnaire and how they felt about their assigned testing method

—> Qualtrics Survey

  • Similar aesthetics to EchoXP

  • Shows the question one by one

—> User Engagement Short-Form (UES-SF)

  • The Experience Evaluation Questionnaire is based on the User Engagement Short-Form (UES-SF) standards for measuring engagement

RESULTS & ANALYSIS

UES Analysis

Graph 1:

  • Chatbot group rated significantly higher in Perceived Usability (M=4.56) than survey group (M=4.17)

    • Users found the chatbot interface more intuitive and less frustrating

  • Focused Attention scores slightly lower for chatbot (M=2.78) than survey (M=2.94)

  • Aesthetic Appeal slightly lower for chatbot (M=3.06) than survey (M=3.28)

  • Reward scores is similar (chatbot: M=3.28, survey: M=3.22)

  • Overall Engagement is similar (chatbot: M=3.42, survey: M=3.40)

Graph 2:

  • Survey group rated higher on:

    • "I lost myself in this experience" (FA.1)

    • "The time I spent doing this activity just slipped away" (FA.2)

    • "The format of this activity was attractive" (AE.1)

    • "This activity appealed to my senses" (AE.3)

Graph 3:

  • Strong negative correlation between Focused Attention and Reward (r=-0.58)

    • Higher levels of absorption are associated with lower perceived reward

  • Strong positive correlations between Overall Engagement and:

    • Perceived Usability (r=0.68)

    • Aesthetic Appeal (r=0.68)

    • Reward (r=0.66)

  • Weak negative correlation between Focused Attention and Overall Engagement (r=-0.11)

  • Moderate positive correlation between Perceived Usability and Reward (r=0.26)

    • More usable interfaces are perceived as more rewarding

Comparison of Response Lengths

Message Activity & Sensory Activity

Post-Test Interview

Graph 4:

  • Chatbot elicited longer responses (median ≈ 8.5 words) than survey (median ≈ 6.5 words)

  • Wider variation in chatbot response lengths, showing greater range of expression

  • Maximum average word count for chatbot (16.5 words) exceeded survey maximum (9.5 words)

    • Conversational interface naturally encouraged more elaborate responses

  • Interesting pattern at lower end: chatbot minimum (2 words) vs. survey minimum (4.5 words)

    • Chatbot accommodated both detailed and brief responses, reflecting natural conversation patterns

  • Response length varied based on specific questions and participant engagement level

    • Suggests chatbot provides more flexible expression options than traditional surveys

Graph 5:

  • Substantial variation in response length throughout conversations

  • Some participants (Dolphin and Pudding) produced notably longer responses at specific points

    • More detailed responses occurred at intervals, likely for questions that naturally elicited elaboration

  • Sensor activity generally remained stable (0.8-1.2 magnitude range) with occasional spikes

  • Physical movement and messaging behaviour relationship varies by individual

  • Different interaction patterns among participants in the same group (chatbot):

    • some showed longer engagement (Pudding), others completed questions early on (Brownbread, Trumpet)

Response Length vs. Emotional Expression

Graph 6:

  • Response lengths ranged from brief replies to approximately 40 words

  • Emotion intensity measured on a 0-5 scale across seven positive emotional states (joy, satisfaction, excitement, enthusiasm, pleasure, happiness, and contentment)

  • Moderate positive correlation (r=0.36) between word count and emotional intensity

  • Longer responses tend to contain somewhat higher levels of expressed emotion

  • Some shorter responses still conveyed high emotional intensity (e.g., Pudding's 10-word response exceeded the intensity of 4)

  • Some longer responses exhibited relatively low emotional scores

  • Some individuals showed consistent emotional expression regardless of message length

  • Relationship is not strictly linear across all users, suggesting individualized patterns

  • Participants tried to “stress test” the chatbot by asking back questions

    • Chatbot demonstrated flexibility by responding to unexpected queries

  • Multiple participants described chatbot's tone as "overly nice" and too eager to please

    • Excessive empathetic personality perceived as unnatural

  • Consistent pleasantness felt artificial compared to human conversation

    • Human conversation typically contains more variation in tone and formality

  • Despite engagement success, future versions need more balanced conversational style

  • Need more authentic interaction patterns that better mimic natural conversation

  • Balancing structure with conversational authenticity remains a key design challenge

DISCUSSION

Discussion & Implications

  • Chatbot interface elicited longer, more detailed responses than traditional surveys

  • Conversational AI demonstrated ability to collect richer qualitative data

  • Integration of sensor data with self-reports provides a more holistic understanding of chronic pain

  • Potential for machine learning applications using combined data:

    • Predictive models for pain levels

    • Sentiment analysis capabilities

    • Activity classification from sensor data

    • Algorithms to help anticipate pain fluctuations

Limitations

  • Participants attempted to test system boundaries beyond the intended conversation flow

  • The computer science student demographic may have approached the chatbot more critically

  • Technical constraints emerged, including inconsistent sensor data collection

  • Fixed conversation flow and limited context awareness sometimes created disjointed interactions

  • Future needs: more adaptive conversations with better context awareness

Ethical Considerations

  • Future development requires balanced question design and authentic conversational capabilities

  • Health contexts demand a balance between data collection, empathy, and user sensitivity

  • Security concerns: data handling by LLM providers needs careful consideration

  • Need for stronger anonymization techniques and transparent data retention policies

  • Healthcare data regulation compliance is crucial for trust and protection

  • Success requires interdisciplinary collaboration: HCI, psychology, pain management, ML, data security

CONCLUSION

In Conclusion..

  • The study successfully met primary objectives of investigating chatbot impact on user engagement

  • EchoXP demonstrated moderate success in enhancing user interactions

  • Participants produced longer and more detailed responses with chatbot interface

  • Differences between interfaces were nuanced, not as initially hypothesized

  • Comparative testing revealed promising insights about multimodal data collection

  • Integration of conversational and sensor data provided important contextual support for self reports

  • Conversational AI showed potential for capturing more extensive health data in sensitive domains

  • Multimodal approach creates new pathways for more effective pain monitoring

  • Project serves as a stepping stone toward more effective, user-centered health solutions

  • Future solutions should continue to balance robust data collection with engaging user experience

Want to read more? → Click here to view the full paper