Investigating How Large Language Model Responses Should Be Displayed to Everyday Augmented Reality Users

Supervisor: Dr Joseph O'Hagan and Dr Euan Freeman

School: Computing Science

Description:

A great deal of attention has focused recently on the advancing capabilities of artificial intelligence (AI) tools, e.g., recent developments in large language models (LLMs) that underpin virtual assistants like ChatGPT. There is a growing need to better understand how users will interact with these tools in the future, especially with increased interest in embedding AI tools into novel device form factors – e.g., the wearable Humane AI Pin.

Everyday augmented reality (AR) offers significant potential to enhance user interactions with AI tools and, likewise, enables the benefits of AI to be integrated into the user’s perception of the world - by bringing experiences of, and interactions with, AI into our everyday lives, and offering direct routes for auditory and visual feedback based on AI instruction and output.

While current user-LLM interactions are limited to text/speech input and output, everyday AR will enable new, direct ways of providing visual/auditory input and outputs, far greater than the capabilities of existing technologies. For example, a user might clarify their query to the LLM by physically touching an object in question. Likewise, an LLM might provide visual feedback to the user (e.g. visually outlining an object of interest) rather than relying on textual/auditory feedback to relay its answer.

This project will focus on the latter of these through a lab-based user study. This user study will investigate the design of how LLM responses might be presented to an everyday AR user to explore user experience with different presentation types (e.g. a traditional window-like display box, embedding the information into the user’s surrounding environment, etc).

In doing so, this project will explore how everyday AR user-LLM interactions might be improved by improving user experience with the responses of LLMs which take advantage of the unique capabilities provided by everyday AR devices.

A timeline of the core steps of this project follows:
• (a) building a corpus of AI-driven search results for queries in different domains (e.g. education, manufacturing, everyday life) [weeks 1-3]
• (b) designing prototype AR presentations of these results ranging from augmented displays to embedding information into the world around the user [weeks 4-6]
• (c) evaluating these prototype AR presentations in a wizard-of-oz lab-based user study [weeks 7-10]