PI: Dr. Matthew Peterson
Co-PI: Helen Armstrong
Research Assistants: Ashley Anderson, Kayla Rondinelli, Rebecca Planchart & Kweku Baidoo,
The NC State Laboratory for Analytic Sciences Team led by TPM Christine B.
RESEARCH OBJECTIVE: Discover innovations in interface design that empower intelligence analysts to efficiently validate large language model (LLM) outputs.
Click here to interact with the demo and read more about our research.
As we consider a future of human/machine teaming, we seek an interface environment through which humans might augment their cognitive abilities with the use of AI. We look to a scenario in which human insight might combine with AI support to leverage the unique strengths of both.
In our project “Developing Visual Conventions for Explainable LLM Outputs in Intelligence Analysis Summaries,” (Conventions) our research team positioned human sensemaking in this complementary human/AI space.
When users can’t fully grasp how automated systems work, their willingness to rely on the systems depends heavily on trust, particularly in complex scenarios when comprehensive understanding of an AI system is impractical. Trust is established in the literature as a crucial factor in human-machine teaming with AI. Humans either trust these systems too much (more than the system capabilities warrant) or they trust them too little (dismissing them quickly), even if the results might be accurate. Appropriate trust calibration via user interface design provides one possible answer.
Our project “Developing Visual Conventions for Explainable Large Language Model (LLM) Outputs in Intelligence Analysis Summaries,” (Conventions) considered how uncertainty might be effectively communicated to analysts through visual conventions. We then embedded these visual conventions inside a larger sensemaking interface through which we explore several methods of trust calibration including verification, multi-agent dialog and context and user oversight. We began with an interactive demo (MAVs) that uses the common interface convention of a dashboard. We then created three future facing speculative interfaces to explore how AI capabilities might be leveraged in the future to support appropriate trust calibration between an intelligence analyst and an LLM.
This material is based upon work done, in whole or in part, in coordination with the Department of Defense (DoD). Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the DoD and/or any agency or entity of the United States Government.
.
Interact with the demo and read more about our research.
In the first of these future facing speculative interfaces we considered verification: how might the interface utilize query recommendations, nudging, verification and source investigation to support trust calibration?
In the second speculative interface, we dug into the concept of multi-agent dialog, designing the interaction as a conversational user interface that engages the human user with multiple agents. Envisioning different functionality in the form of distinct agents helps users to form a clear mental model of the capabilities of the AI system.
In the third, we considered context. We explored how the interface might intuitively respond to the needs of specific users, customers and/or storylines to decrease cognitive load and tailor responses. This interface supports trust calibration by both providing more relevant data to the user and enabling the user to adjust the analytic and visualization sensitivities to match the context of the scenario.