NSF RETTL: Collaborative Research: Advancing STEM Online Learning by Augmenting Accessibility with Explanatory Captions and AI

Project Description (NSF IIS-2119531)

PI: Yun Huang; co-PIs: Lawrence Angrave, Meng Jiang, and Qi Wang

Videos are a popular medium for online learning, in which captions are essential for increasing accessibility to students for effective learning. This research identifies two types of video captions: typical closed captions and explanatory captions. Closed captions are a text representation of the spoken part of a video. Explanatory captions are created to give students insights into the visual, textual, and audio content of a video. Existing technologies have focused on automatically generating or improving the quality of closed captions. For STEM learning, explanatory captions have the potential to play a new role in learning. This project will work to devise effective Q/A mechanisms and effective interaction designs that enable students and instructors to generate explanatory captions for STEM videos in a collaborative manner. The proposed technologies will augment accessibility and learning experiences for under-served populations, including the Deaf and Hard-of-Hearing (DHH) community, made up of 48 million Americans, while also improving comprehension for non-native English speakers, even those without hearing impairments. Evaluation sites include both Gallaudet University, the world’s only liberal arts university dedicated exclusively to educating DHH learners, and the University of Illinois at Urbana-Champaign, which has the largest international student population amongst U.S. public institutions and supports students with disabilities in inclusive learning environments.

This interdisciplinary research draws from and contributes to both computer science and learning science, and accessibility practices in the following areas. The first step is discovering new knowledge about how accessibility-enabled videos (with explanatory and closed captions) broaden the participation of under-served populations in STEM learning. This will provide the foundation for developing a theory of how explanatory captions can contribute to learning and effective mechanisms, based on crowdsourced human contributions and machine learning algorithms, to create these explanatory captions for STEM videos at different learning stages (e.g., preparing, tracking, trouble-shooting, and reflecting). The investigators will then use the theory to create a novel chatbot that enables knowledge sharing for students with diverse backgrounds. Theoretical frameworks--ICAP (interactive, constructive, active, and passive) and Community of Inquiry will guide the evaluation of how explanatory captions and chatbots can contribute to learning. Finally, the team will acquire empirical understanding of how augmented accessibility with AI agents (e.g., chatbots) impacts students' and instructors' practices.

Faculty

Meng Jiang

Research Assistants

Mengxia Yu

Bang Nguyen

Wenhao Yu

Publications

QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation Annual Meetings of the Association for Computational Linguistics (ACL), 2025.
Reference-based Metrics Disprove Themselves in Question Generation Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
TOWER: Tree Organized Weighting for Evaluating Complex Instructions Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning Annual Meetings of the Association for Computational Linguistics (ACL), 2024.
Pre-training Language Models for Comparative Reasoning Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
Creative Research Question Generation for Human-Computer Interaction Research Workshop on Human-AI Co-Creation with Generative Models (HAI-GEN) at Annual Conference on Intelligent User Interfaces (IUI), 2023.
Scientific Comparative Argument Generation Third Document Intelligence Workshop (DI) at ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2022.
A Unified Encoder-Decoder Framework with Entity Memory Empirical Methods on Natural Language Processing (EMNLP), 2022.
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach Empirical Methods on Natural Language Processing (EMNLP), 2022.
Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts Findings of Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
Dict-BERT: Enhancing Language Model Pre-training with Dictionary Findings of Annual Meeting of the Association for Computational Linguistics (ACL), 2022.