Overview
Our goal for this workshop is to educate researchers about the technological needs of people with vision impairments while empowering researchers to improve algorithms to meet these needs. A key component of this event will be to track progress on six dataset challenges, where the tasks are to recognize objects in few-shot learning scenarios, answer visual questions, ground answers, recognize visual questions with multiple answer groundings, locate objects in few-shot learning scenarios, classify images in a zero-shot setting. The second key component of this event will be a discussion about current research and application issues, including invited speakers from both academia and industry who will share their experiences in building today’s state-of-the-art assistive technologies as well as designing next-generation tools.

Important Dates
- Challenges go live: Friday, January 20 (9am CST)
- Challenge submissions due: Saturday, May 3 (9am CST)
- Abstract submissions due: Saturday, May 10 (9am CST)
- Abstract acceptance notifications: Friday, May 16 (5pm CST)
- Half-day Workshop: Thursday, June 12
Submissions
We invite two types of submissions:
Challenge Submissions
We invite submissions about algorithms for the following six challenge tasks: recognize objects in few-shot learning scenarios, answer visual questions, ground answers, recognize visual questions with multiple answer groundings, locate objects in few-shot learning scenarios, and classify images in a zero-shot setting. We accept submissions for algorithms that are not published, currently under review, and already published.
The teams with the top-performing submissions will be invited to give short talks during the workshop.
Extended Abstracts
We invite submissions of extended abstracts on topics related to all challenge tasks as well as assistive technologies for people with visual impairments. Papers must be at most two pages (with references) and follow the CVPR formatting guidelines using the provided author kit. Reviewing will be single-blind and accepted papers will be presented as posters. We will accept submissions on work that is not published, currently under review, and already published. There will be no proceedings. Please send your extended abstracts to workshop@vizwiz.org.
Please note that we will require all camera-ready content to be accessible via a screen reader. Given that making accessible PDFs and presentations may be a new process for some authors, we will host training sessions beforehand to both educate and assist all authors to succeed in making their content accessible.
Challenge Results
- TBD
Program
Location:
Davidson C2, Music City Center [map]
Address: 201 Rep. John Lewis Way S, Nashville, TN 37203
Schedule:
- 1:00-1:05pm: Opening remarks
- 1:05-1:15pm: Overview of three challenges related to VQA (VQA, Answer Grounding, Single Answer Grounding Recognition), winner announcements, and analysis of results
- 1:15-1:45pm: Invited talk and Q&A with computer vision researcher (Kristen Grauman)
- Talk title: “Skill Learning with First-Person and Instructional Video”
- 1:45-2:15pm: Invited talk and Q&A with Global Accessibility Awareness Day (GAAD) Foundation representative (Jennison Asuncion)
- Talk title: “Fireside Chat with Jennison Asuncion on the Impact of AI in the Digital Accessibility Industry, the Blindness Community, and on Him Personally”
- 2:15-2:45pm: Poster spotlight talks
- 2:45-3:15pm: Break
- 3:15-3:20pm: Mid-morning remarks
- 3:20-3:30pm: Overview of three zero-shot and few-shot learning challenges (few-shot video object recognition, few-shot private object localization, zero-shot classification), winner announcements, and results analysis
- 3:30-4:00pm: Invited talk and Q&A with industry representative from Apple (Jeff Bigham)
- Talk title:”System-Class Accessibility with Computer Vision”
- 4:00-4:30pm: Invited talk and Q&A with human-computer interaction researcher (Amy Pavel)
- Talk title: “Beyond Accuracy: Measuring What Matters for People Using Interactive Visual Assistance Technology”
- 4:30-5:00pm: Open Q&A panel with four invited speakers
- 5:00-5:05pm: Closing remarks
- 5:05-5:30pm: Poster session
Poster List:
- HQD-EM : Robust VQA through Hierarchical Question Decomposition and Ensemble-Adaptive Margins
Seong Hyeon Noh, Jae Won Cho
Paper
- LLM2Seg: LLM-Guided Few-Shot Object Localization with Visual Transformer
Wei-Chih Yin*, Pin-Hsuan Chou*, Chao-Chi Liao*, Yu-Chee Tseng, Cheng-Kuan Lin
Paper
- BLaVe-CoT: Consistency-Aware Visual Question Answering for Blind and Low Vision Users
Wanyin Cheng, Zanxi Ruan
Paper
- Multi-Perspective LVLM Prompting for Robust Zero-Shot Image Classification
Wonjun Choi, Jeong-Cheol Lee, Jun-Hyeok Seo, Dong-Gyu Lee
Paper
- Unified Visions: Multi-Modal Transformer Fusion for Comprehensive VQA Answering
Minju Baek, Dongheon Lee, Jaehong Yoon, Joonki Paik
Paper
- Zero-shot image classification method based on multi-model ensemble voting
Heng Yang, Lianping Lu, Kexin Zhang,Lingling Li, Licheng Jiao, Long Sun, Wenping Ma
Paper
- Test-Time Augmented Ensemble for Zero-Shot Learning
Dongheon Lee, Minju Baek, Jaehong Yoon, Joonki Paik
Paper
- Evaluating Camera Placement for Last-Mile Navigation by Blind Users
Apurv Varshney, Lucas Nadolskis, Michael Beyeler
Paper
- Evaluating VLMs as Accessibility Bridges for Event Sequence Visualizations
Kazi Tasnim Zinat, Saad Mohammad Abrar, Sharmila Duppala, Saimadhav Naga Sakhamuri, Zhicheng Liu
Paper
- Beyond Blanket Masking: Examining Granularity for Privacy Protection in Images Captured by Blind and Low Vision Users
Jeffri Murrugarra-LLerena, Haoran Niu, K. Suzanne Barber, Hal Daume III, Yang Trista Cao, Paola Cascante-Bonilla
Paper
- VideoA11y-40K: A Large-Scale Dataset for Accessible Video Understanding
Chaoyu Li, Sid Padmanabhuni, Maryam Cheema, Hasti Seifi, Pooyan Fazli
Paper
- Lightweight Grounding Model Combining a CLIP-based Encoder With an Upsampling Decoder
Jihee Yoon, Seunga Lee, Haesol Jeong, Junseok Kwon
Paper
Invited Speakers:

Kristen Grauman
Professor
UT Austin

Jennison Asuncion
Co-Founder & Vice Chair
GAAD Foundation

Jeffrey Bigham
Carnegie Mellon University, Apple

Amy Pavel
Assistant Professor
UT Austin
Organizers

Danna Gurari
University of Colorado Boulder

Jeffrey Bigham
Carnegie Mellon University, Apple

Ed Cutrell
Microsoft

Everley Tseng (Yu-Yun Tseng)
University of Colorado Boulder

Josh Myers-Dean
University of Colorado Boulder

Zhuoheng Li
University of Colorado Boulder
Contact Us
For questions, comments, or feedback, please send them to Danna Gurari at danna.gurari@colorado.edu.
Sponsor
