YOLO World
A LLM/VLM tool to aid BVI people in day-to-day navigation
 
                 
            Python
PyTorch
LLM
Jupyter
Magic Leap
Key Achievements
- Designed a pipeline to generate activity graphs from video input 
- Integrated pipeline with Magic Leap hardware 
Project Background
This project is intended to be an application, which takes real-time video data, describes the contents of the video data, and allows the user to query for past information in natural language.
Use Case
                Imagine a blind-vision-impaired individual is using the YOLO World application to navigate their daily life: they scan their home,
                receiving the description, "There is a desk with a red book sitting on it". 
                
                They then walk to a coffee shop, receiving audio descriptions 
                along the way, "There is a white car stopped next to a stop sign". At the coffee shop, they order a drink and sit down "There is a cup sitting 
                on a wooden table". 
                
                Suddenly, they think, "A book would go perfect with this coffee", but they cannot remember where they left their book.
                
                They could ask YOLO World, "Where did I leave my book?" and YOLO World would reply, "The book is sitting on a desk at your home". 
            
Affiliated With
 
             
                