Neo: Intelligent 3D Room Scanning

Neo: Intelligent 3D Room Scanning
AWSGCPFirebasePythonPyTorchTensorFlowFlutterSwiftPostgreSQL

LiDAR-based navigation AI agent for visually impaired users; Treehacks Hackathon 2025

At Treehacks Hackathon 2025, I spearheaded the development of Neo, an innovative application that leverages LiDAR and point cloud technologies to generate precise 3D OBJ file scans of indoor environments. Designed with accessibility in mind, Neo empowers users—including those with visual impairments—by integrating a suite of intuitive features and cutting-edge technologies.

Key Features & Functionalities

  • Dynamic Scanning Controls: Toggle scanning/sampling, RGB visualization, and point cloud display. Easily clear/reset the point buffer to ensure accurate scanning.
  • Flexible Data Management: Save detailed scans as `.ply` files (in both ASCII and binary formats), export or delete previously saved scans, and benefit from an increased sampling rate for enhanced detail.
  • Voice-Guided Navigation: By integrating OpenAI's Whisper—fine-tuned to accurately interpret room scans—the app provides real-time, audio-based navigational cues. Users simply ask where specific objects or locations are, and Neo responds with clear, actionable guidance.
  • Historical Data Storage: A robust SQL database stores past scans, ensuring users can access and review their historical data seamlessly.

Impact on the Blind Community

Neo's innovative voice-guided navigation system has a profound impact on blind and visually impaired individuals. By delivering real-time audio cues and detailed spatial information, the app empowers users to confidently navigate indoor spaces, significantly enhancing their safety and independence. This breakthrough technology bridges the gap between digital and physical environments, fostering inclusion and enabling a more autonomous daily experience for the blind community.

Technological Stack

  • Frontend: Developed using Swift (with ARKit integration for advanced LiDAR functionality), Flutter, React, and JavaScript to create a responsive, cross-platform user experience.
  • Backend: A Python-powered API (leveraging frameworks like Flask) orchestrates real-time data processing and integrates machine learning models to enhance scan detection and audio responsiveness.
  • Cloud & Data Management: Firebase underpins the app's authentication, real-time database, and cloud storage needs, ensuring scalability and secure data management. Additional technologies, such as Docker for containerized deployments and CI/CD pipelines, supported rapid development and iteration.