Unseen
Vision.

On-device AI for Spatial Perception.

Word First AI pendant built with and for
the blind and low-vision community.

0%
Detection Accuracy
10+
AI Agents dev
80+
Prototypes Shipped
Scroll
01 The Pendant
PENDANT REF / V3.0 eyeOS HARDWARE
eyeOS Pendant device

The Pendant

Camera, LiDAR, on-device AI, speaker — worn at the chest. Spatial intelligence without a screen. Designed from the ground up for the blind and low-vision community: runs entirely on-device with no cloud dependency, integrates 10+ AI agents through the A2A protocol, and delivers environmental understanding in real time, at the weight of a keychain.

LiDAR
YDLIDAR GS2
Camera
Multimodal
AI Engine
Lightnet 7B
Connect
BT 5.0 and WiFi
Output
Speaker + Haptic
Battery
~12 hr est.
Weight
~45g
Form Factor
Chest-worn
Response
Real time Engagement
02 Architecture

Lightnet 7B

Perception Engine
0.2%
mAP@0.5

Crosswalk referenced traffic light detection: the core perceptual task.

Rather than identifying signals in isolation, Lightnet 7B uses crosswalks as spatial anchors to resolve the correct pedestrian signal in complex multi light intersections, where signals overlap, stack, or face conflicting directions across multiple lanes.

Built on a YOLOv5 backbone with SE and CBAM dual attention mechanisms. Trained on 12,000+ annotated intersection frames spanning day, dusk, rain, and high contrast conditions. The only model purpose-built for pedestrian signal detection at this granularity.

Lightnet 7B model architecture diagram

Lightnet 7B Model Architecture

A2A Protocol

Agent-to-Agent Network
10+
Agents

10+ specialized micro models run simultaneously, each an expert in one perceptual task: Braille recognition  ·  Realtime text OCR  ·  Scene understanding
Pedestrian navigation  ·  Depth estimation  ·  Object tracking

Each agent is orchestrated by an Agent-to-Agent protocol that routes context across models, resolves conflicts when outputs disagree, and synthesizes a unified spatial response in under 120ms. Backed by multimodal foundation models with video and image language understanding.

Not a single model doing everything. A coordinated intelligence where each agent is an expert.

Text OCR
Realtime
Braille
Recognition
Scene
Understanding
A2A
Protocol
Depth
Estimation
Navigation
Pedestrian
Tracking
Object
<120ms
End-to-End Orchestration Latency

A2A Protocol Agent-to-Agent Network

iGUI

Trimodal Context OS
3
Modalities

Three output channels. Zero learning curve.

Vision enhancement through optical passthrough. Auditory guidance via bone conduction speaker, ambient without isolating. Haptic feedback for directional urgency and confirmation.

Intelligence is stored as callable agents, activated by voice on demand. No screens, no menus, no app to launch. The interface disappears until the moment it's needed.

iGUI trimodal architecture diagram

iGUI Trimodal Context Architecture

03 Benchmark

eyeOS is the only device in its class purpose-built for spatial navigation by the visually impaired.

Three dimensions where purpose-built design produces quantifiable difference: detection precision, processing architecture, and agent coordination.

Below: a technical comparison against the nearest wearable AI peers.

Detection Accuracy
mAP@0.5
97.2%
eyeOS
31%
Rokid
23%
Meta
N/A
Omi
On Device Processing
% inference local
100%
eyeOS
28%
Rokid
18%
Meta
0%
Omi
AI Agent Coordination
count
10+
eyeOS
2
Rokid
1
Meta
1
Omi
eyeOS Pendant Meta Ray Ban Omi AI Rokid Max
Primary function Spatial nav for VI Social capture + voice AI Conversation capture AR display streaming
Detection (mAP@0.5) 97.2% Lightnet 7B ~23% est. N/A, no camera ~31% est.
On device inference 100% local Cloud first (~18% local) Cloud only Partial (~28% local)
Inference latency <120ms per frame ~800ms (cloud RTT) ~1.2s (cloud RTT) ~400ms (partial)
Multi agent coordination 10+ specialized models 1 (Meta AI) 1 (conversation) 2 (limited)
Visual input Camera + LiDAR Camera only No camera Camera only
Haptic feedback Yes
Open source Yes
Community field testing 18 months, 80+ devices
04 In the Field

True accessibility means no one left in the dark.

80+
Prototypes Shipped

Each unit hand-delivered to blind and low-vision participants at Nanjing School for the Blind.

Not lab tests. Real navigation, real conditions, real feedback loops that shaped every hardware iteration from first prototype to final form.

18
Months Cocreation

Full co-design with the community from first prototype to architecture lock.

Bimonthly onsite visits. Every feature earned its place at the intersection, not in a meeting room.

<120ms
Inference Latency

On device Lightnet 7B response per frame for pedestrian signal detection with no cloud round trip required.

Fully functional in dead zones, tunnels, and low connectivity environments where cloud dependent devices fail entirely.

0.2%
mAP@0.5 Accuracy

Precision recall benchmark on crosswalk referenced traffic light detection in dense urban intersections.

Validated across day, dusk, rain, and adverse contrast, the standard metric for object detection performance at deployment scale.

05 Principles

Sincerely elevate the species.
We respect your Privacy.

Perception as a Right

Over 250 million people live with moderate or severe visual impairment. 36 million are blind. They navigate a world built without them in mind.

Access to spatial awareness is not a product feature. It is a fundamental condition for autonomy.

Co-Created

Every feature was validated with real users before it shipped. 80+ prototypes delivered to Nanjing School for the Blind.

Bimonthly on-site visits. Direct feedback loops from people who rely on the system daily. Not UX panels, not simulations. 18 months of iterating at human speed.

Open by Default

Built on open source. Given back to open source.

Detection algorithms, annotated datasets, training pipelines, released as community resources, not competitive moats. The mission scales when the infrastructure is shared.

06 Origin

AI is rewriting the grammar of hardware interaction. More intelligent models are leaving the cloud, entering the body of the machine. The question is no longer whether machines can see. The question is: who gets to benefit when they do.

Bottom-up waves don't await top-down benchmarks.

We chose the hardest entry point. We build spatial intelligence for those the world forgot to design for.

Tech advances through a prime life-force beyond man's body—romantic, wild, inevitable.

We've tasted creation's heat. That obligates us to participate—to build, to steward, and to help shape the world we owe.

2022
The Question

Nanjing, Zhujiang Road. A blind man navigating a broken sidewalk. Why can cars read traffic lights, but a person cannot safely cross the street?

2023
First Research

Nanjing School for the Blind. First hardware prototypes. Algorithm development begins. The crosswalk referenced spatial anchor methodology, the insight that would become Lightnet 7B, developed, tested, and validated in the field.

2024
MVP to Market

MVP deployed as app + wearable combination. WAIC exhibition, Shanghai. CPPCC recognition. 80+ devices field-tested with the community over 18 months of co-creation.

2025
The Pendant

Full hardware integration: pendant form factor, A2A multi-agent architecture, Lightnet 7B on device inference, iGUI trimodal output. Vision Computing. Rokid Acquired

2026
Open Ecosystem

Detection models, datasets, and infrastructure released to the community. The platform expands beyond the pendant.

07 Team

Young, scrappy, and hungry.
We are Real Time Engagement.

DJ
Du
Founder
SZ
Sh1n3zZ
rponeawa
Software
JZ
Jinghan Zhang
Hengxi Liu
Zhengan Tian
ML / Vision
SY
Sym
Hardware
LK
Luke
Max Zhuang
Product

For we live by faith, not by sight.

2 Corinthians 5:7

Recognition

The mission continues at scale.
Perception is a right, not a privilege.
Build with Research and Empathy