• español
    • English
  • Login
  • español 
    • español
    • English
  • Tipos de Publicaciones
    • bookbook partconference objectdoctoral thesisjournal articlemagazinemaster thesispatenttechnical documentationtechnical report
Ver ítem 
  •   IMDEA Networks Principal
  • Ver ítem
  •   IMDEA Networks Principal
  • Ver ítem
JavaScript is disabled for your browser. Some features of this site may not work without it.

EgoLife: Towards Egocentric Life Assistant

Compartir
Ficheros
Main article (3.888Mb)
Identificadores
URI: https://hdl.handle.net/20.500.12761/2041
Metadatos
Mostrar el registro completo del ítem
Autor(es)
Yang, Jingkang; Liu, Shuai; Guo, Hongming; Dong, Yuhao; Zhang, Xiamengwei; Zhang, Sicheng; Wang, Pengyun; Zhou, Zitang; Xie, Binzhu; Wang, Ziyue; Ouyang, Bei; Lin, Zhengyu; Cominelli, Marco; Cai, Zhongang; Li, Bo; Zhang, Yuanhan; Zhang, Peiyuan; Hong, Fangzhou; Widmer, Joerg; Gringoli, Francesco; Yang, Lei; Liu, Ziwei
Fecha
2025-06-15
Resumen
We introduce EgoLife, a project to develop an egocentric life assistant that accompanies and enhances personal efficiency through AI-powered wearable glasses. To lay the foundation for this assistant, we conducted a comprehensive data collection study where six participants lived together for one week, continuously recording their daily activities - including discussions, shopping, cooking, socializing, and entertainment - using AI glasses for multimodal egocentric video capture, along with synchronized third-person-view video references. This effort resulted in the EgoLife Dataset, a comprehensive 300-hour egocentric, interpersonal, multiview, and multimodal daily life dataset with intensive annotation. Leveraging this dataset, we introduce EgoLifeQA, a suite of long-context, life-oriented question-answering tasks designed to provide meaningful assistance in daily life by addressing practical questions such as recalling past relevant events, monitoring health habits, and offering personalized recommendations. To address the key technical challenges of (1) developing robust visual-audio models for egocentric data, (2) enabling identity recognition, and (3) facilitating long-context question answering over extensive temporal information, we introduce EgoButler, an integrated system comprising EgoGPT and EgoRAG. EgoGPT is an omni-modal model trained on egocentric datasets, achieving state-of-the-art performance on egocentric video understanding. EgoRAG is a retrieval-based component that supports answering ultra-long-context questions. Our experimental studies verify their working mechanisms and reveal critical factors and bottlenecks, guiding future improvements. By releasing our datasets, models, and benchmarks, we aim to stimulate further research in egocentric AI assistants.
Compartir
Ficheros
Main article (3.888Mb)
Identificadores
URI: https://hdl.handle.net/20.500.12761/2041
Metadatos
Mostrar el registro completo del ítem

Listar

Todo IMDEA NetworksPor fecha de publicaciónAutoresTítulosPalabras claveTipos de contenido

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Difusión

emailContacto person Directorio wifi Eduroam rss_feed Noticias
Iniciativa IMDEA Sobre IMDEA Networks Organización Memorias anuales Transparencia
Síguenos en:
Comunidad de Madrid

UNIÓN EUROPEA

Fondo Social Europeo

UNIÓN EUROPEA

Fondo Europeo de Desarrollo Regional

UNIÓN EUROPEA

Fondos Estructurales y de Inversión Europeos

© 2021 IMDEA Networks. | Declaración de accesibilidad | Política de Privacidad | Aviso legal | Política de Cookies - Valoramos su privacidad: ¡este sitio no utiliza cookies!