SEAL logo

Segment Any Events with Language

ICLR 2026

Department of Computer Science, National University of Singapore

SEAL is the first Semantic-aware Segment Any Events model!


Given the box prompts, our SEAL output both instance-level mask and open-vocabulary semantics.


Given a point prompt, our SEAL recognizes and classifies both instance-level and part-level masks.

Class-agnostic object detection

I wanna drive.

Walking in the sidewalk.


Based on our SEAL, we further propose SEAL++ that supports 1) prompt-free, 2) spatiotemporal, Open-Vocabulary Events Instance Segmentation (OV-EIS). Left video shows the class-agnostic detection from SEAL++. Two videos on the right present the spatiotemporal OV-EIS.

Scene 1.

Scene 2.


Our SEAL++ also supports spatiotemporal OV-EIS with multiple semantics.
In the above demo, green color indicates Vehicle class and white color indicates Human class.

Abstract

Scene understanding with free-form language has been widely explored within diverse modalities such as images, point clouds, and LiDAR. However, related studies on event sensors are scarce or narrowly centered on semantic-level understanding. We introduce SEAL, the first Semantic-aware Segment Any Events framework that addresses Open-Vocabulary Event Instance Segmentation (OV-EIS). Given the visual prompt, our model presents a unified framework to support both event segmentation and open-vocabulary mask classification at multiple levels of granularity, including instance-level and part-level. To enable thorough evaluation on OV-EIS, we curate four benchmarks that cover label granularity from coarse to fine class configurations and semantic granularity from instance-level to part-level understanding. Extensive experiments show that our SEAL largely outperforms proposed baselines in terms of performance and inference speed with a parameter-efficient architecture. In the Appendix, we further present a simple variant of our SEAL achieving generic spatiotemporal OV-EIS that does not require any visual prompts from users in the inference. The code will be publicly available.

BibTeX