Referring Expression Comprehension: Grounding natural language in visual scenes using Words as Classifiers approach
nlp machine-learning computer-vision deep-learning clip vgg19 referring-expressions vision-language multimodal-ai grounded-semantics refcoco
-
Updated
Feb 7, 2026 - Python