Skip to content
View gordonhu608's full-sized avatar

Highlights

  • Pro

Block or report gordonhu608

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. InternRobotics/G2VLM InternRobotics/G2VLM Public

    [CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

    Python 311 9

  2. mlpc-ucsd/BLIVA mlpc-ucsd/BLIVA Public

    (AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

    Python 260 25

  3. uclanlp/OpenVLThinker uclanlp/OpenVLThinker Public

    OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

    Python 145 7

  4. MQT-LLaVA MQT-LLaVA Public

    [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models

    Python 124 11

  5. mragbench/MRAG-Bench mragbench/MRAG-Bench Public

    [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

    Python 61 3