Siwei Yang
Siwei Yang
Home
Publications
Awards
Experience
Projects
CV
Light
Dark
Automatic
Publications
Type
Conference paper
Preprint
Date
2024
2022
2020
WebAgent-90K: A Large-Scale Dataset for Fine-Tuning Agent for Automatic Web Browsing Tasks
In this paper we present WebAgent-90K, web-interaction dataset with around 90K tasks collected via Evol-Instruct and an automated web agent based on GPT-4V, which can be used for training an automated web agent with open-sourced VLM. The Llava-v1.5 we finetuned with WebAgent-90K yielded similar performance as GPT-4V on Webvoyager.
Bingchen Zhao
,
Siwei Yang
,
Cihang Xie
PDF
Code
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200K edits. Unlike prior approaches relying on human feedback, we devise a scalable data collection pipeline leveraging self-instruct with advanced foundation models, namely GPT-4V and DALL-E 3.
Mude Hui
,
Siwei Yang
,
Bingchen Zhao
,
Yichun Shi
,
Heng Wang
,
Peng Wang
,
Yuyin Zhou
,
Cihang Xie
PDF
Cite
Code
Project Page
Demo
AQA-Bench: An Interactive Benchmark for Evaluating LLMs’ Sequential Reasoning Ability in Algorithmic Environments
This paper presents AQA-Bench, a benchmark for evaluating LLMs’ sequential reasoning abilities via interactive environments requiring model executing algorithms such as Binary search, DFS, and BFS. Our find includes
(1)
the inverse scaling between model sizes and performance,
(2)
the nuanced impact of naive in-context examples due to over-fitting in ICL,
(3)
weak models failing mainly due to incapability of starting well and
(4)
impressive improvement from a few given predecessor steps following the optimal policy.
Siwei Yang
,
Bingchen Zhao
,
Cihang Xie
PDF
Cite
Code
AsyInst: Asymmetric Affinity with DepthGrad and Color for Box-Supervised Instance Segmentation
Due to the optimization problem of the former symmetric pairwise affinity loss, it is only compatible with color affinity but not with other modalities. Our method alleviates this issue by introducing asymmetry, which not only makes it compatible with depth gradient affinity but also improves the performance with color affinity.
Siwei Yang
,
Longlong Jing
,
Junfei Xiao
,
Hang Zhao
,
Alan Yuille
,
Yingwei Li
PDF
Cite
Contrastive Multi-Task Dense Prediction
We discover that in a multi-task model, task-specific features follow a cross-task contrastive distribution, e.g. pixels with the same semantic label have similar features for depth estimation. Therefore, we devise a regularization method that can improve multi-task performance by enhancing this distribution.
Siwei Yang
,
Hangrong Ye
,
Dan Xu
PDF
Cite
XCon: Learning with Experts for Fine-grained Category Discovery
Learning to do category discovery within a fine-grained dataset is challenging, we present a method that learns to do so by partitioning the dataset into k sub-groups, and shows improved performance on several fine-grained datasets.
Yixin Fei
,
Zhongkai Zhao
,
Siwei Yang
,
Bingchen Zhao
PDF
Cite
Code
Slides
Reducing the feature divergence of RGB and near-infrared images using Switchable Normalization
Instance normalization reduces feature divergence between modalities while batch normalization keeps the discriminative distribution. Thus, segmentation models achieve better performance by utilizing both kinds of normalization.
Siwei Yang
,
Shaozuo Yu
,
Bingchen Zhao
,
Yin Wang
PDF
Cite
Code
Project
Slides
Cite
×