IDRNet: Intervention-Driven Relation Network for Semantic Segmentation

Jin, Zhenchao; Hu, Xiaowei; Zhu, Lingting; Song, Luchuan; Yuan, Li; Yu, Lequan

Abstract:Co-occurrent visual patterns suggest that pixel relation modeling facilitates dense prediction tasks, which inspires the development of numerous context modeling paradigms, \emph{e.g.}, multi-scale-driven and similarity-driven context schemes. Despite the impressive results, these existing paradigms often suffer from inadequate or ineffective contextual information aggregation due to reliance on large amounts of predetermined priors. To alleviate the issues, we propose a novel \textbf{I}ntervention-\textbf{D}riven \textbf{R}elation \textbf{Net}work (\textbf{IDRNet}), which leverages a deletion diagnostics procedure to guide the modeling of contextual relations among different pixels. Specifically, we first group pixel-level representations into semantic-level representations with the guidance of pseudo labels and further improve the distinguishability of the grouped representations with a feature enhancement module. Next, a deletion diagnostics procedure is conducted to model relations of these semantic-level representations via perceiving the network outputs and the extracted relations are utilized to guide the semantic-level representations to interact with each other. Finally, the interacted representations are utilized to augment original pixel-level representations for final predictions. Extensive experiments are conducted to validate the effectiveness of IDRNet quantitatively and qualitatively. Notably, our intervention-driven context scheme brings consistent performance improvements to state-of-the-art segmentation frameworks and achieves competitive results on popular benchmark datasets, including ADE20K, COCO-Stuff, PASCAL-Context, LIP, and Cityscapes. Code is available at \url{this https URL}.

Comments:	Accepted by NeurIPs 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.10755 [cs.CV]
	(or arXiv:2310.10755v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2310.10755

Computer Science > Computer Vision and Pattern Recognition

Title:IDRNet: Intervention-Driven Relation Network for Semantic Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators