MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

Official implementation of the paper 'MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection'.

For our multi-view version, MonoDETR-MV on nuScenes dataset, please refer to MonoDETR-MV.

Introduction

MonoDETR is the first DETR-based model for monocular 3D detection without additional depth supervision, anchors or NMS, which achieves leading performance on KITTI val and test set. We enable the vanilla transformer in DETR to be depth-aware and enforce the whole detection process guided by depth. In this way, each object estimates its 3D attributes adaptively from the depth-informative regions on the image, not limited by center-around features.

Main Results

This repo contains only an intermediate version of MonoDETR. Our paper is still under review, but has been intentionally plagiarized by several times in character level, submitting to NeurIPS, CVPR, and other conferences. Given this, we plan to release the complete code after our paper been accepted. Thanks for your understanding.

The randomness of training for monocular detection would cause the variance of ±1 AP_3D. For reproducibility, we provide four training logs of MonoDETR on KITTI val set for the car category: (the stable version is still under tuned)

We have relased the ckpts of our implementation for reproducibility. The module names might have some mismatch, which will be rectified in a few days.

Models	Val, AP_3D\|R40			Logs	Ckpts
Models	Easy	Mod.	Hard	Logs	Ckpts
MonoDETR	28.84%	20.61%	16.38%	log	ckpt
	26.66%	20.14%	16.88%	log	ckpt
	29.53%	20.13%	16.57%	log	ckpt
	27.11%	20.08%	16.18%	log	ckpt

MonoDETR on test set from official KITTI benckmark for the car category:

Models	Test, AP_3D\|R40
Models	Easy	Mod.	Hard
MonoDETR	24.52%	16.26%	13.93%
MonoDETR	25.00%	16.47%	13.58%

Installation

Clone this project and create a conda environment:

git clone https://github.com/ZrrSkywalker/MonoDETR.git
cd MonoDETR

conda create -n monodetr python=3.8
conda activate monodetr

Install pytorch and torchvision matching your CUDA version:
```
conda install pytorch torchvision cudatoolkit
```

Install requirements and compile the deformable attention:

pip install -r requirements.txt

cd lib/models/monodetr/ops/
bash make.sh

cd ../../../..

Make dictionary for saving training losses:
```
mkdir logs
```

Download KITTI datasets and prepare the directory structure as:

│MonoDETR/
├──...
├──data/KITTIDataset/
│   ├──ImageSets/
│   ├──training/
│   ├──testing/
├──...

You can also change the data path at "dataset/root_dir" in configs/monodetr.yaml.

Get Started

Train

You can modify the settings of models and training in configs/monodetr.yaml and appoint the GPU in train.sh:

bash train.sh configs/monodetr.yaml > logs/monodetr.log

Test

The best checkpoint will be evaluated as default. You can change it at "tester/checkpoint" in configs/monodetr.yaml:

bash test.sh configs/monodetr.yaml

Acknowlegment

This repo benefits from the excellent Deformable-DETR and MonoDLE.

Citation

@article{zhang2022monodetr,
  title={MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection},
  author={Zhang, Renrui and Qiu, Han and Wang, Tai and Xu, Xuanzhuo and Guo, Ziyu and Qiao, Yu and Gao, Peng and Li, Hongsheng},
  journal={arXiv preprint arXiv:2203.13310},
  year={2022}
}

Contact

If you have any question about this project, please feel free to contact zhangrenrui@pjlab.org.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
configs		configs
lib		lib
tools		tools
utils		utils
MonoDETR_arxiv.pdf		MonoDETR_arxiv.pdf
README.md		README.md
pipeline.jpg		pipeline.jpg
requirements.txt		requirements.txt
test.sh		test.sh
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

Introduction

Main Results

Installation

Get Started

Train

Test

Acknowlegment

Citation

Contact

About

Releases

Packages

Languages

Viclea/MonoDETR

Folders and files

Latest commit

History

Repository files navigation

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

Introduction

Main Results

Installation

Get Started

Train

Test

Acknowlegment

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages