[go: nahoru, domu]

Skip to content

DMCV-SJTU/Make-it-3D-Jittor

Repository files navigation

Make-It-3D Jittor Implementation:

We provide Jittor implementations for our paper "Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior".

Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, and Dong Chen.

Abstract

In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. This is inherently challenging: it essentially involves estimating the underlying 3D geometry while simultaneously hallucinating unseen textures. To address this challenge, we leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation. Our approach, Make-It-3D, employs a two-stage optimization pipeline: the first stage optimizes a neural radiance field by incorporating constraints from the reference image at the frontal view and diffusion prior at novel views; the second stage transforms the coarse model into textured point clouds and further elevates the realism with diffusion prior while leveraging the high-quality textures from the reference image. Extensive experiments demonstrate that our method outperforms prior works by a large margin, resulting in faithful reconstructions and impressive visual quality. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.

Todo (Latest update: 2024/06/29)

  • Release coarse stage training code
  • Release refine stage training code
  • Release coarse stage training code with Instant NGP
  • Release all training code (coarse + refine stage)

Demo of 360° geometry

SAM + Make-It-3D


Installation

1. Download jittor-related libraries

Please download the requirement folds from here The directory structure of downloaded fold is as following:

makeit3d_requirement/
│
├── jittor-1.3.9.7/
│   ├── setup.py
│   └── ...
|
├── jtorch/
│   ├── setup.py
|   └── ...
|
├── diffuser_jittor/
│   ├── setup.py
│   └── ...
| 
├── transformers_jittor/
│   ├── setup.py
|   └── ...
| 
├── JDiffusion/
│   ├── setup.py
│   └── ...
|
├── JNeRF/
│   ├── setup.py
│   └── ...
└── ...

2. Compile the jittor-related libraries.

After getting the makeit3d_requirement fold, you need to compile all of them. Please run the following command in the same directory as setup.py file in each libraries mentioned above:

pip install -e .

Note: Due to the dependencies between the components, it is best to compile in the order shown in the above diagram.

3. Install other dependencies

Other dependencies:

pip install -r requirements.txt

4. Download the pre-trained model


Training

Coarse stage

We use progressive training strategy to generate a full 360° 3D geometry. Run the command and modify the workspace name NAME, the path of the reference image IMGPATH and the prompt PROMPT describing the image . We first optimize the scene under frontal camera views.

python main.py --workspace ${NAME} --ref_path "${IMGPATH}" --phi_range 135 225 --iters 10000 --backbone vanilla --text ${PROMPT}

We have proposed the example fold in the fold results, you can run the following command for a quick start:

python main.py --workspace teddy --ref_path demo/teddy.png --phi_range 135 225 --iters 10000 --backbone vanilla --text "a teddy bear"
python main.py --workspace teddy2 --ref_path demo/teddy-2.png --phi_range 135 225 --iters 10000 --backbone vanilla --text "a teddy bear"
  • If you want to run Make-It-3D on your own example, please make sure to get depth map and mask according to the guidance in preprocess before performing the training process.

  • To speed up training, you can simply remove the backbone keyword and use Instant NGP to accelerate the coarse stage. For example, run the following command to get Instant NGP acceleration:

python main.py --workspace ${NAME} --ref_path "${IMGPATH}" --phi_range 135 225 --iters 10000 --text ${PROMPT}

Refine stage

We have proposed an example for refine stage. Before the refine stage training, you should download following examples into your workspace. Make sure the downloaded files are placed in the following directory structure:

Make-It-3D/
│
├── results/
│   ├── $WORKSPACE_NAME$/
│   │    ├── mvimg/
|   |       ├── df_epxxx_000_depth.png
│   │       ├── df_epxxx_000_mask.png
│   │       ├── df_epxxx_000_normal.png
│   │       ├── df_epxxx_000_rgb.png
|   |       ├── df_epxxx_poses.npy
│   │       └── ...  
│   │    ├── refine/
│   └── ...
└── ...

Teddy bear

You can easily refine this teddy bear texture as following guidance:

python main.py --workspace ${WORKSPACE_NAME} --ref_path "demo/teddy.png" --phi_range 0 90 --fovy_range 50 70 --fov 60 --refine --refine_iter 3000 --backbone vanilla --text "a teddy bear"

Important Note

Hallucinating 3D geometry and generating novel views from a single image of general genre is a challenging task. While our method demonstrates strong capability on creating 3D from most images with a centered single object, it may still encounter difficulties in reconstructing solid geometry on complex cases. If you encounter any bugs, please feel free to contact us.

Citation

If you find this code helpful for your research, please cite:

@InProceedings{Tang_2023_ICCV,
    author    = {Tang, Junshu and Wang, Tengfei and Zhang, Bo and Zhang, Ting and Yi, Ran and Ma, Lizhuang and Chen, Dong},
    title     = {Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {22819-22829}
}

Acknowledgments

This code borrows heavily from Stable-Dreamfusion.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •