[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the hyperparameters, for both the training pipeline (train_search.py )and the final evaluation pipeline (train.py) differ a lot!? #107

Open
NdaAzr opened this issue Aug 1, 2019 · 3 comments

Comments

@NdaAzr
Copy link
NdaAzr commented Aug 1, 2019

I am wondering why hyperparameters in training is different with the evaluation pipeline.

For example, here are the hyperparameters for CIFAR, in this format: training pipeline value -> final evaluation pipeline value:
cells: 8 -> 20
batch size: 64 -> 96
initial channels: 16 -> 36
epochs: 50 -> 600
droppath: 0.3->(with probability 0.2)
auxiliary weight: no -> yes (with weight 0.4)

@NdaAzr
Copy link
Author
NdaAzr commented Aug 1, 2019

I found the answer to this question here:

https://openreview.net/forum?id=S1eYHoC5FX

For convolutional cells:

Our setup of #cells (8->20), #epochs (600) and weight for the auxiliary head (0.4) in the final evaluation exactly follows Zoph et al., 2018. The #init_channels is enlarged from 16 to 36 to ensure a comparable model size (~3M) with other baselines. Given those settings, we then use the largest possible batch size (96) for a single GPU. The drop path probability was tuned wrt the validation set among the choices of (0.1, 0.2, 0.3) given the best cell learned by DARTS.

@NdaAzr NdaAzr closed this as completed Aug 1, 2019
@NdaAzr NdaAzr reopened this Aug 1, 2019
@YANGWAGN
Copy link
YANGWAGN commented Sep 2, 2019

Hi ,NdaAzr! In the code, I know that train_search.py haw to use, however , I don't see the code about obtaining the best architecture and saving the ultimate arch parameters. And how to Construct the architecture with the arch parameters in the train.py.
Thank you!

@NdaAzr
Copy link
Author
NdaAzr commented Sep 3, 2019

Hi @YANGWAGN,

When you ran the train_search.py, the model.genotype() will give you the best-learned cell. So, you need to train it for a few numbers of epochs and get the genotype with the highest accuracy.

Then, you need to give the genotype.py this genotype and run train.py. see example here:

DARTS_v2 = Genotype(normal=[('dil_conv_3x3', 0), ('skip_connect', 1), ('skip_connect', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 2), ('sep_conv_3x3', 0), ('skip_connect', 1), ('dil_conv_3x3', 0)], normal_concat=range(2, 6), reduce=[('dil_conv_3x3', 1), ('sep_conv_3x3', 0), ('max_pool_3x3', 0), ('dil_conv_5x5', 2), ('dil_conv_5x5', 3), ('max_pool_3x3', 1), ('max_pool_3x3', 0), ('max_pool_3x3', 1)], reduce_concat=range(2, 6))

DARTS = DARTS_v2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants