[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problems when I train the model #9

Open
ustczhouyu opened this issue Nov 24, 2018 · 7 comments
Open

Some problems when I train the model #9

ustczhouyu opened this issue Nov 24, 2018 · 7 comments

Comments

@ustczhouyu
Copy link

When I use ICDAR2015 to train the model,
Inside the file sample_train_data/MLT/trainMLT.txt are icdar2015 localization training images such as icdar-2015-Ch4/Train/img_1.jpg and inside sample_train_data/MLT_CROPS/gt.txt are icdar2015 recognition training images such as word_1.png, "Genaxis Theatre".
I have not changed other paths. When I train the model by:
python3 train.py -train_list=sample_train_data/MLT/trainMLT.txt -batch_size=8 -num_readers=5 -debug=0 -input_size=512 -ocr_batch_size=256 -ocr_feed_list=sample_train_data/MLT_CROPS/gt.txt
the output are:
root@10ca3ad2a7d1:/home/zy/jupyter/recognition/spotter/E2E-MLT-master# python3 train.py -train_list=sample_train_data/MLT/trainMLT.txt -batch_size=8 -num_readers=5 -debug=0 -input_size=512 -ocr_batch_size=256 -ocr_feed_list=sample_train_data/MLT_CROPS/gt.txt
Using E2E-MLT
loading model from e2e-mlt.h5
e2e-mlt.h5
1000 training images in sample_train_data/MLT/trainMLT.txt
1000 training images in sample_train_data/MLT/trainMLT.txt
1000 training images in sample_train_data/MLT/trainMLT.txt
1000 training images in sample_train_data/MLT/trainMLT.txt
1000 training images in sample_train_data/MLT/trainMLT.txt
4468 training images in sample_train_data/MLT_CROPS/gt.txt
4468 training images in sample_train_data/MLT_CROPS/gt.txt
I waited for half an hour, but no more output. can you help me? thank you.

@MichalBusta
Copy link
Owner

Hi, looks like problem with data feeding.

  • you can try: use -debug=1 flag to see the training data

there is piece of bad code in data_gen.py:

if not os.path.exists(im_name): continue im = cv2.imread(im_name) if im is None: continue

@MiZhangWhuer
Copy link

Hi, looks like problem with data feeding.

  • you can try: use -debug=1 flag to see the training data

there is piece of bad code in data_gen.py:

if not os.path.exists(im_name): continue im = cv2.imread(im_name) if im is None: continue

Hi,@MichalBusta @ustczhouyu I meet the same issues as you've asked. And I solve the problem by commenting the following lines associated with dg_ocr:

  # imageso, labels, label_length = next(dg_ocr)
  # im_data_ocr = net_utils.np_to_variable(imageso, is_cuda=opts.cuda).permute(0, 3, 1, 2)
  # features = net.forward_features(im_data_ocr)
  # labels_pred = net.forward_ocr(features)
  #
  # probs_sizes =  torch.IntTensor( [(labels_pred.permute(2,0,1).size()[0])] * (labels_pred.permute(2,0,1).size()[1]) )
  # label_sizes = torch.IntTensor( torch.from_numpy(np.array(label_length)).int() )
  # labels = torch.IntTensor( torch.from_numpy(np.array(labels)).int() )
  # loss_ocr = ctc_loss(labels_pred.permute(2,0,1), labels, probs_sizes, label_sizes) / im_data_ocr.size(0) * 0.5
  #
  # loss_ocr.backward()

I think the main reason is that two threads 'dg_ocr' and 'data_generator' conflicts with each other in each training epoch. @MichalBusta do you have any other approach to solve this problem?

@ustczhouyu
Copy link
Author
ustczhouyu commented Feb 28, 2019 via email

@MichalBusta
Copy link
Owner

Hi, nice to know that you have synthesized a multilingual data set Synthetic Multi-Language in Natural Scene Dataset, I don't know how to download it, can you send it to me? Thank you very much.

https://github.com/MichalBusta/E2E-MLT -section Data

@ycjcy
Copy link
ycjcy commented Mar 11, 2019

@ustczhouyu @MiZhangWhuer @MichalBusta Hi, I meet the same question, and I changed according to the above.But the error still occur,hope you give me some solution.Look forward to your reply.Thank you.

@LittlePinkRobin
Copy link

@ycjcy @MichalBusta If you are running the sample data that is provided in the repository try making batchsize=2 as I noticed it was an issue with batchsize=8 it would never hit the terminating case.

@duxiangcheng
Copy link

@MichalBusta @MiZhangWhuer @ycjcy @LittlePinkRobin @ustczhouyu hello everyone! I want to know the function of "-ocr_feed_list" in the train.py? And where can I get the cropped image? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants