[go: nahoru, domu]

Skip to content

Issues: NVIDIA/Megatron-LM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

In BERT pretraining how to specify DATA_PATH to take multiple files stale No activity in 60 days on issue or PR
#117 opened Jul 7, 2021 by armundle
The training of T5 using FP16 is unstable stale No activity in 60 days on issue or PR
#115 opened Jul 6, 2021 by zhuhong
How about supporting alternatives to fine-tuning? stale No activity in 60 days on issue or PR
#114 opened Jul 6, 2021 by hwijeen
AttributeError: 'Parameter' object has no attribute 'main_grad' stale No activity in 60 days on issue or PR
#112 opened Jun 27, 2021 by xyltt
Problems about model parallel stale No activity in 60 days on issue or PR
#111 opened Jun 11, 2021 by Shaw95
Preocessing data about T5 stale No activity in 60 days on issue or PR
#110 opened Jun 11, 2021 by Hanlard
Unclear description of ICT pretraining stale No activity in 60 days on issue or PR
#108 opened Jun 7, 2021 by hangzhang-nlp
Distributed training all-reduce order stale No activity in 60 days on issue or PR
#107 opened May 31, 2021 by zhiqi-0
Add new attention features on megatron to optimize the performance. stale No activity in 60 days on issue or PR
#106 opened May 22, 2021 by rainmaker712
how to export the gpt2 to onnx model? stale No activity in 60 days on issue or PR
#99 opened May 7, 2021 by HuuY
Can not create embeddings from Megatron stale No activity in 60 days on issue or PR
#91 opened Apr 17, 2021 by Benan-Akca
Further pretraining using BERT-base weights stale No activity in 60 days on issue or PR
#182 opened Jan 25, 2022 by genesith
bert_dataset.py - ValueError: Seed must be between 0 and 2**32 - 1 stale No activity in 60 days on issue or PR
#88 opened Mar 23, 2021 by bugface
Support LAMB optimizer stale No activity in 60 days on issue or PR
#87 opened Mar 23, 2021 by bugface
loss curve in pretraining BERT is very strange stale No activity in 60 days on issue or PR
#86 opened Mar 16, 2021 by zjujh1995
Fused kernel compilation could get stuck bug Something isn't working stale No activity in 60 days on issue or PR
#82 opened Mar 14, 2021 by rhythmswing
How to calculate FLOPS? stale No activity in 60 days on issue or PR
#76 opened Mar 9, 2021 by ShivanshuPurohit
GLUE tasks for BERT stale No activity in 60 days on issue or PR
#74 opened Mar 8, 2021 by casually-PYlearner
the issue of GPU utilize stale No activity in 60 days on issue or PR
#73 opened Mar 2, 2021 by ywd-pku
Do we need to use preprocess.py before loading dataset? stale No activity in 60 days on issue or PR
#94 opened Apr 20, 2021 by Benan-Akca
ProTip! Add no:assignee to see everything that’s not assigned.