[go: nahoru, domu]

Jump to content

Editing GPT-3

You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to a username, among other benefits.
Content that violates any copyrights will be deleted. Encyclopedic content must be verifiable through citations to reliable sources.
Latest revision Your text
Line 30: Line 30:
'''Generative Pre-trained Transformer 3''' ('''GPT-3''') is a [[large language model]] released by [[OpenAI]] in 2020.
'''Generative Pre-trained Transformer 3''' ('''GPT-3''') is a [[large language model]] released by [[OpenAI]] in 2020.


Like its predecessor, [[GPT-2]], it is a decoder-only<ref name="OpenAI_Radford_20200611" /> [[transformer model]] of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "[[Attention (machine learning)|attention]]".<ref name="2018_Attention_Paper">{{cite journal |last1=Vaswani |first1=Ashish |author1-link= Ashish Vaswani |last2=Shazeer |first2=Noam |last3=Parmar |first3=Niki |last4=Uszkoreit |first4=Jakob |last5=Jones |first5=Llion |last6=Gomez |first6=Aidan N |author6-link= Aidan Gomez |last7=Kaiser |first7=Łukasz |last8=Polosukhin |first8=Illia |title=Attention is All you Need |journal=Advances in Neural Information Processing Systems |date=2017 |volume=30 |url=https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf |publisher=Curran Associates, Inc.}}</ref> This attention mechanism allows the model to focus selectively on segments of input text it predicts to be most relevant.<ref name="jointly">{{cite arXiv |last1= Bahdanau |first1 = Dzmitry |last2 = Cho |first2= Kyunghyun |last3= Bengio |first3= Yoshua |eprint = 1409.0473 |title= Neural Machine Translation by Jointly Learning to Align and Translate |class= cs.CL |date= 1 September 2014}}</ref> GPT-3 has 175 billion [[Parameter (machine learning)|parameters]], each with 16-bit precision, requiring 350GB of storage since each parameter occupies 2 bytes. It has a [[context window]] size of 2048 [[Lexical analysis|tokens]], and has demonstrated strong "[[zero-shot]]" and "[[Few-shot learning (natural language processing)|few-shot]]" learning abilities on many tasks.<ref name="OpenAI_Radford_20200611">{{Cite web| page = 12| access-date = July 31, 2020| date = June 11, 2018| last1 = Radford| first1 = Alec| last2 = Narasimhan| first2 = Karthik| last3 = Salimans| first3 = Tim| last4 = Sutskever| first4 = Ilya| title = Improving Language Understanding by Generative Pre-Training| url = https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| archive-date = January 26, 2021| archive-url = https://web.archive.org/web/20210126024542/https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| url-status = live}}</ref>
Like its predecessor, [[GPT-2]], it is a decoder-only<ref name="OpenAI_Radford_20200611" /> [[transformer model]] of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "[[Attention (machine learning)|attention]]".<ref name="2018_Attention_Paper">{{cite journal |last1=Vaswani |first1=Ashish |author1-link= Ashish Vaswani |last2=Shazeer |first2=Noam |last3=Parmar |first3=Niki |last4=Uszkoreit |first4=Jakob |last5=Jones |first5=Llion |last6=Gomez |first6=Aidan N |author6-link= Aidan Gomez |last7=Kaiser |first7=Łukasz |last8=Polosukhin |first8=Illia |title=Attention is All you Need |journal=Advances in Neural Information Processing Systems |date=2017 |volume=30 |url=https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf |publisher=Curran Associates, Inc.}}</ref> This attention mechanism allows the model to selectively focus on segments of input text it predicts to be most relevant.<ref name="jointly">{{cite arXiv |last1= Bahdanau |first1 = Dzmitry |last2 = Cho |first2= Kyunghyun |last3= Bengio |first3= Yoshua |eprint = 1409.0473 |title= Neural Machine Translation by Jointly Learning to Align and Translate |class= cs.CL |date= 1 September 2014}}</ref> GPT-3 has 175 billion [[Parameter (machine learning)|parameters]], each with a 16-bit precision, thus requiring 350GB of storage space as each parameter takes 2 bytes of space. It has a [[context window]] size of 2048 [[Lexical analysis|tokens]], and has demonstrated strong "[[zero-shot]]" and "[[Few-shot learning (natural language processing)|few-shot]]" learning abilities on many tasks.<ref name="OpenAI_Radford_20200611">{{Cite web| page = 12| access-date = July 31, 2020| date = June 11, 2018| last1 = Radford| first1 = Alec| last2 = Narasimhan| first2 = Karthik| last3 = Salimans| first3 = Tim| last4 = Sutskever| first4 = Ilya| title = Improving Language Understanding by Generative Pre-Training| url = https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| archive-date = January 26, 2021| archive-url = https://web.archive.org/web/20210126024542/https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| url-status = live}}</ref>


On September 22, 2020, [[Microsoft]] announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.<ref name="MSgotcode">{{Cite magazine |title=OpenAI is giving Microsoft exclusive access to its GPT-3 language model |url=https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |date=September 23, 2020 |last=Hao |first=Karen |access-date=2020-09-25 |magazine=[[MIT Technology Review]] |language=en |quote="The companies say OpenAI will continue to offer its public-facing [[API]], which allows chosen users to send text to GPT-3 or OpenAI's other models and receive its output. Only Microsoft, however, will have access to GPT-3's underlying code, allowing it to embed, repurpose, and modify the model as it pleases." |archive-date=February 5, 2021 |archive-url=https://web.archive.org/web/20210205121656/https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |url-status=live }}</ref>
On September 22, 2020, [[Microsoft]] announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.<ref name="MSgotcode">{{Cite magazine |title=OpenAI is giving Microsoft exclusive access to its GPT-3 language model |url=https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |date=September 23, 2020 |last=Hao |first=Karen |access-date=2020-09-25 |magazine=[[MIT Technology Review]] |language=en |quote="The companies say OpenAI will continue to offer its public-facing [[API]], which allows chosen users to send text to GPT-3 or OpenAI's other models and receive its output. Only Microsoft, however, will have access to GPT-3's underlying code, allowing it to embed, repurpose, and modify the model as it pleases." |archive-date=February 5, 2021 |archive-url=https://web.archive.org/web/20210205121656/https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |url-status=live }}</ref>
By publishing changes, you agree to the Terms of Use, and you irrevocably agree to release your contribution under the CC BY-SA 4.0 License and the GFDL. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel Editing help (opens in new window)

Copy and paste: – — ° ′ ″ ≈ ≠ ≤ ≥ ± − × ÷ ← → · §   Cite your sources: <ref></ref>


{{}}   {{{}}}   |   []   [[]]   [[Category:]]   #REDIRECT [[]]   &nbsp;   <s></s>   <sup></sup>   <sub></sub>   <code></code>   <pre></pre>   <blockquote></blockquote>   <ref></ref> <ref name="" />   {{Reflist}}   <references />   <includeonly></includeonly>   <noinclude></noinclude>   {{DEFAULTSORT:}}   <nowiki></nowiki>   <!-- -->   <span class="plainlinks"></span>


Symbols: ~ | ¡ ¿ † ‡ ↔ ↑ ↓ • ¶   # ∞   ‹› «»   ¤ ₳ ฿ ₵ ¢ ₡ ₢ $ ₫ ₯ € ₠ ₣ ƒ ₴ ₭ ₤ ℳ ₥ ₦ № ₧ ₰ £ ៛ ₨ ₪ ৳ ₮ ₩ ¥   ♠ ♣ ♥ ♦   𝄫 ♭ ♮ ♯ 𝄪   © ® ™
Latin: A a Á á À à  â Ä ä Ǎ ǎ Ă ă Ā ā à ã Å å Ą ą Æ æ Ǣ ǣ   B b   C c Ć ć Ċ ċ Ĉ ĉ Č č Ç ç   D d Ď ď Đ đ Ḍ ḍ Ð ð   E e É é È è Ė ė Ê ê Ë ë Ě ě Ĕ ĕ Ē ē Ẽ ẽ Ę ę Ẹ ẹ Ɛ ɛ Ǝ ǝ Ə ə   F f   G g Ġ ġ Ĝ ĝ Ğ ğ Ģ ģ   H h Ĥ ĥ Ħ ħ Ḥ ḥ   I i İ ı Í í Ì ì Î î Ï ï Ǐ ǐ Ĭ ĭ Ī ī Ĩ ĩ Į į Ị ị   J j Ĵ ĵ   K k Ķ ķ   L l Ĺ ĺ Ŀ ŀ Ľ ľ Ļ ļ Ł ł Ḷ ḷ Ḹ ḹ   M m Ṃ ṃ   N n Ń ń Ň ň Ñ ñ Ņ ņ Ṇ ṇ Ŋ ŋ   O o Ó ó Ò ò Ô ô Ö ö Ǒ ǒ Ŏ ŏ Ō ō Õ õ Ǫ ǫ Ọ ọ Ő ő Ø ø Œ œ   Ɔ ɔ   P p   Q q   R r Ŕ ŕ Ř ř Ŗ ŗ Ṛ ṛ Ṝ ṝ   S s Ś ś Ŝ ŝ Š š Ş ş Ș ș Ṣ ṣ ß   T t Ť ť Ţ ţ Ț ț Ṭ ṭ Þ þ   U u Ú ú Ù ù Û û Ü ü Ǔ ǔ Ŭ ŭ Ū ū Ũ ũ Ů ů Ų ų Ụ ụ Ű ű Ǘ ǘ Ǜ ǜ Ǚ ǚ Ǖ ǖ   V v   W w Ŵ ŵ   X x   Y y Ý ý Ŷ ŷ Ÿ ÿ Ỹ ỹ Ȳ ȳ   Z z Ź ź Ż ż Ž ž   ß Ð ð Þ þ Ŋ ŋ Ə ə
Greek: Ά ά Έ έ Ή ή Ί ί Ό ό Ύ ύ Ώ ώ   Α α Β β Γ γ Δ δ   Ε ε Ζ ζ Η η Θ θ   Ι ι Κ κ Λ λ Μ μ   Ν ν Ξ ξ Ο ο Π π   Ρ ρ Σ σ ς Τ τ Υ υ   Φ φ Χ χ Ψ ψ Ω ω   {{Polytonic|}}
Cyrillic: А а Б б В в Г г   Ґ ґ Ѓ ѓ Д д Ђ ђ   Е е Ё ё Є є Ж ж   З з Ѕ ѕ И и І і   Ї ї Й й Ј ј К к   Ќ ќ Л л Љ љ М м   Н н Њ њ О о П п   Р р С с Т т Ћ ћ   У у Ў ў Ф ф Х х   Ц ц Ч ч Џ џ Ш ш   Щ щ Ъ ъ Ы ы Ь ь   Э э Ю ю Я я   ́
IPA: t̪ d̪ ʈ ɖ ɟ ɡ ɢ ʡ ʔ   ɸ β θ ð ʃ ʒ ɕ ʑ ʂ ʐ ç ʝ ɣ χ ʁ ħ ʕ ʜ ʢ ɦ   ɱ ɳ ɲ ŋ ɴ   ʋ ɹ ɻ ɰ   ʙ ⱱ ʀ ɾ ɽ   ɫ ɬ ɮ ɺ ɭ ʎ ʟ   ɥ ʍ ɧ   ʼ   ɓ ɗ ʄ ɠ ʛ   ʘ ǀ ǃ ǂ ǁ   ɨ ʉ ɯ   ɪ ʏ ʊ   ø ɘ ɵ ɤ   ə ɚ   ɛ œ ɜ ɝ ɞ ʌ ɔ   æ   ɐ ɶ ɑ ɒ   ʰ ʱ ʷ ʲ ˠ ˤ ⁿ ˡ   ˈ ˌ ː ˑ ̪   {{IPA|}}

Wikidata entities used in this page

  • GPT-3: Sitelink, Statement: P408, Title, Description: en, Statement: P1324, Miscellaneous (e.g. aliases, entity existence)

Pages transcluded onto the current version of this page (help):