Editing GPT-3

{{Short description|2020 text-generating language model}}
{{see also|Generative pre-trained transformer#Foundational models}}
{{use mdy dates|date=August 2020}}
{{Use American English|date=June 2023}}
{{Infobox software
| name                   = Generative Pre-trained Transformer 3 (GPT-3)
| logo                   =
| screenshot             =
| screenshot size        =
| caption                =
| author                 = [[OpenAI]]<ref name="preprint" />
| developer              =
| released               = May 28, 2020 (publication); June 11, 2020 (OA API beta)
| latest release version =
| latest release date    =
| repo                   =
| programming language   =
| operating system       =
| replaces               = [[GPT-2]]
| replaced_by            = [[GPT-3.5]]<br>[[GPT-4]]
| genre = {{ indented plainlist |
*[[Large language model]]
*[[Generative pre-trained transformer]]
*[[Foundation model]]
}}
| license                = proprietary
| website                = {{URL|https://openai.com/blog/openai-api}}
}}
{{Machine learning|Artificial neural network}}
'''Generative Pre-trained Transformer 3''' ('''GPT-3''') is a [[large language model]] released by [[OpenAI]] in 2020.

Like its predecessor, [[GPT-2]], it is a decoder-only<ref name="OpenAI_Radford_20200611" /> [[transformer model]] of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "[[Attention (machine learning)|attention]]".<ref name="2018_Attention_Paper">{{cite journal |last1=Vaswani |first1=Ashish |author1-link= Ashish Vaswani |last2=Shazeer |first2=Noam |last3=Parmar |first3=Niki |last4=Uszkoreit |first4=Jakob |last5=Jones |first5=Llion |last6=Gomez |first6=Aidan N  |author6-link= Aidan Gomez |last7=Kaiser |first7=Łukasz |last8=Polosukhin |first8=Illia |title=Attention is All you Need |journal=Advances in Neural Information Processing Systems |date=2017 |volume=30 |url=https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf |publisher=Curran Associates, Inc.}}</ref> This attention mechanism allows the model to selectively focus on segments of input text it predicts to be most relevant.<ref name="jointly">{{cite arXiv |last1= Bahdanau |first1 = Dzmitry |last2 = Cho |first2= Kyunghyun |last3= Bengio |first3= Yoshua |eprint = 1409.0473 |title= Neural Machine Translation by Jointly Learning to Align and Translate |class= cs.CL |date= 1 September 2014}}</ref> GPT-3 has 175 billion [[Parameter (machine learning)|parameters]], each with a 16-bit precision, thus requiring 350GB of storage space as each parameter takes 2 bytes of space. It has a [[context window]] size of 2048 [[Lexical analysis|tokens]], and has demonstrated strong "[[zero-shot]]" and "[[Few-shot learning (natural language processing)|few-shot]]" learning abilities on many tasks.<ref name="OpenAI_Radford_20200611">{{Cite web| page = 12| access-date = July 31, 2020| date = June 11, 2018| last1 = Radford| first1 = Alec| last2 = Narasimhan| first2 = Karthik| last3 = Salimans| first3 = Tim| last4 = Sutskever| first4 = Ilya| title = Improving Language Understanding by Generative Pre-Training| url = https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| archive-date = January 26, 2021| archive-url = https://web.archive.org/web/20210126024542/https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf| url-status = live}}</ref>

On September 22, 2020, [[Microsoft]] announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.<ref name="MSgotcode">{{Cite magazine |title=OpenAI is giving Microsoft exclusive access to its GPT-3 language model |url=https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |date=September 23, 2020 |last=Hao |first=Karen |access-date=2020-09-25 |magazine=[[MIT Technology Review]] |language=en |quote="The companies say OpenAI will continue to offer its public-facing [[API]], which allows chosen users to send text to GPT-3 or OpenAI's other models and receive its output. Only Microsoft, however, will have access to GPT-3's underlying code, allowing it to embed, repurpose, and modify the model as it pleases." |archive-date=February 5, 2021 |archive-url=https://web.archive.org/web/20210205121656/https://www.technologyreview.com/2020/09/23/1008729/openai-is-giving-microsoft-exclusive-access-to-its-gpt-3-language-model/ |url-status=live }}</ref>

== Background ==
According to ''[[The Economist]]'', improved algorithms, more powerful computers, and a recent increase in the amount of digitized material have fueled a revolution in [[machine learning]]. New techniques in the 2010s resulted in "rapid improvements in tasks", including manipulating language.<ref name="theeconomist_20200611">{{Cite news| issn = 0013-0613| title = An understanding of AI's limitations is starting to sink in| newspaper = The Economist| date = June 11, 2020| access-date = July 31, 2020| url = https://www.economist.com/technology-quarterly/2020/06/11/an-understanding-of-ais-limitations-is-starting-to-sink-in| archive-date = July 31, 2020| archive-url = https://web.archive.org/web/20200731060114/https://www.economist.com/technology-quarterly/2020/06/11/an-understanding-of-ais-limitations-is-starting-to-sink-in| url-status = live}}</ref>

Software models are trained to learn by using thousands or millions of examples in a "structure{{nbsp}}... loosely based on the neural architecture of the brain".<ref name="theeconomist_20200611" /> One architecture used in [[natural language processing]] (NLP) is a [[artificial neural network|neural network]] based on a [[deep learning]] model that was introduced in 2017—the [[Transformer (machine learning model)|transformer]] architecture.<ref name="Polosukhin_2017">{{cite arXiv|last1=Polosukhin|first1=Illia|last2=Kaiser|first2=Lukasz|last3=Gomez|first3=Aidan N.|last4=Jones|first4=Llion|last5=Uszkoreit|first5=Jakob|last6=Parmar|first6=Niki|last7=Shazeer|first7=Noam|last8=Vaswani|first8=Ashish|date=2017-06-12|title=[[Attention Is All You Need]]|eprint=1706.03762|class=cs.CL}}</ref> There are a number of NLP systems capable of processing, mining, organizing, connecting and contrasting textual input, as well as correctly answering questions.<ref name="thomsonreuters_nd">{{Cite web| title = Natural Language Processing| access-date = 2020-07-31| url = https://www.thomsonreuters.com/en/artificial-intelligence/natural-language-processing.html| archive-date = August 22, 2020| archive-url = https://web.archive.org/web/20200822144104/https://www.thomsonreuters.com/en/artificial-intelligence/natural-language-processing.html| url-status = live}}</ref>

On June 11, 2018, OpenAI researchers and engineers published a paper introducing the first [[generative pre-trained transformer]] (GPT){{mdash}}a type of [[generative artificial intelligence|generative]] [[large language model]] that is pre-trained with an enormous and diverse [[text corpus]] in [[Dataset (machine learning)|datasets]], followed by discriminative [[fine-tuning (machine learning)|fine-tuning]] to focus on a specific task. GPT models are transformer-based deep-learning neural network architectures. Previously, the best-performing neural NLP models commonly employed [[supervised learning]] from large amounts of manually-labeled data, which made it prohibitively expensive and time-consuming to train extremely large language models.<ref name="OpenAI_Radford_20200611" /> The first GPT model was known as "GPT-1," and it was followed by "GPT-2" in February 2019. Created as a direct scale-up of its predecessor, GPT-2 had both its parameter count and dataset size increased by a factor of 10. It had 1.5 billion parameters, and was trained on a dataset of 8 million web pages.<ref>{{Cite web |url=https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf |title=Archived copy |access-date=April 28, 2023 |archive-date=February 6, 2021 |archive-url=https://web.archive.org/web/20210206183945/https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf |url-status=live }}</ref> 
 
In February 2020, Microsoft introduced its Turing Natural Language Generation (T-NLG), which they claimed was "largest language model ever published at 17 billion parameters."<ref name="Wired_Sterling_20200213">{{Cite magazine| issn = 1059-1028| last = Sterling| first = Bruce| title = Web Semantics: Microsoft Project Turing introduces Turing Natural Language Generation (T-NLG)| magazine = Wired| access-date = July 31, 2020| date = February 13, 2020| url = https://www.wired.com/beyond-the-beyond/2020/02/web-semantics-microsoft-project-turing-introduces-turing-natural-language-generation-t-nlg/| archive-date = November 4, 2020| archive-url = https://web.archive.org/web/20201104163637/https://www.wired.com/beyond-the-beyond/2020/02/web-semantics-microsoft-project-turing-introduces-turing-natural-language-generation-t-nlg/| url-status = live}}</ref> It performed better than any other language model at a variety of tasks, including [[Automatic summarization|summarizing texts]] and [[question answering|answering questions]].

== Training and capabilities ==
{{Quote box
| title = A sample student essay about [[pedagogy]] written by GPT-3
| quote = The construct of "learning styles" is problematic because it fails to account for the processes through which learning styles are shaped. Some students might develop a particular learning style because they have had particular experiences. Others might develop a particular learning style by trying to accommodate to a learning environment that was not well suited to their learning needs. Ultimately, we need to understand the interactions among learning styles and environmental and personal factors, and how these shape how we learn and the kinds of learning we experience.
| source = – Text generated by [[Mike Sharples]]<ref name="pedagogy">{{Cite web |last=Marche |first=Stephen |date=2022-12-06 |title=The College Essay Is Dead |url=https://www.theatlantic.com/technology/archive/2022/12/chatgpt-ai-writing-college-student-essays/672371/ |access-date=2022-12-08 |website=[[The Atlantic]] |archive-date=January 24, 2023 |archive-url=https://web.archive.org/web/20230124042209/https://www.theatlantic.com/technology/archive/2022/12/chatgpt-ai-writing-college-student-essays/672371/ |url-status=live }}</ref>
| align = right
| width = 300px
}}

On May 28, 2020, an [[arXiv]] preprint by a group of 31 engineers and researchers at OpenAI described the achievement and development of GPT-3, a third-generation "state-of-the-art language model".<ref name="preprint" /><ref name="analyticsindiamag_Sagar_20200603">{{Cite magazine| last = Sagar| first = Ram| title = OpenAI Releases GPT-3, The Largest Model So Far| magazine = Analytics India Magazine| access-date = July 31, 2020| date = June 3, 2020| url = https://analyticsindiamag.com/open-ai-gpt-3-language-model/| archive-date = August 4, 2020| archive-url = https://web.archive.org/web/20200804173452/https://analyticsindiamag.com/open-ai-gpt-3-language-model/| url-status = live}}</ref> The team increased the capacity of GPT-3 by over two orders of magnitude from that of its predecessor, GPT-2,<ref name="gpt2-with-quote">{{cite web |title=Language Models are Unsupervised Multitask Learners |url=https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf |access-date=December 4, 2019 |quote="GPT-2, is a 1.5B parameter Transformer" |website=openai.com |archive-date=December 12, 2019 |archive-url=https://web.archive.org/web/20191212223916/https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf |url-status=live }}</ref> making GPT-3 the largest non-sparse language model to date.<ref name="preprint"/>{{rp|14|quote="Since we increase the capacity by over two orders of magnitude from GPT-2 to GPT-3"}}<ref name="CNBC_Shead_20200723">{{Cite news| last = Shead| first = Sam| title = Why everyone is talking about the A.I. text generator released by an Elon Musk-backed lab| work = CNBC| access-date = July 31, 2020| date = July 23, 2020| url = https://www.cnbc.com/2020/07/23/openai-gpt3-explainer.html| archive-date = July 30, 2020| archive-url = https://web.archive.org/web/20200730123130/https://www.cnbc.com/2020/07/23/openai-gpt3-explainer.html| url-status = live}} Four preprints were released between May 28 and July 22, 2020.</ref> Because GPT-3 is structurally similar to its predecessors,<ref name="preprint" /> its greater accuracy is attributed to its increased capacity and greater number of parameters.<ref name="ZDNet_Tiernan_20200601">{{Cite web| last = Ray| first = Tiernan| date = June 1, 2020| title = OpenAI's gigantic GPT-3 hints at the limits of language models for AI| work = ZDNet| access-date = July 31, 2020| url = https://www.zdnet.com/article/openais-gigantic-gpt-3-hints-at-the-limits-of-language-models-for-ai/| archive-date = June 1, 2020| archive-url = https://web.archive.org/web/20200601081629/https://www.zdnet.com/article/openais-gigantic-gpt-3-hints-at-the-limits-of-language-models-for-ai/| url-status = live}}</ref> GPT-3's capacity is ten times larger than that of Microsoft's Turing NLG, the next largest NLP model known at the time.<ref name="analyticsindiamag_Sagar_20200603" />

Lambdalabs estimated a hypothetical cost of around $4.6 million US dollars and 355 years to train GPT-3 on a single [[GPU]] in 2020,<ref name="lambdalabs">{{Citation | first1 = Chuan | last1 = Li | title = OpenAI's GPT-3 Language Model: A Technical Overview | date = June 3, 2020 | url = https://lambdalabs.com/blog/demystifying-gpt-3 | access-date = March 27, 2023 | archive-date = March 27, 2023 | archive-url = https://web.archive.org/web/20230327213811/https://lambdalabs.com/blog/demystifying-gpt-3 | url-status = live }}</ref> with lower actual training time by using more GPUs in parallel.

Sixty percent of the weighted pre-training dataset for GPT-3 comes from a filtered version of [[Common Crawl]] consisting of 410 billion [[Byte pair encoding|byte-pair-encoded]] tokens. Fuzzy deduplication used [[Apache Spark]]'s [[MinHash]]LSH.<ref name="preprint" />{{rp|9}} Other sources are 19 billion tokens from WebText2 representing 22% of the weighted total, 12 billion tokens from Books1 representing 8%, 55 billion tokens from Books2 representing 8%, and 3 billion tokens from Wikipedia representing 3%.<ref name="preprint" />{{rp|9}} GPT-3 was trained on hundreds of billions of words and is also capable of coding in [[CSS]], [[JSX (JavaScript)|JSX]], and [[Python (programming language)|Python]], among others.{{cn|date=June 2024}}

{| class="wikitable"
|+ GPT-3 training data<ref name="preprint"/>{{rp|9}}
|-
! Dataset
! # tokens
! Proportion <br />within training
|-
| [[Common Crawl]]
| style="text-align:right; padding-left: 2em; padding-right: 2em;" | 410 billion
| style="text-align:right; padding-right: 2em;" | 60%
|-
| WebText2
| style="text-align:right; padding-right: 2em;" | 19 billion
| style="text-align:right; padding-right: 2em;" | 22%
|-
| Books1
| style="text-align:right; padding-right: 2em;" | 12 billion
| style="text-align:right; padding-right: 2em;" | 8%
|-
| Books2
| style="text-align:right; padding-right: 2em;" | 55 billion
| style="text-align:right; padding-right: 2em;" | 8%
|-
| Wikipedia
| style="text-align:right; padding-right: 2em;" | 3 billion
| style="text-align:right; padding-right: 2em;" | 3%
|}

Since GPT-3's training data was all-encompassing, it does not require further training for distinct language tasks.{{cn|date=June 2024}} The training data contains occasional toxic language and GPT-3 occasionally generates toxic language as a result of mimicking its training data. A study from the [[University of Washington]] found that GPT-3 produced toxic language at a toxicity level comparable to the similar natural language processing models of [[GPT-2]] and CTRL. OpenAI has implemented several strategies to limit the amount of toxic language generated by GPT-3. As a result, GPT-3 produced less toxic language compared to its predecessor model, GPT-1, although it produced both more generations and a higher toxicity of toxic language compared to CTRL Wiki, a language model trained entirely on Wikipedia data.<ref>{{Citation | first1 = Samuel | last1 = Gehman | first2 = Suchin | last2 = Gururangan | first3 = Maarten | last3 = Sap | first4 = Yejin | last4 = Choi | first5 = Noah A. | last5 = Smith  | title = REALTOXICITYPROMPTS: Evaluating Neural Toxic Degeneration in Language Models | pages = 3356–3369 | publisher = Association for Computational Linguistics | date = 16–20 November 2020 | arxiv = 2009.11462 }}</ref>

On June 11, 2020, [[OpenAI]] announced that users could request access to its user-friendly GPT-3 API—a "machine learning toolset"—to help OpenAI "explore the strengths and limits" of this new technology.<ref name="OpenAI_20200611">{{cite web |url=https://openai.com/blog/openai-api/ |date=June 11, 2020 |work=OpenAI |title=OpenAI API |access-date=July 31, 2020 |archive-date=June 11, 2020 |archive-url=https://web.archive.org/web/20200611150951/https://openai.com/blog/openai-api/ |url-status=live }}</ref><ref name="techcrunch_20200601">{{Cite web |title=OpenAI makes an all-purpose API for its text-based AI capabilities |work=TechCrunch |date=June 11, 2020 |access-date=July 31, 2020 |url= https://techcrunch.com/2020/06/11/openai-makes-an-all-purpose-api-for-its-text-based-ai-capabilities/ |quote=If you've ever wanted to try out OpenAI's vaunted machine learning toolset, it just got a lot easier. The company has released an API that lets developers call its AI tools in on "virtually any English language task." |last=Coldewey|first=Devin|archive-url=https://web.archive.org/web/20211027000059/https://techcrunch.com/2020/06/11/openai-makes-an-all-purpose-api-for-its-text-based-ai-capabilities/|archive-date=October 27, 2021|url-status=live}}</ref> The invitation described how this API had a general-purpose "text in, text out" interface that can complete almost "any English language task", instead of the usual single use-case.<ref name="OpenAI_20200611" /> According to one user, who had access to a private early release of the OpenAI GPT-3 API, GPT-3 was "eerily good" at writing "amazingly coherent text" with only a few simple prompts.<ref name="Arram_20200709">{{Cite web| last = Arram| title = GPT-3: An AI that's eerily good at writing almost anything| work = Arram Sabeti| access-date = July 31, 2020| date = July 9, 2020| url = https://arr.am/2020/07/09/gpt-3-an-ai-thats-eerily-good-at-writing-almost-anything/| archive-date = July 20, 2020| archive-url = https://web.archive.org/web/20200720192137/https://arr.am/2020/07/09/gpt-3-an-ai-thats-eerily-good-at-writing-almost-anything/| url-status = live}}</ref> In an initial experiment 80 US subjects were asked to judge if short ~200 word articles were written by humans or GPT-3. The participants judged correctly 52% of the time, doing only slightly better than random guessing.<ref name="preprint" />

On November 18, 2021, OpenAI announced that enough safeguards had been implemented that access to its API would be unrestricted.<ref>{{Cite web |date=2021-11-18 |title=OpenAI's API Now Available with No Waitlist |url=https://openai.com/blog/api-no-waitlist/ |access-date=2022-11-05 |website=OpenAI |language=en |archive-date=November 5, 2022 |archive-url=https://web.archive.org/web/20221105195042/https://openai.com/blog/api-no-waitlist/ |url-status=live }}</ref> OpenAI provided developers with a content moderation tool that helps them abide by OpenAI's content policy.<ref>{{Cite web |title=OpenAI API |url=https://beta.openai.com/ |access-date=2022-11-05 |website=beta.openai.com |language=en |archive-date=December 23, 2022 |archive-url=https://web.archive.org/web/20221223073027/https://beta.openai.com/ |url-status=live }}</ref> On January 27, 2022, OpenAI announced that its newest GPT-3 language models (collectively referred to as InstructGPT) were now the default language model used on their [[API]]. According to OpenAI, InstructGPT produced content that was better aligned to user intentions by following instructions better, generating fewer made-up facts, and producing somewhat less toxic content.<ref>{{Cite web |date=2022-01-27 |title=Aligning Language Models to Follow Instructions |url=https://openai.com/blog/instruction-following/ |access-date=2022-11-05 |website=OpenAI |language=en |archive-date=November 5, 2022 |archive-url=https://web.archive.org/web/20221105195041/https://openai.com/blog/instruction-following/ |url-status=live }}</ref>

Because GPT-3 can "generate news articles which human evaluators have difficulty distinguishing from articles written by humans,"<ref name="analyticsindiamag_Sagar_20200603" /> GPT-3 has the "potential to advance both the beneficial and harmful applications of language models."<ref name="preprint" />{{rp|34}} In their May 28, 2020 paper, the researchers described in detail the potential "harmful effects of GPT-3"<ref name="analyticsindiamag_Sagar_20200603" /> which include "misinformation, [[Spamming|spam]], [[phishing]], [[Abuse of process|abuse of legal and governmental processes]], [[Academic dishonesty|fraudulent academic essay writing]] and social engineering [[pretexting]]".<ref name="preprint" /> The authors draw attention to these dangers to call for research on [[risk mitigation]].<ref name="preprint" />{{rp|34}}

GPT-3 is capable of performing zero-shot and few-shot learning (including one-shot).<ref name="preprint" />

In June 2022, Almira Osmanovic Thunström wrote that GPT-3 was the primary author on an article on itself, that they had submitted it for publication,<ref name="Thunström 2022">{{cite web |last=Thunström |first=Almira Osmanovic |title=We Asked GPT-3 to Write an Academic Paper about Itself – Then We Tried to Get It Published |website=Scientific American |date=2022-06-30 |url=https://www.scientificamerican.com/article/we-asked-gpt-3-to-write-an-academic-paper-about-itself-then-we-tried-to-get-it-published/ |access-date=2022-06-30 |archive-date=June 30, 2022 |archive-url=https://web.archive.org/web/20220630233635/https://www.scientificamerican.com/article/we-asked-gpt-3-to-write-an-academic-paper-about-itself-then-we-tried-to-get-it-published/ |url-status=live }}</ref> and that it had been pre-published while waiting for completion of its review.<ref name="Transformer Thunström Steingrimsson 2022">{{cite web |last1=Transformer |first1=Gpt Generative Pretrained |last2=Thunström |first2=Almira Osmanovic |last3=Steingrimsson |first3=Steinn |title=Can GPT-3 write an academic paper on itself, with minimal human input? |website=Archive ouverte HAL |date=2022-06-21 |url=https://hal.archives-ouvertes.fr/hal-03701250 |language=fr |access-date=2022-06-30 |archive-date=June 30, 2022 |archive-url=https://web.archive.org/web/20220630233635/https://hal.archives-ouvertes.fr/hal-03701250 |url-status=live }}</ref>

== GPT-3 models ==
There are many models in the GPT-3 family, some serving different purposes than others. In the initial research paper published by OpenAI, they mentioned 8 different sizes of the main GPT-3 model:

{| class="wikitable"
!Model name
!Parameters
!API name
|-
|GPT-3 Small
|125 M
|n/a
|-
|GPT-3 Medium
|350 M
|ada
|-
|GPT-3 Large
|760 M
|n/a
|-
|GPT-3 XL
|1.3 B
|babbage
|-
|GPT-3 2.7B
|2.7 B
|n/a
|-
|GPT-3 6.7B
|6.7 B
|curie
|-
|GPT-3 13B
|13B
|n/a
|-
|GPT-3 175B
|175B
|davinci
|}

Half of the models are accessible through the API, namely GPT-3-medium, GPT-3-xl, GPT-3-6.7B and GPT-3-175b, which are referred to as ada, babbage, curie and davinci respectively. While the size of the API models was not originally disclosed by OpenAI, [[EleutherAI]] announced the mapping between model sizes and API names in May 2021.<ref>{{cite web |url=https://blog.eleuther.ai/gpt3-model-sizes/ |title=On the Sizes of OpenAI API Models |last=Gao |first=Leo |work=EleutherAI Blog |publisher=[[EleutherAI]] |date=2021-05-24 |access-date=2023-11-23 }}</ref> These model sizes were later confirmed by OpenAI,<ref>{{cite web |url=https://archive.today/20221202233415/https://beta.openai.com/docs/model-index-for-researchers |title=Model index for researchers |website=[[OpenAI]] |access-date=2023-11-23 }}</ref> but the sizes of subsequent models have not been disclosed.

{| class="wikitable"
!Model
!Parameters
!Description
!Series
|-
|ada
|350 M
|Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost.
|Base GPT-3
|-
|babbage
babbage-002
|1.3 B
|Capable of straightforward tasks, very fast, and lower cost.
|Base GPT-3
|-
|curie
|6.7B
|Very capable, but faster and lower cost than Davinci.
|Base GPT-3
|-
|davinci
davinci-002
|175 B
|Most capable GPT-3 model. Can do any task the other models can do, often with higher quality.
|Base GPT-3
|-
|text-ada-001
|350 M
|Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost.
|InstructGPT
|-
|text-babbage-001
|1.3B
|Capable of straightforward tasks, very fast, and lower cost.
|InstructGPT
|-
|text-curie-001
|6.7B
|Very capable, faster and lower cost than Davinci.
|InstructGPT
|-
|text-davinci-001
|175B
|Older version of the most capable model in the GPT-3 series. Can perform any task the other GPT-3 models can, often with less context.
|InstructGPT
|-
|text-davinci-002
code-davinci-002
|Undisclosed
|Similar capabilities to <code>text-davinci-003</code> but trained with supervised fine-tuning instead of reinforcement learning
|GPT-3.5
|-
|text-davinci-003
|Undisclosed
|Can do any language task with better quality, longer output, and consistent instruction-following than the curie, babbage, or ada models. Also supports inserting completions within text.
|GPT-3.5
|-
|gpt-3.5-turbo
gpt-3.5-turbo-instruct
gpt-3.5-turbo-16k
|Undisclosed
|Most capable and cost effective (fastest) GPT-3.5 model and optimized for chat at 1/10th the cost of <code>text-davinci-003</code>.
|GPT-3.5
|}

== {{anchor|GPT-3.5}} GPT-3.5 ==
{{Infobox software
| name = Generative Pre-trained Transformer 3.5 (GPT-3.5)
| logo = 
| screenshot = 
| screenshot size = 
| caption = 
| author = [[OpenAI]]<ref name="preprint" />
| developer = 
| released = {{start date and age|2022|3|15}}
| latest release version = 
| latest release date = 
| repo = n/a
| programming language = 
| operating system = 
| replaces = GPT-3
| replaced_by = [[GPT-4]]
| genre = {{ indented plainlist |
*[[Large language model]]
*[[Generative pre-trained transformer]]
*[[Foundation model]]
}}
| license = [[Privative software|Privative]]
| website = n/a
}}

'''Generative Pre-trained Transformer 3.5''' ('''GPT-3.5''') is a sub class of GPT-3 Models created by [[OpenAI]] in 2022.

On March 15, 2022, OpenAI made available new versions of GPT-3 and [[OpenAI Codex|Codex]] in its API with edit and insert capabilities under the names "text-davinci-002" and "code-davinci-002".<ref>{{Cite web |date=2022-03-15 |title=New GPT-3 Capabilities: Edit & Insert |url=https://openai.com/blog/gpt-3-edit-insert/ |url-status=live |archive-url=https://web.archive.org/web/20230113234402/https://openai.com/blog/gpt-3-edit-insert/ |archive-date=January 13, 2023 |access-date=2023-01-13 |website=OpenAI |language=en}}</ref> These models were described as more capable than previous versions and were trained on data up to June 2021.<ref name="auto">{{Cite web |title=OpenAI API |url=https://platform.openai.com/ |url-status=live |archive-url=https://web.archive.org/web/20230320023933/https://platform.openai.com/ |archive-date=March 20, 2023 |access-date=March 15, 2023 |website=platform.openai.com}}</ref> On November 28, 2022, OpenAI introduced text-davinci-003.<ref>{{Cite web |title=Check out OpenAI's new text-davinci-003! Same underlying model as text-davinci-002 but more aligned. Would love to hear feedback about it! / Twitter |url=https://twitter.com/janleike/status/1597355354433916928 |access-date=2023-05-06 |archive-date=March 15, 2023 |archive-url=https://web.archive.org/web/20230315114254/https://twitter.com/janleike/status/1597355354433916928 |url-status=live }}</ref> On November 30, 2022, OpenAI began referring to these models as belonging to the "GPT-3.5" series,<ref name="auto" /> and released [[ChatGPT]], which was [[fine-tuning (machine learning)|fine-tuned]] from a model in the GPT-3.5 series.<ref>{{Cite web |date=2022-11-30 |title=ChatGPT: Optimizing Language Models for Dialogue |url=https://openai.com/blog/chatgpt/ |url-status=live |archive-url=https://web.archive.org/web/20221130180912/https://openai.com/blog/chatgpt/ |archive-date=November 30, 2022 |access-date=2023-01-13 |website=OpenAI |language=en}}</ref> OpenAI does not include GPT-3.5 in GPT-3.<ref>{{Cite web |title=OpenAI API |url=https://platform.openai.com/docs/models |access-date=6 May 2023 |archive-date=March 17, 2023 |archive-url=https://web.archive.org/web/20230317000210/https://platform.openai.com/docs/models |url-status=live }}</ref>

=== Models ===
There are three models:<ref>{{Cite web |title=OpenAI API |url=https://platform.openai.com/docs/models/gpt-3-5 |access-date=6 May 2023 |archive-date=May 6, 2023 |archive-url=https://web.archive.org/web/20230506030813/https://platform.openai.com/docs/models/gpt-3-5 |url-status=live }}</ref>
* Chat
** gpt-3.5-turbo
* Text completion
** text-davinci-003
** text-davinci-002

=== {{Anchor|GPT-3.5 with browsing}}GPT-3.5 with browsing ===
On April 10, 2023, [[OpenAI]] introduced a new variant of its GPT-3.5 series model, known as GPT-3.5 with Browsing (ALPHA).<ref name=":0">{{Cite web |last=tingetici |date=2023-04-10 |title=Default (GPT-3.5) with browsing ALPHA -- NEW Model showed up just now. |url=http://www.reddit.com/r/OpenAI/comments/1300c2g/default_gpt35_with_browsing_alpha_new_model/ |access-date=2023-04-27 |website=r/OpenAI |archive-date=April 27, 2023 |archive-url=https://web.archive.org/web/20230427085505/https://www.reddit.com/r/OpenAI/comments/1300c2g/default_gpt35_with_browsing_alpha_new_model/ |url-status=live }}</ref> This updated model was described to build upon the capabilities of its predecessors "text-davinci-002" and "code-davinci-002".<ref name=":1">{{Cite web |date=2022-03-15 |title=Introducing GPT-3.5 Series: text-davinci-002 and code-davinci-002 Models |url=https://platform.openai.com/ |access-date=2023-04-27 |website=OPEN AI |language=en |archive-date=March 20, 2023 |archive-url=https://web.archive.org/web/20230320023933/https://platform.openai.com/ |url-status=live }}</ref> The GPT-3.5 with Browsing (ALPHA) model incorporated the ability to access and browse online information. This has led to more accurate and up-to-date responses to user queries.<ref name=":0" />

The GPT-3.5 with Browsing (ALPHA) model has been trained on data up to September 2021, giving it more information compared to previous GPT-3.5 models, which were trained on data up until June 2021. The model attempted to provide developers and users with an advanced natural language processing tool that can effectively retrieve and synthesize online information.<ref name=":0" />

To enable browsing capabilities, OpenAI implemented a new [[API]] that allows the GPT-3.5 with Browsing (ALPHA) model to access selected online resources during operation.<ref name=":2">{{Cite web |date=2023-04-27 |title=GPT-3.5 with Browsing (ALPHA) Now Available for GPT Plus Users |url=https://platform.openai.com/ |access-date=2023-04-27 |website=OPEN AI |language=en |archive-date=March 20, 2023 |archive-url=https://web.archive.org/web/20230320023933/https://platform.openai.com/ |url-status=live }}</ref> This feature allows users to ask questions or request information with the expectation that the model will deliver updated, accurate, and relevant answers based on the latest online sources available to it.

On April 27, 2023, OpenAI made the GPT-3.5 with Browsing (ALPHA) model publicly available to GPT Plus users. This allowed more people to access to its new features.<ref name=":2" />

=== InstructGPT ===
InstructGPT is a fine-tuned version of GPT-3.5 trained on a dataset of human-written instructions.<ref>{{cite journal |vauthors=Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, Chartash D |title=How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment |journal=JMIR Med Educ |volume=9 |pages=e45312 |date=February 2023 |pmid= 36753318|pmc=9947764 |doi=10.2196/45312 |doi-access=free}}</ref>

== Reception ==
=== Applications ===
* GPT-3, specifically the [[OpenAI Codex|Codex model]], was the basis for [[GitHub Copilot]], a code completion and generation software that can be used in various code editors and IDEs.<ref>{{cite web |title=OpenAI Codex |url=https://openai.com/blog/openai-codex/ |website=OpenAI |access-date=23 December 2022 |language=en |date=10 August 2021 |archive-date=February 3, 2023 |archive-url=https://web.archive.org/web/20230203201912/https://openai.com/blog/openai-codex/ |url-status=live }}</ref><ref>{{cite magazine |last1=Thompson |first1=Clive |title=How an AI Became My Code-Writing Genie |url=https://www.wired.com/story/openai-copilot-autocomplete-for-code/ |access-date=23 December 2022 |magazine=Wired |date=15 March 2022 |archive-date=December 23, 2022 |archive-url=https://web.archive.org/web/20221223183659/https://www.wired.com/story/openai-copilot-autocomplete-for-code/ |url-status=live }}</ref>
* GPT-3 is used in certain [[Microsoft]] products to translate conventional language into formal computer code.<ref>{{Cite web|url=https://blogs.microsoft.com/ai/from-conversation-to-code-microsoft-introduces-its-first-product-features-powered-by-gpt-3/|title=Microsoft announced its first customer product features powered by GPT-3 and @Azure.|date=May 25, 2021|website=The AI Blog|access-date=May 26, 2021|archive-date=May 26, 2021|archive-url=https://web.archive.org/web/20210526120530/https://blogs.microsoft.com/ai/from-conversation-to-code-microsoft-introduces-its-first-product-features-powered-by-gpt-3/|url-status=live}}</ref><ref>{{cite news |last1=Vincent |first1=James |title=Microsoft has built an AI-powered autocomplete for code using GPT-3 |url=https://www.theverge.com/2021/5/25/22451144/microsoft-gpt-3-openai-coding-autocomplete-powerapps-power-fx |access-date=23 December 2022 |work=The Verge |date=25 May 2021 |archive-date=December 23, 2022 |archive-url=https://web.archive.org/web/20221223183700/https://www.theverge.com/2021/5/25/22451144/microsoft-gpt-3-openai-coding-autocomplete-powerapps-power-fx |url-status=live }}</ref>
* GPT-3 has been used in CodexDB<ref>{{Cite web|url=https://itrummer.github.io/CodexDB/|title=CodexDB - SQL Processing Powered by GPT-3|website=CodexDB - SQL Processing Powered by GPT-3|access-date=December 7, 2022|archive-date=December 7, 2022|archive-url=https://web.archive.org/web/20221207034506/https://itrummer.github.io/CodexDB/|url-status=live}}</ref> to generate query-specific code for [[SQL]] processing.
* GPT-3 has been used by [[Jason Rohrer]] in a retro-themed chatbot project named "Project December", which is accessible online and allows users to converse with several AIs using GPT-3 technology.<ref>{{cite news|first=Jason|last=Fagone|author-link=Jason Fagone|title=The Jessica Simulation: Love and loss in the age of A.I.|url=https://www.sfchronicle.com/projects/2021/jessica-simulation-artificial-intelligence/|work=[[San Francisco Chronicle]]|date=July 23, 2021|access-date=July 29, 2021|archive-date=July 28, 2021|archive-url=https://web.archive.org/web/20210728170927/https://www.sfchronicle.com/projects/2021/jessica-simulation-artificial-intelligence/|url-status=live}}</ref>
* GPT-3 was used by ''[[The Guardian]]'' to write an article about AI being harmless to human beings. It was fed some ideas and produced eight different essays, which were ultimately merged into one article.<ref>{{Cite news|last=GPT-3|date=2020-09-08|title=A robot wrote this entire article. Are you scared yet, human? {{!}} GPT-3|work=The Guardian|url=https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3|access-date=2020-09-15|issn=0261-3077|archive-date=September 8, 2020|archive-url=https://web.archive.org/web/20200908090812/https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3|url-status=live}}</ref>
* GPT-3 was used in ''[[AI Dungeon]]'', which generates text-based adventure games. Later it was replaced by a competing model after OpenAI changed their policy regarding generated content.<ref>{{Cite web |date=2021-12-08 |title=Update: Language Models and Dragon |url=https://latitude.io/blog/update-language-models |website=Latitude blog |access-date=March 22, 2022 |archive-date=April 25, 2022 |archive-url=https://web.archive.org/web/20220425034449/https://latitude.io/blog/update-language-models |url-status=live }}</ref><ref>{{cite news |title=This Mystical Book Was Co-Authored by a Disturbingly Realistic AI |url=https://www.vice.com/en/article/7kbjvb/this-magickal-grimoire-was-co-authored-by-a-disturbingly-realistic-ai |access-date=23 December 2022 |work=www.vice.com |date=2022 |language=en |archive-date=December 23, 2022 |archive-url=https://web.archive.org/web/20221223183700/https://www.vice.com/en/article/7kbjvb/this-magickal-grimoire-was-co-authored-by-a-disturbingly-realistic-ai |url-status=live }}</ref>
* GPT-3 is used to aid in writing [[Copy (publishing)|copy]] and other marketing materials.<ref>{{Cite news|last=GPT-3|date=2023-02-24|title=38 Prompt Examples in 10 Different Categories {{!}} GPT-3|work=GiPiTi Chat|url=https://gipiti.chat/prompt-examples#prompts-for-language-use|access-date=2023-02-24|archive-date=April 8, 2023|archive-url=https://web.archive.org/web/20230408154018/https://gipiti.chat/prompt-examples#prompts-for-language-use|url-status=live}}</ref>
* A 2022 study from [[Drexel University]] suggested that GPT-3-based systems could be used to screen for early signs of [[Alzheimer's disease]].<ref>{{Cite news|url=https://www.jpost.com/health-and-wellness/mind-and-spirit/article-725929|title=Can ChatGPT AI chatbot spot early stages of Alzheimer's? - study|date=2022|access-date=February 10, 2023|website=The Jerusalem Post|archive-date=February 10, 2023|archive-url=https://web.archive.org/web/20230210054139/https://www.jpost.com/health-and-wellness/mind-and-spirit/article-725929|url-status=live}}</ref><ref>{{Cite journal |title=Predicting dementia from spontaneous speech using large language models |date=December 22, 2022 |journal=PLOS Digital Health |number=12 |last1=Agbavor |first1=Felix |last2=Liang |first2=Hualou |pages=e0000168 |doi=10.1371/journal.pdig.0000168 |volume=1|pmid=36812634 |s2cid=255029590 |pmc=9931366 |doi-access=free }}</ref>

=== Reviews ===
* In a July 2020 review in ''[[The New York Times]]'', [[Farhad Manjoo]] said that GPT-3's ability to generate computer code, poetry, and prose is not just "amazing", "spooky", and "humbling", but also "more than a little terrifying".<ref name="NYT_Farhad_20190515">{{Cite news| issn = 0362-4331| first = Farhad| last = Manjoo| title = How Do You Know a Human Wrote This?| work = [[The New York Times]]| access-date = August 4, 2020| date = July 29, 2020| url = https://www.nytimes.com/2020/07/29/opinion/gpt-3-ai-automation.html| archive-date = October 29, 2020| archive-url = https://web.archive.org/web/20201029161230/https://www.nytimes.com/2020/07/29/opinion/gpt-3-ai-automation.html| url-status = live}}</ref>
* ''Daily Nous'' presented a series of articles by nine philosophers on GPT-3.<ref name="DailyNous_Weinberg_20200730">{{Cite web| editor-last = Weinberg| editor-first = Justin| title = Philosophers On GPT-3 (updated with replies by GPT-3)| work = Daily Nous| access-date = July 31, 2020| date = July 30, 2020| url = http://dailynous.com/2020/07/30/philosophers-gpt-3/| archive-date = October 30, 2020| archive-url = https://web.archive.org/web/20201030232410/http://dailynous.com/2020/07/30/philosophers-gpt-3/| url-status = live}}</ref> Australian philosopher [[David Chalmers]] described GPT-3 as "one of the most interesting and important AI systems ever produced".<ref name="DailyNous_Weinberg_Chalmer_20200730">{{Cite news| first = David| last = Chalmers| author-link = David Chalmers| editor-last = Weinberg| editor-first = Justin| title = GPT-3 and General Intelligence| series = Philosophers On GPT-3 (updated with replies by GPT-3)| work = Daily Nous| access-date = August 4, 2020| date = July 30, 2020| url = https://dailynous.com/2020/07/30/philosophers-gpt-3/#chalmers| archive-date = August 4, 2020| archive-url = https://web.archive.org/web/20200804135420/http://dailynous.com/2020/07/30/philosophers-gpt-3/#chalmers| url-status = live}}</ref>
* A review in ''[[Wired (magazine)|Wired]]'' said that GPT-3 was "provoking chills across [[Silicon Valley]]".<ref name="Wired_Simonite_20200722">{{Cite magazine| issn = 1059-1028| title = Did a Person Write This Headline, or a Machine?| first = Tom| last = Simonite| magazine = [[Wired (magazine)|Wired]]| access-date = July 31, 2020| date = July 22, 2020| url = https://www.wired.com/story/ai-text-generator-gpt-3-learning-language-fitfully/| archive-date = November 1, 2020| archive-url = https://web.archive.org/web/20201101124640/https://www.wired.com/story/ai-text-generator-gpt-3-learning-language-fitfully/| url-status = live}}</ref>
* The ''[[National Law Review]]'' said that GPT-3 is an "impressive step in the larger process", with OpenAI and others finding "useful applications for all of this power" while continuing to "work toward a more [[Artificial general intelligence|general intelligence]]".<ref name="NTR_20200730">{{Cite web |first=Theodore |last=Claypoole |title=New AI Tool GPT-3 Ascends to New Peaks, But Proves How Far We Still Need to Travel |work=[[The National Law Review]] |date=July 30, 2020 |access-date=August 4, 2020 |volume=10 |number=214 |url=https://www.natlawreview.com/article/new-ai-tool-gpt-3-ascends-to-new-peaks-proves-how-far-we-still-need-to-travel |archive-date=October 30, 2020 |archive-url=https://web.archive.org/web/20201030140406/https://www.natlawreview.com/article/new-ai-tool-gpt-3-ascends-to-new-peaks-proves-how-far-we-still-need-to-travel |url-status=live }}</ref>
* An article in the ''[[MIT Technology Review]],'' co-written by Deep Learning critic [[Gary Marcus]],<ref>{{Cite web|last=Marcus|first=Gary|date=2018-12-01|title=The deepest problem with deep learning|url=https://medium.com/@GaryMarcus/the-deepest-problem-with-deep-learning-91c5991f5695|access-date=2020-09-29|website=Medium|language=en|archive-date=August 1, 2019|archive-url=https://web.archive.org/web/20190801212321/https://medium.com/@GaryMarcus/the-deepest-problem-with-deep-learning-91c5991f5695|url-status=live}}</ref> stated that GPT-3's "comprehension of the world is often seriously off, which means you can never really trust what it says."<ref name="Marcus_Davis_2020">{{cite magazine |last1=Marcus |first1=Gary |last2=Davis |first2=Ernest |url=https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion |title=GPT-3, Bloviator: OpenAI's language generator has no idea what it's talking about |date=August 22, 2020 |magazine=[[MIT Technology Review]] |access-date=August 23, 2020 |archive-date=August 23, 2020 |archive-url=https://web.archive.org/web/20200823022409/https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/ |url-status=live }}</ref> According to the authors, GPT-3 models relationships between words without having an [[natural language understanding|understanding]] of the meaning behind each word.
* Jerome Pesenti, head of the Facebook AI lab, said GPT-3 is "unsafe," pointing to the [[sexist]], [[racist]] and other biased and negative language generated by the system when it was asked to discuss Jews, women, black people, and the [[Holocaust]].<ref>{{Cite news|last=Metz|first=Cade|date=2020-11-24|title=Meet GPT-3. It Has Learned to Code (and Blog and Argue).|language=en-US|work=The New York Times|url=https://www.nytimes.com/2020/11/24/science/artificial-intelligence-ai-gpt3.html|access-date=2020-11-24|issn=0362-4331|archive-date=December 6, 2020|archive-url=https://web.archive.org/web/20201206112300/https://www.nytimes.com/2020/11/24/science/artificial-intelligence-ai-gpt3.html|url-status=live}}</ref>
* Nabla, a French start-up specializing in healthcare technology, tested GPT-3 as a medical [[chatbot]], though OpenAI itself warned against such use. As expected, GPT-3 showed several limitations. For example, while testing GPT-3 responses about mental health issues, the AI advised a simulated patient to commit suicide.<ref>{{Cite web|date=2020-10-28|title=Medical chatbot using OpenAI's GPT-3 told a fake patient to kill themselves|url=https://artificialintelligence-news.com/2020/10/28/medical-chatbot-openai-gpt3-patient-kill-themselves/|access-date=2021-01-08|website=AI News|language=en-GB|archive-date=January 10, 2021|archive-url=https://web.archive.org/web/20210110145323/https://artificialintelligence-news.com/2020/10/28/medical-chatbot-openai-gpt3-patient-kill-themselves/|url-status=live}}</ref>
* [[Noam Chomsky]] expressed his skepticism about GPT-3's scientific value: "It's not a language model. It works just as well for impossible languages as for actual languages. It is therefore refuted, if intended as a language model, by normal scientific criteria. [...] Perhaps it's useful for some purpose, but it seems to tell us nothing about language or cognition generally."<ref>{{cite AV media|url=https://www.youtube.com/watch?v=c6MU5zQwtT4|title=Chomsky on Terence McKenna, Sam Harris, GPT3, Cryptocurrencies, Kierkegaard, Neuralink, & Hofstadter|date=2021-03-24|time=1:11:44|access-date=April 29, 2021|archive-date=April 29, 2021|archive-url=https://web.archive.org/web/20210429153422/https://www.youtube.com/watch?v=c6MU5zQwtT4|url-status=live}}</ref>
* [[Luciano Floridi]] and [[Massimo Chiriatti]] highlighted the risk of "cheap production of good, semantic artefacts".<ref>{{Cite journal| last1=Floridi |first1=Luciano |last2=Chiriatti |first2=Massimo |date=1 November 2020 |title=GPT‑3: Its Nature, Scope, Limits, and Consequences |journal=Minds and Machines |volume= 30 |issue=4 |pages= 681–694  |doi= 10.1007/s11023-020-09548-1|s2cid=228954221 |doi-access=free }}</ref>
* OpenAI's Sam Altman himself criticized what he called "GPT-3 hype", acknowledging GPT-3 "has serious weakness and sometimes makes very silly mistakes... AI is going to change the world, but GPT-3 is just a very early glimpse."<ref>{{cite news |last1=Vincent |first1=James |title=OpenAI's latest breakthrough is astonishingly powerful, but still fighting its flaws |url=https://www.theverge.com/21346343/gpt-3-explainer-openai-examples-errors-agi-potential |access-date=9 November 2022 |work=The Verge |date=30 July 2020 |archive-date=July 30, 2020 |archive-url=https://web.archive.org/web/20200730235924/https://www.theverge.com/21346343/gpt-3-explainer-openai-examples-errors-agi-potential |url-status=live }}</ref>

=== Criticism ===
GPT-3's builder, [[OpenAI]], was initially founded as a [[Nonprofit organization|non-profit]] in 2015.<ref>{{cite news |first= Drew |last= Olanoff |title= Artificial Intelligence Nonprofit OpenAI Launches With Backing From Elon Musk And Sam Altman |url= https://techcrunch.com/2015/12/11/non-profit-openai-launches-with-backing-from-elon-musk-and-sam-altman/ |publisher= Tech Crunch |date= 11 December 2015 |access-date= 31 May 2021 |archive-date= October 20, 2022 |archive-url= https://web.archive.org/web/20221020165718/https://techcrunch.com/2015/12/11/non-profit-openai-launches-with-backing-from-elon-musk-and-sam-altman/ |url-status= live }}</ref> In 2019, OpenAI broke from its usual open-source standards by not publicly releasing GPT-3's predecessor model, citing concerns that the model could facilitate the propagation of fake news. OpenAI eventually released a version of [[GPT-2]] that was 8% of the original model's size.<ref>{{cite news |first= Karen |last= Hao |title= OpenAI has released the largest version yet of its fake-news-spewing AI |url= https://www.technologyreview.com/2019/08/29/133218/openai-released-its-fake-news-ai-gpt-2/ |publisher= MIT Technology Review |date= 29 August 2019 |access-date= 31 May 2021 |archive-date= May 9, 2021 |archive-url= https://web.archive.org/web/20210509013721/https://www.technologyreview.com/2019/08/29/133218/openai-released-its-fake-news-ai-gpt-2/ |url-status= live }}</ref> In the same year, OpenAI restructured to be a for-profit company.<ref>{{cite news |first= Devin |last= Coldewey |title= OpenAI shifts from nonprofit to 'capped-profit' to attract capital |url= https://techcrunch.com/2019/03/11/openai-shifts-from-nonprofit-to-capped-profit-to-attract-capital/ |publisher= Tech Crunch |date= 11 Mar 2019 |access-date= 31 May 2021 |archive-date= January 4, 2023 |archive-url= https://web.archive.org/web/20230104154138/https://techcrunch.com/2019/03/11/openai-shifts-from-nonprofit-to-capped-profit-to-attract-capital/ |url-status= live }}</ref> In 2020, Microsoft announced the company had exclusive licensing of GPT-3 for Microsoft's products and services following a multi-billion dollar investment in OpenAI. The agreement permits OpenAI to offer a public-facing API such that users can send text to GPT-3 to receive the model's output, but only Microsoft will have access to GPT-3's source code.<ref name="MSgotcode" />

Large language models, such as GPT-3, have come under criticism from a few of Google's AI ethics researchers for the environmental impact of training and storing the models, detailed in a paper co-authored by [[Timnit Gebru]] and [[Emily M. Bender]] in 2021.<ref>{{cite conference|last1=Bender|first1=Emily M.|last2=Gebru|first2=Timnit|last3=McMillan-Major|first3=Angelina|last4=Shmitchell|first4=Shmargaret|date=2021-03-03|title=On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?|conference=|publisher=FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency|pages=610–623|doi=10.1145/3442188.3445922|doi-access=free}}</ref>

The growing{{When|date=December 2022}} use of automated writing technologies based on GPT-3 and other language generators, has raised concerns regarding academic integrity<ref>{{Cite news|last1=Mindzak|first1=Michael|last2=Eaton|first2=Sarah Elaine|title=Artificial intelligence is getting better at writing, and universities should worry about plagiarism|url=http://theconversation.com/artificial-intelligence-is-getting-better-at-writing-and-universities-should-worry-about-plagiarism-160481|access-date=2021-11-06|website=The Conversation|language=en|archive-date=November 7, 2021|archive-url=https://web.archive.org/web/20211107102635/https://theconversation.com/artificial-intelligence-is-getting-better-at-writing-and-universities-should-worry-about-plagiarism-160481|url-status=live}}</ref> and raised the stakes of how universities and schools will gauge what constitutes academic misconduct such as plagiarism.<ref>{{Cite journal|last1=Rogerson|first1=Ann M.|last2=McCarthy|first2=Grace|date=December 2017|title=Using Internet based paraphrasing tools: Original work, patchwriting or facilitated plagiarism?|journal=International Journal for Educational Integrity|language=en|volume=13|issue=1|pages=1–15|doi=10.1007/s40979-016-0013-y|s2cid=9473217|issn=1833-2595|doi-access=free}}</ref>

OpenAI's GPT series was built with data from the [[Common Crawl]] dataset,<ref>{{Cite web |last=Ver Meer |first=Dave |title=ChatGPT Statistics |url=https://www.namepepper.com/chatgpt-users |access-date=2023-06-21 |website=NamePepper |language=en |archive-date=June 5, 2023 |archive-url=https://web.archive.org/web/20230605230914/https://www.namepepper.com/chatgpt-users |url-status=live }}</ref> a conglomerate of copyrighted articles, internet posts, web pages, and books scraped from 60 million domains over a period of 12 years. ''TechCrunch'' reports this training data includes copyrighted material from the BBC, ''The New York Times'', [[Reddit]], the full text of online books, and more.<ref>{{cite conference|title=Here are a few ways GPT-3 can go wrong|work=TechCrunch|url=https://techcrunch.com/2020/08/07/here-are-a-few-ways-gpt-3-can-go-wrong/|access-date=November 26, 2021|archive-date=November 26, 2021|archive-url=https://web.archive.org/web/20211126192240/https://techcrunch.com/2020/08/07/here-are-a-few-ways-gpt-3-can-go-wrong/|url-status=live}}</ref> In its response to a 2019 Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation from the [[United States Patent and Trademark Office]] (USPTO), OpenAI argued that "Under current law, training AI systems [such as its GPT models] constitutes [[fair use]]," but that "given the lack of [[case law]] on point, OpenAI and other AI developers like us face substantial legal uncertainty and compliance costs."<ref>{{cite conference |title=Comment Regarding Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation |url=https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf |publisher=USPTO |access-date=November 30, 2021 |archive-date=October 16, 2021 |archive-url=https://web.archive.org/web/20211016024654/https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-58141.pdf |url-status=live }}</ref>

== See also ==
* [[BERT (language model)]]
* [[Hallucination (artificial intelligence)]]
* [[LaMDA]]
* [[Gemini (language model)]]
* [[Wu Dao]]
* [[GPT-4]]
* [[GPTZero]]

== References ==
{{reflist|30em|refs=
<ref name="preprint">{{cite arXiv
|first1=Tom B.|last1=Brown|first2=Benjamin|last2=Mann|last3=Ryder|last25=Chess|last20=Hesse|first20=Christopher|last21=Chen|first21=Mark|last22=Sigler|first22=Eric|last23=Litwin|first23=Mateusz|last24=Gray|first24=Scott|first26=Jack|first25=Benjamin|last26=Clark|last19=Winter|last27=Berner|first27=Christopher|last28=McCandlish|first28=Sam|last29=Radford|first29=Alec|last30=Sutskever|first30=Ilya|last31=Amodei|first31=Dario|first19=Clemens|first18=Jeffrey|first3=Nick|last10=Askell|last4=Subbiah|first4=Melanie|last5=Kaplan|first5=Jared|last6=Dhariwal|first6=Prafulla|last7=Neelakantan|first7=Arvind|last8=Shyam|first8=Pranav|last9=Sastry|first9=Girish|first10=Amanda|last18=Wu|last11=Agarwal|first11=Sandhini|last12=Herbert-Voss|first12=Ariel|last13=Krueger|first13=Gretchen|last14=Henighan|first14=Tom|last15=Child|first15=Rewon|last16=Ramesh|first16=Aditya|last17=Ziegler|first17=Daniel M. 
|eprint=2005.14165
|title=Language Models are Few-Shot Learners 
|date=28 May 2020
|class=cs.CL }}</ref>
}}

{{OpenAI navbox}}
{{Differentiable computing}}
{{Existential risk from artificial intelligence}}

[[Category:Large language models]]
[[Category:Generative pre-trained transformers]]
[[Category:OpenAI]]