Talk:DALL-E
DALL-E was nominated as a good article, but it did not meet the good article criteria at the time (No date specified. To provide a date use: {{FailedGA|insert date in any format here}}). There are suggestions below for improving the article. If you can improve it, please do; it may then be renominated. |
This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||||||
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
|
A fact from DALL-E appeared on Wikipedia's Main Page in the Did you know column on 18 March 2021 (check views). The text of the entry was as follows:
|
DALL·E or DALL-E?
Do we know if the official name of the AI is DALL·E or DALL-E? OpenAI seems to be using DALL·E everywhere on their website, while external sources use DALL-E. LittleWhole (talk) 08:21, 7 January 2021 (UTC)
- For what it's worth, roughly the same happened to WALL-E. Azai~enwiki (talk) 08:47, 11 January 2021 (UTC)
Did you know nomination
- The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.
The result was: promoted by DanCherek (talk) 14:49, 12 March 2021 (UTC)
- ... that the machine learning model DALL-E, while widely reported on for its "surreal" and "quirky" output, also learned how to solve Raven's Matrices (a form of IQ test) without being specifically taught to?Source: Input, NBC, Nature, VentureBeat, Wired, CNN, New Scientist and BBC pieces (refs in article) for surrealness and quirkiness, TheNextWeb piece (ref also in article) for Raven's Matrices
- ALT1:... that ...? Source: "You are strongly encouraged to quote the source text supporting each hook" (and [link] the source, or cite it briefly without using citation templates)
5x expanded by JPxG (talk). Self-nominated at 19:20, 7 March 2021 (UTC).
- This interesting article qualifies for DYK as a five-fold expansion and is new enough and long enough. The hook facts are cited inline, the article is neutral and I detected no copyright issues. A QPQ has been done. Cwmhiraeth (talk) 07:26, 8 March 2021 (UTC)
GA Review
GA toolbox |
---|
Reviewing |
- This review is transcluded from Talk:DALL-E/GA1. The edit link for this section can be used to add comments to the review.
Reviewer: RoySmith (talk · contribs) 16:29, 2 April 2021 (UTC)
Starting review. My plan is to do two major passes through the article, first for prose, the second to verify the references. In general, all my comments will be suggestions which you can accept or reject as you see fit. -- RoySmith (talk) 16:29, 2 April 2021 (UTC)
- I put this on hold for a week, with no response, so closing this as a failed review. -- RoySmith (talk) 15:39, 10 April 2021 (UTC)
- Shit! I have been busy with a bunch of online stuff happening at the same time. Anyway, I will go through all of these things and probably do some expansion, and nominate again later. jp×g 18:43, 11 April 2021 (UTC)
Prose
Lead section
- "It uses a 12-billion parameter[2] version of the GPT-3 Transformer model to interpret natural language inputs (such as "a green leather purse shaped like a pentagon" or "an isometric view of a sad capybara") and generate corresponding images." For the lead section, I'd leave out the whole "(such as ...)" parenthetical phrase. That's covered in more detail in the main body.
- This one, I'm not sure how familiar a layman would be with the phrase
natural language
being used to specifically mean "a sentence spoken the way you'd say it to a person" rather than "written in a human language". After all,fetchString = cur.execute("SELECT * FROM threads2 WHERE replycount > 20 AND viewcount/replycount > 300 ORDER BY forumid, replycount DESC LIMIT 1000")
is an English-language sentence. You are correct that this is an unreasonably long sentence, though. I will ponder this. jp×g 19:10, 2 April 2021 (UTC)- JPxG, You've linked to natural language. Somebody who is not familiar with AI jargon can click through to find out what it means, and that fact that it's linked should be a clue that it has some special meaning. But, maybe if you want to give a little more, replace the current "(such as ... capybara)" with something like "(conventional human language)"? I'm assuming this was trained on an English-language corpus; you should verify that and mention it somewhere if it is indeed the case. -- RoySmith (talk) 20:07, 2 April 2021 (UTC)
- This one, I'm not sure how familiar a layman would be with the phrase
- "It is able to create images" -> "It can create images"
- Done. jp×g 19:07, 2 April 2021 (UTC)
- The long lists of citations on some sentences ("...texture of a porcupine").[2][4][5][6][7][8]") seem like WP:REFBOMB and detract from readability (particularly in the lead section). Can these be trimmed to just the most important sources that actually support the statement?
- This is a silly artifact of how I wrote the article (write a couple-sentence stub, locate and format all references, then flesh out an article by moving them down into the expanded text), which is definitely unintentional. Fixed. jp×g 19:07, 2 April 2021 (UTC)
- "DALL-E's name is a" -> "The name is a"
- Fixed. jp×g 19:13, 2 April 2021 (UTC)
- "in conjunction with another model, CLIP" -> "in conjunction with CLIP"
- Fixed. jp×g 19:13, 2 April 2021 (UTC)
- "OpenAI has refused to release source code for either mode" While this may be true, it's non-neutral. The implication is, "They should have released the source code, they were asked to do so, and they refused". I see you talk about this more later, but here in the lead, you state it in Wikipedia voice, which violates WP:NPOV. Must fix.
- This one is a little tricky: in GPT-2 (frankly, a better article) I wrote at much greater length about the issue there. To wit, OpenAI was founded as a nonprofit, and received funding to develop models, with the explicit goal of making their research open to the public (as opposed to organizations like DeepMind). The decision to not release GPT-2 was widely criticized, and they ended up releasing it anyway after determining that the abuse concerns were not based in fact. However, I agree that this should probably be explained in greater detail than just saying they "refused". jp×g 19:38, 2 April 2021 (UTC)
- "one of OpenAI's objectives through DALL-E's development" -> one of DALE-E's objectives"
- Fixed. jp×g 19:40, 2 April 2021 (UTC)
Architecture
- "model was first developed by OpenAI in 2018" -> I think "initially" works better than "first"
- Done. jp×g 19:41, 2 April 2021 (UTC)
- "was scaled up to produce GPT-2 in 2019.[10] In 2020, GPT-2 was augmented similarly to produce GPT-3,[11] of which DALL-E is a implementation.[2][12]" -> "was scaled up to produce GPT-2 in 2019, and GPT-3 (which DALE-E uses) in 2020."
- I have made an attempt here, but it is still a little awkward. Let me know what you think. jp×g 19:43, 2 April 2021 (UTC)
- JPxG, What you've got now is better. I could suggest some other alternatives, but I think it's fine now. -- RoySmith (talk) 20:10, 2 April 2021 (UTC)
- I have made an attempt here, but it is still a little awkward. Let me know what you think. jp×g 19:43, 2 April 2021 (UTC)
- "It uses zero-shot learning", clarify whether "it" refers to GPT-3, DALE-E, or GPT in general.
- I've moved that sentence down to an appropriate location, where I think it is much clearer (and where it makes more sense to be anyway). jp×g 19:47, 2 April 2021 (UTC)
- "scaled down from GPT-3's parameter size of 175 billion" -> scaled down from GPT-3's 175 billion"
- "large amounts of images" -> "a large number of images".
- I will dig my heels in very slightly on this one; I say amounts (plural) since it generates a large amount (singular) in response to each prompt (singular). jp×g 19:47, 2 April 2021 (UTC)
- JPxG, The problem is, "amount" implies a measurement, as opposed to a count. You can have "a large amount of image data", but you can't have "a large amount of images". You can have "a large number of images", or "many images", or "a voluminous quantity of images", or "a boatload of images". -- RoySmith (talk) 20:15, 2 April 2021 (UTC)
- Much to think about. I think you are correct; will fix. jp×g 20:43, 2 April 2021 (UTC)
- JPxG, The problem is, "amount" implies a measurement, as opposed to a count. You can have "a large amount of image data", but you can't have "a large amount of images". You can have "a large number of images", or "many images", or "a voluminous quantity of images", or "a boatload of images". -- RoySmith (talk) 20:15, 2 April 2021 (UTC)
- I will dig my heels in very slightly on this one; I say amounts (plural) since it generates a large amount (singular) in response to each prompt (singular). jp×g 19:47, 2 April 2021 (UTC)
- "another OpenAI model, CLIP,", this should start a new sentence.
- ""understand and rank" its output", I think "its" refers to DALE-E's here, but clarify.
- " (like ImageNet)", I'd leave that out completely. Since you're talking about "most classifier models", calling out one in particular doesn't add any value.
- ImageNet is a curated dataset of labeled images, not a classifier model. I have edited it to make this a little clearer. jp×g 19:49, 2 April 2021 (UTC)
- JPxG, But, there's still lots of curated image datasets. What's so special about ImageNet that it needs to be called out as the one example you mention as something that wasn't used? -- RoySmith (talk) 20:18, 2 April 2021 (UTC)
- ImageNet is a curated dataset of labeled images, not a classifier model. I have edited it to make this a little clearer. jp×g 19:49, 2 April 2021 (UTC)
- "Rather than learn from a single label", avoid repetition of the word "rather".
- Rather than use something instead of that
rather
, I have usedinstead
rather thanrather
for the previousrather
. jp×g 19:51, 2 April 2021 (UTC)
- Rather than use something instead of that
- "CLIP learns to associate" -> "CLIP associates"
Performance
- As above, WP:REFBOMB
- Fixed. jp×g 19:39, 2 April 2021 (UTC)
- "quoted Neil Lawrence ... describing it as ..." I think you mean, "quoted Neil Lawrence ... who described it as ..."
- " He also quoted Mark Riedl" clarify who "he" is.
Implications
- "DALL-E demonstrated "it is becoming", not sure, but maybe, "DALE-E demonstrated that "it is becoming"
My overall impression is that this reads like a publicity piece for OpenAI. The vast majority of the quotes are extolling the virtues of the system, with only one or two examples of problems, and even those are in the context of, "but he's an example of what it does better". The REFBOMB aspect is part of the problem, but it's deeper than that. I'm going to put the rest of the review on hold for a week to give you a chance to address that. -- RoySmith (talk) 17:54, 2 April 2021 (UTC)
- Thanks for taking the time to review! I will go through it now. jp×g 19:01, 2 April 2021 (UTC)
- Okay, have gone through it. I think that the lack of negativity in the article is mostly a consequence of OpenAI's embargo; nobody can access the code outside of an extremely narrowly-controlled demo which is more like a photo album than an interface (which, nevertheless, I strongly advise you to check out to form an opinion on the model). They also took a fair bit of time to release the paper, which I will admit to not having had time to go through yet, and I think this would also enable a fairly neutral description of what it does. Some of the more cynically minded opinion-havers called the GPT-2 embargo a deliberate strategy to hype up the GPT-2's capabilities when they did it with that model. I will go hunting for some more stuff to add to the article, though. jp×g 20:00, 2 April 2021 (UTC)
Images
I took another look at this. The infobox image File:DALL-E sample.png claims to be MIT-licensed from https://openai.com/blog/dall-e/. I can't find the image there. The Commons page is also lacking author information.
A Commons file used on this page or its Wikidata item has been nominated for deletion
The following Wikimedia Commons file used on this page or its Wikidata item has been nominated for deletion:
Participate in the deletion discussion at the nomination page. —Community Tech bot (talk) 14:37, 6 May 2022 (UTC)
What to do with DALL-E 2
OpenAI released last month DALL-E 2, a sequel to the DALL-E program. Should this page act as a wiki for DALL-E 1, and create a new page for DALL-E 2, or should the DALL-E page act as an umbrella for all future DALL-E iterations?
Camdoodlebop (talk) 01:38, 24 May 2022 (UTC)
- I'm more inclined to agree with the second suggestion, but let's see what other editors have to say. - Munmula (talk), second account of Alumnum 02:42, 24 May 2022 (UTC)
- I think the latter is better, as DALLE2 is a successor of the first model. Artem.G (talk) 08:10, 24 May 2022 (UTC)
New sources on Dall-E Mini and related projects
- https://www.businessinsider.com/dall-e-mini
- https://www.cnet.com/culture/everything-to-know-about-dall-e-mini-the-mind-bending-ai-art-creator/
- https://www.businessinsider.com/dall-e-mini
- https://www.vice.com/en/article/3ad8yw/we-asked-an-ai-to-draw-a-self-portrait
- https://www.theguardian.com/culture/2022/jun/09/what-exactly-is-ai-generated-art-how-does-it-work-will-it-replace-human-visual-artists
Lizardcreator (talk) 02:24, 15 June 2022 (UTC)
Undue weight
There is an undue weight given to open-source models that try to imitate dall-e (for example dall-e flow wasn't mentioned anywhere as anything notable);
and an undue weight to the "hidden language" developed by the model. This is just to recent to include, as very few reliable sources can be cited. The claim of the "language" is very strong one, and it's too recent to include into an encyclopedic article right now.
I think both sections should be trimmed, but it should be discussed before. Artem.G (talk) 13:19, 22 June 2022 (UTC)
Article rewrite
I've BOLDly rewritten and reshuffled most of the article, including removing a large amount of WP:SYNTH and miscellaneous other poor organisation, along with drastically slashing the weight of the open source implementaitons. I'd welcome any comments people have about these changes. BrigadierG (talk) 16:00, 18 July 2022 (UTC)
- I'll be honest, I kind of half-assed this article when I wrote it in January 2021, and I certainly haven't been keeping it up to date since then. I think that the open-source implementations are mostly not relevant, and a lot of the stuff that went out was not very good. On the other hand -- I see you're in the middle of rewriting, so I don't know if this stuff is going anywhere or if it's just being removed entirely, but it looks like you removed some stuff about the actual implementation of the model (such as
CLIP was trained to predict which caption (out of a "random selection" of 32,768 possible captions) was most appropriate for an image, allowing it to subsequently identify objects in images outside its training set
). If anything, the original article wasn't very good because I skimmed over many details of how the model worked (mostly because I hadn't bothered to read the actual paper yet, lol). Anyway, I will shut up for a bit and wait to see where you're going with this before I form a whole opinion about it. jp×g 18:51, 19 July 2022 (UTC)- My first pass was to kill the UNDUE material and improve article structure, my second pass is to bring the article up to date with more detail. BrigadierG (talk) 10:29, 20 July 2022 (UTC)
- @BrigadierG: Thanks for taking this on, in particular for cutting out the editorializing and for spotting that faked citation.
- However, something seems to have gone wrong with the citations in this "squashing" - unless I'm overlooking something, [1] doesn't make any claims about anatomical diagrams, X-ray images, mathematical proofs, or blueprints. Regards, HaeB (talk) 08:43, 28 July 2022 (UTC)
- Excellent spot. I'll be honest, I didn't read that source, the claim seemed to *so obviously* match the source that I didn't bother verifying. My mistake, great job. I get it, I get it, WP:AGF, but by god, are people just inventing things? BrigadierG (talk) 13:39, 28 July 2022 (UTC)
- Thanks, I have removed it accordingly. It's probably worth checking various other citations too. Regards, HaeB (talk) 03:35, 31 July 2022 (UTC)
- Excellent spot. I'll be honest, I didn't read that source, the claim seemed to *so obviously* match the source that I didn't bother verifying. My mistake, great job. I get it, I get it, WP:AGF, but by god, are people just inventing things? BrigadierG (talk) 13:39, 28 July 2022 (UTC)
Image examples
If it helps alleviate NOT GALLERY concerns, perhaps we can agree on a few good examples of DALL-E images to be featured independently alongside the prose, rather than a dedicated gallery section?
I don't like content disputes, so I'm happy with a compromise here, but it would be a loss not to represent the product with at least a few samples. I have no preference as to which examples. ASUKITE 14:43, 31 July 2022 (UTC)
- We should only be listing images that have something more of note than simply "this is interesting" (along with the rest of WP:ATA). The test for inclusion I think should be as follows:
- 1. Has the image in question been cited by OpenAI or a WP:RS as displaying a significant capability of DALL-E.
- 2. Is that significant capability better covered or covered more widely in RS by an image already included in the article?
- Note that while artists pages may include a significant number of their works, they are not present in isolation - they show a key part of that artist's life or style. That's what distinguishes artistic commentary from WP:NOTGALLERY. BrigadierG (talk) 15:57, 31 July 2022 (UTC)
Thanks. Art isn't usually my topic of choice. I'll see if I can pick a couple of decent samples at some point, now that the novelty of getting access to the beta has passed somewhat. ASUKITE 19:08, 31 July 2022 (UTC)
- I think a small gallery would be really useful for people - there is a discussion of what the model can and can not do, and showing more than one picture in the infobox will demonstrate the capabilities the model have. Artem.G (talk) 09:57, 2 August 2022 (UTC)
- I think a gallery of varying examples would be a great idea. There is a precedent for this as the Japanese page for Stable Diffusion features a gallery of various different styles that the program can generate. Camdoodlebop (talk) 00:49, 11 September 2022 (UTC)
- : I've given it a second thought and I've changed my mind I do not think an image gallery is necessary for this page, count me in as a no. One example should be enough I think. Camdoodlebop (talk) 23:10, 11 September 2022 (UTC)
No mention of Raven's Matrices
There is no mention of Raven's Matrices in the referenced article (https://en.wikipedia.org/wiki/DALL-E#cite_note-dale-25) If someone could find a better reference please do. AcuteTriceratops (talk) 01:28, 12 August 2022 (UTC)
- The source links to DALL-E's blog post, which explicitly mentions Raven's Matrices. I've added that as a source and removed the dubious tab. BrigadierG (talk) 20:44, 13 August 2022 (UTC)
- Former good article nominees
- B-Class Computing articles
- Mid-importance Computing articles
- B-Class software articles
- Mid-importance software articles
- B-Class software articles of Mid-importance
- All Software articles
- B-Class Computer science articles
- Mid-importance Computer science articles
- All Computing articles
- B-Class Technology articles
- WikiProject Technology articles
- Wikipedia Did you know articles