[L] Make refinements to and incorporate P18 based section-level image suggestions
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	CBogen
	Feb 28 2023, 4:29 PM

Description

Based on the first round of section-level image suggestions evaluation results, we will be moving forward with the section alignment and intersection based suggestions, which all scored very well. We will not be moving forward with depicts-based suggestions.

We decided to temporarily pause on section topics/P18 (Wikidata image) based suggestions until we can do more work to refine and evaluate those results -- while those till remain in the pipeline data, we can remove them on the client side. However, the second round of evaluation showed that with updates, section topics/P18 results were much more promising.

This ticket is to do more work to refine and evaluate the P18/section topics based results so that we can hopefully adopt them in the next version of the Growth tool and notifications.

Acceptance Criteria

~~Determine if there are any more refinements we want to make to to the data to improve section topics suggestions, or if we just want to do another round of ambassador evaluation with the current data~~ - see https://phabricator.wikimedia.org/T330773#8721136
Complete refinements, if necessary
~~Run another round of evaluation using https://section-image-suggestions-test.toolforge.org/ with updated data~~ will be covered in T330784

Details

	Title	Reference	Author	Source Branch	Dest Branch
	Section topics suggestions, round 2	repos/structured-data/image-suggestions!28	mlitn	T330773	main

Customize query in GitLab

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		None	T311814 [EPIC] Section-level image suggestions data pipeline
		Resolved		matthiasmullie	T330773 [L] Make refinements to and incorporate P18 based section-level image suggestions

Event Timeline

CBogen created this task.Feb 28 2023, 4:29 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 28 2023, 4:29 PM

CBogen added a parent task: T311814: [EPIC] Section-level image suggestions data pipeline.Feb 28 2023, 4:30 PM

CBogen renamed this task from Second round of manual evaluation for section-level image suggestions focused on P18 to Another round of manual evaluation for section-level image suggestions focused on P18.Feb 28 2023, 6:21 PM

CBogen updated the task description. (Show Details)Feb 28 2023, 6:27 PM

mfossati subscribed.Mar 1 2023, 9:15 AM

mfossati mentioned this in T330852: Skip p18-based image suggestions from newcomer tasks.Mar 1 2023, 11:12 AM

AUgolnikova-WMF mentioned this in T330903: ECs do not receive notifications for p18-based image suggestions.Mar 1 2023, 3:48 PM

I think that we should tackle T330516: Sections with images still appearing in the section-level image suggestions pipeline, T323505#8656938, and ~~T330841: [L] Exclude tables and lists from section alignment-based image suggestions~~ before calling the evaluation round.
Update: the section alignment ticket is actually not needed for p18 data.

In T330773#8657661, @mfossati wrote:

I think that we should tackle T330516: Sections with images still appearing in the section-level image suggestions pipeline, T323505#8656938, and T330841: [L] Exclude tables and lists from section alignment-based image suggestions before calling the evaluation round.

Agreed; this is blocked on the first evaluation for those tickets -- T330784: [M] Another round of manual evaluation for SLIS.

@CBogen totally makes sense 👍

CBogen edited projects, added Structured-Data-Backlog; removed Structured-Data-Backlog (Current Work).Mar 1 2023, 5:56 PM

CBogen moved this task from Triage to Section Level Image Suggestions on the Structured-Data-Backlog board.Mar 6 2023, 5:27 PM

As I understand things, simplified:

sections topics are based on blue links within that section
- e.g. for https://en.wikipedia.org/wiki/Cat#Skeleton, we’d find that “cervical vertebrae” link (https://en.wikipedia.org/wiki/Cervical_vertebrae)
from there, we figure out what entity is linked to that page
- e.g. for “cervical vertebrae”, that is Q900457 (https://www.wikidata.org/wiki/Special:EntityPage/Q900457)
from there, we grab their P18
- e.g. for Q900457, that is https://commons.wikimedia.org/wiki/File:Cervical_vertebrae_animation_small.gif
that image would then be suggested for the “Skeleton” section of the “Cat” article

We’ve found that those suggestions often aren’t good enough.
Unsurprising, because we’re only checking for the image to have a relationship to the topic; unless the topic is highly specific to only a certain subject, it may not have anything to do with the page’s subject (as is the case for above example)

IMO, we should check that an image has a relationship with both the entity associated to the topics, AND the entity associated to the page.
If we’d do that, I theorize that our suggestions would be an awful lot more relevant.

Doing so, however, drastically reduces the amount of suggestions.
I did a quick test, and the amount of available suggestions (based on section topics) would drop from 173961759 (~174M) to 206245 (~206K) - and that is even before filtering out sections with lists, and images already used on page; actual numbers would be even lower.
While significantly lower, it still leaves us a decent amount of (theoretically likely good, I think) suggestions.

From there, I think we could take things further, and reconsider sources other than only Wikidata P18 (which suffered the same issues; here’s a quote from @mfossati about depicts statements: “the intuition is that a given image indeed depicts a section topic, but is unrelated to the actual Wikipedia content where that section topics originates”)

There are 2 benefits to including more sources:

Bringing in more data is (likely) to yield to the subset of suggestions matching both criteria (matching both page & topic) growing exponentially
Bringing in more data and cross-referencing it helps us be more comfortable about some suggestions (i.e. a suggestions that matches in multiple sources is more likely to be relevant than one that matches just 1)

Luckily, we already have some of these things from page-level image suggestions, where we have 4 groups:

suggestions for an entity based on Wikidata P18 (Commons image)
suggestions for an entity based on Wikidata P373 (in Commons category)
suggestions for an entity based on lead image data
suggestions for an entity based on SDC/depicts

I figured I’d give those a shot: generate a bunch of suggestions and evaluate how relevant they subjectively feel for me.

But first, a baseline, for the implementations we already have:

Suggestions based on section alignment
- Accuracy: 66-88%
Suggestions based on topics matching P18 only (which we’ve already discarded)
- Accuracy: 8-12%

What do these numbers mean?
The lower number is good suggestions, the higher number is “ok, but not great” (e.g. relevant to the article’s contents, but belongs in another section)

Note that these ranges should be taken with a massive grain of salt: I only evaluated 50 suggestions each, all on enwiki. And I’m not an expert in 99% of these pages, so I may have misjudged some.
Some filters (e.g. duplicates, sections with tables) had not yet been applied, which may also skew findings slightly (i.e. it is possible that, after filtering out images that are already on page, accuracy is slightly lower)
Still, given that samples are random and have been evaluated consistently, they should be safe to compare.

On to the meat of this comment - here are my findings after sampling/evaluating 50 images for all combinations of image suggestions from other sources, provided that they match both the topic & the page entities:

Suggestions where topic entity matches P18, page entity matches P18
- Accuracy: 73-92%
Suggestions where topic entity matches P18; page entity matches P373
- Accuracy: 72-85%
Suggestions where topic entity matches P18; page entity matches lead image
- Accuracy: 70-84%
Suggestions where topic entity matches P373; page entity matches P18
- Accuracy: 86-98%
Suggestions where topic entity matches P373; page entity matches P373
- Accuracy: 58-76%
Suggestions where topic entity matches P373; page entity matches lead image
- Accuracy: 80-92%
Suggestions where topic entity matches lead image; page entity matches P18
- Accuracy: 72-92%
Suggestions where topic entity matches lead image; page entity matches P373
- Accuracy: 70-82%
Suggestions where topic entity matches lead image; page entity matches lead image
- Accuracy: 66-80%

Judging from the numbers above, suggestions based on section topics appear similarly relevant to suggestions based on section alignment, provided we cross-reference those images with both the topic AND the page.
This seems to be true for all sources tested.
See https://phabricator.wikimedia.org/P45919 for the full list of samples.

Sidenote: because the application of topic & page entities are a little different, I figured I’d also check whether there are substantial differences between sources and type of entity we’ll validate them for.
Broken down in a matrix, here’s the (subjective) accuracy of each source, for each type of entity:

		topic		page
image		72-87%		77-94%
category	74-88%		66-81%
lead		69-84%		73-86%

P18 seems to have a slight edge overall, but the others aren’t far behind.

Note: I was unable to test SDC/depicts statements, but I’m fairly confident scores would be similarly good from that source.

I haven’t yet run a full count on how many suggestions we’d be left with if we were to use all of these sources, but it would definitely bring us back to many millions of section topics based image suggestions.
And judging from my samples evaluation, unlike the current “topic entity = P18” implementation, those are millions of relevant suggestions.

Quick recap:

current section topics-based suggestions are no good; we already knew that and planned to take that out
the reason they’re no good is probably because we don’t also cross-reference topics-based suggestions with the subject of the page
doing so appears to make suggestions relevant; looks to be of similar quality as alignment-based suggestions
...but leaves us with significantly fewer suggestions
it looks like suggestions from additional sources (P373, lead image, probably also depicts) also remain relevant provided we cross-reference both topic & page
...which significantly increases the amount of suggestions again

In terms of work, that would mean:

also cross-referencing images with page entity (in addition to topic entity)
repurpose the work from page image suggestions for these other sources; some refactoring will be needed
investigate/fix the depicts data skew issues (or drop that one as a source, for now)
another round of manual evaluation once the actual work is done to confirm/deny my preliminary findings

1 & 2 aren’t that much work. I’d estimate about an L combined.
3 is unknown & has the potential to be big. But we could decide to skip SDC/depicts for now (although this may eventually require some looking into for page suggestions anyway)

IMO, these initial findings seem very promising, and likely wouldn’t add too much additional workload, and I would recommend we proceed to implement the suggested changes & evaluate them properly (in less quick-and-dirty fashion than above)

matthiasmullie renamed this task from Another round of manual evaluation for section-level image suggestions focused on P18 to Another round of manual evaluation for section-level image suggestions.Mar 23 2023, 1:30 PM

matthiasmullie updated the task description. (Show Details)

CBogen renamed this task from Another round of manual evaluation for section-level image suggestions to [L] Another round of manual evaluation for section-level image suggestions.Mar 23 2023, 2:01 PM

CBogen assigned this task to matthiasmullie.

CBogen updated the task description. (Show Details)

CBogen edited projects, added Structured-Data-Backlog (Current Work); removed Structured-Data-Backlog.

CBogen moved this task from Incoming to Ready for Estimation on the Structured-Data-Backlog (Current Work) board.

CBogen moved this task from Ready for Estimation to Ready for Development on the Structured-Data-Backlog (Current Work) board.

In T330773#8721136, @matthiasmullie wrote:

In terms of work, that would mean:

also cross-referencing images with page entity (in addition to topic entity)

repurpose the work from page image suggestions for these other sources; some refactoring will be needed

investigate/fix the depicts data skew issues (or drop that one as a source, for now)

another round of manual evaluation once the actual work is done to confirm/deny my preliminary findings

1 & 2 aren’t that much work. I’d estimate about an L combined.
3 is unknown & has the potential to be big. But we could decide to skip SDC/depicts for now (although this may eventually require some looking into for page suggestions anyway)

Based on discussion in Slack, we will move forward with 1 & 2 and we will skip 3 for now. 4 will be covered in T330784.

CBogen renamed this task from [L] Another round of manual evaluation for section-level image suggestions to [L] Make refinements to and incorporate P18 based section-level image suggestions.Mar 23 2023, 2:04 PM

CBogen mentioned this in T330784: [M] Another round of manual evaluation for SLIS.Mar 23 2023, 2:07 PM

CBogen mentioned this in T330934: [L] Send image suggestion notification (for article + section) to experienced users.Mar 23 2023, 2:13 PM

matthiasmullie moved this task from Ready for Development to Doing on the Structured-Data-Backlog (Current Work) board.Apr 3 2023, 7:11 PM

matthiasmullie moved this task from Doing to Code Review on the Structured-Data-Backlog (Current Work) board.Apr 18 2023, 12:08 PM

mfossati updated https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/merge_requests/28

Section topics suggestions, round 2

mfossati merged https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/merge_requests/28

Section topics suggestions, round 2