Wikidata:Properties for deletion/P7859: Difference between revisions

From Wikidata
Jump to navigation Jump to search
Content deleted Content added
ISNIplus (talk | contribs)
Dcflyer (talk | contribs)
m please do not copy my signed comment from elsewhere
Tags: Reverted 2017 source edit
Line 415: Line 415:
One user wants to revive the usage of P7859 :
One user wants to revive the usage of P7859 :
-------
-------

:Hi, @[[User:ISNIplus|ISNIplus]]. Please revert your batch [https://editgroups.toolforge.org/b/QSv2/237296/ 237296].
:It's troubling that mismatched and erroneous identifiers ({{P|244}}) were [[Special:Diff/2245179123|intentionally]] added to potentially ~ 4000 (or more) items, and even more so with the expectation that other editors become obliged to fix the errors that you introduced into Wikidata, instead of yourself, as was stated again in a revert edit summary [[Special:Diff/2245418369|comment]] to [[User:MSGJ|Martin (MSGJ)]]. Especially given the fact that you were [[Special:Diff/2245327364|advised]] by [[User:Epìdosis|Epìdosis]], ". . . but since the number of users fixing these constraint violations is very low, I would prefer not to overburden them, so I think the best option would be solving them manually without doing this passage."
:Here are just a few examples: the addition of {{Q|Q1236085}}'s P244 to {{Q|Q2943}}, the addition of the LC Name Authority File (LCNAF) identifier for Rodan (Musical group) to {{Q|Q602}}, and the addition of the LCNAF identifier for World Health Organization Country Office in Pakistan to {{Q|Q7817}}.
:Some or all of the following additional batches introduced incorrect values of P244 into items, as well: [https://quickstatements.toolforge.org/#/batch/237149 237149], [https://quickstatements.toolforge.org/#/batch/237148 237148], [https://quickstatements.toolforge.org/#/batch/237147 237147], [https://quickstatements.toolforge.org/#/batch/237146 237146], [https://quickstatements.toolforge.org/#/batch/237145 237145], [https://quickstatements.toolforge.org/#/batch/237144 237144]. -- [[User:Dcflyer|DCflyer*]] ([[User talk:Dcflyer|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 22:34, 8 September 2024 (UTC)
------
------



Revision as of 23:43, 8 September 2024

P7859 (P7859): (delete | history | links | entity usage | logs | discussion)

As recently announced by a banner in WorldCat Identities website, "The WorldCat Identities web application will be retired and shut down in the coming months and the data is no longer being updated. The most recent version of the data is from July of 2022. As OCLC continues to build out the WorldCat Entities ecosystem, please use it as a source for persistent Person identifiers. https://id.oclc.org/worldcat/entity" (note: WorldCat Entities is represented here by WorldCat Entities ID (P10832)). Whilst I usually support keeping the values of obsolete external IDs for historical purposes, I think in this case keeping it's not worth: the great majority of 1.93 M IDs are either VIAF-based or LCCN-based and the very few "np" and "nc" IDs (4k) have often a dubious identification value. My proposal is deleting the property as soon as the website will effectively go offline (while the website is still online, I would keep it), so probably before the end of 2023. —Epìdosis 21:20, 2 March 2023 (UTC)[reply]

The proponent and the creator of the property have been contacted in their talk pages; this page has also been linked in the talk pages of the templates indicated in the {{ExternalUse}}. --Epìdosis 21:40, 2 March 2023 (UTC)[reply]
I'm the original proponent and I agree to delete it down when the WorldCat Identities website is shut down. "either VIAF-based or LCCN-based" does not diminish its value since one cannot pick one of the two alternatives without having the WorldCat Identities data. But without the website, it's useless -- Vladimir Alexiev (talk) 06:35, 17 March 2023 (UTC)[reply]
@Epìdosis: I don't see why it would need to be deleted. The "worth" is exactly historical purposes and it costs nothing, I'd assume. Cheers, Ederporto (talk) 14:41, 14 March 2023 (UTC)[reply]
I'll object to that. The presence of the np and nc IDs indicate that there are library records related to the associated items. They are a big clue to users and editors that there will be relevant information in library catalogues to retrieve and link to that item. I'd recommend marking them as deprecated statements with reason for deprecated rank (P2241)withdrawn identifier value (Q21441764). From Hill To Shore (talk) 19:08, 27 March 2023 (UTC)[reply]
@From Hill To Shore: Please keep it simple. If we have two WorldCat ids users need to know that identities are not entities or was it tentities? Now translate this into Arabic, Japanese, and Hindi. Many of the WorldCat ids are outdated or wrong. WorldCat ids have changed over the years. Leaving the mistakes and creating the opportunity for new ones is not a good solution. --Kolja21 (talk) 23:48, 27 March 2023 (UTC)[reply]
I am not sure why you are concerned about translation. The whole premise of Wikidata is that statements are machine-readable and easily translated into any language. If reason for deprecated rank (P2241)withdrawn identifier value (Q21441764) is not translated into a particular language, simply add the relevant labels to the property and the item. Also, I have no idea why you are urging me to "keep it simple." Where do you draw the line on wiping deprecated information from our database in the interests of keeping things "simple"? I am objecting here because an editor is proposing unilateral action to delete ahead of this discussion being concluded. If more editors join the discussion and disagree with my position, then the consensus will be against me and deletion will proceed. From Hill To Shore (talk) 05:35, 28 March 2023 (UTC)[reply]
There are already reason for deprecated rank (P2241)withdrawn identifier value (Q21441764) for withdrawn identifier values. So we can't use this qualifier for a project ceased to exist. --Kolja21 (talk) 13:19, 28 March 2023 (UTC)[reply]
I'm sorry but I have absolutely no idea what point you are trying to make in your comment there. "Because the statement exists, we can't use the statement." From Hill To Shore (talk) 15:37, 28 March 2023 (UTC)[reply]
It's not so difficult. withdrawn identifier value (Q21441764) is used for a single withdrawn identifier. If a project ceased to exist this is something else. If you have problems to understand this distinction you might understand why users will have problems distinguishing between six properties connected to WorldCat. This is what is meant with: "Keep it simple." --Kolja21 (talk) 17:13, 28 March 2023 (UTC)[reply]
If you think withdrawn identifier value (Q21441764) is not the best description for this deprecation then simply create a new item with a deprecation reason you find suitable. "I don't like your choice of reason," is not an argument to delete instead of deprecate. As we have seen through the life of Worldcat Identities, many IDs have changed from "nc"/"np" prefixed IDs to "VIAF"/"LCCN" prefixed IDs. This is because the nc/np entries are for items in the library catalogue and libraries are likely to generate new IDs for them over time. There is a strong likelihood that this behaviour will continue and the residual nc/np items will gain WorldCat Entities ID (P10832) over a period of time. Keeping the nc/np entries as deprecated will make it easier to find matches later rather than have us repeat the identification process all over again. I have no problem with removing P7859 (P7859) when we have a WorldCat Entities ID (P10832) present. Your focus is on preventing user confusion, which I don't see as an issue, unless you are advocating the removal of all deprecated information (what makes this case special compared to any other case of deprecation?). My focus is on preserving the useful curation work we have completed that may help us continue to match items to new library IDs. From Hill To Shore (talk) 07:45, 29 March 2023 (UTC)[reply]
"My focus is on preserving the useful curation work we have completed" - which of the P7859 statements are result of such work and would you support that those that are not can be deleted? 77.191.135.37 03:18, 17 April 2023 (UTC)[reply]
  •  Keep and link to archived page. WorldCat identities indexed the names of entities in languages other than English, and as far as I can tell WorldCat entities is effectively English-only. Take a look, for example, at the item for محمّد سعد الله خان کھيتران: Muhammad Saad Ullah Khan Khetran (Q113960737) ; this person's name in their native language Saraiki was on their WorldCat identities page but not on the entities page. While it may be noted that the Library of Congress link lists 6 native labels, 5 out of 6 of them are incomplete and/or spelled incorrectly. Most writers of languages that are not one of a few widely spoken ones like English, French, etc. have erroneous information recorded throughout in their records in various databases. So we could really use any and all links to information which can be cross referenced to help determine the correct information. When and if this information is available via WorldCat entities is when the deletion of this property should be discussed.
عُثمان (talk) 18:24, 24 April 2023 (UTC)[reply]
@Grimes2: A rar visitor. How did you find this discussion?  Comment BTW: P7859 and P10832 should be displayed one below the other so that the comparison is easier. 2A02:2454:986D:F700:493E:E780:1887:62CE 03:38, 7 May 2023 (UTC)[reply]
  •  Keep: Certain until P10832 is populated following a migration strategy. Especially uaeful when the p7859 refirects to the new P10832 entity. -- DeirgeDel tac 20:30, 29 May 2023 (UTC)[reply]
  •  Comment: May I be so bold as to venture that the evidence it overwhelming that P7859 is retained until P10832 is populated. There is then the question of what is using the P7859 value out of Wikidata? Is it only the authority control template or something else? While that debate may not be helpful a more productive approach might be to agree a roadmap on how P10832 is to be populated. There's bits of that spread throughout the above vote but it would be helpful to see an agreed way forward in one place. That is focus on getting P10832 populated rather than P7859 deleted. Thankyou. -- DeirgeDel tac 10:48, 1 June 2023 (UTC)[reply]
    Åma (Q21477069): mountain in Antarctica, P7859 = https://worldcat.org/identities/viaf-168989993/ = 5 IDs = many things, but no mountain in Antarctica. Most WorldCat Identities IDs were imported through other IDs and never checked. The only thing that is overwhelming is bad quality of this property. Taking a secondary source for migration while Wikidata has the original source (LCAuth etc.) would be counterproductive. It's a basic rule: Use citable and original sources! --Kolja21 (talk) 17:22, 1 June 2023 (UTC)[reply]
    P7859's which redirrect to viaf and not worldcat entities are of no/limited use in determine P10832. P7859 which redirect to worldcat entries are more useful. I have no clue about any P7859 value not directing to a worldcat entity. And I'd love to look at this more but I've not got the resource. Thankyou. -- DeirgeDel tac 23:07, 1 June 2023 (UTC)[reply]
  •  Keep as per previous comments. I think it's of high importance for archival use and it could simply be modified to use a "Depreciated rank" instead of being removed (like many of the old Google-related IDs). Also noticed a user has been gung ho on deleting these before a consensus has been reached. I hope he's as motivated to restore them if the decision is made to keep.--Bricks&Wood (talk) 04:14, 4 September 2024 (UTC)[reply]

Migration to P10832

enWikiQuote migration case

As spin-off from Worldcat changes I've been exploring maintaining some functionality from loss of the "Worldcat id" template by replacing with authority control collecting P10832 from the associated wikidata item. There's only about 90 articles/Wikidata items involved with just over a handful having P10832 already populated. So I have been looking as getting the rest of 90 populated with P7859. I have kluged up a program which uses P7859 from wikidata to look up the P10832 via the redirect in a number of cases and produces Wikitable output as well as quickstatements input ( it avoids producing QS if P10832 is already populated). I've run in quickstatements batch 206942 a small batch of 5 if anyone wishes to make any comments. There's a little more information at q:en:User:DeirgeDel/xpop2 and q:en:Wikiquote:Village pump#P10832 population task plan F. Thankyou. -- DeirgeDel tac 12:12, 30 May 2023 (UTC)[reply]

Due to an unexpected need to clear some of my to-do list I've boldy gone ahead and added the P10832's needed by the set 90 artcles needed it in Wikiquote. I had to do a handful manually and a frequent reason was the Wolrdcat entity not having an English Label. ✓ Done -- DeirgeDel tac 23:13, 30 May 2023 (UTC)[reply]

Removal of P7859 before consensus reached

I observe P7859 vales are being removed. I believe this is before consensus has been reached in the above discussion. In particular there is the interesting use-case of Joseph M. Crofts (Q115943526), an item created by myself when my DeirgeDel account was named Deirge Ó Dhaoinebeaga. @Epìdosis: removed the P7859 statement at Special:Diff/1906578499. I recovered that statement, the P7859 value I had set was "lccn-nb2002-21760" and unfortunately that linked to this Worldcat identity URL which redirected to notfound. However I remembered a "trick" "rule of thumb" from work on Wikiquote that is a Worldcat Identity has two hyphens sometimes changing the second hyphen to a zero would produce a worldcat identity that would redirect to an URL that would identify the required P10832 Worldcat Entity Id value. Thus from this change the link was generated which Worldcat sent to the redirect target URL which exposed the Worldcat Entity ID "E39PCjG8wDQPxYPBqJMgkxRvQC" for P10832 as well as links to the VIAF ID and the Library of Congress Name Authority File ID. While Epìdosis has been helpful adding additional identifiers I am concerned that removal of P7859 makes it difficult to locate the P10832 and suggest these removals need to be stopped and possibly even reverted. -- DeirgeDel tac 13:35, 2 June 2023 (UTC)[reply]

Hi @DeirgeDel:, the removals I performed today of a few thousands of values (which are BTW concluded) where based on 3 criteria: 1) IDs which, due to botched format, were invalid or anyway unusable to get new P10832 values (https://editgroups.toolforge.org/b/QSv2T/1685691070809/ + https://editgroups.toolforge.org/b/QSv2T/1685691187036/); 2) IDs which were present in items not containing a VIAF ID (P214) ID (https://editgroups.toolforge.org/b/QSv2/207071/) - on the basis of the reasoning that, in many cases, it could have happened that the VIAF ID from which the WorldCat ID was copied could have been removed because of a mismatch or a conflation and thus the WorldCat ID could be itself a mismatch or a conflation; 3) IDs which were referenced with a VIAF ID (P214) ID which is not present anymore in the item (all the other batches of today) - on the basis of the reasoning that, in many cases, it could have happened that the VIAF ID from which the WorldCat ID was copied could have been removed because of a mismatch or a conflation and thus the WorldCat ID could be itself a mismatch or a conflation. If we want to adopt caution in the import of WorldCat Entities IDs from WorldCat Identities IDs, I though that the above cases are doubtful enough to be excluded from the conversion, and thus were to be removed (this is especially true for cases 1 and 3). --Epìdosis 13:45, 2 June 2023 (UTC) P.S. and https://editgroups.toolforge.org/b/QSv2/207093/ removes only deprecated values, which are surely not suitable for conversion to P10832. --Epìdosis 14:02, 2 June 2023 (UTC)[reply]
I think you might consider what it says in Help:Deprecation. Speficially deprecating but not deleting properties that are "now known to be wrong, but were once thought correct". What is your reason why your deletions are distinguishable from the general rule? --William Graham (talk) 16:55, 2 June 2023 (UTC)[reply]
The usefullness of keeping as deprecated the IDs which are obsolete or inexact because of a conflation lies mainly in avoiding that they are readded with normal rank; since here the database is defunct, there is no risk of readdition. Since the discussion above is mainly centered on the need of keeping these IDs only because they are useful for finding new P10832 values, it follows that the IDs which cannot be used for this aim, or that would be harmful if used for this aim (because they could lead to adding mismatched IDs), should be deleted, IMHO. --Epìdosis 20:14, 2 June 2023 (UTC)[reply]
@Epìdosis: you shouldn't be removing this property while this deletion request is still open. That's really bad practice and as an administrator on this project you should lead by example.
What you should do is apologize for being to early, undo your batches and wait for a not involved administrator to close the deletion request. Multichill (talk) 14:12, 3 June 2023 (UTC)[reply]
Hi @Multichill:, I partially agree. I apologize for the batch numbered 2 (https://editgroups.toolforge.org/b/QSv2/207071/), which removed the IDs which were present in items not containing a VIAF ID (P214) ID, and I'm now undoing it; although I'm still convinced that some percentage of these IDs is surely there because a VIAF was previously present, then was removed because it was perceived as imprecise or conflated, but the WorldCat Identities ID still remained there, I have effectively no precise clue of how high this percentage is, so I think it is reasonable restoring the entire batch. However, for the other batches I personally don't agree, for the following reason: leaving aside the fact that the property is being deleted, I am still convinced for the above motivations that the IDs removed in the batches 1 and 3 have never been pertinent to the item containing them and thus had to be removed simply because they were mismatched; restoring them and using them for copying new P10832 values could lead to worse conflations, as I (and others) have explained above. For more context, I have recently also removed a batch of 13k VIAFs which were conflated (per this discussion) and I have received no complaint so far; I repeat, it is not a matter of removing values of a identifier proposed for deletion, it is a matter of removing IDs which are mismatched (an operation which is commonly done for identifiers not proposed for deletion). Of course, if you have evidence that my reasoning is wrong and that the IDs removed in batches 1 and 3 are not mismatched, I will apologize also for them and I will immediately undo them. Thanks, --Epìdosis 15:15, 3 June 2023 (UTC)[reply]
That's fine with me. Multichill (talk) 15:22, 3 June 2023 (UTC)[reply]

::::Before talking about migration we need think about how to improve P7859. The removal of wrong IDs is part of the maintenance work. Let's start with Q6243526#P7859: 6 IDs: 5x VIAF, 1x LCCN. After this work is done there are 1.918.337 IDs left to be checked. When all IDs have been checked, we can talk about migration. --Kolja21 (talk) 17:56, 3 June 2023 (UTC) {{Small|I'm boldly suggesting this good faith has in some ways drifted from the topic Removal of P7859 before consensus reached and I've boldly forked it to a new section where that might be moved to in more detail. I hope that OK with everyone. -- DeirgeDel tac 23:40, 3 June 2023 (UTC) Template:Deindent I've had and and having a few pretty busy RL days and I can't spend as much time on this as I'd like. Can I place some comment here in simple form, and I apologise if I'm being stupid. Some of the above discussion can be really hard to read if not read very carefully. If I'm correct @Epìdosis: wishes to run data cleansing batches: -- DeirgeDel tac 23:40, 3 June 2023 (UTC)[reply]

  • P7859 elements will be removed when they refer to "viaf values" ... "viag*. These appears to provide no additional help in identify a P10832 value that the P214 value itself does not already provide. -- DeirgeDel tac 23:40, 3 June 2023 (UTC)[reply]
  • P7859 values that are of the form "lccn*" are not being removed, certainly at this stage, even if they do not currently provide a link a WorldCat Entity Id. -- 23:40, 3 June 2023 (UTC)
  • VIAF itself at times requires data cleansing; I think VIAF sometimes has duplicate Id's that need to be merged. -- DeirgeDel tac 23:40, 3 June 2023 (UTC)[reply]

Apologies if I've got the wrong end of the stick. -- DeirgeDel tac 23:40, 3 June 2023 (UTC)[reply]

@DeirgeDel: of course I confirm that unfortunately VIAF has a lot of duplications (and conflations, more worryingly); regarding the two previous points, it's not exactly what I meant in point 3, I try to explain it differently: nearly all WorldCat Identities values have been imported from VIAF values - so, in cases where the VIAF value A used to import a WorldCat Identities value X isn't present anymore in item Z, my batches just removed the WorldCat Identities value X (whichever form it had, either "viaf-" or "lccn-"), on the basis of the high risk that X was probably a residuate of a conflation of item Z with VIAF value A representing different entities. --Epìdosis 16:50, 4 June 2023 (UTC)[reply]

Pre-migration data cleansing of P7859

I've felt @Kolja21:'s comment in the Removal of P7859 before consensus reached has moved slightly away and I'd like to look at this use case specifically and perhaps migration / P10832 population separately. I hope to respond detail to this particular use case and I'd not want that wrapped in the above discussion:- Before talking about migration we need think about how to improve P7859. The removal of wrong IDs is part of the maintenance work. Let's start with Q6243526#P7859: 6 IDs: 5x VIAF, 1x LCCN. After this work is done there are 1.918.337 IDs left to be checked. When all IDs have been checked, we can talk about migration. --Kolja21 (talk) 17:56, 3 June 2023 (UTC)[reply]

In this case (as for every other case) we look at the LCCN value, "lccn-no2012115736". That links to a Worldcat identities record that is redirected to the WorldCat Identity Record E39PCjD79fj4FJ7m3KHHvFBWrC. Thus P10832 is a candidate for the value of P10832 and quite frankly The fact that the WorldCat Identity record says it is associated with Wikidata item id Q6243526. Thus I have every confidence P10832 could be rightfully set to the value of "E39PCjD79fj4FJ7m3KHHvFBWrC]" for Q6243526 regards of any mess with state of P7859. In many ways it is easier to focus on getting a value P10832 set rather than migrating it from P7859. P7859 is simply one method that might allow this to happen eeficiently.  – The preceding unsigned comment was added by DeirgeDel (talk • contribs).
I fear the case of Q6243526#P7859 is bit more intricate, because also "viaf-" IDs can be used sometimes to get valid P10832; in this case, 2 out of 5 "viaf-" IDs pointed to presently valid P10832 (so, in fact, P10832 is at least triple for this person presently). In fact: viaf-4896153063221319320008 = viaf-288715031 = viaf-306425053 = https://viaf.org/viaf/306425053/; but lccn-no2012115736 = https://id.oclc.org/worldcat/entity/E39PCjD79fj4FJ7m3KHHvFBWrC.html + viaf-1792159474074827660233 = https://id.oclc.org/worldcat/entity/E39PCjJVGdCdwDVyXRMrJbfd6X.html + viaf-610152636065020050681 = https://id.oclc.org/worldcat/entity/E39PCjF3G4jYpmVQvrJWvhGMbm.html. --Epìdosis 16:50, 4 June 2023 (UTC)[reply]

Practical way of migrating to P10832

Given the above discussion, I would try to outline a possible path of migrating present P7859 values (WorldCat Identities) to P10832 values (WorldCat Entities). I would start dividing P7859 values in 3 parts:

  1. IDs leading to "Not Found" page (e.g. https://www.wikidata.org/w/index.php?title=Q20009745&oldid=1907624955#P7859)
  2. IDs leading to a VIAF cluster (e.g. https://www.wikidata.org/w/index.php?title=Q21542865&oldid=1908131483#P7859)
  3. IDs leading to a WorldCat Entities entity (e.g. https://www.wikidata.org/w/index.php?title=Q314447&oldid=1890759824#P7859)

I would firstly propose that IDs of the first two parts should be removed; for the third part, they should remain there until the migration is completed. Secondly, if we consider safe adding new P10832 values on the basis of P7859 values (I personally have some doubts, especially for non-human items, but there seems to be consensus for this operation), the migration could simply consist in adding new P10832 on the basis of the redirects of P7859 values (if judged useful, with some sort of reference; otherwise, simply with no reference). I have exemplified the proposed removal here and the proposed migration here (for the proposed migration, we could also decide to use references, as I said). If there is consensus for these two operations, I can perform them through QuickStatements slowly in the next weeks. --Epìdosis 16:50, 4 June 2023 (UTC)[reply]

@@Epìdosis:: I thank you for looking at this in a practical way and I must apologise for the limited time I can put in on this. My focus is on "can-do" of population of accurate P10832 values rather than the removal of other values that are potentially of some us/partial use in P10832 population. I acknowledge that work entities are more problematic in general and that where P7859 yields not found there is a high chance of issues with other identifiers as well, your identification of case 1 Museo di storia naturale di Rosignano (Q20009745) indicating an issue at a bigger level that P7859 and probably ought to be dealt with holistically. Equally an lccn-xxxxx-xxxx with two dashes while leading to not found is in case 1 but can sometimes be resolved by converting the second hyphen to a zero. In terms of your case two example Marco Zanetti (Q21542865) that links to a personal VIAF id and id like to spend time looking at that in detail but I am concerned that case does not extrapolate to every case in that identified group and that in some cases retention might help resolve difficulties. In terms of your example of exemplified removals I take the case of Leonard Lansink (Q100312) where the P7839 entry indicates a WorldCat Entity Id. record might exist. Doing a [1] person search for WorldCat Entities] yields The Worldcat Identity Id record E39PBJtmpJmHRBf7MYHXXWM9Dq Its then important to verify the data in E39PBJtmpJmHRBf7MYHXXWM9Dq matches what is held in the Wikidata item record, e.g. (VIAF ID="303829074", GND ID="115196137" ...) and it is safe to set P10832 to "E39PBJtmpJmHRBf7MYHXXWM9Dq". And that comes on the references to set for P18032. It is possibly useful to set a value to indicate P10832 wwas derived from a WorldCar redirect (on a particualr date), but it would also be useful to indicate that the contents of the WorldCat Entity Id.record contains referneces that indicate corresoonds to the Wikidata Item Id. This is not a reference of the form the GND database has a reference to WorldCat Entity Id (which would be nice but at least isn't happening for the moment) but rather that WorldCat Entity Id record confirms it correspond to GND ID which the Wikidata Id also confirms in relates to. In summary I will b opposing removals certainly for the moment but will be supporting proceeding with case (3) for person entities especially if referencing is agreed and ideally if a method of automation validation can be agreeed. Thankyou. -- DeirgeDel tac 22:24, 4 June 2023 (UTC)[reply]

New proposed plan for migration to P10832

Nine months after my previous plan, I would like to draft a second, more detailed one, articulated in the following 5 ordered steps:

  1. find P10832 through Library of Congress authority ID (P244): use the links in the third column of https://qlever.cs.uni-freiburg.de/wikidata/NOsygr to find P10832 values and add them with references constructed in this way: matched by identifier from (P11797)Library of Congress Authorities (Q13219454) + Library of Congress authority ID (P244)id + retrieved (P813)retrieval date
  2. find P10832 through VIAF ID (P214): use the links in the third column of https://qlever.cs.uni-freiburg.de/wikidata/rktIZX to find P10832 values and add them with references constructed in this way: matched by identifier from (P11797)Virtual International Authority File (Q54919) + VIAF ID (P214)id + retrieved (P813)retrieval date
  3. find P10832 through P7859: use the links in the third column of https://qlever.cs.uni-freiburg.de/wikidata/TbOMjH to find P10832 values and add them with references constructed in this way: matched by identifier from (P11797)WorldCat Identities (Q76630151) + retrieved (P813)retrieval date
  4. delete P7859 values in main statements (query: https://qlever.cs.uni-freiburg.de/wikidata/Zi6JSH) and references containing P7859 values (query: https://qlever.cs.uni-freiburg.de/wikidata/i1e1mT)
  5. delete P7859

Any comments are welcome! --Epìdosis 18:01, 16 March 2024 (UTC)[reply]

I like this series of steps! I am attempting to compile lists of values with respect to the first three steps, although I don't know how fast I can make this compilation happen. Mahir256 (talk) 18:54, 16 March 2024 (UTC)[reply]
So preparation of Step 1 is almost done: the bot code is ready to be tested and the list of P244 values is almost done being checked against WorldCat, so I hope that within the next few days I can start that part of the migration. (Thanks @Epìdosis: for doing some early QS runs for that step!) Mahir256 (talk) 19:47, 25 April 2024 (UTC)[reply]
Alright, for those P7859 values containing "lccn-" that resolve to P10832 values, step 1 and step 4 (with respect to main statements) have begun. Mahir256 (talk) 16:32, 29 April 2024 (UTC)[reply]
As part of point 3 the above plan, of which points 1 and 2 have now been completed, I will tomorrow start a batch removing nearly 4.5k "nc" and "np" values, which cannot be converted into WorldCat Entities ID (P10832). I will link it here. Epìdosis 18:47, 24 June 2024 (UTC) This is the batch.[reply]

Identifiers containing "viaf-"

It appears that for many P7859 values based on a VIAF ID (P214), the P7859 value now merely redirects to viaf.org, rather than to oclc.org as might have happened before. Given that the path to migrate to P10832 is therefore removed for those values which merely redirect to viaf.org, should the values in question be simply removed? Mahir256 (talk) 16:53, 29 April 2024 (UTC)[reply]

Imho the VIAF based IDs should be removed since VIAF is a cluster which also contains namesakes. --Kolja21 (talk) 21:50, 20 May 2024 (UTC)[reply]
Mahir, can your bot detect when viaf-something redirects to viaf.org? I would favor a deletion in these cases, if the viaf id is present in P214. ISNIplus (talk) 13:43, 29 August 2024 (UTC)[reply]

Continued batch removal of IDs that can be replaced

I strongly object. Example:

  1. https://www.wikidata.org/w/index.php?title=Q61134582&diff=prev&oldid=2101089930

ISNIplus (talk) 20:59, 15 May 2024 (UTC)[reply]

Stop the Twofivesixbot!

Yesterday I had already called for the bot to be stopped immediately on the discussion page of Mahir256, who runs the Twofivesixbot. Both he and Epìdosis rejected this request and Epìdosis justified this by saying that no one had spoken out against the migration. However, this statement is incorrect because I, for example, as a clear supporter of keeping the property, was not informed of the new discussion, for example via a ping.

In my view, it was more the case that the supporters of deletion looked for a new way to delete the property without contacting the supporters of keeping the property, even though there is no consensus. That is why I am again calling for the bot to be stopped immediately and for a vote on whether these edits should be carried out or not. --Gymnicus (talk) 16:29, 13 June 2024 (UTC)[reply]

As a precisation, in User talk:Mahir256#Twofivesixbot deletes statements I meant that no one had spoken out against the migration after the bot run had started; you are right observing that I did not specify explicitly "after the bot run had started". For the remainder, I am still convinced of what I wrote there. --Epìdosis 19:27, 13 June 2024 (UTC)[reply]
But now I have spoken out against this bot run and therefore I continue to demand that this run be stopped immediately and that a proper vote be held on whether this migration is wanted or not. If Mahir256 does not stop the bot run within 24 hours, I will post a complaint on the administrator board. --Gymnicus (talk) 21:28, 13 June 2024 (UTC)[reply]
There was a thorough discussion and the bot does a good job. No reason to stop it because of a single user. --Kolja21 (talk) 01:23, 29 June 2024 (UTC)[reply]
There was no discussion, at least none in which the opposing side, which was clearly against any deletion, was even heard. As already mentioned, the deletion advocates are looking for a way around this so that the deletion can be carried out without consensus. The fact that this is being done by two administrators is, in my view, outrageous. Based on my experience, I am used to nothing less from Mahir256, but I am more than disappointed with Epìdosis. And the fact that the other administrators on the admin board did not even ask for a statement from the two of them is beyond audacity and shows once again that administrators are favored here and are seen as infallible. Gymnicus (talk) 15:26, 29 June 2024 (UTC)[reply]

2024-08

@Epìdosis: can you post statistics:

  1. ids starting viaf-
  2. ids starting lccn-
  3. ids starting with something else

? Recently each viaf- based id that I found redirected to viaf org. Would be nice to have these removed soon. ISNIplus (talk) 13:48, 29 August 2024 (UTC)[reply]

Presently, out of 530901 values of P7859, 338759 with viaf- and 192142 with lccn-. Epìdosis 14:04, 29 August 2024 (UTC)[reply]
Thank you. The first query restricted to humans and those where the part of the id after viaf- is also present in P214 https://w.wiki/B3zL : 244280 results. ISNIplus (talk) 10:49, 30 August 2024 (UTC)[reply]

2024-09

@Epìdosis, Kolja21: I only found the following results when clicking on IDs starting with:

  1. viaf- : redirect to viaf
  2. lccn- : not found

Regarding presence of the corresponding value in P214 and P244:

  1. present: P7859 can be removed
  2. not present:
    1. in case of VIAF - remove, since it is only a cluster ID
    2. in case of LC
      1. test if the value exists in WD somewhere else
      2. if not in WD, test if it exists in source
      3. if in source, test if it could be added to an existing item

If those "present" are removed it will be visible how many are "not present" and how many belong to humans or have a WorldCat Entities ID.

Opinions? ISNIplus (talk) 12:50, 1 September 2024 (UTC)[reply]

I agree, IMHO you can proceed removing all the "present" ones. Epìdosis 16:23, 1 September 2024 (UTC)[reply]
+1. No use to keep redirects to VIAF. --Kolja21 (talk) 16:42, 1 September 2024 (UTC)[reply]

VIAF - 2024-09

@Epìdosis, Kolja21: "present" for viaf- yields only one result https://w.wiki/B5LW : Q86431843 - the page is protected, Epìdosis, can you remove? The remaining P7859 "viaf-" items can be split into two groups:

  1. some P214 exists https://w.wiki/B5Lb : 1592 results
    1. the first result is "Paul Oskar Kristeller (Q63895)" - https://worldcat.org/identities/viaf-1504155044720972520000/ redirects to http://viaf.org/viaf/56612115 - that value is "present" in P214.
    2. 1354 are humans https://w.wiki/B5M4 - of these 859 have a GND https://w.wiki/B5M7, 1068 an ISNI https://w.wiki/B5MA, 680 a LC https://w.wiki/B5MC and 559 WorldCat Entities https://w.wiki/B5ML
    3. I suggest to remove the P7859 values - P214 is probably better maintained, P7859 could be 1) redirect to a value "present" in P214 2) wrong, 3) be another value or a redirect to another value - but when several values for the same item exist, the cluster is maybe more likely to be merge later - opinion? remove? who would check these otherwise?
  2. no P214 exists https://w.wiki/B5Lf : 344 results
    1. of these 295 are humans https://w.wiki/B5MP - none has WorldCat Entities https://w.wiki/B5Mk
    2. for the first two (Emilio Rúa (Q112898516), Jens G. Nørby (Q112942988)) the link redirected to WorldCat Entities, I added that value and added the viaf- value to P214 and in VIAF an ISNI was present which I added too.
    3. in case of the former two and others ("Jan Skiba" (Q97143763), "Dimitris Kavroudakis" (Q97467354)) the value had been removed by Epidosis but the removal undone by him https://editgroups.toolforge.org/b/EG/98011a0/
    4. several that I checked never had P214 before, so someone added these directly from WorldCat, so they probably have at least one work there, the accuracy could be higher than from normal P214-name-matches
    5. the 295 values could probably be copied to P214 and have the same error rate or lower as when humans normally add a value to P214
    6. I started reviewing the 295 manually, down to 263 now.

ISNIplus (talk) 13:48, 2 September 2024 (UTC)[reply]

I agree about the need of checking manually the point 2 (fortunately not many items), thanks for doing it @ISNIplus:. For point 1, I agree that in the great great majority of cases we should expect the value of P214 to be correct, or at least more correct than P7859, so I would support removing them. Epìdosis 16:08, 2 September 2024 (UTC)[reply]
BTW: There are also two types of errors:
--Kolja21 (talk) 16:10, 2 September 2024 (UTC)[reply]
Sorry, I made more clear what my comment from today referred to, by adding in front of it "VIAF - 2024-09" - so I will not respond to Demmin here as it is lccn- based, lccn- analysis expected earliest tomorrow, since the removal of the "present" values will last until then, but not sure if I will have time tomorrow.
Regarding Lucie Guerín, thank you, I also saw such errors. Sometimes I created a new item, so that the error is less likely to be made again, for L. Guerín too, there is now Q130220212. ISNIplus (talk) 16:36, 2 September 2024 (UTC)[reply]

@Epìdosis, Kolja21: update:

  1. some P214 exists https://w.wiki/B5Lb : 0 results
  2. no P214 exists https://w.wiki/B5Lf : 196 results - should be visible at Wikidata:Database reports/Complex constraint violations/P7859#Value doesn't match P214 if the page is updated

ISNIplus (talk) 03:08, 3 September 2024 (UTC)[reply]

  1. viaf- on non-humans https://w.wiki/B5u5 : 0 results
  2. viaf- on humans https://w.wiki/B5$F : 89 results (value already copied to P214, but some redirect to WorldCat - manual verification ongoing)
ISNIplus (talk) 20:05, 3 September 2024 (UTC)[reply]
down to 39 ISNIplus (talk) 00:48, 4 September 2024 (UTC)[reply]

@Epìdosis, Kolja21: the manual review has ended. No more viaf- in P7859 ( https://w.wiki/B6Cd ). Actions frequently taken:

  1. new items created, because P7859 was misplaced
  2. corresponding P214 was deprecated and marked conflated - value deleted
  3. corresponding P214 claim created, and from VIAF, if present added ISNI, GND, IDref, BnF, LC, BNE, sometimes PLWABN

ISNIplus (talk) 08:26, 4 September 2024 (UTC)[reply]

LCCN - 2024-09-03

I think something went wrong: ISNIplus started deleting P214 properties that have P10832 associated with them, but it is not specified. In the case of manual follow-up care, this has been 100 percent so far (e.g.: Virág Judit Gallery, Budapest (Q105735314), Arcanum Adatbázis (Q56415347), Albert Szent-Györgyi University of Medicine (Q61052428)) Maybe. that the stick should be stopped and thought about. Pallor (talk) 09:07, 3 September 2024 (UTC)[reply]

Thank you! Each of these cases belongs to the batches named "not q5 -P7859 lccn- if corresponding lc value in P244". I added a section headline before your comment. Will respond more later. ISNIplus (talk) 10:37, 3 September 2024 (UTC)[reply]
@Pallor: Sorry for the delay, but I wanted to finish the work on viaf- first.
I am an opponent of removal without adding P10832
  1. see my comment above from 2024-05-15 in section "Continued batch removal of IDs that can be replaced".
  2. I reviewed more than 295 human items manually, it was tedious and no pleasure, see above searching for 295.
Regarding lccn- :
  1. I did check dozens of lccn- manually before, and only found redirects to viaf.org that resulted in "not found".
  2. Mahir (see above) had a bot running following these redirects and adding P10832
The three cases you provided (thanks again!) indicate, that possibly more P10832 exist for none-human items that had a P7859 value. Since the removal was done by QS batches the removal statements could be downloaded and the P7859 values be looked up by a bot. Each batch is named "not q5 -P7859 lccn- if corresponding lc value in P244". ISNIplus (talk) 08:49, 4 September 2024 (UTC)[reply]
The truth is that we have known before that many LCCN IDs have a Worldcat Entity even if the redirection does not work. For example, I only deal with elements related to Hungary, and within that, only those that are on my watch list, that is, a very small part of the entire stock, but I had a WCE record for 90 percent of the elements that I just reviewed, and if we look at public administrative units, then 100 percent. All I could do was manually add the WCE record for the 19 Hungarian counties, with quickstatements for the 175 districts, but I don't want to retrieve the IDs deleted from the settlements. This would be followed by the organizations of which I have listed three here, but approx. there were ten that I checked and found a WCE record for nine (regardless of whether the redirection worked or not).
And I emphasize that these are only on the margins of the data mass, what about the rest? My suspicion is that now hundreds of thousands of values ​​have been deleted that would have helped us to see which elements might have a WCE record. Pallor (talk) 10:27, 4 September 2024 (UTC)[reply]
Re "The truth is that we have known before that many LCCN IDs have a Worldcat Entity even if the redirection does not work." - who is this group of "we", why wasn't I included?
Re "My suspicion is that now hundreds of thousands of values ​​have been deleted that would have helped us to see which elements might have a WCE record." - What is the basis of your "suspicion"? The batches named "not q5 -P7859 lccn- if corresponding lc value in P244" included 119173 commands for removal, with a bit over 100 not done due to "ERROR".
Please 1) give an example where the redirection didn't work 2) say were was that mentioned before 3) explain how P7859 could be helpful. From your mentioned batch of district https://quickstatements.toolforge.org/#/batch/237079 I looked at the first, namely Ajka District Q554544 and P7859 seems to have never been present [2] - so I don't understand why you mention these here (same for 2nd and 3rd, didn't check any further).
ISNIplus (talk) 12:29, 4 September 2024 (UTC)[reply]
1; Epidosis must have known about it, I'm sorry if you didn't inform me about it. Please don't delete more identifiers for now, let's develop a concept together first.
2; You're probably right, not hundreds of thousands, but "only" a hundred thousand (119173), but that's a lot, because we've lost so many potential connections. I'm not saying that every deleted LCCN P7859 had a P10832 counterpart, but I'm definitely saying that the ones that did have a connection, the deletion was a bad decision.
1; I don't understand this now: why do I have to give an example of what you yourself wrote about "I did check dozens of lccn- manually before, and only found redirects to viaf.org that resulted in "not found"." Now shall I give you an example of what you found?
2; P7859 is useful because it points out that this element may have a P10832 counterpart. If we delete this (P7859), the item becomes average, no one will think to look in the WorldCat Entities database to see if there is a record there.
3; This was probably misunderstood: I did not claim that there was a P7859 cancellation in relation to the Hungarian districts, but that I myself am trying to participate in the data cleaning/matching with my modest means. But it is still true that in the case of Hungarian counties and cities, P7859 was deleted from several places where P10832 would have been, but was not entered. It is unfortunate, but I can only assume that this may be true in relation to the administrative units of all other countries. (If you haven't checked the districts any further, can I trust that your check will be complete in the case of the deleted settlements?) Pallor (talk) 00:29, 5 September 2024 (UTC)[reply]
Hope this addresses all (numbers don't correspond to the above numbers):
  1. Re the 119173 non-humans ("not q5 -P7859 lccn- if corresponding lc value in P244"): All I checked redirected to https://worldcat.org/identities/notfound. Their placement could have been right or wrong - but there was no reference and no way to see if they were right. External IDs come mostly without references, since they are the reference itself, but if they lead to https://worldcat.org/identities/notfound then there isn't even that kind of reference. - It is not clear how many had a redirect to a content page, despite my assumption none would have one. But since nothing is lost, first of all the value is still in P244, a bot could be run to check the redirects.
  2. A bot could check all P244 not only those that had P7859 against "https://worldcat.org/identities/lccn-<P244>/". So many more can potentially be found. For a start, the bot could check the 119173. Results could be:
    1. redirect to a value for P10832 exists: the value can be added
    2. redirect to a value for P10832 does not exist: how the presence of P7859 would help? You wrote "that many LCCN IDs have a Worldcat Entity even if the redirection does not work. For example, I only deal with elements related to Hungary, and within that, only those that are on my watch list, that is, a very small part of the entire stock, but I had a WCE record for 90 percent of the elements that I just reviewed, and if we look at public administrative units, then 100 percent." - please give an example. For the districts you found WCE without P7859.
  3. Last but not least, for several human items I don't understand the value of WCE at all, e.g. Talk:Q25930834#WorldCat Entities as a copy of Wikidata .
ISNIplus (talk) 03:35, 5 September 2024 (UTC)[reply]
ISNIplus: my combined answer:
The significance of the presence of P7859 is that it indicates that this item may have a record in the WCE database. Whether the redirection works and it is easy to determine which is the related P10832 record, or the redirection does not work and only a unique search to determine the P10832 record is irrelevant. P7859 is a signal. When you delete P7859 without P10832 in the element, we lose a potential connection.
I see that the WCE database is also developing. I have also been dealing with pairing for months, but only on the marginal line I mentioned earlier. My view is definitely that there are more and more items in WCE that weren't there before, or were there under different names. I couldn't find them before, now I can. You see that a name appears in "text" format (eg: X. Y.'s letter to W. Z.), but before there was no record for X. Y., now there is. Or there is no record now, but there will be later.
We will only find them if we look back from time to time at the previously unpaired elements. The existence of P7859 is a big help for this, which is why I consider the recent deletions to be a serious mistake. Pallor (talk) 21:30, 6 September 2024 (UTC)[reply]
P.s.: Most of what you see here was placed by me (before you note: I was not consistent in using "novalue" and "somevalue"). These indications indicate that there was P7859, so presumably there is (or will be) P10832, but when I checked this it wasn't. This mark is like a Wikidata item where you left P7859 in even though you didn't find P10832. Pallor (talk) 21:36, 6 September 2024 (UTC)[reply]
Köszönöm szépen. In the WCE link one can substitute .html with .json and get structured data, this can help to check correctness of existing IDs. Today I counted humans that have P10832, it's over a 1309241, from VIAF only ISNI and GND have more and LC has a bit less. Regarding the signal I agree, deletion end of August (?) was not considering that items in WCE might exist, even if link is broken. I tried to find a download link for my QS, but couldn't find, maybe it is in editgroups which was broken, when I looked, will check again. Maybe the signal can be brought back using P10832, e.g. value unknown and a lccn- reference, similar to what you did. Unfortunately I have no software to mass check IDs, but if no one else is doing it, I *may* try to do it. ISNIplus (talk) 23:31, 6 September 2024 (UTC)[reply]
ISNIplus I'm still not convinced that what you're doing is good. You delete, you delete, but you don't check if there is P10832. Why are you doing this? Pallor (talk) 21:54, 7 September 2024 (UTC)[reply]
What do you refer to? ISNIplus (talk) 22:02, 7 September 2024 (UTC)[reply]
For example this: Q66669898 or this: Q69815148... Pallor (talk) 23:10, 7 September 2024 (UTC)[reply]
These are links to items. Regarding the first, please explain "I'm still not convinced that what you're doing is good. You delete, you delete, but you don't check if there is P10832." ISNIplus (talk) 23:14, 7 September 2024 (UTC)[reply]
I don't understand what you don't understand.
Do not delete identifier P7859 from elements where you cannot immediately insert P10832. This is what I ask, and this is what the community asks of you. The examples I have listed support the fact that he is constantly going against the will of the community. Please stop this. Pallor (talk) 08:41, 8 September 2024 (UTC)[reply]

LCCN - 2024-09-04 AM

Statistics for items having P7859 with "lccn-":

  1. https://w.wiki/B6Dr all : 64181
  2. https://w.wiki/B6Du human : 44766
  3. https://w.wiki/B6Dv P244 exists : 6229
  4. https://w.wiki/B6Dx P244 is the same : 140
  5. https://w.wiki/B6Dy P244 is the same and P10832 exists : 29
  6. https://w.wiki/B6E3 part after lccn- contains "-" : 723
  7. https://w.wiki/B6E6 part after lccn- contains "-" and P244 exists : 685
  8. https://w.wiki/B6E8 part after lccn- contains "-" and P244 is the same when "-" removed : 541
  9. https://w.wiki/B6EA part after lccn- contains "-" and P244 is the same when "-" removed and P10832 exists : 341
  10. https://w.wiki/B6KK part after lccn- contains "-" and P244 is the same when "-" replaced with "0" : 117
  11. https://w.wiki/B6KC part after lccn- contains "-" and P244 is the same when "-" replaced with "0" and P10832 exists : 70
  12. https://w.wiki/B6Lj part after lccn- contains "-" and P244 is the same when "-" replaced with "00" : 22
  13. https://w.wiki/B6Ln part after lccn- contains "-" and P244 is the same when "-" replaced with "00" and P10832 exists : 16
  14. https://w.wiki/B6JJ no P244 and no P10832 : 57543

ISNIplus (talk) 09:39, 4 September 2024 (UTC)[reply]

Regarding 'part after lccn- contains "-"'

  1. Removal
    1. I checked four of the 341 - in each case redirect didn't work with "-" but did work without "-" and lead to the existing P10832. ISNIplus (talk) 13:13, 4 September 2024 (UTC)[reply]
      The remaining 337 processed via https://quickstatements.toolforge.org/#/batch/237125 ISNIplus (talk) 13:22, 4 September 2024 (UTC)[reply]
    2. I checked four of the 70 - in each case redirect didn't work with "-" but did work with "0" instead and lead to the existing P10832. Remaining 68 processed via https://quickstatements.toolforge.org/#/batch/237127 ISNIplus (talk) 13:44, 4 September 2024 (UTC)[reply]
    3. 16 removed https://quickstatements.toolforge.org/#/batch/237134 ISNIplus (talk) 14:21, 4 September 2024 (UTC)[reply]
  2. Keep but adjust
    1. 199 items: adjust P7859 to P244 by removing "-" https://quickstatements.toolforge.org/#/batch/237130 ISNIplus (talk) 14:09, 4 September 2024 (UTC)[reply]
    2. 47 items: adjust P7859 to P244 by replacing "-" with "0" https://quickstatements.toolforge.org/#/batch/237131 ISNIplus (talk) 14:09, 4 September 2024 (UTC)[reply]
    3. 6 items: adjust P7859 to P244 by replacing "-" with "00" https://quickstatements.toolforge.org/#/batch/237135 ISNIplus (talk) 14:26, 4 September 2024 (UTC)[reply]
  3. 723 down to 38 and no P244 is present. ISNIplus (talk) 14:55, 4 September 2024 (UTC)[reply]
    723 down to 0. ISNIplus (talk) 16:33, 4 September 2024 (UTC)[reply]

LCCN - 2024-09-04 PM

Statistics for items having P7859 with "lccn-":

  1. https://w.wiki/B6Dr all : 63681
  2. https://w.wiki/B6Du human : 44271
  3. https://w.wiki/B6Dv P244 exists : 5775
  4. https://w.wiki/B6Dx P244 is the same : 369 (link with ext. URLs https://w.wiki/B6S4)
  5. https://w.wiki/B6Dy P244 is the same and P10832 exists : 0
  6. https://w.wiki/B6JJ no P244 and no P10832 : 57501
  7. https://w.wiki/B6Sw no P244 exists : 57906

ISNIplus (talk) 16:42, 4 September 2024 (UTC)[reply]

@Mahir256: can you run your bot again on these items to find values for P10832? @Epìdosis, Kolja21: opinion about how to proceed? ISNIplus (talk) 12:50, 4 September 2024 (UTC)[reply]

@ISNIplus: Thanks for your good work. I just stumbled across the ceb issue:
--Kolja21 (talk) 15:42, 4 September 2024 (UTC)[reply]
https://w.wiki/B6Dy "P244 is the same and P10832 exists" increased to 180 due to copying the lccn- value to P244, first that I checked is Rheinfelden, like the Romano d'Ezzelino example. https://www.wikidata.org/w/index.php?title=Q269667&diff=2243669242&oldid=2179408808 ISNIplus (talk) 03:45, 5 September 2024 (UTC)[reply]

@Epìdosis, Kolja21: there are ~63000 lccn- for which no P244 is present (57906 items without any P244). I cannot judge the quality of the LC within P7859, but when cleaning those that contained "-" in the string after lccn- I found several that were - in adjusted form - good for insertion into P244. So, maybe it is best to copy these values to P244, a property which is much better maintained and where people will find inconsistencies and duplicates. If the LC value is preserved that way, one could even delete P7859.

@Bargioni: could you check which P7859 lccn-links are resolving to "not found"? Could moreIdentifiers be extended to look for WCE if P244 exists but not P10832? ISNIplus (talk) 17:29, 4 September 2024 (UTC)[reply]

Thanks very much for your manual checks.
Copying values to Library of Congress authority ID (P244) (I would suggest to leave a reference matched by identifier from (P11797)WorldCat Identities (Q76630151) to these IDs to make it easier checking them afterwards) seems reasonable to me.
I'm sure moreIdentifiers cannot "be extended to look for WCE if P244 exists but not P10832", since it is based exclusively on VIAF and VIAF presently doesn't take into account WCE. For P7859 lccn-links resolving to not found, maybe the program already used by @Mahir256: can be readapted to do it with not much work. Epìdosis 17:44, 4 September 2024 (UTC)[reply]
Batches running for 57948 claims including ref as suggested by you, they are named "+P244,S11797=Q76630151".
moreIdentifiers would just need a call to another URI and work with the response(s), but probably better to have a bot running, could be too many useless calls to WCE. I could create an account ISNIplusBot, and maybe adjust and run existing code. Or maybe I use a local programm, and add results via QS later. ISNIplus (talk) 19:56, 4 September 2024 (UTC)[reply]

LCCN - 2024-09-05 adding WCE but not removing WCI

@Pallor:, two minutes after I added P244 (LC) you added P10832, but didn't remove P7859 [4] - can you explain why? Others then will have to look at the item again. With a longer distance between my edit and yours: Q62075614, Q66361504. ISNIplus (talk) 18:55, 5 September 2024 (UTC)[reply]

I don't delete P7859, which has two reasons:

1; deletions can be done extremely efficiently with a stick when we get to the end of the project and every item P7859 is assigned P10832. 2; the community's support for the cancellation was not clear. I read that P7859 should not be deleted for the time being. Pallor (talk) 00:39, 6 September 2024 (UTC)[reply]

1) "every item P7859 is assigned P10832" - so you don't care if a specific P7859 redirects to a specific P10832? In two of the cases above the redirects from P7859 pointed to https://worldcat.org/identities/notfound . 2) "the community's support for the cancellation was not clear" - For 2a: P7859 for which the P10832 has been added? 2b: P7859 pointing to https://worldcat.org/identities/notfound ? ISNIplus (talk) 00:45, 6 September 2024 (UTC)[reply]

LCCN - 2024-09-06 AM

Statistics for items having P7859 with "lccn-":

  1. https://w.wiki/B6Dr all : 61685
  2. https://w.wiki/B6Du human : 42478
  3. https://w.wiki/B6Dv P244 exists : 61605
  4. https://w.wiki/B6Dx P244 is the same : 57713 (link with ext. URLs https://w.wiki/B6S4)
  5. https://w.wiki/B6Dy P244 is the same and P10832 exists : 0
  6. https://w.wiki/B6JJ no P244 and no P10832 : 80
  7. https://w.wiki/B6Sw no P244 exists : 80

ISNIplus (talk) 03:13, 6 September 2024 (UTC)[reply]

I reviewed the 80 manually, now down to 0. ISNIplus (talk) 15:19, 6 September 2024 (UTC)[reply]

LCCN - 2024-09-06 PM

Statistics for items having P7859 with "lccn-":

  1. https://w.wiki/B6Dr all : 61590
  2. https://w.wiki/B6Du human : 42421
  3. https://w.wiki/B6Dv P244 exists : 61590
  4. https://w.wiki/B6Dx P244 is the same : 57699 (link with ext. URLs https://w.wiki/B6Ru)
  5. ?? P244 is not the same : 61590-57699=3891 (see also: Wikidata:Database reports/Complex constraint violations/P7859)
  6. https://w.wiki/B6Dy P244 is the same and P10832 exists : 0
  7. https://w.wiki/B6JJ no P244 and no P10832 : 0
  8. https://w.wiki/B6Sw no P244 exists : 0

ISNIplus (talk) 15:20, 6 September 2024 (UTC)[reply]

For "P244 is the same : 57699" one could remove P7859, the P244 value is tagged with "matched by identifier from : WorldCat Identities" - so no data is lost, just duplication removed. @Epìdosis, Kolja21: Opinion? ISNIplus (talk) 15:35, 6 September 2024 (UTC)[reply]

I agree that we can remove them. Epìdosis 15:59, 6 September 2024 (UTC)[reply]
+1. --Kolja21 (talk) 19:21, 6 September 2024 (UTC)[reply]

LCCN - 2024-09-07 PM

Statistics for items having P7859 with "lccn-":

  1. https://w.wiki/B6Dr all : 3993
  2. https://w.wiki/B6Du human : 2551
  3. https://w.wiki/B6Dv P244 exists : 3993
  4. https://w.wiki/B6Dx P244 is the same : 104 (link with ext. URLs https://w.wiki/B6Ru)
  5. ?? P244 is not the same : 3993-104=3889 (see also: Wikidata:Database reports/Complex constraint violations/P7859)
  6. https://w.wiki/B6Dy P244 is the same and P10832 exists : 0
  7. https://w.wiki/B6JJ no P244 and no P10832 : 0
  8. https://w.wiki/B6Sw no P244 exists : 0

ISNIplus (talk) 20:51, 7 September 2024 (UTC)[reply]

There are 3993 items having P7859 based on lccn- and P244 but the values don't match.

Examples for Q5:

  1. Héloïse d’Argenteuil Q5656 (item with lowest ID)
    1. P7859 lccn-nr89011057 1 reference VIAF ID 71392176
      1. https://worldcat.org/identities/lccn-nr89011057/ redirects to https://worldcat.org/identities/notfound
      2. https://id.loc.gov/authorities/names/nr89011057.html
        1. Identifies RWO
          1. https://isni.org/isni/0000000121386094 - found on Q5656
          2. https://d-nb.info/gnd/118548980 - found on Q5656
        2. Closely Matching Concepts from Other Schemes : Wikidata Elizabeth Boyd Q18546156 which has nr89011057
      3. https://viaf.org/viaf/71392176/
        1. https://id.loc.gov/authorities/names/nr89011057
    2. P244 n50081967 1 reference imported from Wikimedia project
      1. https://id.loc.gov/authorities/names/n50081967.html
        1. Identifies RWO https://isni.org/isni/0000000121386094 - found on Q5656
      2. https://worldcat.org/identities/lccn-n50081967/ redirects to https://entities.oclc.org/worldcat/entity/E39PCjrFwDFjy8hbvDPmGCfJ8P.html
  2. Julia Fitzgerald Q117775165 (item with highest ID)
    1. P7859 lccn-n86812674 0 references
      1. https://worldcat.org/identities/lccn-n86812674/ redirects to https://entities.oclc.org/worldcat/entity/E39PCjKBjFH9FjQY4qcqk9PRcd.html
    2. P244 n2008042997 0 references
      1. https://id.loc.gov/authorities/n2008042997 Monson, Christine - now created: Christine Monson (Q130257195)
        1. http://viaf.org/viaf/sourceID/LC%7Cn+2008042997#skos:Concept redirects to https://viaf.org/viaf/274979320/#skos:Concept
      2. 2023-04-18 1) the (at least now) wrong WCE was added to P7859 2) next edit replaces it with WCI based on correct LC 3) the wrong P244 was added
  3. Aleksandr Agin (Q4056926) https://www.wikidata.org/w/index.php?title=Q4056926&diff=2245202918&oldid=2210393559 - P244 LC not found

@Epìdosis, Kolja21: the first two above exmples of a mismatch between lccn-... and P244 show two types of LC errors: 1) LC of P7859 belongs to another item and P244 is correct and 2) vice versa.

In the 2nd example the P244 error existed for more than a year, nobody corrected it.

For the 3993 mismatches one could copy the P7859 value to P244 and give as ref: "matched by identifier from [P11797] : WorldCat Identities [Q76630151]" - the two different values will exist in P244 forcing constraint violations and make it more likely people look at the item. Opinion? ISNIplus (talk) 23:08, 7 September 2024 (UTC)[reply]

I think "copy the P7859 value to P244 and give as ref: "matched by identifier from [P11797] : WorldCat Identities [Q76630151]" - the two different values will exist in P244 forcing constraint violations and make it more likely people look at the item" could be a reasonable solution, at least in theory, but since the number of users fixing these constraint violations is very low, I would prefer not to overburden them, so I think the best option would be solving them manually without doing this passage. Epìdosis 07:22, 8 September 2024 (UTC)[reply]
I found more good LC in P7859, so decided to copy. But they can still be found via query for manual review/removal. ~340 items left that link to P7859, see below. ISNIplus (talk) 11:41, 8 September 2024 (UTC)[reply]

2024-09-08 new claims LC WCE normal and WCI deprecated

@Epìdosis, Kolja21: Just saw new normal rank LC WCE and new WCI deprecated [5]. I think WCI P7859 confuses people. I is often given as suggestion while WCE is not given, when starting to add a new property without having typed anything yet.

Currently there are 340 items linking to P7859 https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=340 , will manually review, but 340 is too much. Your help very welcome.

Epidosis, can you write a query that finds items where P7859 is used as reference or qualifier? ISNIplus (talk) 11:38, 8 September 2024 (UTC)[reply]

I reviewed the 32 format violations (Wikidata:Database_reports/Constraint_violations/P7859#Format) manually, for no value I (mostly?):
  1. added WCE and deleted no value
  2. moved no value to WCE
Currently 308 : https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=310
ISNIplus (talk) 12:46, 8 September 2024 (UTC)[reply]


@Vladimir Alexiev: 270 left https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=270 - sometimes in reference for occupation, frequently found "non-fiction writer". I always check if I can find an item in WCE, but for these occupations, there is sometimes only a text, not a human item. If you can help to clean these last 270, this would be great! ISNIplus (talk) 13:49, 8 September 2024 (UTC)[reply]

@Vladimir: There are wrong imports like VIAF ➔ LCAuth n89287411 (Ābād, fl. 1918) ➔ WorldCat Identities added to József Abád (Q481610): Hungarian mathematics teacher, volleyball coach (1910-1978). Later marked as "conflation" but it's not a conflation just the import of an unchecked ID. Wrong from the beginning. @ISNIplus: Imho all WorldCat Identities with a deprecated rank can be deleted. --Kolja21 (talk) 14:19, 8 September 2024 (UTC)[reply]
@Kolja21: If they have no P10832 one can look them up in WCE.
Less than 240 left. https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=240
Maybe we can finish that this Sunday and finally delete P7859. ISNIplus (talk) 14:24, 8 September 2024 (UTC)[reply]

@Epìdosis: on Q22081788 I was looking for P7859 and after no success, I checked the history, you just removed it. If that happenend to you too - I am off now, no interference from my side. Less than 200 https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=200 ISNIplus (talk) 14:55, 8 September 2024 (UTC)[reply]

There are no P7859 left in references, everything cleaned and substituted with good sources wherever possible. Epìdosis 14:58, 8 September 2024 (UTC)[reply]
Still 197 links from item pages to P7859 https://www.wikidata.org/w/index.php?title=Special:WhatLinksHere/Property:P7859&limit=200 ISNIplus (talk) 15:42, 8 September 2024 (UTC)[reply]

P7859 revival proposal

One user wants to revive the usage of P7859 :



Doesn't seem reasonable at all. Of course there are errors, but there are also errors in the broken links. Nobody becomes obliged. WD is a voluntary project. ISNIplus (talk) 23:33, 8 September 2024 (UTC)[reply]

@Dcflyer: you wrote "and even more so with the expectation that other editors become obliged to fix the errors that you introduced into Wikidata, instead of yourself, as was stated again in a revert edit summary comment to Martin (MSGJ)." - That is just not true. Please look again. ISNIplus (talk) 23:36, 8 September 2024 (UTC)[reply]
@Dcflyer: you wrote 'Especially given the fact that you were advised by Epìdosis, ". . . but since the number of users fixing these constraint violations is very low, I would prefer not to overburden them, so I think the best option would be solving them manually without doing this passage."' - you probably saw my reply there too? Why do you represent only selected facts of the reality? ISNIplus (talk) 23:38, 8 September 2024 (UTC)[reply]