Property talk:P4839
Documentation
input form for an entity in Wolfram Language
Entity\["[A-Z][a-zA-Z]+", \{?"[^"]+"(?:, (?:\d+|"[a-zA-Z]+"))*\}?\]
”: value must be formatted using this pattern (PCRE syntax). (Help)List of violations of this constraint: Database reports/Constraint violations/P4839#Format, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P4839#Unique value, SPARQL (every item), SPARQL (by value)
List of violations of this constraint: Database reports/Constraint violations/P4839#Entity types
List of violations of this constraint: Database reports/Constraint violations/P4839#single best value, SPARQL
type is currently not a value of Wolfram Language entity type (P7497) (Help)
Violations query:
SELECT * { { SELECT ?type (COUNT(*) as ?count) (SAMPLE(?value) as ?samplevalue) (SAMPLE(?item) as ?item) { ?item wdt:P4839 ?value . BIND(strafter(strbefore(?value, "\","),"Entity[\"") as ?type) } GROUP BY ?type } MINUS { [] wdt:P7497 ?type } } ORDER BY DESC(?count)
List of this constraint violations: Database reports/Complex constraint violations/P4839#First part should be a value of Wolfram Language entity type (P7497)
Check Entity listed in nhttps://reference.wolfram.com/language/guide/EntityTypes.html (Help)
Violations query:
SELECT ?item ?value { ?item p:P4839 [ ps:P4839 ?value ]. BIND( REGEX( STR( ?value ), "^(.+(Language|Word|GrammaticalUnit|WritingScript|Alphabet|Character|Concept|WritingDirection|WritingScriptBaseline|WritingScriptType|HistoricalCountry|HistoricalEvent|HistoricalPeriod|HistoricalSite|Shipwreck|MilitaryConflict|Person|PersonTitle|GivenName|Surname|Gender|Emotion|Food|FoodType|BasicFoodGroup|USDAFoodGroup|FoodTypeGroup|FoodAlcoholLabel|FoodCaffeineLabel|FoodCalorieLabel|FoodFiberLabel|FoodFatLabel|FoodIronLabel|FoodSodiumLabel|FoodSugarLabel|FoodBoneContent|FoodSkinContent|FoodSeedContent|FoodCrustType|FoodFatType|FoodGeometryType|FoodPeelingType|FoodProcessingType|FoodServingType|FoodStorageType|FoodSugarType|FoodBeefGrade|FoodMeatCut|FoodMeatQuality|FoodPattyCount|FoodBrandName|FoodSubBrandName|FoodManufacturer|FoodAge|FoodComposition|FoodConcentration|FoodCulture|FoodDataSource|FoodFlavor|FoodIntendedUse|FoodLocation|FoodMoistureLevel|FoodNutritionalSupplement|FoodNutritionalSupplementNotAdded|FoodPackaging|FoodPart|FoodPreparation|FoodSeafoodVariety|FoodSize|FoodState|FoodSugarType|FoodTexture|FoodTrimmingLevel|FoodVariety|FoodVegetablePart|Financial|Company|CurrencyDenomination|SportObject|SportMatch|MusicalInstrument|BoardGame|PopularCurve|YogaPose|YogaPosition|YogaSequence|YogaProp|PilatesExercisePokemon|Digimon|Language|Religion|Mythology|Movie|MusicAct|MusicAlbum|MusicAlbumRelease|MusicWork|MusicWorkRecording|BroadcastStation|BroadcastStationClassification|Book|Artwork|Periodical|FictionalCharacter|Museum|LibraryBranch|LibrarySystem|FrequencyAllocation|BroadcastStation|MeasurementDevice|Building|Bridge|Tunnel|Dam|Mine|Aircraft|Airline|Airport|Ship|WeatherStation|TropicalStorm|Cloud|AtmosphericLayer|Earthquake|GeologicalLayer|GeologicalPeriod|Mineral|FamousGem|TidalConstituent|TideStation|Satellite|Rocket|DeepSpaceProbe|MannedSpaceMission|Planet|PlanetaryMoon|MinorPlanet|Comet|SolarSystemFeature|MeteorShower|Exoplanet|Star|Galaxy|StarCluster|Nebula|Supernova|Pulsar|AstronomicalRadioSource|Constellation|Icon|Color|ColorSet|LightColor|FileFormat|DisplayFormat|NotableComputer|InternetDomain|IPAddress|NetworkService|TopLevelDomain|ProgrammingLanguage|WolframLanguageSymbol|Polyhedron|Solid|Lamina|Surface|SpaceCurve|PlaneCurve|Lattice|LatticeSystem|PeriodicTiling|NonperiodicTiling|Graph|Knot|FiniteGroup|MathematicalFunction|IntegerSequence|ContinuedFraction|FunctionSpace|TopologicalSpaceType|FamousMathProblem|FamousMathGame|ComputationalComplexityClass|MathWorld|ContinuedFractionResult|ContinuedFractionSource|FunctionalAnalysisSource|Plant|Species|Dinosaur|DogBreed|CatBreed|AnatomicalStructure|AnimalAnatomicalStructure|Neuron|Disease|MedicalTest|Protein|AnatomicalFunctionalConcept|AnatomicalTemporalConcept|CognitiveTask|ICDNine|ICDTen|Gene|SNP|Protein|Chemical|Element|Isotope|Particle|Mineral|Laser|CrystalFamily|CrystalSystem|CrystallographicSpaceGroup|PhysicalSystem|PhysicalConstant|FamousPhysicsProblem|FamousChemistryProblem|Color|ColorSet|LightColor|MeasurementDevice|Country|AdministrativeDivision|City|Neighborhood|MetropolitanArea|ZIPCode|USCongressionalDistrict|DistrictCourt|Ocean|Island|UnderseaFeature|Reef|Beach|Lake|Mountain|Volcano|River|Glacier|Waterfall|EarthImpact|Desert|Forest|GeographicRegion|Airport|Park|AmusementPark|AmusementParkRide|Stadium|Bridge|Canal|Tunnel|Dam|Mine|Cave|OilField|Building|Castle|Cemetery|HistoricalSite|PreservationStatus|ReserveLand|Shipwreck|University|SchoolDistrict|PublicSchool|PrivateSchool|Museum|LibrarySystem|LibraryBranch|WeatherStation|AstronomicalObservatory|ParticleAccelerator|NuclearReactor|NuclearTestSite|NuclearExplosion|TimeZone).+)$" ) AS ?regexresult ) . FILTER( ?regexresult = false ) . FILTER( ?item NOT IN ( wd:Q4115189, wd:Q13406268, wd:Q15397819 ) ) . }
List of this constraint violations: Database reports/Complex constraint violations/P4839#Check Entity name
|
Constraint regex too tight
[edit]Just a note - the regular expression doesn't allow for colons within the strings, which is valid in WL. For instance Entity["Species", "Species:HomoSapiens"] is the entity code for Q5, but this shows constraint violation. --Hebejebelus (talk) 13:36, 26 January 2019 (UTC)
- @Hebejebelus: I made the minimal change to regex, I don't know where the "third string" is used so I didn't change that to accept a single colon. I also added your string to Homo Sapiens. I see you suggested to add it to human (Q5) instead, that separation seems to be debated anyway.
- It may be noted that there are properties that aren't supported for P4839, so Entity["Species", "Species:HomoSapiens"][EntityProperty["Species", "ScientificName"]] isn't accepted by regex, and I think should not. Jagulin (talk) 05:08, 22 August 2019 (UTC)
Data type
[edit]Why isn't this an external identifier? --99of9 (talk) 01:21, 1 May 2019 (UTC)
- @99of9: I think we should change it. Somehow the creation discussion and its review miss it/didn't fix it. --- Jura 11:38, 11 August 2019 (UTC)
- @Jura1: Changing it now would be a big effort given the huge usage and all the mix-n-match sets. I was mostly curious as to whether there was a good reason. --99of9 (talk) 03:49, 12 August 2019 (UTC)
- There is a script that does that, it shouldn't impact mix-n-match. Many properties were converted with even more uses than this one. --- Jura 16:42, 12 August 2019 (UTC)
- @Jura1: Changing it now would be a big effort given the huge usage and all the mix-n-match sets. I was mostly curious as to whether there was a good reason. --99of9 (talk) 03:49, 12 August 2019 (UTC)
- I came with the same question: Wikidata property related to software (Q21126229) seems incorrect for this item. Proposal clearly talks about it as an ID. Conceptually from Wolfram point of view it's a more complex search string, but with the restrictions added for WD Wolfram Language entity code (P4839) should be reclassified and renamed from "code" to "ID". (Note that the Wolfram Language unit code (P7007) to me is slightly different, the proposal talks about code rather than an ID. The discussion on that item doesn't exactly change the outcome here.)
- @Jura1:Can you take care of it, raising the consensus discussion elsewhere if needed? Jagulin (talk) 04:17, 22 August 2019 (UTC)
- I left a note on Project chat [1]. --- Jura 08:52, 22 August 2019 (UTC)
- I created a ticket to change the datatype. Lea Lacroix (WMDE) (talk) 14:22, 30 September 2019 (UTC)
Weak opposeSupport Hi. Just saw this, so I hope I'm not too late to the discussion. ("weak" oppose because there might be some higher principle that is stronger than the argument I want to make.) The string stored in this property is actual Wolfram Language (Q15241057) code, as can be seen in the following:
- I created a ticket to change the datatype. Lea Lacroix (WMDE) (talk) 14:22, 30 September 2019 (UTC)
- I left a note on Project chat [1]. --- Jura 08:52, 22 August 2019 (UTC)
In[1]:= Needs["GraphStore`"]
In[2]:= result = SPARQLExecute[
"https://query.wikidata.org/sparql",
"select * where { wd:Q937 wdt:P4839 ?entity }"
]
Out[2]= {<|"entity" -> "Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"|>}
In[3]:= entityCode = result[[1, "entity"]]
Out[3]= "Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"
In[4]:= SyntaxQ[entityCode]
Out[4]= True
In[5]:= entity = ToExpression[entityCode]
Out[5]= Entity["Person", "AlbertEinstein::6tb7g"]
The second input queries the Wolfram Language entity code (P4839) for Albert Einstein (Q937). The third input extracts the single result and stores it in the variable entityCode
. SyntaxQ
is a built-in function that confirms that this is valid code. Note that this "code" is not usable yet in the language to identify anything: It first has to be parsed, which is what ToExpression
does. So I think this demonstrates that this is "code", just as with Wolfram Language unit code (P7007). Toni 001 (talk) 11:17, 2 October 2019 (UTC)
Now that I showed some code, I wanted to hint that in production-quality code one would not use the single-argument ToExpression[...]
, but rather the safer ToExpression[..., HoldComplete]
. This is important because this property could be filled with any malicious expression, say Quit
. The HoldComplete
wrapper prevents evaluation and gives an opportunity to inspect the expression. Luckily, a whitelist of certain patterns will be sufficient. Here is a demonstration of how the value of this property can be safely converted to an expression:
(* wrapper that applies a predicate to an unevaluated argument *)
unev[pred_] := Function[Null, pred[Unevaluated[#]], HoldAllComplete];
(* strong patterns for strings and integers *)
$str = _String?(unev[StringQ]);
$int = _Integer?(unev[IntegerQ]);
(* checks whether all leaves of a held expression match a pattern *)
safeExprQ[expr_HoldComplete, patt_] := MatchQ[
Level[expr, {-1}, HoldComplete, Heads -> True],
HoldComplete[HoldComplete, patt ..]
];
safeExprQ[_, _] := False;
(* safely convert a string to an expression *)
FromEntityCode[entityCode_String] := Module[
{res},
res = Quiet[ToExpression[entityCode, InputForm, HoldComplete]];
If[
Or[
FailureQ[res],
! safeExprQ[res, $str | $int | Entity | List],
! MatchQ[res, HoldComplete[Entity[_String, _]]]
],
Return[
Failure["InvalidEntityCode", <|
"MessageTemplate" -> "The value `EntityCode` for the entity code (P4839) is invalid.",
"MessageParameters" -> <|"EntityCode" -> entityCode|>
|>],
Module
];
];
res = ReleaseHold[res];
res
];
Example:
In[7]:= FromEntityCode["Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"]
Out[7]= Entity["Person", "AlbertEinstein::6tb7g"]
In[8]:= FromEntityCode["weird"]
Out[8]= Failure["InvalidEntityCode", ...]
Toni 001 (talk) 12:10, 2 October 2019 (UTC)
- Thanks, Toni. I don't quite understand what the issue is though. Are you saying we should be storing something else as the identifier than we currently are? If so what? (Sorry I'm not familiar enough with WA to tell right away.) --Lydia Pintscher (WMDE) (talk) 13:42, 2 October 2019 (UTC)
- Some comments (nothing that would prevent making this an ID, but things worth considering):
- This property is very useful as it is right now, so I'm not proposing any change. (In fact, at Wolfram Research (Q1367937) we are looking at it very carefully and might be using it for upcoming functionality.) My point is: "yes, this is code". My weaker point is that "code and ID might be conflicting concepts", but this might depend on the context. This can be seen when looking at a few examples like "ISBN-10", "CAS Registry Number", ...; putting this property in the list makes it stand out because it is the only one that can't be used as it is, but needs to be "parsed".
- Another way to look at it is that an entity actually consists of two IDs: A "type" (first argument of the function
Entity
) and a "canonical name" (second argument); the latter is defined only in the context of the former. But creating individual ID properties for those two parts would not help at the moment because the "canonical name" is not always a string. - Finally, Wolfram Alpha (Q207006) accepts natural language input, but as an additional feature it accepts (a subset of) WL code; that's why there can be a formatter URL (P1630) for this property.
Toni 001 (talk) 14:34, 2 October 2019 (UTC)
- Unless we should actually be storing something else (which was extensively debated on creation) or this property should be split into several properties, I don't see what should prevent us from using Wikidata's external-id datatype. The current property values uniquely identify at WA. --- Jura 15:34, 4 October 2019 (UTC)
- @Toni 001, Lydia Pintscher (WMDE): can we move ahead with this? --- Jura 10:29, 8 October 2019 (UTC)
- No real objection from me, but see also my comment on this proposal, where I make this ID vs. code distinction. Toni 001 (talk) 14:11, 8 October 2019 (UTC)
- By the way, I don't want to be holding up this change, especially if I'm the only one being a little skeptical. Maybe I'm missing a point in this discussion: Is is just for the values being listed in the "identifiers" section? Is there a technical / philosophical / practical / ... reason? My reasons would fall into the "(very) technical" category. Toni 001 (talk) 15:46, 8 October 2019 (UTC)
- Maybe there could be a "computer code" datatype for all properties whole values are valid syntax in some programming language. It could be just like "monolingual text", but instead of the (human) language an item representing the programming language is specified. Toni 001 (talk) 13:56, 9 October 2019 (UTC)
- No real objection from me, but see also my comment on this proposal, where I make this ID vs. code distinction. Toni 001 (talk) 14:11, 8 October 2019 (UTC)
Ok so to make sure I understand this correctly: WA doesn't actually have unique IDs for entities? There is only a type plus an ID that uniquely identifies an entity? Is that the root of the issue? --Lydia Pintscher (WMDE) (talk) 10:55, 10 October 2019 (UTC)
- @Lydia Pintscher (WMDE): That summary sounds correct to me. To reformulate that in my programmers mind: There is no ID, say 123abc which could be fed anywhere to produce the expression/code/function call
Entity[type, canonicalName]
. The formatter URL works not because the string "Entity[\"type\", ...]" is an ID (well, depending on the precise definition of ID, of course), but because the natural language parser recognizes that fragment as code, finds the head to beEntity
, then resolves the two arguments. The first argument, the type, does have official URLs, for instance Planet. Toni 001 (talk) 11:44, 10 October 2019 (UTC)- There are other identifiers that have a various parts that form an identifier in combination. Given the above, can we also convert Wolfram Language unit code (P7007)? BTW, a property for Wolfram Language entity types would probably be useful as well. --- Jura 09:40, 13 October 2019 (UTC)
- I changed my mind. Being code or a composite identifier does not make this less an identifier. So I support changing this to an external ID. I would like to keep the current name though, to emphasize the code-character. Toni 001 (talk) 10:05, 15 October 2019 (UTC)
- I went ahead and made a proposal for types: Wikidata:Property proposal/Wolfram Language entity type. --- Jura 06:47, 19 October 2019 (UTC)
- I changed my mind. Being code or a composite identifier does not make this less an identifier. So I support changing this to an external ID. I would like to keep the current name though, to emphasize the code-character. Toni 001 (talk) 10:05, 15 October 2019 (UTC)
- There are other identifiers that have a various parts that form an identifier in combination. Given the above, can we also convert Wolfram Language unit code (P7007)? BTW, a property for Wolfram Language entity types would probably be useful as well. --- Jura 09:40, 13 October 2019 (UTC)
- @Lydia Pintscher (WMDE): That summary sounds correct to me. To reformulate that in my programmers mind: There is no ID, say 123abc which could be fed anywhere to produce the expression/code/function call
Done thanks to all involved, notably @Lea Lacroix (WMDE), Lydia Pintscher (WMDE), Ladsgroup: at WMDE. --- Jura 11:40, 15 November 2019 (UTC)
Mixnmatch P2264
[edit]This item is getting clogged up with numerous Mix'n'match catalog ID (P2264) added. What are the purpose of those? In the original example P2264 is used to identify AAT. For this item, P2264 seems to list any query that uses Wolfram. Should they be removed? Jagulin (talk) 05:20, 22 August 2019 (UTC)
- Even there, it's just meant to link to a catalogue: Wikidata:Property_proposal/Archive/39#P2264. As the scope of P4839 is larger, I suppose it was broken down into several. This also allows to apply a distinct P31 value for each. --- Jura 11:57, 23 August 2019 (UTC)
How do I find the entity code?
[edit]I'd like to add the entity code for Pomona College (Q7227384), but I can't figure out how to find it. Could we please add some instructions here? Courtesy ping IvanP. {{u|Sdkb}} talk 23:50, 14 July 2020 (UTC)
- @Sdkb: Type Pomona College into Wolfram|Alpha, mouse over the input interpretation, then click on Plain Text (bottom right corner):
Entity["University", "PomonaCollege::jycd5"]
. -- IvanP (talk) 20:07, 16 July 2020 (UTC)- @IvanP: Thanks! I added usage instructions to this property. {{u|Sdkb}} talk 20:28, 16 July 2020 (UTC)
Missing reference for number of entries in the database
[edit]https://www.wikidata.org/wiki/Property:P4839#P4876 could someone fix that? --So9q (talk) 09:15, 27 January 2021 (UTC)
Shouldn't wiki entries with instance: Human only accept the Entity["Person"] wolfram language code?
[edit][Gaye] for example got 2 entries, one the music act and one the person as entity.LuukH87 (talk) 14:15, 20 January 2022 (UTC)
Canonical and alternate names
[edit]Some entities accept alternate names in the second argument. Send it through EntityValue[...]
to determine what the canonical name is:
In[3]:= Internal`ClearEntityValueCache[]
In[4]:= EntityValue[Entity["Language", "French"]]
Out[4]= Entity["Language", "French::367gk"]
Toni 001 (talk) 05:56, 6 April 2022 (UTC)
- This requires having a local wolfram install to do conversion. I have a script that does conversion between Wolfram code and wikidata entities that breaks with only the canonical form being there. I suggest we keep all form that the wolfram language accept, but put the canonical form as the prefered statement Vincent cloutier (talk) 18:14, 6 April 2022 (UTC)
- All Properties
- Properties with external-id-datatype
- Properties used on 100000+ items
- Properties with format constraints
- Properties with scope constraints
- Properties with unique value constraints
- Properties with entity type constraints
- Properties with single best value constraints
- Properties with complex constraints
- Mathematics properties