Property talk:P5187
Documentation
- Start a query
- Current uses
- Statistics by class
- String length
- Language of strings
- List of qualifiers
- Count
word stem of the subject lexeme
Represents | word stem (Q210523) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Data type | Monolingual text | |||||||||
Example | According to this template:
When possible, data should only be stored as statements
| |||||||||
See also | conjugation class (P5186), combines lexemes (P5238) | |||||||||
Lists |
| |||||||||
Proposal discussion | Proposal discussion | |||||||||
Current uses |
|
List of violations of this constraint: Database reports/Constraint violations/P5187#Scope, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P5187#lexical category
This property is being used by: Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.) |
Language of the stem should be the same as the language of the main lemma. (Help)
Violations query:
SELECT * WHERE { ?item wikibase:lemma ?lemma ; wdt:P5187 ?stem . BIND ( SUBSTR(LANG(?lemma),1,2) AS ?langLemma ) BIND ( SUBSTR(LANG(?stem),1,2) AS ?langStem ) FILTER ( ( ?langLemma != ?langStem ) && ( ?langLemma != 'pn' ) && ( ?langLemma != 'pa' ) && ( ?langLemma != 'hi' ) && ( ?langLemma != 'ur' ) && ( ?langLemma != 'tg' ) && ( ?langLemma != 'fa' ) ) }
List of this constraint violations: Database reports/Complex constraint violations/P5187#Inconsistency between the languages
Format: "porta", "port-a", or "port[a]"
editWhich form should be used on Lexeme:L1642 (Latin portare). I think an indication of the part of the stem that is replaced in conjugation would be helpful. I used "port-a" for now.
--- Jura 11:04, 27 May 2018 (UTC)
- @Jura1:
first a detail but this property is limited to Japanese. I would be preferable to change that before using it on non-Japanese verb. - Then, good question. Port-a could fit. But wouldn't it be better to use multiple values? Here it would be port and porta (which are the perfect stem and the present stem, if I recall correctly my Latin, that can be indicated in qualifier).
- Cdlt, VIGNERON (talk) 13:55, 2 June 2018 (UTC)
Link missing
edit@Jura1, VIGNERON: This property was proposed with the motivation: "needed to determine the manner of conjugation of Japanese verbs and adjectives". If we use it for word stems in general how can I create a link from Assoziationsüberschuss (Lexeme:L2169) to:
- Assoziation (Lexeme:L2171)
- Überschuss (Lexeme:L2170)
--Zitatesammler (talk) 13:28, 2 June 2018 (UTC)
- @Zitatesammler: I wouldn't use it for « word stems in general », but only of « stem of a verb in general » (and it's already debatable).
- Anyhow, this property is a monolingual string so it's not possible to link to lexemes. For your case, I think it's not really stems but compounds and there is a proposal for that Wikidata:Property proposal/compound of.
- Cdlt, VIGNERON (talk) 13:47, 2 June 2018 (UTC)
Extension to all verbs
edit@Okkn, Jura1, Fnielsen, Şêr, Kareyac: (top-5 users of this property)
Any objection to extend this property to all verbs? (and maybe all lexemes?)
At least, I suggest to add non-Japanese examples (that can be taken among the WhatLinksHere).
Cheers, VIGNERON (talk) 15:24, 13 June 2018 (UTC)
- @VIGNERON: I think there is no problem to use this property on non-Japanese lexemes. In addition, I have added this property not only to Japanese verbs, but also to Japanese adjectival nouns (such as Lexeme:L2454). --Okkn (talk) 15:41, 13 June 2018 (UTC)
- @VIGNERON: Support in general. Can it confict with derived from lexeme (P5191) ? - Kareyac (talk) 16:28, 13 June 2018 (UTC)
- Word stems cannot be an independent Lexeme, so this property won't conflict with derived from lexeme (P5191). --Okkn (talk) 04:23, 14 June 2018 (UTC)
- @Kareyac: How could it conflict? As I seen it, the first property is about etymology while the second is about morphology and first property is linking to another lexeme while the second is a monolingual text. But maybe I'm missing something here. Cdlt, VIGNERON (talk) 14:23, 14 June 2018 (UTC)
- @Okkn: @VIGNERON: Now its more clear to me. Thanks for explanation. - Kareyac (talk) 14:32, 14 June 2018 (UTC)
- Word stems cannot be an independent Lexeme, so this property won't conflict with derived from lexeme (P5191). --Okkn (talk) 04:23, 14 June 2018 (UTC)
- Support I must admit that I did not see that it was restricted to Japanese verbs. I have used it for all kinds of lexemes. The Q item linked (word stem (Q210523)) is the general notion of stem, — not just Japanese. I am wondering whether there is something special about the Japanese notion of stem that warrrants its own property? — Finn Årup Nielsen (fnielsen) (talk) 16:49, 13 June 2018 (UTC)
- That is because the original proposal of this property intended to use this property on Japanese lexemes. Of course I think there is no need to restrict it to Japanese words. --Okkn (talk) 04:23, 14 June 2018 (UTC)
Done documentation changed Special:Diff/695698956. Cdlt, VIGNERON (talk) 14:23, 14 June 2018 (UTC)
Multiple values
editHi y'all,
@Fnielsen, Okkn, Mitrolayzing, Liamjamesperritt, Circeus, So9q: @Shlomo, Adrijaned, Iniquity, Pamputt: (top-10 users of this property)
Some case are simple and there is only one stem; but what is the best way to use this property when there is multiple values?
Here some examples I found :
- تاجیکستان/Тоҷикистон/Toçikiston (L220599) or زبان تاجیکی/забони тоҷикӣ/zaboni toçikī (L230362) multi-script Tajik nouns: 3 values, obviously one for each script, strange but no major trouble here
- French irregular verbs
- ennuyer (L9346) a common French irregular verb: 2 values given, all verbs ending in -uyer and -oyer are in the same case, the "y" becomes a "i" when the next letter is a silent "e", can it be specified with a qualifier?
- aller (L750) a French irregular *and* suppletive verb (rare case): 3 values given at the Lexeme level with no way to know when and how they apply, should we put qualifiers? (and which one, there is not really a rule here - at least not a simple one that I can see) or move the property to the Form level? (what has been done for the derived from lexeme (P5191) which follows the same irregular pattern).
patrino (L270299) an Esperanto noun (by ThelmOSO): 3 value given, I guess only the first is really the "stem", or is it? and if not, how to store this data?fixed- atl (L8355) a Nahuatl noun: 2 values, the second as a qualifier has characteristic (P1552) = exception (Q779608), which is kind of helpful but not very precise
I came here after a discussion with Uziel302 about Lexemes like ales (L256931) where he put the information in the representation of each form. At least it specific but it's a bit strange, it doesn't feel to me to be the right place.
What do you think? any idea, comment, remark?
Cheers, VIGNERON (talk) 15:40, 1 April 2020 (UTC)
- I think we should have stem both on lexeme level for the usual case and on the form level for cases of different stems for different forms. Uziel302 (talk) 16:02, 1 April 2020 (UTC)
- I fixed patrino (L270299). It has only one stem "patrin" at least as I understand stems from the wikipedia page about them.--So9q (talk) 16:55, 1 April 2020 (UTC)
- Hi @So9q, VIGNERON in Esperanto (and some other languages, I suppose) every part of word can be a stem (Q111029)/root (Q210523) (-in- -> ino / ina / ineto etc.). In my opinion it is worth to mentioning individual morphemes/roots (maybe there is a better tool for it?). Wiktionaries for example: eo: patrino, en: patrino, pl: patrino.
In this case, "the main root/stem" is patr- (or patrin- ?), I agree with that. However, there are also compound words like vapor-o-ŝip-o, or reĝ-id-in-o and reĝ-in-id-o. Unfortunately I don't know what to do with words like these (maybe something like: root (P5920), combines lexemes (P5238)?) --ThelmOSO (talk) 21:20, 1 April 2020 (UTC)- @ThelmOSO: yes, in almost all languages very various part of word *can be* a stem; but that doesn't mean it *actually is* When you describe a specific form of a word, there is one and only one stem (I don't know any exception) and usually a lexemes also have only one stem (but here there is exception, this is exactly what this discussion is about). Thanks @So9q: for the fix, remark: I would have used combines lexemes (P5238) and not derived from lexeme (P5191) (and no, root (P5920) is something entirely different, it's closer to root, not stem ; sadly English only has 2 words for that so it's a bit confusing, in French we have 3 words : "racine", "radical" and "thème" ).
- Cdlt, VIGNERON (talk) 08:48, 2 April 2020 (UTC)
- @VIGNERON: I agree :) I meant to use combines, fixed.--So9q (talk) 20:59, 7 April 2020 (UTC)
- Hi @So9q, VIGNERON in Esperanto (and some other languages, I suppose) every part of word can be a stem (Q111029)/root (Q210523) (-in- -> ino / ina / ineto etc.). In my opinion it is worth to mentioning individual morphemes/roots (maybe there is a better tool for it?). Wiktionaries for example: eo: patrino, en: patrino, pl: patrino.
- Comment actually 3 stems in زبان تاجیکی/забони тоҷикӣ/zaboni toçikī (L230362) are not from different scripts, they are from different words in the phrase. Is it appropriate use? --Infovarius (talk) 02:26, 3 April 2020 (UTC)
- @Infovarius: thanks for noticing, I look too quickly and thought تاجیکستان/Тоҷикистон/Toçikiston (L220599) and زبان تاجیکی/забони тоҷикӣ/zaboni toçikī (L230362) where in the same situation but indeed the second one is not the same. I don't know Tajik but isn't the solution to move this data to the property combines lexemes (P5238)? (a bit like for the Esperanto, qv. supra). @Mitrolayzing: could you give some clarification here? Cheers, VIGNERON (talk) 14:58, 3 April 2020 (UTC)
- Agree, combines lexemes (P5238) is better. --Infovarius (talk) 22:12, 4 April 2020 (UTC)
- @Infovarius: thanks for noticing, I look too quickly and thought تاجیکستان/Тоҷикистон/Toçikiston (L220599) and زبان تاجیکی/забони тоҷикӣ/zaboni toçikī (L230362) where in the same situation but indeed the second one is not the same. I don't know Tajik but isn't the solution to move this data to the property combines lexemes (P5238)? (a bit like for the Esperanto, qv. supra). @Mitrolayzing: could you give some clarification here? Cheers, VIGNERON (talk) 14:58, 3 April 2020 (UTC)
Equivalent property to http://www.lexinfo.net/ontology/2.0/lexinfo#radical ?
editBefore I add that statement on this for equivalent property (P1628) ... I think this is the same as http://www.lexinfo.net/ontology/2.0/lexinfo#radical but just double-checking with others here since "radical" that I noticed as an alias for French, and currently "radical" is a messy linguistic term in Latin-based languages, so I'm not 100% sure of the equivalent nature until I can get someone's "yes".
- @Thadguidry: sorry for the late reply. It seems that this Lexinfo property is not the equivalent of word stem (P5187) but of radical (P5280). Cheers, VIGNERON (talk) 06:34, 15 August 2022 (UTC)
- @VIGNERON Ah! Thanks, I've added it to radical (P5280) Thadguidry (talk) 16:20, 15 August 2022 (UTC)