[go: nahoru, domu]

Page MenuHomePhabricator

Please provide an up-to-date list or table of fallback languages
Closed, ResolvedPublic

Description

I need something like https://www.mediawiki.org/wiki/Localisation_statistics , or like https://commons.wikimedia.org/wiki/Module:Fallbacklist - basically, a single page (not a visual) that allows me to quickly search for a language name and figure out which fallback language/cluster it belongs to.

If I had to check only on a few languages, I'm aware, per Amir, that I could manually look for the $fallback line in the relevant file in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/languages/messages/ , but that is not doable for dozens or hundred checks.

If we had an up-to-date list, we may be able to build better delivery lists for MassMessage rather than always resorting to using English to message wikis. (Think https://meta.wikimedia.org/wiki/Distribution_list/Global_message_delivery/fr, but it also has Wolof and others.)

If folks know how to build such a resource, even though they can't personally do it - please just document it? Thanks.

Related Objects

StatusSubtypeAssignedTask
DuplicateNone
DuplicateNone
DeclinedNone
DeclinedNone
DeclinedNone
DeclinedNone
DeclinedNone
Resolved Elitre
DeclinedNone
ResolvedQgil
ResolvedKeegan
Resolved Elitre
DeclinedNone
DeclinedNone
DuplicateNone
DeclinedNone
InvalidKeegan
DuplicateQgil
ResolvedQgil
ResolvedQgil
DeclinedKeegan
DuplicateTrizek-WMF
Resolved Moushira
ResolvedQgil
ResolvedKeegan
ResolvedKeegan
ResolvedKeegan
ResolvedKeegan
ResolvedKeegan
ResolvedKeegan
ResolvedKeegan
ResolvedJohan
Declined Elitre
DeclinedNone
ResolvedPginer-WMF

Event Timeline

@Legoktm The way info is available at https://commons.wikimedia.org/wiki/Module:Fallbacklist works for me, if it makes sense.
I believe that one may be based on the data the .svg uses, and Amir commented it's too outdated and didn't recommend using it.

@Amire80 I notice that @Nikerabbit actually updated https://www.mediawiki.org/wiki/File:MediaWiki_fallback_chains.svg back then.
It's still not in a non-visual format, but other than that I would have hoped for it to be accurate,
but I see that Ukrainian is still set to fallback to Russian which I thought wasn't the case any longer?

It's not uk -> ru, but rue -> uk, ru. The graph cannot distinguish such cases.

A text-based version of it would, though.

developer@dev:~/mediawiki/workdir/languages/messages (master)$ grep '$fallback ' * | sed 's/^Messages\(.*\)\..* '"'"'\(.*\).;.*/\L\1: \2/;s/_/-/'
ab: ru
abs: id
ace: id
ady: ady-cyrl
aeb-arab: ar
aeb: aeb-arab
aln: sq
alt: ru
ami: zh-hant
an: es
anp: hi
arn: es
arq: ar
ary: ar
arz: ar
ast: es
atj: fr
avk: fr, es, ru
av: ru
awa: hi
ay: es
azb: fa
ban-bali: ban
ban: id
ba: ru
bar: de
bbc-latn: id
bbc: bbc-latn
bcc: fa
be-tarask: be
bgn: fa
bh: bho
bi: en
bjn: id
bm: fr
bpy: bn
bqi: fa
br: fr
btm: id
bug: id
bxr: ru
ca: oc
MessagesCbk-zam.php:$fallback = "es";
cdo: nan, zh-hant
ce: ru
co: it
crh-cyrl: ru
crh: crh-latn
csb: pl
cs: sk
cv: ru
de-at: de
de-ch: de
de-formal: de
dsb: hsb, de
dtp: ms
dty: ne
egl: it
eml: it
en-ca: en
en-gb: en
MessagesEn.php:$fallback = false;
es-formal: es
ext: es
ff: fr
fit: fi
frc: fr
frp: fr
frr: de
fur: it
gag: tr
gan-hans: zh-hans
gan-hant: zh-hant, zh-hans
gan: gan-hant, zh-hant, zh-hans
gcr: fr
glk: fa
gl: pt
gn: es
gom-deva: hi
gom: gom-deva
gor: id
gsw: de
hak: zh-hant
hif: hif-latn
hrx: de
hsb: dsb, de
ht: fr
hu-formal: hu
hyw: hy
ii: zh-cn, zh-hans
inh: ru
io: eo
iu: ike-cans
jam: en
jut: da
jv: id
kaa: kk-latn, kk-cyrl
kab: fr
kbd-cyrl: ru
kbd: kbd-cyrl
kbp: fr
khw: ur
kiu: tr
kjp: my
kk-arab: kk-cyrl
kk-cn: kk-arab, kk-cyrl
kk-kz: kk-cyrl
kk-latn: kk-cyrl
kk: kk-cyrl
kk-tr: kk-latn, kk-cyrl
kl: da
koi: ru
ko-kp: ko
krc: ru
krl: fi
ksh: de
ks: ks-arab
ku-arab: ckb
kum: ru
ku: ku-latn
kv: ru
lad: es
lbe: ru
lb: de
lez: ru, az
lij: it
li: nl
liv: et
lki: fa
lld: it, rm, fur
lmo: pms, eml, lij, vec, it
ln: fr
lrc: fa
ltg: lv
luz: fa
lzh: zh-hant
lzz: tr
mai: hi
map-bms: jv, id
mdf: myv, ru
mg: fr
mhr: mrj, ru
min: id
mnw: my
mo: ro
mrj: mhr, ru
mwl: pt
myv: mdf, ru
mzn: fa
nah: es
nan: cdo, zh-hant
nap: it
nb: nn
nds-nl: nl
nds: de
nl-informal: nl
nn: nb
nrm: fr
oc: ca, fr
olo: fi
os: ru
pcd: fr
pdc: de
pdt: de
pfl: de
pih: en
pms: it
pnt: el
pt-br: pt
pt: pt-br
qug: qu, es
qu: qug, es
rgn: it
rmy: ro
roa-tara: it
rue: uk, ru
rup: ro
ruq-cyrl: mk
ruq-latn: ro
ruq: ruq-latn, ro
sah: ru
sa: hi
scn: it
sco: en
sdc: it
sdh: cbk, fa
ses: fr
sg: fr
sgs: lt
sh: bs, sr-el, hr
shy-latn: fr, arq
sk: cs
skr-arab: ur, pnb
skr: skr-arab
sli: de
srn: nl
sr: sr-ec
stq: de
sty: ru
su: id
szl: pl
szy: zh-tw, zh-hant, zh-hans
tay: zh-hant
tcy: kn
tet: pt
tg: tg-cyrl
tly: ru
trv: zh-hant
tt-cyrl: ru
tt: tt-cyrl, ru
ty: fr
tyv: ru
udm: ru
ug: ug-arab
vec: it
vep: et
vls: nl
vmf: de
vot: fi
vro: et
wa: fr
wo: fr
wuu: zh-hans
xal: ru
xmf: ka
yi: he
za: zh-hans
zea: nl
zgh: kab
zh-cn: zh-hans
zh-hant: zh-hans
zh-hk: zh-hant, zh-hans
zh-mo: zh-hk, zh-hant, zh-hans
zh-my: zh-sg, zh-hans
zh: zh-hans
zh-sg: zh-hans
zh-tw: zh-hant, zh-hans

TYSM! I will figure out where to put this ASAP ^_^

TYSM! I will figure out where to put this ASAP ^_^

I've clarified the link (formerly just "(grep)") in the last sentence of https://www.mediawiki.org/wiki/Manual:Language#Fallback_languages for the link to the code search.
IIUC that link is giving the same output as Nikerabbit's comment above? If so, that might cover it? Alternatively we could add a link to the comment above, into that page, but that will rapidly get outdated.

That doesn't look like a friendly page for someone as lost as I was. But agreed that we may not need much more than a link at this time.

Pginer-WMF claimed this task.
Pginer-WMF subscribed.

That doesn't look like a friendly page for someone as lost as I was. But agreed that we may not need much more than a link at this time.

Based on the previous comments I'll close the ticket, but feel free to reopen if additional details are needed.

developer@dev:~/mediawiki/workdir/languages/messages (master)$ grep '$fallback ' * | sed 's/^Messages\(.*\)\..* '"'"'\(.*\).;.*/\L\1: \2/;s/_/-/'
ab: ru
abs: id
ace: id

By using this list, I converted all the language-code to its language name (by using this table as reference). See graph visualization here : https://altilunium.github.io/MediaWiki-Language-Fallback/

Lombard -> Piedmontese -> Emiliano-Romagnolo -> Ligurian -> Venetian -> Italian
Kotava -> French -> Spanish -> Russian
Gan Chinese -> Gan (Traditional) -> Traditional Chinese -> Simplified Chinese
Ladin -> Italian -> Romansh -> Friulian
Serbo-Croatian -> Bosnian -> Serbian (Latin script) -> Croatian
Sakizaya -> Chinese (Taiwan) -> Traditional Chinese -> Simplified Chinese
Chinese (Macau) -> Chinese (Hong Kong) -> Traditional Chinese -> Simplified Chinese
Min Dong Chinese -> Min Nan Chinese -> Traditional Chinese
Lower Sorbian -> Upper Sorbian -> German
Gan (Traditional) -> Traditional Chinese -> Simplified Chinese
Upper Sorbian -> Lower Sorbian -> German
Sichuan Yi -> Chinese (China) -> Simplified Chinese
Kara-Kalpak -> Kazakh (Latin script) -> Kazakh (Cyrillic script)
Kazakh (China) -> Kazakh (Arabic script) -> Kazakh (Cyrillic script)
Kazakh (Turkey) -> Kazakh (Latin script) -> Kazakh (Cyrillic script)
Lezghian -> Russian -> Azerbaijani
Basa Banyumasan -> Javanese -> Indonesian
Moksha -> Erzya -> Russian
Eastern Mari -> Western Mari -> Russian
Western Mari -> Eastern Mari -> Russian
Erzya -> Moksha -> Russian
Min Nan Chinese -> Min Dong Chinese -> Traditional Chinese
Occitan -> Catalan -> French
Chimborazo Highland Quichua -> Quechua -> Spanish
Quechua -> Chimborazo Highland Quichua -> Spanish
Rusyn -> Ukrainian -> Russian
Megleno-Romanian -> Megleno-Romanian (Latin script) -> Romanian
Southern Kurdish -> Chavacano -> Persian
Shawiya (Latin script) -> French -> Algerian Arabic
Saraiki (Arabic script) -> Urdu -> Western Punjabi
Tatar -> Tatar (Cyrillic script) -> Russian
Chinese (Hong Kong) -> Traditional Chinese -> Simplified Chinese
Chinese (Malaysia) -> Chinese (Singapore) -> Simplified Chinese
Chinese (Taiwan) -> Traditional Chinese -> Simplified Chinese
Abkhazian -> Russian
Ambonese Malay -> Indonesian
Achinese -> Indonesian
Adyghe -> Adyghe (Cyrillic script)
Tunisian Arabic (Arabic script) -> Arabic
Tunisian Arabic -> Tunisian Arabic (Arabic script)
Gheg Albanian -> Albanian
Southern Altai -> Russian
Amis -> Traditional Chinese
Aragonese -> Spanish
Angika -> Hindi
Mapuche -> Spanish
Algerian Arabic -> Arabic
Moroccan Arabic -> Arabic
Egyptian Arabic -> Arabic
Asturian -> Spanish
Atikamekw -> French
Avaric -> Russian
Awadhi -> Hindi
Aymara -> Spanish
South Azerbaijani -> Persian
Balinese (Balinese script) -> Balinese
Balinese -> Indonesian
Bashkir -> Russian
Bavarian -> German
Batak Toba (Latin script) -> Indonesian
Batak Toba -> Batak Toba (Latin script)
Southern Balochi -> Persian
Belarusian (Taraskievica orthography) -> Belarusian
Western Balochi -> Persian
Bhojpuri -> Bhojpuri
Bislama -> English
Banjar -> Indonesian
Bambara -> French
Bishnupriya -> Bangla
Bakhtiari -> Persian
Breton -> French
Batak Mandailing -> Indonesian
Buginese -> Indonesian
Russia Buriat -> Russian
Catalan -> Occitan
Zam -> Spanish
Chechen -> Russian
Corsican -> Italian
Crimean Tatar (Cyrillic script) -> Russian
Crimean Tatar -> Crimean Tatar (Latin script)
Kashubian -> Polish
Czech -> Slovak
Chuvash -> Russian
Austrian German -> German
Swiss High German -> German
German (formal address) -> German
Central Dusun -> Malay
Doteli -> Nepali
Emilian -> Italian
Emiliano-Romagnolo -> Italian
Canadian English -> English
British English -> English
Spanish (formal address) -> Spanish
Extremaduran -> Spanish
Fulah -> French
Tornedalen Finnish -> Finnish
Cajun French -> French
Arpitan -> French
Northern Frisian -> German
Friulian -> Italian
Gagauz -> Turkish
Gan (Simplified) -> Simplified Chinese
Guianan Creole -> French
Gilaki -> Persian
Galician -> Portuguese
Guarani -> Spanish
Goan Konkani (Devanagari script) -> Hindi
Goan Konkani -> Goan Konkani (Devanagari script)
Gorontalo -> Indonesian
Swiss German -> German
Hakka Chinese -> Traditional Chinese
Fiji Hindi -> Fiji Hindi (Latin script)
Hunsrik -> German
Haitian Creole -> French
Hungarian (formal address) -> Hungarian
Western Armenian -> Armenian
Ingush -> Russian
Ido -> Esperanto
Inuktitut -> Eastern Canadian (Aboriginal syllabics)
Jamaican Creole English -> English
Jutish -> Danish
Javanese -> Indonesian
Kabyle -> French
Kabardian (Cyrillic script) -> Russian
Kabardian -> Kabardian (Cyrillic script)
Kabiye -> French
Khowar -> Urdu
Kirmanjki -> Turkish
Eastern Pwo -> Burmese
Kazakh (Arabic script) -> Kazakh (Cyrillic script)
Kazakh (Kazakhstan) -> Kazakh (Cyrillic script)
Kazakh (Latin script) -> Kazakh (Cyrillic script)
Kazakh -> Kazakh (Cyrillic script)
Kalaallisut -> Danish
Komi-Permyak -> Russian
Korean (North Korea) -> Korean
Karachay-Balkar -> Russian
Karelian -> Finnish
Colognian -> German
Kashmiri -> Kashmiri (Arabic script)
Kurdish (Arabic script) -> Central Kurdish
Kumyk -> Russian
Kurdish -> Kurdish (Latin script)
Komi -> Russian
Ladino -> Spanish
Lak -> Russian
Luxembourgish -> German
Ligurian -> Italian
Limburgish -> Dutch
Livonian -> Estonian
Laki -> Persian
Lingala -> French
Northern Luri -> Persian
Latgalian -> Latvian
Southern Luri -> Persian
Literary Chinese -> Traditional Chinese
Laz -> Turkish
Maithili -> Hindi
Malagasy -> French
Minangkabau -> Indonesian
Mon -> Burmese
Moldovan -> Romanian
Mirandese -> Portuguese
Mazanderani -> Persian
Nahuatl -> Spanish
Neapolitan -> Italian
Norwegian Bokmal -> Norwegian Nynorsk
Low Saxon -> Dutch
Low German -> German
Dutch (informal address) -> Dutch
Norwegian Nynorsk -> Norwegian Bokmal
Norman -> French
Livvi-Karelian -> Finnish
Ossetic -> Russian
Picard -> French
Pennsylvania German -> German
Plautdietsch -> German
Palatine German -> German
Norfuk / Pitkern -> English
Piedmontese -> Italian
Pontic -> Greek
Brazilian Portuguese -> Portuguese
Portuguese -> Brazilian Portuguese
Romagnol -> Italian
Vlax Romani -> Romanian
Tarantino -> Italian
Aromanian -> Romanian
Megleno-Romanian (Cyrillic script) -> Macedonian
Megleno-Romanian (Latin script) -> Romanian
Yakut -> Russian
Sanskrit -> Hindi
Sicilian -> Italian
Scots -> English
Sassarese Sardinian -> Italian
Koyraboro Senni -> French
Sango -> French
Samogitian -> Lithuanian
Slovak -> Czech
Saraiki -> Saraiki (Arabic script)
Lower Silesian -> German
Sranan Tongo -> Dutch
Serbian -> Serbian (Cyrillic script)
Saterland Frisian -> German
Siberian Tatar -> Russian
Sundanese -> Indonesian
Silesian -> Polish
Tayal -> Traditional Chinese
Tulu -> Kannada
Tetum -> Portuguese
Tajik -> Tajik (Cyrillic script)
Talysh -> Russian
Taroko -> Traditional Chinese
Tatar (Cyrillic script) -> Russian
Tahitian -> French
Tuvinian -> Russian
Udmurt -> Russian
Uyghur -> Uyghur (Arabic script)
Venetian -> Italian
Veps -> Estonian
West Flemish -> Dutch
Main-Franconian -> German
Votic -> Finnish
Voro -> Estonian
Walloon -> French
Wolof -> French
Wu Chinese -> Simplified Chinese
Kalmyk -> Russian
Mingrelian -> Georgian
Yiddish -> Hebrew
Zhuang -> Simplified Chinese
Zeelandic -> Dutch
Standard Moroccan Tamazight -> Kabyle
Chinese (China) -> Simplified Chinese
Traditional Chinese -> Simplified Chinese
Chinese -> Simplified Chinese
Chinese (Singapore) -> Simplified Chinese

@Nikerabbit @abi_ Does the team have any mean to turn that list into an image of some sort for better visualization? TY!

@Elitre There's the image at https://commons.wikimedia.org/wiki/File:MediaWiki_fallback_chains.svg that is included in many documentation pages. It was last updated in August 2022 (and then some formatting tweaks by a volunteer since then). Were you thinking of something more like the tool that Rtnf made? Or a different style of visualization entirely?

By using this list, I [...]
See graph visualization here

Oh wow, that's really nice! I wonder if this could be converted into a low-maintenance tool on Toolforge that automatically updated, and perhaps even had localized names (so that German-speakers see "Deutsch" instead of "German")?
Or just use that tool (after lining the nodes up clearly, without overlapping text) to capture some screenshots for Commons?
P.s. Try to avoid using the word "here" as a link label, because it leads to people missing the links (because the word is so short) and is also an accessibility problem (because it doesn't provide context for users of screen-reader software, whom often navigate a page by just jumping between links).
Thanks again for the natural-language list and the graph visualization!

I was thinking of whatever puts things in a visual format like that map, and/or updating that map.