fix(language): update Ukrainian Latin to national standard (@paiv) (#6584)

### Description

In short, this PR is based on Ukraine's national standard [DSTU
9112:2021 (A)](https://en.wikipedia.org/wiki/DSTU_9112:2021), and
replaces non-standard transliteration originally submitted.

- Updates #3855

### Context

Unfortunately, this topic has been a bit toxic in Ukraine, I am sorry to
bring this on the developers of this popular tool. I am trying to stay
neutral [¹](https://paiv.github.io/blog/2024/11/26/ukrainian-latin.html
"The state of Ukrainian Latin"), with Ukraine's ultimate benefits in
mind.

At the moment, there is no definite Ukrainian Latin script. The state
standard (KMU 55) is a lossy transliteration, not useable for general
writing, only applied to personal names and places.

Practitioners of Latin script for the Ukrainian language is still only a
marginal group, without a unifying movement. Until 2021 basically
everyone had their own transliteration method, derived from two dozen
historical schemes.

In 2021 comes Ukraine's national standard DSTU 9112:2021, and
objectively is good enough for general writing among alternatives. It
does not prescribe transition from Cyrillic. Its future is in
integration with European languages, gradually replacing legacy KMU 55.

Thus my argument for the practitioners of Ukrainian Latin script is to
adopt DSTU standard, given its perspective and unifying power, and phase
out non-standard schemes, of which #3855 is only one.

I hope @tymof1j as the original contributor could critically review
these notes, with the two year perspective.

The script used for conversion from Cyrillic:
https://gist.github.com/paiv/df2f38ed86a103471a49cfa8064d0d2e

----
To reiterate, Ukrainian Latin script is not established, and people are
coming here to not only train keyboard but also to get used to the
concept of Ukrainian Latin. Hosting one of dozens unofficial
alternatives of Ukrainian Latin without giving wider context is not
appropriate. People should start with Ukraine's national standard, then
learn of alternatives, if interested.

The problem of the national standard is that it is young, and has little
adoption and tooling. Those will come in time. I have posted examples of
possible keyboard setups here: https://paiv.github.io/latynka-keyboard/

I used system A of the national standard, with diacritics. An
alternative would be to use system B, which only needs basic Latin. It
is more verbose, but is more accessible to type. I believe system A is
preferable for general text.

If this is too much for this project, I would rather remove Ukrainian
Latin than host one non-standard variant, until it is established in
Ukraine. But it is nice to keep this platform to teach the concept of
Ukrainian Latin script.
This commit is contained in:
Pavel Ivashkov 2025-05-26 16:57:13 +03:00 committed by GitHub
parent 16eda17eb6
commit ea144996f3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 18827 additions and 18832 deletions

View file

@ -7,7 +7,7 @@
"u",
"v",
"ne",
"ščo",
"ŝo",
"z",
"buty",
"do",
@ -27,29 +27,29 @@
"svij",
"vid",
"vin",
"jogo",
"joğo",
"takyj",
"ale",
"rik",
"vona",
"iz",
"odyn",
"mogty",
"moğty",
"tak",
"maty",
"ljudyna",
"čy",
"vy",
"toj",
"šče",
"ŝe",
"čas",
"jiji",
"ïï",
"koly",
"vono",
"ukrajins'kyj",
"ukraïnsjkyj",
"ty",
"vže",
"ščob",
"ŝob",
"ž",
"inšyj",
"to",
@ -59,7 +59,7 @@
"sebe",
"takož",
"vsi",
"jakščo",
"jakŝo",
"tomu",
"te",
"abo",
@ -68,24 +68,24 @@
"naš",
"pytannja",
"pravo",
"den'",
"denj",
"zakon",
"možna",
"pislja",
"hto",
"til'ky",
"krajina",
"xto",
"tiljky",
"kraïna",
"de",
"znaty",
"mij",
"duže",
"deržavnyj",
"navit'",
"navitj",
"čerez",
"ščodo",
"ŝodo",
"slovo",
"žyttja",
"golova",
"ğolova",
"tut",
"b",
"same",
@ -101,14 +101,14 @@
"raz",
"deržava",
"tam",
"bagato",
"bağato",
"dytyna",
"sjogodni",
"sjoğodni",
"kožnyj",
"vlada",
"misto",
"svit",
"bil'še",
"biljše",
"deputat",
"ni",
"todi",
@ -118,7 +118,7 @@
"systema",
"treba",
"zaraz",
"nacional'nyj",
"nacionaljnyj",
"kazaty",
"narodnyj",
"dva",
@ -129,20 +129,19 @@
"polityčnyj",
"zrobyty",
"mova",
"jihnij",
"hotity",
"ïxnij",
"xotity",
"častyna",
"pracjuvaty",
"miž",
"proekt",
"ruka",
"dijal'nist'",
"dijaljnistj",
"rozvytok",
"proces",
"prosto",
"samyj",
"oblast'",
"jakyjs'",
"oblastj",
"jakyjsj",
"robyty",
"robota",
"šanovnyj",
@ -151,33 +150,33 @@
"zmina",
"syla",
"sud",
"govoryty",
"ğovoryty",
"vvažaty",
"umova",
"potim",
"istorija",
"teper",
"kolega",
"kil'ka",
"grupa",
"social'nyj",
"koleğa",
"kiljka",
"ğrupa",
"socialjnyj",
"pered",
"možlyvist'",
"možlyvistj",
"sytuacija",
"uvaga",
"uvağa",
"žinka",
"rezul'tat",
"rezuljtat",
"zokrema",
"organizacija",
"orğanizacija",
"davaty",
"uže",
"partija",
"mižnarodnyj",
"komitet",
"čolovik",
"dopomoga",
"dopomoğa",
"nemaje",
"organ",
"orğan",
"niž",
"djakuvaty",
"dija",
@ -185,21 +184,21 @@
"počaty",
"povynnyj",
"dumka",
"verhovnyj",
"verxovnyj",
"zemlja",
"ostannij",
"oko",
"ščos'",
"ŝosj",
"informacija",
"tysjača",
"vijna",
"sered",
"ničogo",
"ničoğo",
"vlasnyj",
"zakonoproekt",
"zakonoprojekt",
"pevnyj",
"urjad",
"hoča",
"xoča",
"vypadok",
"bik"
]

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -3,26 +3,25 @@
"noLazyMode": true,
"words": [
"ja",
"'",
"j",
"y",
"m",
"u",
"o",
"i",
"ju",
"j",
"a",
"e",
"h",
"ji",
"x",
"ï",
"v",
"š",
"je",
"k",
"sja",
"nja",
"s'",
"t'",
"sj",
"tj",
"my",
"ty",
"ly",
@ -35,7 +34,7 @@
"nu",
"ku",
"mo",
"go",
"ğo",
"lo",
"ni",
"vi",
@ -49,15 +48,15 @@
"ka",
"te",
"ne",
"yh",
"ah",
"jah",
"oji",
"yx",
"ax",
"jax",
"oï",
"iv",
"av",
"ješ",
"osja",
"'sja",
"jsja",
"ysja",
"esja",
"šsja",
@ -65,14 +64,14 @@
"vsja",
"jusja",
"nnja",
"os'",
"ys'",
"es'",
"as'",
"vs'",
"jus'",
"jut'",
"st'",
"osj",
"ysj",
"esj",
"asj",
"vsj",
"jusj",
"jutj",
"stj",
"ymy",
"amy",
"jamy",
@ -84,7 +83,7 @@
"omu",
"jmo",
"jemo",
"ogo",
"oğo",
"alo",
"nni",
"ovi",
@ -98,26 +97,26 @@
"ala",
"jte",
"jete",
"nyh",
"noji",
"nyx",
"noï",
"mosja",
"losja",
"t'sja",
"tjsja",
"tysja",
"lysja",
"tesja",
"lasja",
"mos'",
"los'",
"tys'",
"lys'",
"tes'",
"las'",
"ist'",
"mosj",
"losj",
"tysj",
"lysj",
"tesj",
"lasj",
"istj",
"nymy",
"nnjam",
"nomu",
"nogo",
"noğo",
"osti",
"istju"
]