aboutsummaryrefslogtreecommitdiff
path: root/localedata/unicode-gen
AgeCommit message (Collapse)Author
2017-10-13localedata: Reorganize Unicode LC_CTYPE inclusion.Carlos O'Donell
The commit does the following things: * Move non-transliteration Unicode generated data to i18n_ctype. * Copy the i18n_ctype data into i18n and add transliteration. In the future, any locale which needs Unicode LC_CTYPE data can also just use `copy i18n_ctype` and get the base character classes and maps without transliteration. Tested by compiling all the locales and my prototype C.UTF-8 which uses it. Signed-off-by: Carlos O'Donell <carlos@redhat.com>
2017-09-06Improve utf8_gen.py to set the width for characters with ↵Mike FABIAN
Prepended_Concatenation_Mark property to 1 [BZ #22070] * localedata/unicode-gen/utf8_gen.py: Set the width for characters with Prepended_Concatenation_Mark property to 1 * localedata/charmaps/UTF-8: Updated using the improved script.
2017-09-06Write all ranges of neighbouring characters with the same width using the ↵Mike FABIAN
range notation in charmaps/UTF-8 Writing ranges of neighbouring characters with the same with like this <U000E0100>...<U000E01EF> 0 in charmaps/UTF-8 is more efficient than writing many single character lines like: <U000E0100> 0 <U000E0101> 0 ... [BZ #21750] * unicode-gen/utf8_gen.py: Write all ranges of neighbouring characters with the same width using the range notation in charmaps/UTF-8.
2017-08-17Resolve some historically special cases of ambiguous widthThorsten Glaser
[BZ #21750] * unicode-gen/utf8_gen.py (U+00AD): Set width to 1. * unicode-gen/utf8_gen.py (U+1160..U+11FF): Set width to 0. * unicode-gen/utf8_gen.py (U+3248..U+324F): Set width to 2. * unicode-gen/utf8_gen.py (U+4DC0..U+4DFF): Likewise.
2017-08-17Handle more cases of combining charactersThorsten Glaser
[BZ #21750] * unicode-gen/utf8_gen.py: Treat category Me and Mn as combining.
2017-08-17UnicodeData has precedence over EastAsianWidthThorsten Glaser
[BZ #19852] [BZ #21750] * unicode-gen/utf8_gen.py: Process EastAsianWidth lines before UnicodeData lines so the latter have precedence; remove hack to group output by EastAsianWidth ranges.
2017-06-22Bug 21533: Update to Unicode 10.0.0Mike FABIAN
* Unicode 10.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 10.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).
2017-02-21Bug 20313: Update to Unicode 9.0.0Mike FABIAN
* Unicode 9.0.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 9.0.0, using generator scripts contributed by Mike FABIAN (Red Hat).
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2016-06-11unicode-gen: include standard comment file headerMike Frysinger
We deployed this header to all the locale files, so make sure we include it in the generated ones too so we don't lose it.
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers
2015-12-11Automate LC_CTYPE generation for tr_TR, update to Unicode 8.0.0 (bug 18491).Joseph Myers
This patch makes the automation of Unicode LC_CTYPE generation also support generating the modified LC_CTYPE used for Turkish (where case conversions of 'i' and 'I' differ from ASCII conventions), so allowing that to be more readily kept in sync for future Unicode updates. The patch includes the locale update generated by the scripts. Tested for x86_64. [BZ #18491] * unicode-gen/unicode_utils.py (to_upper_turkish): New function. (to_lower_turkish): Likewise. * unicode-gen/gen_unicode_ctype.py (output_tables): Support producing output with Turkish case conversions. (--turkish): New command-line option. * unicode-gen/Makefile (GENERATED): Add tr_TR. (tr_TR): New rule. * locales/tr_TR: Regenerate LC_CTYPE.
2015-12-10Update to Unicode 8.0.0.Mike FABIAN
Update __STDC_ISO_10646__ to 201505L for Unicode 8.0.0. Update character encoding, ctype, and transliteration tables. New scripts autogenerate transliteration tables.
2015-12-09Update transliteration support to Unicode 7.0.0.Carlos O'Donell
The transliteration files are now autogenerated from upstream Unicode data.
2015-02-23Amendments to Unicode 7 update.Alexandre Oliva
for ChangeLog * include/stdc-predef.h (__STDC_ISO_10646__): Update to 201304L, for Unicode 7. for localedata/ChangeLog * unicode-gen/ctype_compatibility.py: Use date ranges in copyright notice. * unicode-gen/ctype_compatibility_test_cases.py: Likewise. * unicode-gen/gen_unicode_ctype.py: Likewise. * unicode-gen/utf8_compatibility.py: Likewise. * unicode-gen/utf8_gen.py: Likewise. Use upper case for global variables, use tuples for global constant arrays. From Mike FABIAN. Suggested by Mike Frysinger <vapier@gentoo.org>.
2015-02-20Unicode 7.0.0 update; added generator scripts.Alexandre Oliva
for localedata/ChangeLog [BZ #17588] [BZ #13064] [BZ #14094] [BZ #17998] * unicode-gen/Makefile: New. * unicode-gen/unicode-license.txt: New, from Unicode. * unicode-gen/UnicodeData.txt: New, from Unicode. * unicode-gen/DerivedCoreProperties.txt: New, from Unicode. * unicode-gen/EastAsianWidth.txt: New, from Unicode. * unicode-gen/gen_unicode_ctype.py: New generator, from Mike FABIAN <mfabian@redhat.com>. * unicode-gen/ctype_compatibility.py: New verifier, from Pravin Satpute <psatpute@redhat.com> and Mike FABIAN. * unicode-gen/ctype_compatibility_test_cases.py: New verifier module, from Mike FABIAN. * unicode-gen/utf8_gen.py: New generator, from Pravin Satpute and Mike FABIAN. * unicode-gen/utf8_compatibility.py: New verifier, from Pravin Satpute and Mike FABIAN. * charmaps/UTF-8: Update. * locales/i18n: Update. * gen-unicode-ctype.c: Remove. * tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns true for ordinal indicators.