aboutsummaryrefslogtreecommitdiff
path: root/locale/programs/ld-collate.c
AgeCommit message (Collapse)Author
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers
2022-09-22Use '%z' instead of '%Z' on printf functionsAdhemerval Zanella Netto
The Z modifier is a nonstandard synonymn for z (that predates z itself) and compiler might issue an warning for in invalid conversion specifier. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-03-31locale: Remove set but unused variable on ld-collate.cAdhemerval Zanella
Checked on x86_64-linux-gnu and i686-linux-gnu.
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-09-06Add 'codepoint_collation' support for LC_COLLATE.Carlos O'Donell
Support a new directive 'codepoint_collation' in the LC_COLLATE section of a locale source file. This new directive causes all collation rules to be dropped and instead STRCMP (strcmp or wcscmp) is used for collation of the input character set. This is required to allow for a C.UTF-8 that contains zero collation rules (minimal size) and sorts using code point sorting. To date the only implementation of a locale with zero collation rules is the C/POSIX locale. The C/POSIX locale provides identity tables for _NL_COLLATE_COLLSEQMB and _NL_COLLATE_COLLSEQWC that map to ASCII even though it has zero rules. This has lead to existing fnmatch, regexec, and regcomp implementations that require these tables. It is not correct to use these tables when nrules == 0, but the conservative fix is to provide these tables when nrules == 0. This assures that existing static applications using a new C.UTF-8 locale with 'codepoint_collation' at least have functional range expressions with ASCII e.g. [0-9] or [a-z]. Such static applications would not have the fixes to fnmatch, regexec and regcomp that avoid the use of the tables when nrules == 0. Future fixes to fnmatch, regexec, and regcomp would allow range expressions to use the full set of code points for such ranges. Tested on x86_64 and i686 without regression. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-09-03Remove "Contributed by" linesSiddhesh Poyarekar
We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-04-26LC_COLLATE: Fix last character ellipsis handling (Bug 22668)Hanataka Shinya
During ellipsis processing the collation cursor was not correctly moved to the end of the ellipsis after processing. The code inserted the new entry after the cursor, but before the real end of the ellipsis: [cursor] ... element_t <-> element_t <-> element_t <-> element_t "<U0000>" "<U0001>" "<U007F>" startp endp At the end of the function we have: [cursor] ... element_t <-> element_t <-> element_t "<U007E>" "<U007F>" endp The cursor should be pointing at endp, the last element in the doubly-linked list, otherwise when execution returns to the caller we will start inserting the next line after <U007E>. Subsequent operations end up unlinking the ellipsis end entry or just leaving it in the list dangling from the end. This kind of dangling is immediately visible in C.UTF-8 with the following sorting from strcoll: <U0010FFFF> <U0000FFFF> <U000007FF> <U0000007F> With the cursor correctly adjusted the end entry is correctly given the right location and thus the right weight. Retested and no regressions on x86_64 and i686. Co-authored-by: Carlos O'Donell <carlos@redhat.com>
2021-01-02Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
2020-12-11treewide: fix incorrect spelling of indices in commentsDmitry V. Levin
Replace 'indeces' with 'indices', the most annoying of these typos were those found in elf.h which is a public header file copied to other projects.
2020-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2019-09-07Prefer https to http for gnu.org and fsf.org URLsPaul Eggert
Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-03-21Fix parentheses error in iconvconfig.c and ld-collate.c [BZ #24372]Gabriel F. T. Gomes
When -Werror=parentheses is in use, iconvconfig.c builds fail with: iconvconfig.c: In function ‘write_output’: iconvconfig.c:1084:34: error: suggest parentheses around ‘+’ inside ‘>>’ [-Werror=parentheses] hash_size = next_prime (nnames + nnames >> 1); ~~~~~~~^~~~~~~~ This patch adds parentheses to the expression. Not where suggested by the compiler warning, but where it produces the expected result, i.e.: where it has the effect of multiplying nnames by 1.5. Likewise for elem_size in ld-collate.c. Tested for powerpc64le. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-03-21iconv, localedef: avoid floating point rounding differences [BZ #24372]DJ Delorie
Two cases of "int * 1.4" may result in imprecise results, which in at least one case resulted in i686 and x86-64 producing different locale files. This replaced that floating point multiply with integer operations. While the hash table margin is increased from 40% to 50%, testing shows only 2% increase in overall size of the locale archive. https://bugzilla.redhat.com/show_bug.cgi?id=1311954 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2019-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2018-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2017-10-13locale: Fix localedef exit code (Bug 22292)Carlos O'Donell
The error and warning handling in localedef, locale, and iconv is a bit of a mess. We use ugly constructs like this: WITH_CUR_LOCALE (error (1, errno, gettext ("\ cannot read character map directory `%s'"), directory)); to issue errors, and read error_message_count directly from the error API to detect errors. The problem with that is that the code also uses error to print warnings, and informative messages. All of this leads to problems where just having warnings will produce an exit status as-if errors had been seen. To fix this situation I have adopted the following high-level changes: * All errors are counted distinctly. * All warnings are counted distinctly. * All informative messages are not counted. * Increasing verbosity cannot generate *more* errors, and it previously did for errors conditional on verbose, this is now fixed. * Increasing verbosity *can* generate *more* warnings. * Making the output quiet cannot generate *fewer* errors, and it previously did for errors conditional on be_quiet, this is now fixed. * Each of error, warning, and informative message has it's own function to call defined in record-status.h, and they are: record_error, record_warning, and record_verbose. * The record_error function always records an error, but conditional on be_quiet may not print it. * The record_warning function always records a warning, but conditional on be_quiet may not print it. * The record_verbose function only prints the verbose message if verbose is true and be_quiet is false. This has allowed the following fix: * Previously any warnings were being treated as errors because they incremented error_message_count, but now we properly return an exit status of 1 if there are warnings but output was generated. All of this allows localedef to correctly decide if errors, or warnings were present, and produce the correct exit code. The locale and iconv programs now also use record-status.h and we have removed the WITH_CUR_LOCALE hack, and instead have internal push_locale/pop_locale functions centralized in the record routines. Signed-off-by: Carlos O'Donell <carlos@redhat.com>
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers
2015-10-08strcoll: Remove incorrect STRDIFF-based optimization (Bug 18589).Carlos O'Donell
The optimization introduced in commit f13c2a8dff2329c6692a80176262ceaaf8a6f74e, causes regressions in sorting for languages that have digraphs that change sort order, like cs_CZ which sorts ch between h and i. My analysis shows the fast-forwarding optimization in STRCOLL advances through a digraph while possibly stopping in the middle which results in a subsequent skipping of the digraph and incorrect sorting. The optimization is incorrect as implemented and because of that I'm removing it for 2.23, and I will also commit this fix for 2.22 where it was originally introduced. This patch reverts the optimization, introduces a new bug-strcoll2.c regression test that tests both cs_CZ.UTF-8 and da_DK.ISO-8859-1 and ensures they sort one digraph each correctly. The optimization can't be applied without regressing this test. Checked on x86_64, bug-strcoll2.c fails without this patch and passes after. This will also get a fix on 2.22 which has the same bug.
2015-05-12Improve strcoll with strdiff.Leonhard Holz
This patch improves strcoll hot case by finding first byte that mismatches. That is in likely case enough to determine comparison result.
2015-01-02Update copyright dates with scripts/update-copyrights.Joseph Myers
2014-01-01Update copyright notices with scripts/update-copyrightsAllan McRae
2013-10-18Fix localedef collation handling of <U0000> (bug 15948).Richard Sandiford
2013-10-08Clean up locale file alignment handling.Joseph Myers
2013-10-03Remove locale file dependence on int32_t alignment.Joseph Myers
2013-09-24Add localedef --big-endian and --little-endian options.Joseph Myers
2013-09-06Make localedef output generation use more logical interfaces.Richard Sandiford
2013-08-30Fix then/than typos.Ondřej Bílka
2013-05-16Add #include <stdint.h> for uint[32|64]_t usage (except installed headers).Ryan S. Arnold
2013-01-02Update copyright notices with scripts/update-copyrights.Joseph Myers
2012-02-09Replace FSF snail mail address with URLs.Paul Eggert
2011-06-10Quash some new warnings from GCC 4.6.Roland McGrath
2011-04-22Remove doubled words.Jim Meyering
2009-09-07Fix endless loop in localedef.Ulrich Drepper
localedef got into an endless loop in case order_start was used for the unnamed_section twice and the first use didn't actually result into any definition.
2008-05-27Remove useless more "if" tests before "free".Ulrich Drepper
* include/inline-hashtab.h (htab_delete): Likewise. * libio/freopen.c (freopen): Likewise. * libio/freopen64.c (freopen64): Likewise. * locale/programs/ld-collate.c (collate_read): Likewise. * misc/fstab.c (libc_freeres_fn): Likewise. * posix/glob.c (globfree): Likewise.
2008-04-08(collate_read): Ignore script lines as well when ignoring the whole category.Ulrich Drepper
2008-03-19Remove useless "if" before "free".Ulrich Drepper
2007-11-22* locale/programs/ld-collate.c (collate_read): Fix loop to matchUlrich Drepper
macro name.
2007-10-12* locale/programs/ld-collate.c (collate_read): Optimize a bit.Ulrich Drepper
(skip_to): Fix problems with parameter of elifdef/elifndef.
2007-10-12(collate_read): If ignore_content and nowtok is tok_define, eat any tok_eol ↵Ulrich Drepper
tokens.
2007-10-11* locale/programs/locfile-token.h: Remove tok_elif, add tok_elifdefUlrich Drepper
and tok_elifndef. * locale/programs/locfile-kw.gperf: Likewise. * locale/programs/ld-collate.c: Implement primitive preprocessor.
2007-10-02[BZ #645]Ulrich Drepper
2007-10-02 Ulrich Drepper <drepper@redhat.com> [BZ #645] * locale/programs/ld-collate.c (collate_finish): Compare against last used section which is known to have rules defined. (collate_read): After order_start, correctly record order of sections and queue sections up.
2007-10-02* locales/am_ET (LC_COLLATE): Define new script after copy.Ulrich Drepper
2007-09-30* locales/sa_IN: New file.Ulrich Drepper
* SUPPORTED (SUPPORTED-LOCALES): Add sa_IN.
2007-08-26* locale/programs/ld-collate.c (collate_output): Avoid warning ifUlrich Drepper
NDEBUG is defined.
2007-07-28* nss/nsswitch.c (__nss_lookup_function): Don't cast &ni->known toUlrich Drepper
void **. * nss/nsswitch.h (service_user): Use void * type for KNOWN field. * nss/nss_files/files-hosts.c (LINE_PARSER): Cast host_addr to char * to avoid warning. * nis/nss_nis/nis-hosts.c (LINE_PARSER): Likewise. * timezone/Makefile (CFLAGS-zdump.c): Add -fwrapv. * locale/programs/ld-ctype.c (ctype_finish, set_class_defaults, allocate_arrays): Cast second argument to charmap_find_symbol to char * to avoid warnings. * locale/programs/repertoire.c (repertoire_new_char): Change from_nr, to_nr and cnt to unsigned long, adjust printf format string. * locale/programs/ld-collate.c (insert_value, handle_ellipsis): Cast second argument to new_element to char * to avoid warnings. * locale/weightwc.h (findidx): Cast &extra[-i] to const int32_t *. * intl/gettextP.h (struct loaded_domain): Change plural to const struct expression *. * intl/plural-eval.c (plural_eval): Change first argument to const struct expression *. * intl/plural-exp.c (EXTRACT_PLURAL_EXPRESSION): Change first argument to const struct expression **. * intl/plural-exp.h (EXTRACT_PLURAL_EXPRESSION, plural_eval): Adjust prototypes. * intl/loadmsgcat (_nl_unload_domain): Cast away const in call to __gettext_free_exp. * posix/fnmatch.c (fnmatch): Rearrange code to avoid maybe unitialized wstring/wpattern var warnings. * posix/runtests.c (struct a_test): Make data field const char *. * stdio-common/tst-sprintf2.c (main): Don't declere u, v and buf vars if not LDBL_MANT_DIG >= 106. * stdio-common/Makefile (CFLAGS-vfwprintf.c): Add -Wno-unitialized. * stdio-common/vfprintf.c (vfprintf): Cast first arugment to __find_specmb to avoid warning. * rt/tst-mqueue1.c (do_one_test): Add casts to avoid warnings. * debug/test-strcpy_chk.c (do_tests, do_random_tests): Add casts to avoid warnings. * sysdeps/ieee754/ldbl-96/s_roundl.c (huge): Add L suffix to initializer. * sysdeps/unix/clock_gettime.c (clock_gettime): Only define tv var when it will be actually used. * sunrpc/rpc_cmsg.c (xdr_callmsg): Cast IXDR_PUT_* to void to avoid warnings.
2007-07-24* sysdeps/unix/sysv/linux/powerpc/lowlevellock.hUlrich Drepper
(__lll_private_flag): Define. (lll_futex_wait): Define as a wrapper around lll_futex_timed_wait. (lll_futex_timed_wait, lll_futex_wake, lll_futex_wake_unlock): Use __lll_private_flag. (lll_private_futex_wait, lll_private_futex_timedwait, lll_private_futex_wake): Define as wrapper around non-_private macros. * sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_private_flag): Define. (lll_futex_timed_wait, lll_futex_wake): Use __lll_private_flag. (lll_private_futex_wait, lll_private_futex_timedwait, lll_private_futex_wake): Define as wrapper around non-_private macros.
2007-07-16* elf/ldconfig.c: Allow GPLv2 or any later version.Roland McGrath
* elf/readlib.c: Likewise. * elf/chroot_canon.c: Likewise. * elf/cache.c: Likewise. * nscd/mem.c: Likewise. * nscd/getpwuid_r.c: Likewise. * nscd/grpcache.c: Likewise. * nscd/aicache.c: Likewise. * nscd/getsrvbynm_r.c: Likewise. * nscd/nscd.c: Likewise. * nscd/servicescache.c: Likewise. * nscd/getsrvbypt_r.c: Likewise. * nscd/initgrcache.c: Likewise. * nscd/gethstbyad_r.c: Likewise. * nscd/gethstbynm2_r.c: Likewise. * nscd/getgrnam_r.c: Likewise. * nscd/nscd_setup_thread.c: Likewise. * nscd/getpwnam_r.c: Likewise. * nscd/gai.c: Likewise. * nscd/connections.c: Likewise. * nscd/dbg_log.c: Likewise. * nscd/cache.c: Likewise. * nscd/hstcache.c: Likewise. * nscd/nscd_conf.c: Likewise. * nscd/getgrgid_r.c: Likewise. * nscd/pwdcache.c: Likewise. * catgets/gencat.c: Likewise. * locale/programs/linereader.h: Likewise. * locale/programs/locarchive.c: Likewise. * locale/programs/ld-paper.c: Likewise. * locale/programs/locfile-kw.h: Likewise. * locale/programs/ld-address.c: Likewise. * locale/programs/xmalloc.c: Likewise. * locale/programs/ld-time.c: Likewise. * locale/programs/localedef.c: Likewise. * locale/programs/simple-hash.c: Likewise. * locale/programs/xstrdup.c: Likewise. * locale/programs/ld-numeric.c: Likewise. * locale/programs/locfile-kw.gperf: Likewise. * locale/programs/ld-collate.c: Likewise. * locale/programs/charmap-kw.gperf: Likewise. * locale/programs/charmap.h: Likewise. * locale/programs/charmap-kw.h: Likewise. * locale/programs/config.h: Likewise. * locale/programs/locfile.c: Likewise. * locale/programs/ld-ctype.c: Likewise. * locale/programs/charmap.c: Likewise. * locale/programs/ld-messages.c: Likewise. * locale/programs/repertoire.h: Likewise. * locale/programs/locale.c: Likewise. * locale/programs/ld-name.c: Likewise. * locale/programs/linereader.c: Likewise. * locale/programs/locfile.h: Likewise. * locale/programs/3level.h: Likewise. * locale/programs/ld-monetary.c: Likewise. * locale/programs/ld-measurement.c: Likewise. * locale/programs/charmap-dir.c: Likewise. * locale/programs/ld-identification.c: Likewise. * locale/programs/localedef.h: Likewise. * locale/programs/charmap-dir.h: Likewise. * locale/programs/repertoire.c: Likewise. * locale/programs/simple-hash.h: Likewise. * locale/programs/ld-telephone.c: Likewise. * locale/programs/locale-spec.c: Likewise. * locale/programs/locfile-token.h: Likewise. * posix/getconf.c: Likewise. * iconv/dummy-repertoire.c: Likewise. * iconv/iconv_charmap.c: Likewise. * iconv/iconvconfig.c: Likewise. * iconv/iconv_prog.c: Likewise. * malloc/memusagestat.c: Likewise. * sysdeps/unix/sysv/linux/nscd_setup_thread.c: Likewise.
2007-04-28* locale/programs/ld-collate.c (collate_read): Allow order_startUlrich Drepper
after copy.