aboutsummaryrefslogtreecommitdiff
path: root/wcsmbs
AgeCommit message (Collapse)Author
2022-11-01configure: Use -Wno-ignored-attributes if compiler warns about multiple aliasesAdhemerval Zanella
clang emits an warning when a double alias redirection is used, to warn the the original symbol will be used even when weak definition is overridden. However, this is a common pattern for weak_alias, where multiple alias are set to same symbol. Reviewed-by: Fangrui Song <maskray@google.com>
2022-10-18Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sourcesFlorian Weimer
In the future, this will result in a compilation failure if the macros are unexpectedly undefined (due to header inclusion ordering or header inclusion missing altogether). Assembler sources are more difficult to convert. In many cases, they are hand-optimized for the mangling and no-mangling variants, which is why they are not converted. sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c are special: These are C sources, but most of the implementation is in assembler, so the PTR_DEMANGLE macro has to be undefined in some cases, to match the assembler style. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-10-18Introduce <pointer_guard.h>, extracted from <sysdep.h>Florian Weimer
This allows us to define a generic no-op version of PTR_MANGLE and PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources, avoiding an unintended loss of hardening due to missing include files or unlucky header inclusion ordering. In i386 and x86_64, we can avoid a <tls.h> dependency in the C code by using the computed constant from <tcb-offsets.h>. <sysdep.h> no longer includes these definitions, so there is no cyclic dependency anymore when computing the <tcb-offsets.h> constants. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-09-22Use '%z' instead of '%Z' on printf functionsAdhemerval Zanella Netto
The Z modifier is a nonstandard synonymn for z (that predates z itself) and compiler might issue an warning for in invalid conversion specifier. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-08-30Apply asm redirections in wchar.h before first useRaphael Moreira Zinsly
Similar to d0fa09a770, but for wchar.h. Fixes [BZ #27087] by applying all long double related asm redirections before using functions in bits/wchar2.h. Moves the function declarations from wcsmbs/bits/wchar2.h to a new file wcsmbs/bits/wchar2-decl.h that will be included first in wcsmbs/wchar.h. Tested with build-many-glibcs.py. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-08-01wcsmbs: Add missing test-c8rtomb/test-mbrtoc8 dependencyH.J. Lu
Make test-c8rtomb.out and test-mbrtoc8.out depend on $(gen-locales) for xsetlocale (LC_ALL, "de_DE.UTF-8"); xsetlocale (LC_ALL, "zh_HK.BIG5-HKSCS"); Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-08-01stdlib: Suppress gcc diagnostic that char8_t is a keyword in C++20 in uchar.h.Tom Honermann
gcc 13 issues the following diagnostic for the uchar.h header when the -Wc++20-compat option is enabled in C++ modes that do not enable char8_t as a builtin type (C++17 and earlier by default; subject to _GNU_SOURCE and the gcc -f[no-]char8_t option). warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat] This change modifies the uchar.h header to suppress the diagnostic through the use of '#pragma GCC diagnostic' directives for gcc 10 and later (the -Wc++20-compat option was added in gcc version 10). Unfortunately, a bug in gcc currently prevents those directives from having the intended effect as reported at https://gcc.gnu.org/PR106423. A patch for that issue has been submitted and is available in the email thread archive linked below. https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598736.html
2022-07-06stdlib: Tests for mbrtoc8, c8rtomb, and the char8_t typedef.Tom Honermann
This change adds tests for the mbrtoc8 and c8rtomb functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14 N2653, and for the char8_t typedef adopted for C2X from WG14 N2653. The tests for mbrtoc8 and c8rtomb specifically exercise conversion to and from Big5-HKSCS because of special cases that arise with that encoding. Big5-HKSCS defines some double byte sequences that convert to more than one Unicode code point. In order to test this, the locale dependencies for running tests under wcsmbs is expanded to include zh_HK.BIG5-HKSCS. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-07-06stdlib: Implement mbrtoc8, c8rtomb, and the char8_t typedef.Tom Honermann
This change provides implementations for the mbrtoc8 and c8rtomb functions adopted for C++20 via WG21 P0482R6 and for C2X via WG14 N2653. It also provides the char8_t typedef from WG14 N2653. The mbrtoc8 and c8rtomb functions are declared in uchar.h in C2X mode or when the _GNU_SOURCE macro or C++20 __cpp_char8_t feature test macro is defined. The char8_t typedef is declared in uchar.h in C2X mode or when the _GNU_SOURCE macro is defined and the C++20 __cpp_char8_t feature test macro is not defined (if __cpp_char8_t is defined, then char8_t is a builtin type). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-05-23locale: Add more cached data to LC_CTYPEFlorian Weimer
This data will be used in number formatting. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-05-23locale: Remove private union from struct __locale_dataFlorian Weimer
This avoids an alias violation later. This commit also fixes an incorrect double-checked locking idiom in _nl_init_era_entries. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-05-23locale: Remove cleanup function pointer from struct __localedataFlorian Weimer
We can call the cleanup functions directly from _nl_unload_locale if we pass the category to it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-05-13wcrtomb: Make behavior POSIX compliantSiddhesh Poyarekar
The GNU implementation of wcrtomb assumes that there are at least MB_CUR_MAX bytes available in the destination buffer passed to wcrtomb as the first argument. This is not compatible with the POSIX definition, which only requires enough space for the input wide character. This does not break much in practice because when users supply buffers smaller than MB_CUR_MAX (e.g. in ncurses), they compute and dynamically allocate the buffer, which results in enough spare space (thanks to usable_size in malloc and padding in alloca) that no actual buffer overflow occurs. However when the code is built with _FORTIFY_SOURCE, it runs into the hard check against MB_CUR_MAX in __wcrtomb_chk and hence fails. It wasn't evident until now since dynamic allocations would result in wcrtomb not being fortified but since _FORTIFY_SOURCE=3, that limitation is gone, resulting in such code failing. To fix this problem, introduce an internal buffer that is MB_LEN_MAX long and use that to perform the conversion and then copy the resultant bytes into the destination buffer. Also move the fortification check into the main implementation, which checks the result after conversion and aborts if the resultant byte count is greater than the destination buffer size. One complication is that applications that assume the MB_CUR_MAX limitation to be gone may not be able to run safely on older glibcs if they use static destination buffers smaller than MB_CUR_MAX; dynamic allocations will always have enough spare space that no actual overruns will occur. One alternative to fixing this is to bump symbol version to prevent them from running on older glibcs but that seems too strict a constraint. Instead, since these users will only have made this decision on reading the manual, I have put a note in the manual warning them about the pitfalls of having static buffers smaller than MB_CUR_MAX and running them on older glibc. Benchmarking: The wcrtomb microbenchmark shows significant increases in maximum execution time for all locales, ranging from 10x for ar_SA.UTF-8 to 1.5x-2x for nearly everything else. The mean execution time however saw practically no impact, with some results even being quicker, indicating that cache locality has a much bigger role in the overhead. Given that the additional copy uses a temporary buffer inside wcrtomb, it's likely that a hot path will end up putting that buffer (which is responsible for the additional overhead) in a similar place on stack, giving the necessary cache locality to negate the overhead. However in situations where wcrtomb ends up getting called at wildly different spots on the call stack (or is on different call stacks, e.g. with threads or different execution contexts) and is still a hotspot, the performance lag will be visible. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-01-12debug: Synchronize feature guards in fortified functions [BZ #28746]Siddhesh Poyarekar
Some functions (e.g. stpcpy, pread64, etc.) had moved to POSIX in the main headers as they got incorporated into the standard, but their fortified variants remained under __USE_GNU. As a result, these functions did not get fortified when _GNU_SOURCE was not defined. Add test wrappers that check all functions tested in tst-chk0 at all levels with _GNU_SOURCE undefined and then use the failures to (1) exclude checks for _GNU_SOURCE functions in these tests and (2) Fix feature macro guards in the fortified function headers so that they're the same as the ones in the main headers. This fixes BZ #28746. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-11-10Support C2X printf %b, %BJoseph Myers
C2X adds a printf %b format (see <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted for C2X), for outputting integers in binary. It also has recommended practice for a corresponding %B format (like %b, but %#B starts the output with 0B instead of 0b). Add support for these formats to glibc. One existing test uses %b as an example of an unknown format, to test how glibc printf handles unknown formats; change that to %v. Use of %b and %B as user-registered format specifiers continues to work (and we already have a test that covers that, tst-printfsz.c). Note that C2X also has scanf %b support, plus support for binary constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol base 0 and scanf %i coming from a previous paper that added binary integer literals). I intend to implement those features in a separate patch or patches; as discussed in the thread starting at <https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>, they will be more complicated because they involve adding extra public symbols to ensure compatibility with existing code that might not expect 0b constants to be handled by strtol base 0 and 2 and scanf %i, whereas simply adding a new format specifier poses no such compatibility concerns. Note that the actual conversion from integer to string uses existing code in _itoa.c. That code has special cases for bases 8, 10 and 16, probably so that the compiler can optimize division by an integer constant in the code for those bases. If desired such special cases could easily be added for base 2 as well, but that would be an optimization, not actually needed for these printf formats to work. Tested for x86_64 and x86. Also tested with build-many-glibcs.py for aarch64-linux-gnu with GCC mainline to make sure that the test does indeed build with GCC 12 (where format checking warnings are enabled for most of the test).
2021-10-20Make sure that the fortified function conditionals are constantSiddhesh Poyarekar
In _FORTIFY_SOURCE=3, the size expression may be non-constant, resulting in branches in the inline functions remaining intact and causing a tiny overhead. Clang (and in future, gcc) make sure that the -1 case is always safe, i.e. any comparison of the generated expression with (size_t)-1 is always false so that bit is taken care of. The rest is avoidable since we want the _chk variant whenever we have a size expression and it's not -1. Rework the conditionals in a uniform way to clearly indicate two conditions at compile time: - Either the size is unknown (-1) or we know at compile time that the operation length is less than the object size. We can call the original function in this case. It could be that either the length, object size or both are non-constant, but the compiler, through range analysis, is able to fold the *comparison* to a constant. - The size and length are known and the compiler can see at compile time that operation length > object size. This is valid grounds for a warning at compile time, followed by emitting the _chk variant. For everything else, emit the _chk variant. This simplifies most of the fortified function implementations and at the same time, ensures that only one call from _chk or the regular function is emitted. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-09-03Remove "Contributed by" linesSiddhesh Poyarekar
We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-16Enable support for GCC 11 -Wmismatched-dealloc.Martin Sebor
To help detect common kinds of memory (and other resource) management bugs, GCC 11 adds support for the detection of mismatched calls to allocation and deallocation functions. At each call site to a known deallocation function GCC checks the set of allocation functions the former can be paired with and, if the two don't match, issues a -Wmismatched-dealloc warning (something similar happens in C++ for mismatched calls to new and delete). GCC also uses the same mechanism to detect attempts to deallocate objects not allocated by any allocation function (or pointers past the first byte into allocated objects) by -Wfree-nonheap-object. This support is enabled for built-in functions like malloc and free. To extend it beyond those, GCC extends attribute malloc to designate a deallocation function to which pointers returned from the allocation function may be passed to deallocate the allocated objects. Another, optional argument designates the positional argument to which the pointer must be passed. This change is the first step in enabling this extended support for Glibc.
2021-01-02Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
2020-12-31nonstring: Enable __FORTIFY_LEVEL=3Siddhesh Poyarekar
Use __builtin_dynamic_object_size in the remaining functions that don't have compiler builtins as is the case for string functions.
2020-12-08Make strtoimax, strtoumax, wcstoimax, wcstoumax into aliasesJoseph Myers
The functions strtoimax, strtoumax, wcstoimax, wcstoumax currently have three implementations each (wordsize-32, wordsize-64 and dummy implementation in stdlib/ using #error), defining the functions as thin wrappers round corresponding *_internal functions. Simplify the code by changing them into aliases of functions such as strtol and wcstoull. This is more consistent with how e.g. imaxdiv is handled. Tested for x86_64 and x86.
2020-06-01mbstowcs: Document, test, and fix null pointer dst semantics (Bug 25219)Carlos O'Donell
The function mbstowcs, by an XSI extension to POSIX, accepts a null pointer for the destination wchar_t array. This API behaviour allows you to use the function to compute the length of the required wchar_t array i.e. does the conversion without storing it and returns the number of wide characters required. We remove the __write_only__ markup for the first argument because it is not true since the destination may be a null pointer, and so the length argument may not apply. We remove the markup otherwise the new test case cannot be compiled with -Werror=nonnull. We add a new test case for mbstowcs which exercises the destination is a null pointer behaviour which we have now explicitly documented. The mbsrtowcs and mbsnrtowcs behave similarly, and mbsrtowcs is documented as doing this in C11, even if the standard doesn't come out and call out this specific use case. We add one note to each of mbsrtowcs and mbsnrtowcs to call out that they support a null pointer for the destination. The wcsrtombs function behaves similarly but in the other way around and allows you to use a null destination pointer to compute how many bytes you would need to convert the wide character input. We document this particular case also, but leave wcsnrtombs as a references to wcsrtombs, so the reader must still read the details of the semantics for wcsrtombs.
2020-04-30Rename __LONG_DOUBLE_USES_FLOAT128 to __LDOUBLE_REDIRECTS_TO_FLOAT128_ABIPaul E. Murphy
Improve the commentary to aid future developers who will stumble upon this novel, yet not always perfect, mechanism to support alternative formats for long double. Likewise, rename __LONG_DOUBLE_USES_FLOAT128 to __LDOUBLE_REDIRECTS_TO_FLOAT128_ABI now that development work has settled down. The command used was git grep -l __LONG_DOUBLE_USES_FLOAT128 ':!./ChangeLog*' | \ xargs sed -i 's/__LONG_DOUBLE_USES_FLOAT128/__LDOUBLE_REDIRECTS_TO_FLOAT128_ABI/g' Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-02-17Prepare redirections for IEEE long double on powerpc64leGabriel F. T. Gomes
All functions that have a format string, which can consume a long double argument, must have one version for each long double format supported on a platform. On powerpc64le, these functions currently have two versions (i.e.: long double with the same format as double, and long double with IBM Extended Precision format). Support for a third long double format option (i.e. long double with IEEE long double format) is being prepared and all the aforementioned functions now have a third version (not yet exported on the master branch, but the code is in). For these functions to get selected (during build time), references to them in user programs (or dependent libraries) must get redirected to the aforementioned new versions of the functions. This patch installs the header magic required to perform such redirections. Notice, however, that since the redirections only happen when __LONG_DOUBLE_USES_FLOAT128 is set to 1, and no platform (including powerpc64le) currently does it, no redirections actually happen. Redirections and the exporting of the new functions will happen at the same time (when powerpc64le adds ldbl-128ibm-compat to their Implies. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Reviewed-by: Paul E. Murphy <murphyp@linux.vnet.ibm.com>
2020-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2019-11-22Use DEPRECATED_SCANF macro for remaining C99-compliant scanf functionsGabriel F. T. Gomes
When the commit commit 03992356e6fedc5a5e9d32df96c1a2c79ea28a8f Author: Zack Weinberg <zackw@panix.com> Date: Sat Feb 10 11:58:35 2018 -0500 Use C99-compliant scanf under _GNU_SOURCE with modern compilers. added the DEPRECATED_SCANF macro to select when redirections of *scanf functions to their ISO C99 compliant versions should happen, it accidentally missed doing it for vfwscanf, vwscanf, and vswscanf. Tested for powerpc64le and with build-many-glibcs (i686-linux-gnu and nios2-linux-gnu are failing with current master, and with this patch, but I didn't see a regression). Change-Id: I706b344a3fb50be017cdab9251d9da18a3ba8c60
2019-09-07Prefer https to http for gnu.org and fsf.org URLsPaul Eggert
Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-07-31iconv: Revert steps array reference counting changesFlorian Weimer
The changes introduce a memory leak for gconv steps arrays whose first element is an internal conversion, which has a fixed reference count which is not decremented. As a result, after the change in commit 50ce3eae5ba304650459d4441d7d246a7cefc26f, the steps array is never freed, resulting in an unbounded memory leak. This reverts commit 50ce3eae5ba304650459d4441d7d246a7cefc26f ("gconv: Check reference count in __gconv_release_cache [BZ #24677]") and commit 7e740ab2e7be7d83b75513aa406e0b10875f7f9c ("libio: Fix gconv-related memory leak [BZ #24583]"). It reintroduces bug 24583. (Bug 24677 was just a regression caused by the second commit.)
2019-05-21wcsmbs: Fix data race in __wcsmbs_clone_conv [BZ #24584]Florian Weimer
This also adds an overflow check and documents the synchronization requirement in <gconv.h>.
2019-05-21libio: Fix gconv-related memory leak [BZ #24583]Florian Weimer
struct gconv_fcts for the C locale is statically allocated, and __gconv_close_transform deallocates the steps object. Therefore this commit introduces __wcsmbs_close_conv to avoid freeing the statically allocated steps objects.
2019-04-04wcsmbs: Use loop_unroll on wcsrchrAdhemerval Zanella
This allows an architecture to set explicit loop unrolling. Checked on aarch64-linux-gnu. * wcsmbs/wcsrchr.c (WCSRCHR): Use loop_unroll.h to parametrize the loop unroll.
2019-04-04wcsmbs: Use loop_unroll on wcschrAdhemerval Zanella
This allows an architecture to set explicit loop unrolling. Checked on aarch64-linux-gnu. * wcsmbs/wcschr.c (WCSCHR): Use loop_unroll.h to parametrize the loop unroll.
2019-04-04wcsmbs: Add wcscpy loop unroll optionAdhemerval Zanella
This allows an architecture to use the old generic implementation and also set explicit loop unrolling. Checked on aarch64-linux-gnu. * include/loop_unroll.h: New file. * wcsmbs/wcscpy (__wcscpy): Add option to use loop unrolling besides generic implementation.
2019-02-27wcsmbs: optimize wcsnlenAdhemerval Zanella
This patch rewrites wcsnlen using wmemchr. The generic wmemchr already uses the strategy (loop unrolling and tail handling) and by using it it allows architectures that have optimized wmemchr (s390 and x86_64) to optimize wcsnlen as well. Checked on x86_64-linux-gnu. * wcsmbs/wcsnlen.c (__wcsnlen): Rewrite using wmemchr.
2019-02-27wcsmbs: optimize wcsncpyAdhemerval Zanella
This patch rewrites wcsncpy using wcsnlen, wmemset, and wmemcpy. This is similar to the optimization done on strncpy by f6482cf29d and 6423d4754c. Checked on x86_64-linux-gnu. * wcsmbs/wcsncpy.c (__wcsncpy): Rewrite using wcsnlen, wmemset, and wmemcpy.
2019-02-27wcsmbs: optimize wcsncatAdhemerval Zanella
This patch rewrites wcsncat using wcslen, wcsnlen, and wmemcpy. This is similar to the optimization done on strncat by 3eb38795db and e80514b5a8. Checked on x86_64-linux-gnu. * wcsmbs/wcsncat.c (wcsncat): Rewrite using wcslen, wcsnlen, and wmemcpy.
2019-02-27wcsmbs: optimize wcscpyAdhemerval Zanella
This patch rewrites wcscpy using wcslen and wmemcpy. This is similar to the optimization done on strcpy by b863d2bc4d. Checked on x86_64-linux-gnu. * wcsmbs/wcscpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-27wcsmbs: optimize wcscatAdhemerval Zanella
This patch rewrites wcscat using wcslen and wcscpy. This is similar to the optimization done on strcat by 6e46de42fe. The strcpy changes are mainly to add the internal alias to avoid PLT calls. Checked on x86_64-linux-gnu and a build against the affected architectures. * include/wchar.h (__wcscpy): New prototype. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c (__wcscpy): Route internal symbol to generic implementation. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy): Add internal __wcscpy alias. * sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise. * sysdeps/s390/wcscpy.c (wcscpy): Likewise. * sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise. * wcsmbs/wcscpy.c (wcscpy): Add * sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to use generic implementation. * wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
2019-02-27wcsmbs: optimize wcpncpyAdhemerval Zanella
This patch rewrites wcpncpy using wcslen, wmemcpy, and wmemset. This is similar to the optimization done on stpncpy by 48497aba8e. Checked on x86_64-linux-gnu. * wcsmbs/wcpncpy.c (__wcpcpy): Rewrite using wcslen, wmemcpy, and wmemset.
2019-02-27wcsmbs: optimize wcpcpyAdhemerval Zanella
This patch rewrites wcpcpy using wcslen and wmemcpy. This is similar to the optimizatio done on stpcpy by f559d8cf29. Checked on x86_64-linux-gnu and string tests on a simulated m68k-linux-gnu. * sysdeps/m68k/wcpcpy.c: Remove file. * wcsmbs/wcpcpy.c (__wcpcpy): Rewrite using wcslen and wmemcpy.
2019-02-04Fix handling of collating elements in fnmatch (bug 17396, bug 16976)Andreas Schwab
This fixes the same bug in fnmatch that was fixed by commit 7e2f0d2d77 for regexp matching. As a side effect it also removes the use of an unbound VLA.
2019-01-03Use C99-compliant scanf under _GNU_SOURCE with modern compilers.Zack Weinberg
The only difference between noncompliant and C99-compliant scanf is that the former accepts the archaic GNU extension '%as' (also %aS and %a[...]) meaning to allocate space for the input string with malloc. This extension conflicts with C99's use of %a as a format _type_ meaning to read a floating-point number; POSIX.1-2008 standardized equivalent functionality using the modifier letter 'm' instead (%ms, %mS, %m[...]). The extension was already disabled in most conformance modes: specifically, any mode that doesn't involve _GNU_SOURCE and _does_ involve either strict conformance to C99 or loose conformance to both C99 and POSIX.1-2001 would get the C99-compliant scanf. With compilers new enough to use -std=gnu11 instead of -std=gnu89, or equivalent, that includes the default mode. With this patch, we now provide C99-compliant scanf in all configurations except when _GNU_SOURCE is defined *and* __STDC_VERSION__ or __cplusplus (whichever is relevant) indicates C89/C++98. This leaves the old scanf available under e.g. -std=c89 -D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it was already not present under -std=gnu11 without -D_GNU_SOURCE) and from -std=gnu89 without -D_GNU_SOURCE. There needs to be an internal override so we can compile the noncompliant scanf itself. This is the same problem we had when we removed 'gets' from _GNU_SOURCE and it's dealt with the same way: there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to off under the appropriate conditions for external code, but can be overridden by individual files within stdio. We also run into problems with PLT bypass for internal uses of sscanf, because libc_hidden_proto uses __REDIRECT and so does the logic in stdio.h for choosing which implementation of scanf to use; __REDIRECT isn't transitive, so include/stdio.h needs to bridge the gap with a macro. As far as I can tell, sscanf is the only function in this family that's internally called by unrelated code. Finally, there are several tests in stdio-common that use the extension. bug21.c is a regression test for a crash; it still exercises the relevant code when changed to use %ms instead of %as. scanf14.c through scanf17.c are more complicated since they are actually testing the subtleties of the extension - under what circumstances is 'a' treated as a modifier letter, etc. I changed all of them to use %ms instead of %as as well, but duplicated scanf14.c and scanf16.c as scanf14a.c and scanf16a.c. These still use %as and are compiled with -std=gnu89 to access the old extension. A bunch of diagnostic overrides and manual workarounds for the old stdio.h behavior become unnecessary. Yay! * include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE parameter. Only use deprecated scanf when __USE_GNU is defined and __STDC_VERSION__ is less than 199901L or __cplusplus is less than 201103L, whichever is relevant for the language being compiled. * libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their __isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF). * wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf. * libio/iovsscanf.c * libio/fwscanf.c * libio/iovswscanf.c * libio/swscanf.c * libio/vscanf.c * libio/vwscanf.c * libio/wscanf.c * stdio-common/fscanf.c * stdio-common/scanf.c * stdio-common/vfscanf.c * stdio-common/vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-compat.c * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c * sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c: Override __GLIBC_USE_DEPRECATED_SCANF to 1. * stdio-common/sscanf.c: Likewise. Remove ldbl_hidden_def for __sscanf. * stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf. * include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf, not sscanf. [!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf with a preprocessor macro. * stdio-common/bug21.c, stdio-common/scanf14.c: Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[]; remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16.c: Likewise. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf14a.c: New copy of scanf14.c which still uses %as, %aS, %a[]. Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16a.c: New copy of scanf16.c which still uses %as, %aS, %a[]. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf15.c, stdio-common/scanf17.c: No need to override feature selection macros or provide definitions of u_char etc. * stdio-common/Makefile (tests): Add scanf14a and scanf16a. (CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove. (CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New. Compile these files with -std=gnu89.
2019-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2018-12-05Use SCANF_ISOC99_A instead of _IO_FLAGS2_SCANF_STD.Zack Weinberg
Change the callers of __vfscanf_internal and __vfwscanf_internal that want C99-compliant behavior to communicate this via the new flags argument, rather than setting bits on the FILE object. This also means these functions do not need to do their own locking. Tested for powerpc and powerpc64le.
2018-12-05Add __vfscanf_internal and __vfwscanf_internal with flags arguments.Zack Weinberg
There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used by __isoc99_ scanf variants. In this patch, the new functions honor these flag bits if they're set, but they still also look at the corresponding bits of environmental state, and callers all pass zero. The new functions do *not* have the "errp" argument possessed by _IO_vfscanf and _IO_vfwscanf. All internal callers passed NULL for that argument. External callers could theoretically exist, so I preserved wrappers, but they are flagged as compat symbols and they don't preserve the three-way distinction among types of errors that was formerly exposed. These functions probably should have been in the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just aliases for vfscanf and vfwscanf. (It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf. Please check that part of the patch very carefully, I am still not confident I understand all of the details of ldbl-opt.) This patch also introduces helper inlines in libio/strfile.h that encapsulate the process of initializing an _IO_strfile object for reading. This allows us to call __vfscanf_internal directly from sscanf, and __vfwscanf_internal directly from swscanf, without duplicating the initialization code. (Previously, they called their v-counterparts, but that won't work if we want to control *both* C99 mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.) It's still a little awkward, especially for wide strfiles, but it's much better than what we had. Tested for powerpc and powerpc64le.
2018-10-22Stop c32rtomb and mbrtoc32 aliasing wcrtomb and mbrtowc (bug 23793).Joseph Myers
glibc does: /* There should be no difference between the UTF-32 handling required by c32rtomb and the wchar_t handling which has long since been implemented in wcrtomb. */ weak_alias (__wcrtomb, c32rtomb) /* There should be no difference between the UTF-32 handling required by mbrtoc32 and the wchar_t handling which has long since been implemented in mbrtowc. */ weak_alias (__mbrtowc, mbrtoc32) The reasoning in those comments to justify those aliases is incorrect: ISO C requires that, for the case of a NULL mbstate_t* being passed, each function has its *own* internal static mbstate_t. Thus a program must be able to use both wcrtomb and c32rtomb at the same time with each keeping its own separate state, and likewise for mbrtowc and mbrtoc32. This patch duly sets up separarate char32_t function that wrap the wchar_t ones. Note that the included test only covers the mbrtoc32 / mbrtowc pair. While I think the change made is logically correct for c32rtomb / wcrtomb as well, I'm not sure we have a locale with a suitable state-dependent multibyte encoding for testing that part of the change. Tested for x86_64. [BZ #23793] * wcsmbs/c32rtomb.c: New file. * wcsmbs/mbrtoc32.c: Likewise. * wcsmbs/tst-c32-state.c: Likewise. * wcsmbs/mbrtowc.c (mbrtoc32): Do not define as alias. * wcsmbs/wcrtomb.c (c32rtomb): Likewise. * wcsmbs/Makefile (routines): Add mbrtoc32 and c32rtomb. (tests): Add tst-c32-state. [$(run-built-tests) = yes] ($(objpfx)tst-c32-state.out): Depend on $(gen-locales).
2018-10-19Handle surrogate pairs in c16rtomb (bug 23794, DR#488, C2X).Joseph Myers
The c16rtomb implementation has: // XXX The ISO C 11 spec I have does not say anything about handling // XXX surrogates in this interface. The DR#488 resolution, as applied to C2X, requires surrogate pairs to be handled here (so the first call returns 0 and stores the high surrogate in the mbstate_t, while the second call combines the surrogates, produces a multibyte character and returns the number of bytes written). This patch implements that. (mbrtoc16 already handled producing surrogates as output.) Tested for x86_64. [BZ #23794] * wcsmbs/c16rtomb.c (c16rtomb): Save first character of surrogate pair and return 0 in that case, and use saved character to interpret following character. * wcsmbs/tst-c16-surrogate.c: New file. * wcsmbs/Makefile (tests): Add tst-c16-surrogate.c. [$(run-built-tests) = yes] ($(objpfx)tst-c16-surrogate.out): Depend on $(gen-locales)
2018-06-15Add tests for sign of NaN returned by strtod (bug 23007).Joseph Myers
This patch adds tests for bug 23007, strtod ignoring any sign in the input string in the case of a NaN result. Tested for x86_64. [BZ #23007] * stdlib/tst-strtod-nan-sign-main.c: New file. * stdlib/tst-strtod-nan-sign.c: Likewise. * wcsmbs/tst-wcstod-nan-sign.c: Likewise. * stdlib/Makefile (tests): Add tst-strtod-nan-sign. ($(objpfx)tst-strtod-nan-sign): Depend on $(libm). * wcsmbs/Makefile (tests) Add tst-wcstod-nan-sign. ($(objpfx)tst-wcstod-nan-sign): Depend on $(libm).
2018-05-16math: Merge strtod_nan_*.h into math-type-macros-*.hFlorian Weimer
This change will eventually make it possible to compile stdlib/strtod_nan_main.c as part of math/s_nan_template.c.