aboutsummaryrefslogtreecommitdiff
path: root/iconv
AgeCommit message (Collapse)Author
2024-01-01Update copyright dates not handled by scripts/update-copyrightsPaul Eggert
I've updated copyright dates in glibc for 2024. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files.
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
2023-10-04Fix off-by-one OOB write in iconv/tst-iconv-mtSzabolcs Nagy
The iconv buffer sizes must not include the \0 string terminator. And the output termination with *outbufpos = '\0' was OOB. Consistently use non-null-terminated buffer sizes. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-08-02iconv: restore verbosity with unrecognized encoding names (bug 30694)Andreas Schwab
Commit 91927b7c76 ("Rewrite iconv option parsing [BZ #19519]") changed the iconv program to call __gconv_open directly instead of the iconv_open wrapper, but the former does not set errno. Update the caller to interpret the return codes like iconv_open does.
2023-05-27Fix misspellings in iconv/ and iconvdata/ -- BZ 25337Paul Pluzhnikov
All the changes are in comments or '#error' messages. Applying this commit results in bit-identical rebuild of iconvdata/*.so Reviewed-by: Florian Weimer <fw@deneb.enyo.de>
2023-04-22Use O_CLOEXEC in more places (BZ #15722)Sergey Bugaev
When opening a temporary file without O_CLOEXEC we risk leaking the file descriptor if another thread calls (fork and then) exec while we have the fd open. Fix this by consistently passing O_CLOEXEC everywhere where we open a file for internal use (and not to return it to the user, in which case the API defines whether or not the close-on-exec flag shall be set on the returned fd). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-Id: <20230419160207.65988-4-bugaevc@gmail.com>
2023-03-27Move libc_freeres_ptrs and libc_subfreeres to hidden/weak functionsAdhemerval Zanella Netto
They are both used by __libc_freeres to free all library malloc allocated resources to help tooling like mtrace or valgrind with memory leak tracking. The current scheme uses assembly markers and linker script entries to consolidate the free routine function pointers in the RELRO segment and to be freed buffers in BSS. This patch changes it to use specific free functions for libc_freeres_ptrs buffers and call the function pointer array directly with call_function_static_weak. It allows the removal of both the internal macros and the linker script sections. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-02-17iconv: Remove _STRING_ARCH_unaligned usageAdhemerval Zanella
Use put/get macros __builtin_bswap32 instead. It allows to remove the unaligned routines, the compiler will generate unaligned access if the ABI allows it. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-02-17iconv: Remove _STRING_ARCH_unaligned usage for get/set macrosAdhemerval Zanella
And use a packed structure instead. The compiler generates optimized unaligned code if the architecture supports it. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-02-06Replace rawmemchr (s, '\0') with strchrWilco Dijkstra
Almost all uses of rawmemchr find the end of a string. Since most targets use a generic implementation, replacing it with strchr is better since that is optimized by compilers into strlen (s) + s. Also fix the generic rawmemchr implementation to use a cast to unsigned char in the if statement. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-01-06Update copyright dates not handled by scripts/update-copyrightsJoseph Myers
I've updated copyright dates in glibc for 2023. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files.
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers
2022-10-18Use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sourcesFlorian Weimer
In the future, this will result in a compilation failure if the macros are unexpectedly undefined (due to header inclusion ordering or header inclusion missing altogether). Assembler sources are more difficult to convert. In many cases, they are hand-optimized for the mangling and no-mangling variants, which is why they are not converted. sysdeps/s390/s390-32/__longjmp.c and sysdeps/s390/s390-64/__longjmp.c are special: These are C sources, but most of the implementation is in assembler, so the PTR_DEMANGLE macro has to be undefined in some cases, to match the assembler style. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-10-18Introduce <pointer_guard.h>, extracted from <sysdep.h>Florian Weimer
This allows us to define a generic no-op version of PTR_MANGLE and PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE unconditionally in C sources, avoiding an unintended loss of hardening due to missing include files or unlucky header inclusion ordering. In i386 and x86_64, we can avoid a <tls.h> dependency in the C code by using the computed constant from <tcb-offsets.h>. <sysdep.h> no longer includes these definitions, so there is no cyclic dependency anymore when computing the <tcb-offsets.h> constants. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-09-22Use '%z' instead of '%Z' on printf functionsAdhemerval Zanella Netto
The Z modifier is a nonstandard synonymn for z (that predates z itself) and compiler might issue an warning for in invalid conversion specifier. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-20gconv: Use 64-bit interfaces in gconv_parseconfdir (bug 29583)Florian Weimer
It's possible that inode numbers are outside the 32-bit range. The existing code only handles the in-libc case correctly, and still uses the legacy interfaces when building iconv. Suggested-by: Helge Deller <deller@gmx.de>
2022-06-14Avoid -Wstringop-overflow= warning in iconv module.Stefan Liebler
On s390x when compiling with GCC 12, I get this warning: utf8-utf16-z9.c: ../iconv/loop.c: In function ‘__from_utf8_loop_etf3eh_single’: ../iconv/loop.c:445:22: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 445 | bytebuf[inlen++] = *inptr++; | ~~~~~~~~~~~~~~~~~^~~~~~~~~~ ../iconv/loop.c:381:17: note: at offset 4 into destination object ‘bytebuf’ of size 4 381 | unsigned char bytebuf[MAX_NEEDED_INPUT]; | ^~~~~~~ ../iconv/loop.c:445:22: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=] 445 | bytebuf[inlen++] = *inptr++; | ~~~~~~~~~~~~~~~~~^~~~~~~~~~ ../iconv/loop.c:381:17: note: at offset 5 into destination object ‘bytebuf’ of size 4 381 | unsigned char bytebuf[MAX_NEEDED_INPUT]; | ^~~~~~~ This patch tells the compiler that inend is always behind inptr which avoids the warning. Note that the SINGLE function is only used to implement the mb*towc*() or wc*tomb*() functions. Those functions use inptr and inend pointing to a variable on stack, compute the inend pointer or explicitly check the arguments which always leads to inptr < inend. Special notes for backporters (according to Siddhesh Poyarekar): If someone wants to backport this patch to release branches, they should also backport the following wcrtomb change. Otherwise the assumptions assumed by this patch are not true. commit 9bcd12d223a8990254b65e2dada54faa5d2742f3 Author: Siddhesh Poyarekar <siddhesh@sourceware.org> Date: Fri May 13 19:10:15 2022 +0530 wcrtomb: Make behavior POSIX compliant Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-06-01iconv: Use 64 bit stat for gconv_parseconfdir (BZ# 29213)Adhemerval Zanella
The issue is only when used within libc.so (iconvconfig already builds with _TIME_SIZE=64). This is a missing spot initially from 52a5fe70a2c77935. Checked on i686-linux-gnu.
2022-04-13Replace {u}int_fast{16|32} with {u}int32_tNoah Goldstein
On 32-bit machines this has no affect. On 64-bit machines {u}int_fast{16|32} are set as {u}int64_t which is often not ideal. Particularly x86_64 this change both saves code size and may save instruction cost. Full xcheck passes on x86_64.
2022-03-14associate a deallocator for iconv_openSteve Grubb
This patch associates iconv_close as a deallocator for iconv_open. This required moving the iconv_close declaration above iconv_open. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-02-25build: Properly generate .d dependency files [BZ #28922]H.J. Lu
1. Also generate .d dependency files for $(tests-container) and $(tests-printers). 2. elf: Add tst-auditmod17.os to extra-test-objs. 3. iconv: Add tst-gconv-init-failure-mod.os to extra-test-objs. 4. malloc: Rename extra-tests-objs to extra-test-objs. 5. linux: Add tst-sysconf-iov_max-uapi.o to extra-test-objs. 6. x86_64: Add tst-x86_64mod-1.o, tst-platformmod-2.o, test-libmvec.o, test-libmvec-avx.o, test-libmvec-avx2.o and test-libmvec-avx512f.o to extra-test-objs. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-01-01Update copyright dates not handled by scripts/update-copyrights.Paul Eggert
I've updated copyright dates in glibc for 2022. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus csu/version.c which previously had to be handled manually but is now successfully updated by update-copyrights), there is a small change to the copyright notice in NEWS which should let NEWS get updated automatically next year. Please remember to include 2022 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-10-21iconv: Use TIMEOUTFACTOR for iconv test timeoutStafford Horne
Currently the timeout for each iconv test is hard coded to 3 seconds. On my OpenRISC test platform this is too slow and the test fails with a HANG error. This change uses the available TIMEOUTFACTOR to compute the timeout. The default value is still 3. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-09-13iconvconfig: Fix behaviour with --prefix [BZ #28199]Siddhesh Poyarekar
The consolidation of configuration parsing broke behaviour with --prefix, where the prefix bled into the modules cache. Accept a prefix which, when non-NULL, is prepended to the path when looking for configuration files but only the original directory is added to the modules cache. This has no effect on the codegen of gconv_conf since it passes NULL. Reported-by: Patrick McCarty <patrick.mccarty@intel.com> Reported-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
2021-09-06Add generic C.UTF-8 locale (Bug 17318)Carlos O'Donell
We add a new C.UTF-8 locale. This locale is not builtin to glibc, but is provided as a distinct locale. The locale provides full support for UTF-8 and this includes full code point sorting via STRCMP-based collation (strcmp or wcscmp). The collation uses a new keyword 'codepoint_collation' which drops all collation rules and generates an empty zero rules collation to enable STRCMP usage in collation. This ensures that we get full code point sorting for C.UTF-8 with a minimal 1406 bytes of overhead (LC_COLLATE structure information and ASCII collating tables). The new locale is added to SUPPORTED. Minimal test data for specific code points (minus those not supported by collate-test) is provided in C.UTF-8.in, and this verifies code point sorting is working reasonably across the range. The locale was tested manually with the full set of code points without failure. The locale is harmonized with locales already shipping in various downstream distributions. A new tst-iconv9 test is added which verifies the C.UTF-8 locale is generally usable. Testing for fnmatch, regexec, and recomp is provided by extending bug-regex1, bugregex19, bug-regex4, bug-regex6, transbug, tst-fnmatch, tst-regcomp-truncated, and tst-regex to use C.UTF-8. Tested on x86_64 or i686 without regression. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-09-03Remove "Contributed by" linesSiddhesh Poyarekar
We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-08-23Fix iconv build with GCC mainlineJoseph Myers
Current GCC mainline produces -Wstringop-overflow errors building some iconv converters, as discussed at <https://gcc.gnu.org/pipermail/gcc/2021-July/236943.html>. Add an __builtin_unreachable call as suggested so that GCC can see the case that would involve a buffer overflow is unreachable; because the unreachability depends on valid conversion state being passed into the function from previous conversion steps, it's not something the compiler can reasonably deduce on its own. Tested with build-many-glibcs.py that, together with <https://sourceware.org/pipermail/libc-alpha/2021-August/130244.html>, it restores the glibc build for powerpc-linux-gnu.
2021-08-03iconv_charmap: Close output file when doneSiddhesh Poyarekar
Reviewed-by: Arjun Shankar <arjun@redhat.com>
2021-08-03gconv_parseconfdir: Fix memory leakSiddhesh Poyarekar
The allocated `conf` would leak if we have to skip over the file due to the underlying filesystem not supporting dt_type. Reviewed-by: Arjun Shankar <arjun@redhat.com>
2021-07-07libio: Replace internal _IO_getdelim symbol with __getdelimFlorian Weimer
__getdelim is exported, _IO_getdelim is not. Add a hidden prototype for __getdelim. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-07-02iconvconfig: Use the public feof_unlockedSiddhesh Poyarekar
Build of iconvconfig failed with CFLAGS=-Os since __feof_unlocked is not a public symbol. Replace with feof_unlocked (defined to __feof_unlocked when IS_IN (libc)) to fix this. Reported-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-06-28iconvconfig: Fix multiple issuesSiddhesh Poyarekar
It was noticed on big-endian systems that msgfmt would fail with the following error: msgfmt: gconv_builtin.c:70: __gconv_get_builtin_trans: Assertion `cnt < sizeof (map) / sizeof (map[0])' failed. Aborted (core dumped) This is only seen on installed systems because it was due to a corrupted gconv-modules.cache. iconvconfig had the following issues (it was specifically freeing fulldir that caused this issue, but other cleanups are also needed) that this patch fixes. - Add prefix only if dir starts with '/' - Use asprintf instead of mempcpy so that the directory string is NULL terminated - Make a copy of the directory reference in new_module so that fulldir can be freed within the same scope in handle_dir. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-06-23Handle DT_UNKNOWN in gconv-modules.dSiddhesh Poyarekar
On filesystems that do not support dt_type, a regular file shows up as DT_UNKNOWN. Fall back to using lstat64 to read file properties in such cases. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-23iconvconfig: Use common gconv module parsing functionSiddhesh Poyarekar
Drop local copy of gconv file parsing and use the one in gconv_parseconfdir.h instead. Now there is a single implementation of configuration file parsing. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-23gconv_conf: Split out configuration file processingSiddhesh Poyarekar
Split configuration file processing into a separate header file and include it. Macroize all calls that need to go through internal interfaces so that iconvconfig can also use them. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-23gconv_conf: Remove unused variablesSiddhesh Poyarekar
The modules and nmodules parameters passed to add_modules, add_alias, etc. are not used and are hence unnecessary. Remove them so that their signatures match the functions in iconvconfig. Reviewed-by: DJ Delorie <dj@redhat.com> Reviewed-by: Andreas Schwab <schwab@linux-m68k.org>
2021-06-23iconv: Remove alloca use in gconv-modules configuration parsingSiddhesh Poyarekar
The alloca sizes ought to be constrained to PATH_MAX, but replace them with dynamic allocation to be safe. A static PATH_MAX array would have worked too but Hurd does not have PATH_MAX and the code path is not hot enough to micro-optimise this allocation. Revisit if any of those realities change. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-22Use 64 bit time_t stat internallyAdhemerval Zanella
For the legacy ABI with supports 32-bit time_t it calls the 64-bit time directly, since the LFS symbols calls the 64-bit time_t ones internally. Checked on i686-linux-gnu and x86_64-linux-gnu. Reviewed-by: Lukasz Majewski <lukma@denx.de>
2021-06-09gconv_conf: Read configuration files in gconv-modules.dSiddhesh Poyarekar
Read configuration files with names ending in .conf in GCONV_PATH/gconv-modules.d to mirror configuration flexibility in iconvconfig into the iconv program and function. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-09iconvconfig: Read configuration from gconv-modules.d subdirectorySiddhesh Poyarekar
In addition to GCONV_PATH/gconv-modules, also read module configuration from *.conf files in GCONV_PATH/gconv-modules.d. This allows a single gconv directory to have multiple sets of gconv modules but at the same time, a single modules cache. With this feature, one could separate the glibc supported gconv modules into a minimal essential set (ISO-8859-*, UTF, etc.) from the remaining modules. In future, these could be further segregated into langpack-associated sets with their own gconv-modules.d/someconfig.conf. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-06-09iconvconfig: Make file handling more general purposeSiddhesh Poyarekar
Split out configuration file handling code from handle_dir into its own function so that it can be reused for multiple configuration files. Reviewed-by: DJ Delorie <dj@redhat.com>
2021-05-18charmap_conversion: Free conversion table on exitSiddhesh Poyarekar
The conversion table is allocated using xcalloc but never freed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2021-05-07Run $(objpfx)iconvconfig with $(run-program-prefix) [BZ #27477]H.J. Lu
When glibc is configured with --enable-hardcoded-path-in-tests, "make xcheck" failed with ... env GCONV_PATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconvdata LOCPATH=/export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/localedata LC_ALL=C /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig --output=$tmp --nostdlib /usr/lib64/gconv; ... /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /export/build/gnu/tools-build/glibc-cet-gitlab/build-x86_64-linux/iconv/iconvconfig) ... FAIL: iconv/test-iconvconfig Since $(objpfx)iconvconfig is an installed program, run it with $(run-program-prefix).
2021-01-02Update copyright dates not handled by scripts/update-copyrights.Paul Eggert
I've updated copyright dates in glibc for 2021. This is the patch for the changes not generated by scripts/update-copyrights and subsequent build / regeneration of generated files. As well as the usual annual updates, mainly dates in --version output (minus csu/version.c which previously had to be handled manually but is now successfully updated by update-copyrights), there is a small change to the copyright notice in NEWS which should let NEWS get updated automatically next year. Please remember to include 2021 in the dates for any new files added in future (which means updating any existing uncommitted patches you have that add new files to use the new copyright dates in them).
2021-01-02Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
2020-12-21iconv add iconv_close before the function returned with bad value.liqingqing
add iconv_close before the function returned with bad value.
2020-12-21iconv: use iconv_close after iconv_openliqingqing
2020-12-11treewide: fix incorrect spelling of indices in commentsDmitry V. Levin
Replace 'indeces' with 'indices', the most annoying of these typos were those found in elf.h which is a public header file copied to other projects.
2020-12-07iconv: Fix incorrect UCS4 inner loop bounds (BZ#26923)Michael Colavita
Previously, in UCS4 conversion routines we limit the number of characters we examine to the minimum of the number of characters in the input and the number of characters in the output. This is not the correct behavior when __GCONV_IGNORE_ERRORS is set, as we do not consume an output character when we skip a code unit. Instead, track the input and output pointers and terminate the loop when either reaches its limit. This resolves assertion failures when resetting the input buffer in a step of iconv, which assumes that the input will be fully consumed given sufficient output space.