Age | Commit message (Collapse) | Author |
|
SSE registers are used for passing parameters and must be preserved
in runtime relocations. This is inside ld.so enforced through the
tests in tst-xmmymm.sh. But the malloc routines used after startup
come from libc.so and can be arbitrarily complex. It's overkill
to save the SSE registers all the time because of that. These calls
are rare. Instead we save them on demand. The new infrastructure
put in place in this patch makes this possible and efficient.
|
|
The test now takes the callgraph into account. Only code called
during runtime relocation is affected by the limitation. We now
determine the affected object files as closely as possible from
the outside. This allowed to remove some the specializations
for some of the string functions as they are only used in other
code paths.
|
|
|
|
This patch introduces a test to make sure no function modifies the
xmm/ymm registers. With the exception of the auditing functions.
The test is probably too pessimistic. All code linked into ld.so
is checked. Perhaps at some point the callgraph starting from
_dl_fixup and _dl_profile_fixup is checked and we can start using
faster SSE-using functions in parts of ld.so.
|
|
|
|
|
|
|
|
getaddrinfo didn't update the status variable in that round of the
loop if no callback was used.
|
|
The file contained some code which was never used. Don't compile it
in.
|
|
Ever since the /usr/include/linux headers got cleaned up this isn't
necessary. Meanwhile everybody should have these cleanups.
|
|
|
|
|
|
|
|
When multiarch is enabled we have this information stored. Use it.
|
|
The most recent AP 485 describes a few more cache descriptors for
L3 caches with 24-way associativity.
|
|
There will be more than one function which, in multiarch mode, wants
to use SSSE3. We should not test in each of them for Atoms with
slow SSSE3. Instead, disable the SSSE3 bit in the startup code for
such machines.
|
|
|
|
|
|
|
|
|
|
This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt
string searching algorithm.
|
|
|
|
The prototype for _dl_higher_prime_number was missing. While at it,
the function is now marked with internal_function.
|
|
The patch mainly reduces the code size but also avoids some jumps.
|
|
|
|
|
|
Don't use AVX instructions too often.
|
|
|
|
The original AVX patch used a function pointer to handle the difference
between machines with and without AVX support. This is insecure. A
well-placed memory exploit could lead to redirection of the execution.
Using a variable and several tests is a bit slower but cannot be
exploited in this way.
|
|
|
|
|
|
Some symbols have to be identified process-wide by their name. This is
particularly important for some C++ features (e.g., class local static data
and static variables in inline functions). This cannot completely be
implemented with ELF functionality so far. The STB_GNU_UNIQUE binding
helps by ensuring the dynamic linker will always use the same definition for
all symbols with the same name and this binding.
|
|
Nothing uses these wrong values yet, but it fixes a warning due to
conflicting definitions in <asm/cputable.h>.
|
|
|
|
Some of the new multi-arch string functions for x86-64 were
not aligned to 16 byte boundarie,s possibly creating unnecessary
cache line misses and delays.
|
|
|
|
|
|
|
|
This patch adds SSSE3 strcpy/stpcpy. I got up to 4X speed up on Core 2
and Core i7. I disabled it on Atom since SSSE3 version is slower for
shorter (<64byte) data.
|
|
|
|
|
|
|
|
The ____longjmp_chk implementation didn't load from memory the
right way.
|
|
|
|
|
|
|
|
|
|
|
|
If libcap is available, use it to drop privileges in pt_chown before
starting the work to change the permissions and ownership of the
slave device.
|
|
|