aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/x86_64/multiarch/init-arch.c
AgeCommit message (Collapse)Author
2015-08-13Add _dl_x86_cpu_features to rtld_globalH.J. Lu
This patch adds _dl_x86_cpu_features to rtld_global in x86 ld.so and initializes it early before __libc_start_main is called so that cpu_features is always available when it is used and we can avoid calling __init_cpu_features in IFUNC selectors. * sysdeps/i386/dl-machine.h: Include <cpu-features.c>. (dl_platform_init): Call init_cpu_features. * sysdeps/i386/dl-procinfo.c (_dl_x86_cpu_features): New. * sysdeps/i386/i686/cacheinfo.c (DISABLE_PREFERRED_MEMORY_INSTRUCTION): Removed. * sysdeps/i386/i686/multiarch/Makefile (aux): Remove init-arch. * sysdeps/i386/i686/multiarch/Versions: Removed. * sysdeps/i386/i686/multiarch/ifunc-defines.sym (KIND_OFFSET): Removed. * sysdeps/i386/ldsodefs.h: Include <cpu-features.h>. * sysdeps/unix/sysv/linux/x86/Makefile (libpthread-sysdep_routines): Remove init-arch. * sysdeps/unix/sysv/linux/x86_64/dl-procinfo.c: Include <sysdeps/x86_64/dl-procinfo.c> instead of sysdeps/generic/dl-procinfo.c>. * sysdeps/x86/Makefile [$(subdir) == csu] (gen-as-const-headers): Add cpu-features-offsets.sym and rtld-global-offsets.sym. [$(subdir) == elf] (sysdep-dl-routines): Add dl-get-cpu-features. [$(subdir) == elf] (tests): Add tst-get-cpu-features. [$(subdir) == elf] (tests-static): Add tst-get-cpu-features-static. * sysdeps/x86/Versions: New file. * sysdeps/x86/cpu-features-offsets.sym: Likewise. * sysdeps/x86/cpu-features.c: Likewise. * sysdeps/x86/cpu-features.h: Likewise. * sysdeps/x86/dl-get-cpu-features.c: Likewise. * sysdeps/x86/libc-start.c: Likewise. * sysdeps/x86/rtld-global-offsets.sym: Likewise. * sysdeps/x86/tst-get-cpu-features-static.c: Likewise. * sysdeps/x86/tst-get-cpu-features.c: Likewise. * sysdeps/x86_64/dl-procinfo.c: Likewise. * sysdeps/x86_64/cacheinfo.c (__cpuid_count): Removed. Assume USE_MULTIARCH is defined and don't check it. (is_intel): Replace __cpu_features with GLRO(dl_x86_cpu_features). (is_amd): Likewise. (max_cpuid): Likewise. (intel_check_word): Likewise. (__cache_sysconf): Don't call __init_cpu_features. (__x86_preferred_memory_instruction): Removed. (init_cacheinfo): Don't call __init_cpu_features. Replace __cpu_features with GLRO(dl_x86_cpu_features). * sysdeps/x86_64/dl-machine.h: <cpu-features.c>. (dl_platform_init): Call init_cpu_features. * sysdeps/x86_64/ldsodefs.h: Include <cpu-features.h>. * sysdeps/x86_64/multiarch/Makefile (aux): Remove init-arch. * sysdeps/x86_64/multiarch/Versions: Removed. * sysdeps/x86_64/multiarch/cacheinfo.c: Likewise. * sysdeps/x86_64/multiarch/init-arch.c: Likewise. * sysdeps/x86_64/multiarch/ifunc-defines.sym (KIND_OFFSET): Removed. * sysdeps/x86_64/multiarch/init-arch.h: Rewrite.
2015-06-08This patch adds detection of availability for AVX512F and AVX512DQ ISAs.Andrew Senkevich
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX512F_Usable, bit_AVX512DQ_Usable, bit_Opmask_state, bit_ZMM0_15_state, bit_ZMM16_31_state): New macro. * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Check and set bit_AVX512F_Usable, bit_AVX512DQ_Usable.
2015-01-30Use AVX unaligned memcpy only if AVX2 is availableH.J. Lu
memcpy with unaligned 256-bit AVX register loads/stores are slow on older processorsl like Sandy Bridge. This patch adds bit_AVX_Fast_Unaligned_Load and sets it only when AVX2 is available. [BZ #17801] * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Set the bit_AVX_Fast_Unaligned_Load bit for AVX2. * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Fast_Unaligned_Load): New. (index_AVX_Fast_Unaligned_Load): Likewise. (HAS_AVX_FAST_UNALIGNED_LOAD): Likewise. * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Check the bit_AVX_Fast_Unaligned_Load bit instead of the bit_AVX_Usable bit. * sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise. * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise. * sysdeps/x86_64/multiarch/memmove.c (__libc_memmove): Replace HAS_AVX with HAS_AVX_FAST_UNALIGNED_LOAD. * sysdeps/x86_64/multiarch/memmove_chk.c (__memmove_chk): Likewise.
2015-01-23Also treat model numbers 0x5a/0x5d as SilvermontH.J. Lu
2015-01-23Treat model numbers 0x4a/0x4d as SilvermontH.J. Lu
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Treat model numbers 0x4a/0x4d as Intel Silvermont architecture.
2015-01-02Update copyright dates with scripts/update-copyrights.Joseph Myers
2014-04-17Detect if AVX2 is usableSihai Yao
This patch checks and sets bit_AVX2_Usable in __cpu_features.feature. * sysdeps/x86_64/multiarch/ifunc-defines.sym (COMMON_CPUID_INDEX_7): New. * sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features): Check and set bit_AVX2_Usable. * sysdeps/x86_64/multiarch/init-arch.h (bit_AVX2_Usable): New macro. (bit_AVX2): Likewise. (index_AVX2_Usable): Likewise. (CPUID_AVX2): Likewise. (HAS_AVX2): Likewise.
2014-01-01Update copyright notices with scripts/update-copyrightsAllan McRae
2013-06-28Skip SSE4.2 versions on Intel SilvermontLiubov Dmitrieva
SSE2/SSSE3 versions are faster than SSE4.2 versions on Intel Silvermont.
2013-06-14Set fast unaligned load flag for new Intel microarchitectureLiubov Dmitrieva
I have small patch for new Intel Silvermont machines. http://newsroom.intel.com/community/intel_newsroom/blog/2013/05/06/intel-launches-low-power-high-performance-silvermont-microarchitecture I checked this on my machine and see that strcpy, ... unaligned versions are faster than ssse3 versions.
2013-03-11Remove Prefer_SSE_for_memop on x64Ondrej Bilka
2013-01-03Add HAS_RTMH.J. Lu
2013-01-02Update copyright notices with scripts/update-copyrights.Joseph Myers
2012-10-02Define HAS_FMA with bit_FMA_UsableH.J. Lu
2012-05-17BZ#14059: Fix AVX and FMA4 detection.Carlos O'Donell
Fix AVX and FMA4 detection by following the guidelines set out by Intel and AMD for detecting these features.
2012-02-09Replace FSF snail mail address with URLs.Paul Eggert
2012-01-26Really fix AVX testsUlrich Drepper
There is no problem with strcmp, it doesn't use the YMM registers. The math routines might since gcc perhaps generates such code. Introduce bit_YMM_USBALE and use it in the math routines.
2012-01-26Reset bit_AVX in __cpu_features is OS support is missingUlrich Drepper
2011-10-21Fix compilation problems in x86-64 init-archUlrich Drepper
2011-10-20Check for FMA4 support and generate appropriate fma functionsUlrich Drepper
2011-07-19Improve 64 bit strcat functions with SSE2/SSSE3Liubov Dmitrieva
2011-06-24Optimized st{r,p}{,n}cpy for SSE2/SSSE3 on x86-32H.J. Lu
2011-06-03Assume Intel Core i3/i5/i7 processor if AVX is availableH.J. Lu
2011-03-04Enable SSE2 memset for AMD'supcoming Orochi processor.Harsha Jagasia
This patch enables SSE2 memset for AMD's upcoming Orochi processor. This patch also fixes the following bug: For misaligned blocks larger than > 144 Bytes, memset branches into the integer code path depending on the value of misalignment even if the startup code chooses the SSE2 code path upfront, when multiarch is enabled.
2010-11-12Support Intel processor model 6 and model 0x2.H.J. Lu
2010-11-08Use IFUNC on x86-64 memsetH.J. Lu
2010-08-25Unroll 32bit SSE strlen and handle slow bsfH.J. Lu
2010-06-30Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7H.J. Lu
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to 4X on Core 2 and up to 2X on Core i7.
2010-05-27Incorrect x86 CPU family and model check.H.J. Lu
2010-04-04Fix concurrent handling of __cpu_features.Ulrich Drepper
2010-01-12Optimize 32bit memset/memcpy with SSE2/SSSE3.H.J. Lu
2009-10-06Clean up x86 multiarch HAS_FOO macros.Roland McGrath
2009-08-28Remove ENABLE_SSSE3_ON_ATOM.H.J. Lu
It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch removes ENABLE_SSSE3_ON_ATOM.
2009-07-31Support multiarch for i686.H.J. Lu
This patch adds multiarch support when configured for i686. I modified some x86-64 functions to support 32bit. I will contribute 32bit SSE string and memory functions later.
2009-07-29Prepare use if IFUNC functions outside libc.so.Ulrich Drepper
We use a callback function into libc.so to get access to the data structure with the information and have special versions of the test macros which automatically use this function.
2009-07-23Perform test for Arom x86-64 in central place and handle it.Ulrich Drepper
There will be more than one function which, in multiarch mode, wants to use SSSE3. We should not test in each of them for Atoms with slow SSSE3. Instead, disable the SSSE3 bit in the startup code for such machines.
2009-06-30Fix little checkin problem in last patch.Ulrich Drepper
2009-06-30Determine and store processor family and model on x86-64.H.J. Lu
2009-05-31Simplify CPUID value handling.Ulrich Drepper
SO far Intel and AMD use exactly the same bits meaning the same things in CPUID index 1. Simplify the code. Should an architecture come along which doesn't use the same semantics then it must use a different index value than COMMON_CPUID_INDEX_1.
2009-03-13* config.h.in (USE_MULTIARCH): Define.Ulrich Drepper
* configure.in: Handle --enable-multi-arch. * elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC. (_dl_fixup_profile): Likewise. * elf/do-lookup.c (dl_lookup_x): Likewise. * sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC. * elf/elf.h (STT_GNU_IFUNC): Define. * include/libc-symbols.h (libc_ifunc): Define. * sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the framework in init-arch.h to get CPUID values. * sysdeps/x86_64/multiarch/Makefile: New file. * sysdeps/x86_64/multiarch/init-arch.c: New file. * sysdeps/x86_64/multiarch/init-arch.h: New file. * sysdeps/x86_64/multiarch/sched_cpucount.c: New file. * config.make.in (experimental-malloc): Define. * configure.in: Handle --enable-experimental-malloc. * malloc/Makefile: Handle experimental-malloc flag. * malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features. * malloc/arena.c: Likewise. * malloc/hooks.c: Likewise. * malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.