glibc

History

MayShao-oc c19457aec6 x86_64: Optimize large size copy in memmove-ssse3 This patch optimizes large size copy using normal store when src > dst and overlap. Make it the same as the logic in memmove-vec-unaligned-erms.S. Current memmove-ssse3 use '__x86_shared_cache_size_half' as the non- temporal threshold, this patch updates that value to '__x86_shared_non_temporal_threshold'. Currently, the __x86_shared_non_temporal_threshold is cpu-specific, and different CPUs will have different values based on the related nt-benchmark results. However, in memmove-ssse3, the nontemporal threshold uses '__x86_shared_cache_size_half', which sounds unreasonable. The performance is not changed drastically although shows overall improvements without any major regressions or gains. Results on Zhaoxin KX-7000: bench-memcpy geometric_mean(N=20) New / Original: 0.999 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 0.978 bench-memmove geometric_mean(N=20) New / Original: 1.000 bench-memmmove-large geometric_mean(N=20) New / Original: 0.962 Results on Intel Core i5-6600K: bench-memcpy geometric_mean(N=20) New / Original: 1.001 bench-memcpy-random geometric_mean(N=20) New / Original: 0.999 bench-memcpy-large geometric_mean(N=20) New / Original: 1.001 bench-memmove geometric_mean(N=20) New / Original: 0.995 bench-memmmove-large geometric_mean(N=20) New / Original: 0.936 Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>		2024-06-30 06:26:43 -07:00
..
aarch64	Aarch64: Add new memset for Qualcomm's oryon-1 core	2024-06-30 13:47:17 +02:00
alpha	elf: Remove HWCAP_IMPORTANT	2024-06-18 10:45:36 +02:00
arc	Convert to autoconf 2.72 (vanilla release, no distribution patches)	2024-06-17 21:15:28 +02:00
arm	arm: Avoid UB in elf_machine_rel()	2024-06-26 12:45:43 +02:00
csky	elf: Remove HWCAP_IMPORTANT	2024-06-18 10:45:36 +02:00
generic	elf: Remove HWCAP_IMPORTANT	2024-06-18 10:45:36 +02:00
gnu	login: Use unsigned 32-bit types for seconds-since-epoch	2024-04-19 14:38:17 +02:00
hppa	Update hppa libm-test-ulps	2024-06-23 13:51:25 -04:00
htl	…
hurd	…
i386	i386: Update ulps	2024-06-20 19:00:48 +02:00
ieee754	Convert to autoconf 2.72 (vanilla release, no distribution patches)	2024-06-17 21:15:28 +02:00
loongarch	LoongArch: Fix tst-gnu2-tls2 test case	2024-06-26 12:02:07 +08:00
m68k	Implement C23 logp1	2024-06-17 13:47:09 +00:00
mach	Convert to autoconf 2.72 (vanilla release, no distribution patches)	2024-06-17 21:15:28 +02:00
microblaze	Implement C23 logp1	2024-06-17 13:47:09 +00:00
mips	Revert "MIPSr6/math: Use builtin fma and fmaf"	2024-06-25 01:02:58 +02:00
nios2	Convert to autoconf 2.72 (vanilla release, no distribution patches)	2024-06-17 21:15:28 +02:00
nptl	Always define __USE_TIME_BITS64 when 64 bit time_t is used	2024-04-02 15:28:36 -03:00
or1k	Implement C23 logp1	2024-06-17 13:47:09 +00:00
posix	posix: Sync tempname with gnulib	2024-04-10 14:53:39 -03:00
powerpc	powerpc: Update ulps	2024-06-20 12:15:31 +02:00
pthread	Add crt1-2.0.o for glibc 2.0 compatibility tests	2024-05-06 07:49:40 -07:00
riscv	RISC-V: Update ulps	2024-06-20 23:46:32 +02:00
s390	s390x: Capture grep output in static PIE check	2024-06-20 14:34:06 +02:00
sh	Implement C23 logp1	2024-06-17 13:47:09 +00:00
sparc	sparc: Regenerate ULPs	2024-06-19 14:58:32 +02:00
unix	posix: Fix pidfd_spawn/pidfd_spawnp leak if execve fails (BZ 31695)	2024-06-25 12:11:48 -03:00
wordsize-32	…
wordsize-64	…
x86	x86: Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors	2024-06-30 06:26:43 -07:00
x86_64	x86_64: Optimize large size copy in memmove-ssse3	2024-06-30 06:26:43 -07:00