glibc/sysdeps
Noah Goldstein 26b2478322 x86: Reduce code size of mem{move|pcpy|cpy}-ssse3
The goal is to remove most SSSE3 function as SSE4, AVX2, and EVEX are
generally preferable. memcpy/memmove is one exception where avoiding
unaligned loads with `palignr` is important for some targets.

This commit replaces memmove-ssse3 with a better optimized are lower
code footprint verion. As well it aliases memcpy to memmove.

Aside from this function all other SSSE3 functions should be safe to
remove.

The performance is not changed drastically although shows overall
improvements without any major regressions or gains.

bench-memcpy geometric_mean(N=50) New / Original: 0.957

bench-memcpy-random geometric_mean(N=50) New / Original: 0.912

bench-memcpy-large geometric_mean(N=50) New / Original: 0.892

Benchmarks where run on Zhaoxin KX-6840@2000MHz See attached numbers
for all results.

More important this saves 7246 bytes of code size in memmove an
additional 10741 bytes by reusing memmove code for memcpy (total 17987
bytes saves). As well an additional 896 bytes of rodata for the jump
table entries.
2022-04-14 23:21:42 -05:00
..
aarch64 elf: Fix runtime linker auditing on aarch64 (BZ #26643) 2022-02-01 14:49:46 -03:00
alpha alpha: Remove fcopysign{f} implementation 2022-04-07 14:56:26 -03:00
arc Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
arm Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
csky Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
generic Remove _dl_skip_args_internal declaration 2022-04-12 14:42:26 +01:00
gnu Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
hppa Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
htl htl: Fix initializing the key lock 2022-02-14 19:29:02 +01:00
hurd hurd: Fix pthread_kill on exiting/ted thread 2022-01-15 15:11:54 +01:00
i386 x86: Remove fcopysign{f} implementation 2022-04-07 12:17:15 -03:00
ia64 ia64: Remove fcopysign{f} implementation 2022-04-07 12:27:00 -03:00
ieee754 math: Use builtin for ldbl-96 copysign 2022-04-07 14:54:14 -03:00
m68k Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mach hurd: Define ELIBEXEC 2022-04-12 22:16:40 +02:00
microblaze Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mips Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
nios2 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
nptl nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029) 2022-04-14 12:48:31 -03:00
or1k elf: Remove prelink support 2022-02-10 09:16:12 -03:00
posix gmon: Remove unused sprofil.c functions 2022-03-23 14:29:25 -03:00
powerpc powerpc64: Set up thread register for _dl_relocate_static_pie 2022-04-10 08:33:40 +09:30
pthread nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029) 2022-04-14 12:48:31 -03:00
riscv manual: Avoid name collision in libm ULP table [BZ #28956] 2022-04-11 11:46:10 -04:00
s390 S390: Add new s390 platform z16. 2022-04-14 10:37:45 +02:00
sh elf: Remove prelink support 2022-02-10 09:16:12 -03:00
sparc sparc64: Remove fcopysign{f} implementation 2022-04-07 15:11:56 -03:00
unix Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
wordsize-32 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wordsize-64 Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
x86 x86: Fix fallback for wcsncmp_avx2 in strcmp-avx2.S [BZ #28896] 2022-03-25 11:46:13 -05:00
x86_64 x86: Reduce code size of mem{move|pcpy|cpy}-ssse3 2022-04-14 23:21:42 -05:00