glibc/sysdeps/x86_64/multiarch
Noah Goldstein 7cbc03d030 x86: Remove memcmp-sse4.S
Code didn't actually use any sse4 instructions since `ptest` was
removed in:

commit 2f9062d717
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Wed Nov 10 16:18:56 2021 -0600

    x86: Shrink memcmp-sse4.S code size

The new memcmp-sse2 implementation is also faster.

geometric_mean(N=20) of page cross cases SSE2 / SSE4: 0.905

Note there are two regressions preferring SSE2 for Size = 1 and Size =
65.

Size = 1:
size, align0, align1, ret, New Time/Old Time
   1,      1,      1,   0,               1.2
   1,      1,      1,   1,             1.197
   1,      1,      1,  -1,               1.2

This is intentional. Size == 1 is significantly less hot based on
profiles of GCC11 and Python3 than sizes [4, 8] (which is made
hotter).

Python3 Size = 1        -> 13.64%
Python3 Size = [4, 8]   -> 60.92%

GCC11   Size = 1        ->  1.29%
GCC11   Size = [4, 8]   -> 33.86%

size, align0, align1, ret, New Time/Old Time
   4,      4,      4,   0,             0.622
   4,      4,      4,   1,             0.797
   4,      4,      4,  -1,             0.805
   5,      5,      5,   0,             0.623
   5,      5,      5,   1,             0.777
   5,      5,      5,  -1,             0.802
   6,      6,      6,   0,             0.625
   6,      6,      6,   1,             0.813
   6,      6,      6,  -1,             0.788
   7,      7,      7,   0,             0.625
   7,      7,      7,   1,             0.799
   7,      7,      7,  -1,             0.795
   8,      8,      8,   0,             0.625
   8,      8,      8,   1,             0.848
   8,      8,      8,  -1,             0.914
   9,      9,      9,   0,             0.625

Size = 65:
size, align0, align1, ret, New Time/Old Time
  65,      0,      0,   0,             1.103
  65,      0,      0,   1,             1.216
  65,      0,      0,  -1,             1.227
  65,     65,      0,   0,             1.091
  65,      0,     65,   1,              1.19
  65,     65,     65,  -1,             1.215

This is because A) the checks in range [65, 96] are now unrolled 2x
and B) because smaller values <= 16 are now given a hotter path. By
contrast the SSE4 version has a branch for Size = 80. The unrolled
version has get better performance for returns which need both
comparisons.

size, align0, align1, ret, New Time/Old Time
 128,      4,      8,   0,             0.858
 128,      4,      8,   1,             0.879
 128,      4,      8,  -1,             0.888

As well, out of microbenchmark environments that are not full
predictable the branch will have a real-cost.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-04-15 13:08:42 -05:00
..
Makefile x86: Remove memcmp-sse4.S 2022-04-15 13:08:42 -05:00
bzero.c x86-64: Optimize bzero 2022-02-08 15:58:56 -08:00
ifunc-avx2.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-evex.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-impl-list.c x86: Remove memcmp-sse4.S 2022-04-15 13:08:42 -05:00
ifunc-memcmp.h x86: Remove memcmp-sse4.S 2022-04-15 13:08:42 -05:00
ifunc-memcmpeq.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-memmove.h x86: Remove mem{move|cpy}-ssse3-back 2022-04-14 23:21:42 -05:00
ifunc-memset.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-sse4_2.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-strcasecmp.h x86: Remove str{n}{case}cmp-ssse3 2022-04-14 23:21:41 -05:00
ifunc-strcpy.h x86: Remove str{n}cat-ssse3 2022-04-14 23:21:41 -05:00
ifunc-wcslen.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
ifunc-wmemset.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memchr-avx2-rtm.S
memchr-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memchr-evex-rtm.S
memchr-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmp-avx2-movbe-rtm.S
memcmp-avx2-movbe.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmp-evex-movbe.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmp-sse2.S x86: Optimize memcmp SSE2 in memcmp.S 2022-04-15 13:08:35 -05:00
memcmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmpeq-avx2-rtm.S
memcmpeq-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmpeq-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcmpeq-sse2.S x86: Optimize memcmp SSE2 in memcmp.S 2022-04-15 13:08:35 -05:00
memcmpeq.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memcpy_chk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove-avx-unaligned-erms-rtm.S
memmove-avx-unaligned-erms.S
memmove-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove-avx512-unaligned-erms.S
memmove-evex-unaligned-erms.S
memmove-sse2-unaligned-erms.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove-ssse3.S x86: Reduce code size of mem{move|pcpy|cpy}-ssse3 2022-04-14 23:21:42 -05:00
memmove-vec-unaligned-erms.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memmove_chk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
mempcpy_chk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memrchr-avx2-rtm.S
memrchr-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memrchr-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memrchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memrchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset-avx2-unaligned-erms-rtm.S x86-64: Optimize bzero 2022-02-08 15:58:56 -08:00
memset-avx2-unaligned-erms.S x86-64: Optimize bzero 2022-02-08 15:58:56 -08:00
memset-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset-avx512-unaligned-erms.S x86-64: Optimize bzero 2022-02-08 15:58:56 -08:00
memset-evex-unaligned-erms.S x86-64: Optimize bzero 2022-02-08 15:58:56 -08:00
memset-sse2-unaligned-erms.S x86-64: Remove bzero weak alias in SS2 memset 2022-02-14 10:16:02 -08:00
memset-vec-unaligned-erms.S x86: Set .text section in memset-vec-unaligned-erms 2022-02-12 04:25:19 -06:00
memset.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
memset_chk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rawmemchr-avx2-rtm.S
rawmemchr-avx2.S
rawmemchr-evex-rtm.S
rawmemchr-evex.S
rawmemchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
rawmemchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stpcpy-avx2-rtm.S
stpcpy-avx2.S
stpcpy-evex.S
stpcpy-sse2-unaligned.S
stpcpy-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stpcpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
stpncpy-avx2-rtm.S
stpncpy-avx2.S
stpncpy-c.c
stpncpy-evex.S
stpncpy-sse2-unaligned.S
stpncpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcasecmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcasecmp_l-avx2-rtm.S x86: Add AVX2 optimized str{n}casecmp 2022-03-25 13:16:43 -05:00
strcasecmp_l-avx2.S x86: Add AVX2 optimized str{n}casecmp 2022-03-25 13:16:43 -05:00
strcasecmp_l-evex.S x86: Add EVEX optimized str{n}casecmp 2022-03-25 13:16:50 -05:00
strcasecmp_l-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcasecmp_l-sse4_2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcasecmp_l.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcat-avx2-rtm.S
strcat-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcat-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcat-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcat-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcat.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchr-avx2-rtm.S
strchr-avx2.S x86: Code cleanup in strchr-avx2 and comment justifying branch 2022-03-25 11:46:13 -05:00
strchr-evex.S x86: Code cleanup in strchr-evex and comment justifying branch 2022-03-25 11:46:13 -05:00
strchr-sse2-no-bsf.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchrnul-avx2-rtm.S
strchrnul-avx2.S
strchrnul-evex.S
strchrnul-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strchrnul.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp-avx2-rtm.S
strcmp-avx2.S x86: Add AVX2 optimized str{n}casecmp 2022-03-25 13:16:43 -05:00
strcmp-evex.S x86: Add EVEX optimized str{n}casecmp 2022-03-25 13:16:50 -05:00
strcmp-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp-sse4_2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcmp-sse42.S x86: Remove AVX str{n}casecmp 2022-03-25 13:16:51 -05:00
strcmp.c x86: Remove str{n}{case}cmp-ssse3 2022-04-14 23:21:41 -05:00
strcpy-avx2-rtm.S
strcpy-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcpy-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcpy-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcpy-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strcspn-c.c x86: Optimize strcspn and strpbrk in strcspn-c.c 2022-03-25 11:46:13 -05:00
strcspn-sse2.c x86: Remove strcspn-sse2.S and use the generic implementation 2022-03-25 11:46:13 -05:00
strcspn.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen-avx2-rtm.S
strlen-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen-vec.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strlen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase_l-avx2-rtm.S x86: Add AVX2 optimized str{n}casecmp 2022-03-25 13:16:43 -05:00
strncase_l-avx2.S x86: Add AVX2 optimized str{n}casecmp 2022-03-25 13:16:43 -05:00
strncase_l-evex.S x86: Add EVEX optimized str{n}casecmp 2022-03-25 13:16:50 -05:00
strncase_l-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase_l-sse4_2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncase_l.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncat-avx2-rtm.S
strncat-avx2.S
strncat-c.c
strncat-evex.S
strncat-sse2-unaligned.S
strncat.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncmp-avx2-rtm.S x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] 2022-02-17 15:43:05 -06:00
strncmp-avx2.S x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] 2022-02-17 15:43:05 -06:00
strncmp-evex.S
strncmp-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncmp-sse4_2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strncmp.c x86: Remove str{n}{case}cmp-ssse3 2022-04-14 23:21:41 -05:00
strncpy-avx2-rtm.S
strncpy-avx2.S
strncpy-c.c
strncpy-evex.S
strncpy-sse2-unaligned.S
strncpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strnlen-avx2-rtm.S
strnlen-avx2.S
strnlen-evex.S
strnlen-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strnlen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strpbrk-c.c
strpbrk-sse2.c x86: Remove strpbrk-sse2.S and use the generic implementation 2022-03-25 11:46:13 -05:00
strpbrk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strrchr-avx2-rtm.S
strrchr-avx2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strrchr-evex.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strrchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strrchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strspn-c.c x86: Optimize strspn in strspn-c.c 2022-03-25 11:46:13 -05:00
strspn-sse2.c x86: Remove strspn-sse2.S and use the generic implementation 2022-03-25 11:46:13 -05:00
strspn.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strstr-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
strstr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
varshift.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
varshift.h Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcschr-avx2-rtm.S
wcschr-avx2.S
wcschr-evex.S
wcschr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcschr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcscmp-avx2-rtm.S
wcscmp-avx2.S
wcscmp-evex.S
wcscmp-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcscmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcscpy-c.c
wcscpy-ssse3.S x86: Small improvements for wcscpy-ssse3 2022-03-28 15:00:03 -05:00
wcscpy.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcslen-avx2-rtm.S
wcslen-avx2.S
wcslen-evex.S
wcslen-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcslen-sse4_1.S
wcslen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsncmp-avx2-rtm.S x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] 2022-02-17 15:43:05 -06:00
wcsncmp-avx2.S x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] 2022-02-17 15:43:05 -06:00
wcsncmp-evex.S
wcsncmp-sse2.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsncmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsnlen-avx2-rtm.S
wcsnlen-avx2.S
wcsnlen-c.c
wcsnlen-evex.S
wcsnlen-sse4_1.S
wcsnlen.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsrchr-avx2-rtm.S
wcsrchr-avx2.S
wcsrchr-evex.S
wcsrchr-sse2.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wcsrchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemchr-avx2-rtm.S
wmemchr-avx2.S
wmemchr-evex-rtm.S
wmemchr-evex.S
wmemchr-sse2.S
wmemchr.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemcmp-avx2-movbe-rtm.S
wmemcmp-avx2-movbe.S
wmemcmp-evex-movbe.S
wmemcmp-sse2.S x86: Optimize memcmp SSE2 in memcmp.S 2022-04-15 13:08:35 -05:00
wmemcmp-sse4.S
wmemcmp.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00
wmemset_chk.c Update copyright dates with scripts/update-copyrights 2022-01-01 11:40:24 -08:00