Go to file
Noah Goldstein 7cbc03d030 x86: Remove memcmp-sse4.S
Code didn't actually use any sse4 instructions since `ptest` was
removed in:

commit 2f9062d717
Author: Noah Goldstein <goldstein.w.n@gmail.com>
Date:   Wed Nov 10 16:18:56 2021 -0600

    x86: Shrink memcmp-sse4.S code size

The new memcmp-sse2 implementation is also faster.

geometric_mean(N=20) of page cross cases SSE2 / SSE4: 0.905

Note there are two regressions preferring SSE2 for Size = 1 and Size =
65.

Size = 1:
size, align0, align1, ret, New Time/Old Time
   1,      1,      1,   0,               1.2
   1,      1,      1,   1,             1.197
   1,      1,      1,  -1,               1.2

This is intentional. Size == 1 is significantly less hot based on
profiles of GCC11 and Python3 than sizes [4, 8] (which is made
hotter).

Python3 Size = 1        -> 13.64%
Python3 Size = [4, 8]   -> 60.92%

GCC11   Size = 1        ->  1.29%
GCC11   Size = [4, 8]   -> 33.86%

size, align0, align1, ret, New Time/Old Time
   4,      4,      4,   0,             0.622
   4,      4,      4,   1,             0.797
   4,      4,      4,  -1,             0.805
   5,      5,      5,   0,             0.623
   5,      5,      5,   1,             0.777
   5,      5,      5,  -1,             0.802
   6,      6,      6,   0,             0.625
   6,      6,      6,   1,             0.813
   6,      6,      6,  -1,             0.788
   7,      7,      7,   0,             0.625
   7,      7,      7,   1,             0.799
   7,      7,      7,  -1,             0.795
   8,      8,      8,   0,             0.625
   8,      8,      8,   1,             0.848
   8,      8,      8,  -1,             0.914
   9,      9,      9,   0,             0.625

Size = 65:
size, align0, align1, ret, New Time/Old Time
  65,      0,      0,   0,             1.103
  65,      0,      0,   1,             1.216
  65,      0,      0,  -1,             1.227
  65,     65,      0,   0,             1.091
  65,      0,     65,   1,              1.19
  65,     65,     65,  -1,             1.215

This is because A) the checks in range [65, 96] are now unrolled 2x
and B) because smaller values <= 16 are now given a hotter path. By
contrast the SSE4 version has a branch for Size = 80. The unrolled
version has get better performance for returns which need both
comparisons.

size, align0, align1, ret, New Time/Old Time
 128,      4,      8,   0,             0.858
 128,      4,      8,   1,             0.879
 128,      4,      8,  -1,             0.888

As well, out of microbenchmark environments that are not full
predictable the branch will have a real-cost.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-04-15 13:08:42 -05:00
ChangeLog.old
argp
assert
benchtests benchtests: Use json-lib in bench-strncasecmp.c 2022-03-25 11:46:13 -05:00
bits
catgets
conform
crypt crypt: Remove unused variable on cert test 2022-03-31 09:00:54 -03:00
csu
ctype
debug debug: Improve fdelt_chk error message 2022-03-28 19:10:30 +05:30
dirent
dlfcn
elf S390: Add new s390 platform z16. 2022-04-14 10:37:45 +02:00
gmon
gnulib
grp
gshadow
hesiod
htl
hurd Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
iconv Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
iconvdata Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
include
inet
intl
io
libio
locale Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
localedata Add rif_MA locale [BZ #27781] 2022-04-07 14:59:41 +02:00
login
mach
malloc
manual nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029) 2022-04-14 12:48:31 -03:00
math
mathvec
misc misc: Use 64 bit time_t interfaces on syslog 2022-04-15 10:41:54 -03:00
nis
nptl nptl: Handle spurious EINTR when thread cancellation is disabled (BZ#29029) 2022-04-14 12:48:31 -03:00
nptl_db
nscd
nss hurd: Fix arbitrary error code 2022-04-12 22:15:48 +02:00
po
posix Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
pwd
resolv Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
resource
rt
scripts
setjmp
shadow
signal
socket
soft-fp
stdio-common stdio: Split __get_errname definition from errlist.c 2022-04-15 09:37:57 -03:00
stdlib stdlib: Reflow and sort most variable assignments 2022-04-13 16:52:39 -03:00
string Replace {u}int_fast{16|32} with {u}int32_t 2022-04-13 21:23:04 -05:00
sunrpc
support support: Add xmkfifo 2022-04-15 09:59:33 -03:00
sysdeps x86: Remove memcmp-sse4.S 2022-04-15 13:08:42 -05:00
sysvipc
termios
time
timezone
wcsmbs
wctype
.clang-format Add .clang-format style file 2022-04-11 10:51:03 -05:00
.gitattributes
.gitignore
CONTRIBUTED-BY
COPYING
COPYING.LIB
INSTALL
LICENSES
MAINTAINERS
Makeconfig Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
Makefile
Makefile.help
Makefile.in
Makerules
NEWS NEWS: Move PLT tracking slowdown to glibc 2.35. 2022-04-12 13:26:10 -04:00
README
Rules
SHARED-FILES
abi-tags
aclocal.m4
config.h.in Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
config.make.in Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
configure Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
configure.ac Remove -z combreloc and HAVE_Z_COMBRELOC 2022-04-04 17:19:07 -07:00
extra-lib.mk
gen-locales.mk
libc-abis
libof-iterator.mk
o-iterator.mk
shlib-versions
test-skeleton.c
version.h

README

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arc*-*-linux-gnu
	arm-*-linux-gnueabi
	csky-*-linux-gnuabiv2
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	or1k-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv32-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see https://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at https://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see https://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.