glibc/sysdeps
H.J. Lu 7e681561a3 x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603]
When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in
sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions,
which leads to store forward stall:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

There is no easy fix in compiler.  This patch limits vector width to
128 bits to work around this issue.  It improves performance of sin
and cos by more than 40% on Skylake compiled with -O3 -march=skylake.

Tested with GCC 7/8/9 on x86-64.

	[BZ #24603]
	* sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128
	works.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New.  Set
	to -mprefer-vector-width=128 if supported.
2019-07-24 14:48:43 -07:00
..
aarch64
alpha
arm
csky
generic
gnu
hppa
htl
hurd
i386
ia64
ieee754
init_array
m68k
mach
microblaze
mips
nios2
nptl nptl: Remove unnecessary forwarding of pthread_cond_clockwait from libc 2019-07-18 11:24:33 -03:00
posix
powerpc
pthread
riscv
s390
sh
sparc
unix Linux: Use in-tree copy of SO_ constants for !__USE_MISC [BZ #24532] 2019-07-24 10:59:34 +02:00
wordsize-32
wordsize-64
x86
x86_64 x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603] 2019-07-24 14:48:43 -07:00