glibc

Commit Graph

Author	SHA1	Message	Date
Adhemerval Zanella	907089ba36	linux: Handle EINVAL as unsupported on tst-pidfd_getinfo Some kernels returns EINVAL for ioctl (PIDFD_GET_INFO) on pidfd descriptors. Checked on aarch64-linux-gnu with Linux 6.12. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-11-21 13:13:26 -03:00
Adhemerval Zanella	8d26bed1eb	Enable --enable-fortify-source with clang clang generates internal calls for some _chk symbol, so add internal aliases for them, and stub some with rtld-stubbed-symbols to avoid ld.so linker issues. Reviewed-by: Sam James <sam@gentoo.org>	2025-11-21 13:13:11 -03:00
Adhemerval Zanella	42f07a44ef	math: Remove ldbl-96 fma implementation It is worse than the ldbl-64 version on recent x86 hardware. With Zen3 and gcc-15: ldbl-96 removal reciprocal-throughput master patched improvement x86_64 1176.2200 289.4640 4.06x i686 1476.0600 636.8660 2.32x latency master patched improvement x86_64 1176.2200 293.7360 4.00x i686 1480.0700 658.4160 2.25x Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-21 13:13:02 -03:00
Samuel Thibault	ff92750112	htl: Move pthread_atfork compatibility symbol to libc There is no new symbol version because of the compatibility symbol status.	2025-11-21 00:29:44 +01:00
gfleury	b36a126f7d	htl: move pthread_spin_{destroy, lock, init, trylock, unlock) and remove _pthread_spin_lock, into libc. Message-ID: <20251120085647.326643-1-gfleury@disroot.org>	2025-11-21 00:29:44 +01:00
Samuel Thibault	951bb5c458	hurd: Add missing free_sized and free_aligned_sized `56549264d1` ("malloc: add free_sized and free_aligned_sized from C23") missed adding them.	2025-11-20 17:41:45 +01:00
Adhemerval Zanella	92186652d8	math: Sync atanh from CORE-MATH The CORE-MATH commit 703d7487 fixes some issues for RNDZ: Failure: Test: atanh_towardzero (0x5.96200b978b69cp-4) Result: is: 3.6447730550366463e-01 0x1.753989ed16faap-2 should be: 3.6447730550366458e-01 0x1.753989ed16fa9p-2 difference: 5.5511151231257827e-17 0x1.0000000000000p-54 ulp : 1.0000 max.ulp : 0.0000 Maximal error of `atanh_towardzero' is : 1 ulp accepted: 0 ulp Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu, and i686-linux-gnu.	2025-11-19 15:21:44 -03:00
Justin King	56549264d1	malloc: add free_sized and free_aligned_sized from C23 Signed-off-by: Justin King <jcking@google.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-19 13:47:53 -03:00
Adhemerval Zanella	4567204feb	math: Sync acosh from CORE-MATH The CORE-MATH commit 6736002f fixes some issues for RNDZ: Failure: Test: acosh_towardzero (0x1.08000c1e79fp+0) Result: is: 2.4935636091994373e-01 0x1.feae8c399b18cp-3 should be: 2.4935636091994370e-01 0x1.feae8c399b18bp-3 difference: 2.7755575615628913e-17 0x1.0000000000000p-55 ulp : 1.0000 max.ulp : 0.0000 Failure: Test: acosh_towardzero (0x1.080016353964ep+0) Result: is: 2.4935874767710369e-01 0x1.feafcc91f518ep-3 should be: 2.4935874767710367e-01 0x1.feafcc91f518dp-3 difference: 2.7755575615628913e-17 0x1.0000000000000p-55 ulp : 1.0000 max.ulp : 0.0000 Maximal error of `acosh_towardzero' is : 1 ulp accepted: 0 ulp This only happens when the ISA supports fma, such as x86_64-v3, aarch64, or powerpc. Checked on x86_64-linux-gnu, x86_64-linux-gnu-v3, aarch64-linux-gnu, and i686-linux-gnu.	2025-11-19 12:58:56 -03:00
H. Peter Anvin	40a751b004	linux/termios: test the kernel-side termios canonicalization Verify that the kernel side of the termios interface gets the various speed fields set according to our current canonicalization policy. [ v2.1: fix formatting - Adhemerval Netto ] [ v4: fix typo in patch description - Dan Horák ] Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (v2.1) Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2025-11-19 07:54:51 +08:00
Dylan Fleming	fd1d642ef8	AArch64: Remove WANT_SIMD_EXCEPT from aarch64 AdvSIMD math routines Remove legacy code for supporting an old Arm Optimised Routines deprecated feature for throwing SIMD Exceptions. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-18 15:51:15 +00:00
Pierre Blanchard	bb6519de1e	AArch64: Fix and improve SVE pow(f) special cases powf: Update scalar special case function to best use new interface. pow: Make specialcase NOINLINE to prevent str/ldr leaking in fast path. Remove depency in sv_call2, as new callback impl is not a performance gain. Replace with vectorised specialcase since structure of scalar routine is fairly simple. Throughput gain of about 5-10% on V1 for large values and 25% for subnormal `x`. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-18 15:51:15 +00:00
Pierre Blanchard	e889160273	AArch64: fix SVE tanpi(f) [BZ #33642 ] Fixed svld1rq using incorrect predicates (BZ #33642). Next to no performance variations (tested on V1). Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-18 15:51:15 +00:00
gfleury	d989840693	htl: move pthread_hurd_cond_timedwait_np, pthread_hurd_cond_wait_np into libc. Message-ID: <20251118125044.1160780-3-gfleury@disroot.org>	2025-11-18 15:01:35 +01:00
gfleury	bb3524a879	htl: move pthread_getname_np/setname_np into libc. Message-ID: <20251118125044.1160780-2-gfleury@disroot.org>	2025-11-18 15:01:35 +01:00
Adhemerval Zanella	8c66b742cf	Add new AArch64 HWCAP3 definitions from Linux 6.17 to bits/hwcap.h Linux 7c7f55039b8d6 added HWCAP3_MTE_FAR and f620372209bfe added HWCAP3_MTE_STORE_ONLY.	2025-11-18 10:35:32 -03:00
Stefan Liebler	b9579342c6	Remove support for lock elision. The support for lock elision was already deprecated with glibc 2.42: commit `77438db8cf` "Mark support for lock elision as deprecated." See also discussions: https://sourceware.org/pipermail/libc-alpha/2025-July/168492.html This patch removes the architecture specific support for lock elision for x86, powerpc and s390 by removing the elision-conf.h, elision-conf.c, elision-lock.c, elision-timed.c, elision-unlock.c, elide.h, htm.h/hle.h files. Those generic files are also removed. The architecture specific structures are adjusted and the elision fields are marked as unused. See struct_mutex.h files. Furthermore in struct_rwlock.h, the leftover __rwelision was also removed. Those were originally removed with commit `0377a7fde6` "nptl: Remove rwlock elision definitions" and by chance reintroduced with commit `7df8af43ad` "nptl: Add struct_rwlock.h" The common code (e.g. the pthread_mutex-files) are changed back to the time before lock elision was introduced with the x86-support: - commit `1cdbe57948` "Add the low level infrastructure for pthreads lock elision with TSX" - commit `b023e4ca99` "Add new internal mutex type flags for elision." - commit `68cc29355f` "Add minimal test suite changes for elision enabled kernels" - commit `e8c659d74e` "Add elision to pthread_mutex_{try,timed,un}lock" - commit `49186d21ef` "Disable elision for any pthread_mutexattr_settype call" - commit `1717da59ae` "Add a configure option to enable lock elision and disable by default" Elision is removed also from the tunables, the initialization part, the pretty-printers and the manual. Some extra handling in the testsuite is removed as well as the full tst-mutex10 testcase, which tested a race while enabling lock elision. I've also searched the code for "elision", "elide", "transaction" and e.g. cleaned some comments. I've run the testsuite on x86_64 and s390x and run the build-many-glibcs.py script. Thanks to Sachin Monga, this patch is also tested on powerpc. A NEWS entry also mentions the removal. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-18 14:21:13 +01:00
H. Peter Anvin	6463953fec	linux/termios: factor out the kernel interface from termios_internal.h Factor out the internal kernel interface from termios_internal.h, so that it can be used in test code without causing breakage due to glibc internals used in headers. [ v3: fix Alpha build breakage ] Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-18 12:05:20 +08:00
H. Peter Anvin	8d999a6993	linux/termios: clear k_termios.c_cflag & CIBAUD for non-split speed [BZ 33340] After getting more experience with the various broken direct-to-ioctl termios2 hacks using Fedora 43 beta, I have found a fair number of cases where the software would fail to set, or clear CIBAUD for non-split-speed operation. Thus it seems will help improve compatibility to clear the kernel-side version of c_cflag & CIBAUD (having the same meaning to the Linux kernel as the speed 0 has for cfsetibaud(), i.e. force the input speed to equal the output speed) for non-split-speed operation, rather than having it explicitly equal the output speed in CBAUD. When writing the code that went into glibc 2.42 I had considered this issue, and had to make an educated guess which way would be more likely to break fewer things. Unfortunately, it appears I guessed wrong. A third option would be to always set CIBAUD to __BOTHER, even for the standard baud rates. However, that is an even bigger departure from legacy behavior, whereas this variant mostly preserves current behavior in terms of under what conditions buggy utilities will continue to work. This change is in tcsetattr() rather than ___termios2_canonicalize_speeds(), as it should not be run for tcgetattr(); that would break split speed support for the legacy interface versions of cfgetispeed() and cfsetispeed(). [ v2: fixed comment style ] Resolves: BZ #33340 Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-18 12:04:55 +08:00
Adhemerval Zanella	1abdb38135	math: Handle fabsf128 !__USE_EXTERN_INLINES Work around the clang limitation wrt inline function and attribute definition, where it does not allow to 'add' new attribute if a function is already defined: clang on x86_64 fails to build s_fabsf128.c with: ../sysdeps/ieee754/float128/../ldbl-128/s_fabsl.c:32:1: error: attribute declaration must precede definition [-Werror,-Wignored-attributes] 32 \| libm_alias_ldouble (__fabs, fabs) \| ^ ../sysdeps/generic/libm-alias-ldouble.h:63:38: note: expanded from macro 'libm_alias_ldouble' 63 \| #define libm_alias_ldouble(from, to) libm_alias_ldouble_r (from, to, ) \| ^ ../sysdeps/ieee754/float128/float128_private.h:133:43: note: expanded from macro 'libm_alias_ldouble_r' 133 \| #define libm_alias_ldouble_r(from, to, r) libm_alias_float128_r (from, to, r) \| ^ ../sysdeps/ieee754/float128/s_fabsf128.c:5:3: note: expanded from macro 'libm_alias_float128_r' 5 \| static_weak_alias (from ## f128 ## r, to ## f128 ## r); \ \| ^ ./../include/libc-symbols.h:166:46: note: expanded from macro 'static_weak_alias' 166 \| # define static_weak_alias(name, aliasname) weak_alias (name, aliasname) \| ^ ./../include/libc-symbols.h:154:38: note: expanded from macro 'weak_alias' 154 \| # define weak_alias(name, aliasname) _weak_alias (name, aliasname) \| ^ ./../include/libc-symbols.h:156:52: note: expanded from macro '_weak_alias' 156 \| extern __typeof (name) aliasname __attribute__ ((weak, alias (#name))) \ \| ^ ../include/math.h:134:1: note: previous definition is here 134 \| fabsf128 (_Float128 x) If compiler does not support __USE_EXTERN_INLINES we need to route fabsf128 call to an internal symbol.	2025-11-17 11:17:07 -03:00
Adhemerval Zanella	53ad1eae0f	x86: Fix strstr ifunc on clang Work around the clang limitation wrt inline function and attribute definition, where it does not allow to 'add' new attribute if a function is already defined: Buildint with clang triggers multiple issue on how ifunc macro are used: ../sysdeps/x86_64/multiarch/strstr.c:38:54: error: attribute declaration must precede definition [-Werror,-Wignored-attributes] 38 \| extern __typeof (__redirect_strstr) __strstr_generic attribute_hidden; \| ^ ./../include/libc-symbols.h:356:43: note: expanded from macro 'attribute_hidden' 356 \| # define attribute_hidden __attribute__ ((visibility ("hidden"))) \| ^ ../string/strstr.c:76:1: note: previous definition is here 76 \| STRSTR (const char haystack, const char needle) \| ^ ../sysdeps/x86_64/multiarch/strstr.c:27:16: note: expanded from macro 'STRSTR' 27 \| #define STRSTR __strstr_generic \| ^ ../sysdeps/x86_64/multiarch/strstr.c:65:43: error: redefinition of '__libc_strstr' 65 \| libc_ifunc_redirected (__redirect_strstr, __libc_strstr, IFUNC_SELECTOR ()); \| ^ And ../sysdeps/x86_64/multiarch/strstr.c:65:43: error: redefinition of '__libc_strstr' 65 \| libc_ifunc_redirected (__redirect_strstr, __libc_strstr, IFUNC_SELECTOR ()); \| ^ ../sysdeps/x86_64/multiarch/strstr.c:59:13: note: previous definition is here 59 \| libc_ifunc (__libc_strstr, \| ^ Refactor to use a auxiliary function like other selection (for instance, x86_64/multiarch/strcmp.c).	2025-11-17 11:17:07 -03:00
Adhemerval Zanella	edd4dc7dc8	x86: Use -mavx instead of -msse2avx clang supports -msse2avx from version 19 and onwards, but it should be gated as an option to assembler (either with -Wa or -Xassembler). The -DSSE2AVX option was used because there were asm statements with SSE-only instructions which was fixed by commit `ff8be6152b`. Now we can simply use -mavx. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2025-11-17 11:17:07 -03:00
Adhemerval Zanella	13cfd77bf5	math: Don't redirect inlined builtin math functions When we want to inline builtin math functions, like truncf, for extern float truncf (float __x) __attribute__ ((__nothrow__ )) __attribute__ ((__const__)); extern float __truncf (float __x) __attribute__ ((__nothrow__ )) __attribute__ ((__const__)); float (truncf) (float) asm ("__truncf"); compiler may redirect truncf calls to __truncf, instead of inlining it (for instance, clang). The USE_TRUNCF_BUILTIN is 1 to indicate that truncf should be inlined. In this case, we don't want the truncf redirection: 1. For each math function which may be inlined, we define #if USE_TRUNCF_BUILTIN # define NO_truncf_BUILTIN inline_truncf #else # define NO_truncf_BUILTIN truncf #endif in <math-use-builtins.h>. 2. Include <math-use-builtins.h> in include/math.h. 3. Change MATH_REDIRECT to #define MATH_REDIRECT(FUNC, PREFIX, ARGS) \ float (NO_ ## FUNC ## f ## _BUILTIN) (ARGS (float)) \ asm (PREFIX #FUNC "f"); With this change If USE_TRUNCF_BUILTIN is 0, we get float (truncf) (float) asm ("__truncf"); truncf will be redirected to __truncf. And for USE_TRUNCF_BUILTIN 1, we get: float (inline_truncf) (float) asm ("__truncf"); In both cases either truncf will be inlined or the internal alias (__truncf) will be called. It is not required for all math-use-builtin symbol, only the one defined in math.h. It also allows to remove all the math-use-builtin inclusion, since it is now implicitly included by math.h. For MIPS, some math-use-builtin headers include sysdep.h and this in turn includes a lot of extra headers that do not allow ldbl-128 code to override alias definition (math.h will include some stdlib.h definition). The math-use-builtin only requires the __mips_isa_rev, so move the defintion to sgidefs.h. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2025-11-17 11:17:07 -03:00
Florian Weimer	c6f151839b	Reference COPYING.LIB in <sframe.h> copyright header Commit `3360913c37` ("elf: Add SFrame stack tracing") added this file with an inconsistent copyright header. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2025-11-17 11:15:13 +01:00
Samuel Thibault	5b6ee0e0ba	htl: move pthread_create to into libc This is notably needed for the main thread structure to be always initialized so that some pthread functions can work from the main thread without other threads, e.g. pthread_cancel.	2025-11-17 00:38:37 +00:00
Samuel Thibault	c7d699b55b	htl: Add missing include For IS_IN.	2025-11-16 11:53:46 +01:00
Samuel Thibault	a064213785	loongarch: Remove TLS_TCB_ALIGN This reverts a part of `9f18265a8e` ("Remove TLS_TCB_ALIGN and TLS_INIT_TCB_ALIGN"), as loongarch uses this macro internally.	2025-11-16 11:27:47 +01:00
Samuel Thibault	ce61fcf702	hurd: Fix restoring SSE state on signal mach_port_mod_refs() needs to avoid using SSE&MMX for __sigreturn2 to be able to use it without thrashing SSE&MMX.	2025-11-16 01:33:02 +01:00
Samuel Thibault	9f18265a8e	Remove TLS_TCB_ALIGN and TLS_INIT_TCB_ALIGN This is the rest of `627f5ede70` ("Remove TLS_TCB_ALIGN and TLS_INIT_TCB_ALIGN"), for loongarch and or1k which missed it.	2025-11-15 22:01:07 +01:00
Osama Abdelkader	4f18501498	math: Optimize frexpl (intel96) with fast path for normal numbers Add fast path optimization for frexpl (80-bit x87 extended precision) using a single unsigned comparison to identify normal floating-point numbers and return immediately via arithmetic on the exponent field. The implementation uses arithmetic operations (se - ex ) to adjust the exponent directly, which is simpler than bit masking. For subnormals, the traditional multiply-based normalization is retained as it handles the split word format more reliably. The zero/infinity/NaN check groups these special cases together for better branch prediction. Benchmark results on Intel Core i9-13900H (13th Gen): Baseline: 25.543 ns/op Optimized: 25.531 ns/op Speedup: 1.00x (neutral) Zero: 17.774 ns/op Denormal: 23.900 ns/op Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-14 19:52:38 +00:00
Adhemerval Zanella	7fec8a5de6	Revert __HAVE_64B_ATOMICS configure check The `53807741fb` added a configure check for 64-bit atomic operations that were not previously enabled on some 32-bit ABIs. However, the NPTL semaphore code casts a sem_t to a new_sem and issues a 64-bit atomic operation for __HAVE_64B_ATOMICS. Since sem_t has 32-bit alignment on 32-bit architectures, this prevents the use of 64-bit atomics even if the ABI supports them. Assume 64-bit atomic support from __WORDSIZE, which maps to how glibc defines it before the broken change. Also rename __HAVE_64B_ATOMICS to USE_64B_ATOMICS to define better the flag meaning. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-14 14:05:20 -03:00
Carlos O'Donell	5bdf3c9092	x86: Increase allowable TSX abort rate to 6%. In pre-commit CI on an E5-2698 v4 we sometimes see ~5% aborts. Set the trip point to 6%. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2025-11-14 08:18:36 -05:00
Samuel Thibault	91fb9914d8	htl: Remove errno and herrno from libpthread libc already has them.	2025-11-13 23:45:42 +01:00
Samuel Thibault	23b8e6ae4f	htl: Drop pthread-functions infrastructure All previously forwarded functions are now called directly (either via local call in libc, or through a __export).t	2025-11-13 23:23:13 +01:00
Samuel Thibault	f6a60e9867	htl: move {,_IO_}f{,un,try}lockfile implementation into libc	2025-11-13 23:01:07 +01:00
Adhemerval Zanella	c6908c4e24	linux: Add mseal to mips32 nofpu abilist It was missing from `3d52fd274e`.	2025-11-13 15:32:26 -03:00
Florian Weimer	2254e871f4	hppa: Consistently reference LGPL in copyright header The file was added with a GPL reference (but LGPL statement) in commit `0d6bed7150` ("hppa: Add ____longjmp_check C implementation."). Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Collin Funk <collin.funk1@gmail.com>	2025-11-13 09:46:15 +01:00
Joseph Myers	1f79bc4838	Change fromfp functions to return floating types following C23 (bug 28327) As discussed in bug 28327, C23 changed the fromfp functions to return floating types instead of intmax_t / uintmax_t. (Although the motivation in N2548 was reducing the use of intmax_t in library interfaces, the new version does have the advantage of being able to specify arbitrary integer widths for e.g. assigning the result to a _BitInt, as well as being able to indicate an error case in-band with a NaN return.) As with other such changes from interfaces introduced in TS 18661, implement the new types as a replacement for the old ones, with the old functions remaining as compat symbols but not supported as an API. The test generator used for many of the tests is updated to handle both versions of the functions. Tested for x86_64 and x86, and with build-many-glibcs.py. Also tested tgmath tests for x86_64 with GCC 7 to make sure that the modified case for older compilers in <tgmath.h> does work. Also tested for powerpc64le to cover the ldbl-128ibm implementation and the other things that are handled differently for that configuration. The new tests fail for ibm128, but all the failures relate to incorrect signs of zero results and turn out to arise from bugs in the underlying roundl, ceill, truncl and floorl implementations that I've reported in bug 33623, rather than indicating any bug in the actual new implementation of the functions for that format. So given fixes for those functions (which shouldn't be hard, and of course should add to the tests for those functions rather than relying only on indirect testing via fromfp), the fromfp tests should start passing for ibm128 as well.	2025-11-13 00:04:21 +00:00
Wilco Dijkstra	989e538224	math: Remove float_t and double_t [BZ #33563 ] Remove uses of float_t and double_t. This is not useful on modern machines, and does not help given GCC defaults to -fexcess-precision=fast. One use of double_t remains to allow forcing the precision to double on targets where FLT_EVAL_METHOD=2. This fixes BZ #33563 on i486-pc-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-12 19:33:23 +00:00
Wilco Dijkstra	3b7bb7b2f2	math: Remove ldbl-128/s_fma.c Remove ldbl-128/s_fma.c - it makes no sense to use emulated float128 operations to emulate FMA. Benchmarking shows dbl-64/s_fma.c is about twice as fast. Remove redundant dbl-64/s_fma.c includes in targets that were trying to work around this issue. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-12 18:57:29 +00:00
Adhemerval Zanella	3d52fd274e	linux: Add mseal syscall support It has been added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370) as a way to block operations such as mapping, moving to another location, shrinking the size, expanding the size, or modifying it to a pre-existing memory mapping. Although the system only works on 64-bit CPUs, the entrypoint was added for all ABIs (since the kernel might eventually implement it for additional ones and/or the ABI can execute on a 64-bit kernel). Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Collin Funk <collin.funk1@gmail.com>	2025-11-12 15:27:28 -03:00
Yury Khrustalev	a9c426bcca	aarch64: fix includes in SME tests Use the correct include for the SIGCHLD macro: signal.h Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-12 13:45:52 +00:00
Xi Ruoyao	2f5e68dea9	LoongArch: Call elf_ifunc_invoke for R_LARCH_IRELATIVE in elf_machine_rela When R_LARCH_IRELATIVE is resolved by apply_irel, the ifunc resolver is called via elf_ifunc_invoke so it can read HWCAP from the __ifunc_arg_t argument. But when R_LARCH_IRELATIVE is resolved by elf_machine_rela (it will happen if we dlopen() a shared object containing R_LARCH_IRELATIVE), the ifunc resolver is invoked directly with no or different argument. This causes a segfault if the resolver uses the __ifunc_arg_t. Despite the LoongArch psABI does not specify this argument, IMO it's more convenient to have this argument IMO and per hyrum's rule there may be objects in wild which already relies on this argument (they just didn't blow up because they are not dlopen()ed yet). So make the behavior handling R_LARCH_IRELATIVE of elf_machine_rela same as apply_irel. This fixes BZ #33610. Signed-off-by: Xi Ruoyao <xry111@xry111.site>	2025-11-12 09:12:48 +08:00
Samuel Thibault	f851a74346	hurd: Drop remnants of cthreads These are not used in GNU/Hurd since very long now.	2025-11-12 01:11:11 +01:00
H.J. Lu	71d9f47b5a	x86-64: Fix a typo in fesetenv.c [BZ #33619 ] Fix a typo in commit `427c25278d` Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Fri Oct 31 17:00:46 2025 -0300 x86: Adapt "%v" usage on clang to emit VEX enconding @@ -103,8 +104,8 @@ __fesetenv (const fenv_t envp) temp.__mxcsr = envp->__mxcsr; } - __asm__ ("fldenv %0\n" - "%vldmxcsr %1" : : "m" (temp), "m" (temp.__mxcsr)); + asm volatile ("fldenv %0" : "=m" (temp)); + ldmxcsr_inline_asm (&temp.__mxcsr); / Success. */ return 0; "temp" is input not output. This fixes BZ #33619. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Collin Funk <collin.funk1@gmail.com>	2025-11-11 17:06:34 +08:00
Xie jiamei	1707b23382	Set Prefer_No_AVX512 flag for hygon platform Benchmarks indicate evex can be more profitable on Hygon hardware than AVX512. So add Prefer_No_AVX512 to make it run with evex. Change-Id: Icc59492f71fde7a783a8bd315714ffd6f7ecaf29 Signed-off-by: Li jing <lijing@hygon.cn> Signed-off-by: Xie jiamei <xiejiamei@hygon.cn>	2025-11-11 10:47:26 +08:00
Osama Abdelkader	e52d9542cd	math: Optimize frexpl (binary128) with fast path for normal numbers Add fast path optimization for frexpl (128-bit IEEE quad precision) using a single unsigned comparison to identify normal floating-point numbers and return immediately via arithmetic on the exponent field. The implementation uses arithmetic operations hx = hx - (ex << 48) to adjust the exponent in place, which is simpler and more efficient than bit masking. For subnormals, the traditional multiply-based normalization is retained for reliability with the split 64-bit word format. The zero/infinity/NaN check groups these special cases together for better branch prediction. This optimization provides the same algorithmic improvements as the other frexp variants while maintaining correctness for all edge cases. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-10 08:58:19 -03:00
Osama Abdelkader	e05476b5c8	math: Optimize frexp (binary64) with fast path for normal numbers Add fast path optimization for frexp using a single unsigned comparison to identify normal floating-point numbers and return immediately via arithmetic on the bit representation. The implementation uses asuint64()/asdouble() from math_config.h and arithmetic operations to adjust the exponent, which generates better code than bit masking on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides faster normalization than the traditional multiply approach. The zero/infinity/NaN check is simplified to (int64_t)(ix << 1) <= 0, which is more efficient than separate comparisons. Benchmark results on Intel Core i9-13900H (13th Gen): Baseline: 6.778 ns/op Optimized: 4.007 ns/op Speedup: 1.69x (40.9% faster) Zero: 3.580 ns/op (fast path) Denormal: 6.096 ns/op (slower, rare case) Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-10 08:58:18 -03:00
Osama Abdelkader	4d2582150e	math: Optimize frexpf (binary32) with fast path for normal numbers Add fast path optimization for frexpf using a single unsigned comparison to identify normal floating-point numbers and return immediately via arithmetic on the bit representation. The implementation uses asuint()/asfloat() from math_config.h and arithmetic operations to adjust the exponent, which generates better code than bit masking on ARM and RISC-V architectures. For subnormals, stdc_leading_zeros provides faster normalization than the traditional multiply approach. The zero/infinity/NaN check is simplified to (int32_t)(hx << 1) <= 0, which is more efficient than separate comparisons. Benchmark results on Intel Core i9-13900H (13th Gen): Baseline: 5.858 ns/op Optimized: 4.003 ns/op Speedup: 1.46x (31.7% faster) Zero: 3.580 ns/op (fast path) Denormal: 5.597 ns/op (slower, rare case) Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-11-10 08:58:18 -03:00
Adhemerval Zanella	b983c854e6	math: Sync acosh from CORE-MATH The c9abdf80 fix handle some cases for RNDZ. Checked on x86_64-linux-gnu.	2025-11-10 08:58:14 -03:00

1 2 3 4 5 ...

17277 Commits