glibc

Commit Graph

Author	SHA1	Message	Date
Carlos O'Donell	15808c77b3	ppc64le: Revert "powerpc: Optimized strcmp for power10" (CVE-2025-5702) This reverts commit `3367d8e180` Reason for revert: Power10 strcmp clobbers non-volatile vector registers (Bug 33056) Tested on ppc64le without regression.	2025-06-16 18:02:58 -04:00
Carlos O'Donell	a7877bb668	ppc64le: Revert "powerpc : Add optimized memchr for POWER10" (Bug 33059) This reverts commit `b9182c793c` Reason for revert: Power10 memchr clobbers v20 vector register (Bug 33059) This is not a security issue, unlike CVE-2025-5745 and CVE-2025-5702. Tested on ppc64le without regression.	2025-06-16 18:02:58 -04:00
Carlos O'Donell	c22de63588	ppc64le: Revert "powerpc: Fix performance issues of strcmp power10" (CVE-2025-5702) This reverts commit `90bcc8721e` This change is in the chain of the final revert that fixes the CVE i.e. `3367d8e180` Reason for revert: Power10 strcmp clobbers non-volatile vector registers (Bug 33056) Tested on ppc64le with no regressions.	2025-06-16 18:02:58 -04:00
Carlos O'Donell	63c60101ce	ppc64le: Revert "powerpc: Optimized strncmp for power10" (CVE-2025-5745) This reverts commit `23f0d81608` Reason for revert: Power10 strncmp clobbers non-volatile vector registers (Bug 33060) Tested on ppc64le with no regressions.	2025-06-16 18:02:58 -04:00
Cupertino Miranda	cde5caa4bb	malloc: add testing for large tcache support This patch adds large tcache support tests by re-executing malloc tests using the tunable: glibc.malloc.tcache_max=1048576 Test names are postfixed with "largetcache". Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-16 12:54:32 +00:00
Cupertino Miranda	cbfd798810	malloc: add tcache support for large chunk caching Existing tcache implementation in glibc seems to focus in caching smaller data size allocations, limiting the size of the allocation to 1KB. This patch changes tcache implementation to allow to cache any chunk size allocations. The implementation adds extra bins (linked-lists) which store chunks with different ranges of allocation sizes. Bin selection is done in multiples in powers of 2 and chunks are inserted in growing size ordering within the bin. The last bin contains all other sizes of allocations. This patch although by default preserves the same implementation, limitting caches to 1KB chunks, it now allows to increase the max size for the cached chunks with the tunable glibc.malloc.tcache_max. It also now verifies if chunk was mmapped, in which case __libc_free will not add it to tcache. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-16 12:05:22 +00:00
H.J. Lu	5b7c8d1cd4	Always check lockf64 return value On x86-64, when GCC 14.2.1 is used to build: commit `f3c82fc1b4` Author: Radko Krkos <krkos@mail.muni.cz> Date: Sat Jun 14 11:07:40 2025 +0200 io: Mark lockf() __wur [BZ #32800] In commit `0476597b28` flock() was marked __wur in posix/unistd.h, but not in io/fcntl.h, the declarations must match. Reviewed-by: Florian Weimer <fweimer@redhat.com> I got programs/locarchive.c: In function ‘open_archive’: programs/locarchive.c:641:18: error: ignoring return value of ‘lockf64’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 641 \| (void) lockf64 (fd, F_ULOCK, sizeof (struct locarhead)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ programs/locarchive.c:653:14: error: ignoring return value of ‘lockf64’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 653 \| (void) lockf64 (fd, F_ULOCK, sizeof (struct locarhead)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ programs/locarchive.c:660:14: error: ignoring return value of ‘lockf64’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 660 \| (void) lockf64 (fd, F_ULOCK, sizeof (struct locarhead)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ programs/locarchive.c:679:14: error: ignoring return value of ‘lockf64’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 679 \| (void) lockf64 (fd, F_ULOCK, sizeof (struct locarhead)); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Update locarchive.c to always check lockf64 return value. This fixes BZ #33089. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-16 14:48:45 +08:00
H.J. Lu	81467d4b61	elf: Add optimization barrier for __ehdr_start and _end rtld.c has extern const ElfW(Ehdr) __ehdr_start attribute_hidden; ... _dl_rtld_map.l_map_start = (ElfW(Addr)) &__ehdr_start; _dl_rtld_map.l_map_end = (ElfW(Addr)) _end; As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653 shows, compiler may generate run-time relocation on __ehdr_start with movq .LC0(%rip), %xmm0 ... .section .data.rel.ro.local,"aw" .align 8 .LC0: .quad __ehdr_start This won't work before run-time relocation is finished in rtld.c. Add optimization barrier to prevent run-time relocations against __ehdr_start and _end. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>	2025-06-16 08:43:40 +08:00
gfleury	27360ab9ea	htl: move pthread_key_*, pthread_get/setspecific Signed-off-by: gfleury <gfleury@disroot.org> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Message-ID: <20250613184440.1660335-1-gfleury@disroot.org>	2025-06-15 21:21:12 +02:00
H.J. Lu	90cf97bb9d	elf: Remove the unused _etext declaration Since commit `53df2ce688` Author: Florian Weimer <fweimer@redhat.com> Date: Fri Sep 8 13:02:06 2023 +0200 elf: Remove unused l_text_end field from struct link_map removed the only reference to _etext, also remove the unused _etext declaration. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2025-06-15 12:42:24 +08:00
Radko Krkos	f3c82fc1b4	io: Mark lockf() __wur [BZ #32800 ] In commit `0476597b28` flock() was marked __wur in posix/unistd.h, but not in io/fcntl.h, the declarations must match. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-14 11:57:46 +02:00
Adhemerval Zanella	1d828b9ddc	benchtests: Improve modf benchtest It adds four ranges, which is how the generic implementation handles normal numbers: 1. Random inputs in the range [0.0, 1.0]; 2. Random inputs in the range [1.0, (double)(UINT64_C(1) << 52))]; 3. Random inputs in the range [(double)(UINT64_C(1) << 52), DBL_MAX]; 4. Random integral inputs in the range [0.0, (double)(UINT64_C(1) << 52)]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-13 11:30:12 -03:00
Adhemerval Zanella	619fd4e37b	benchtests: Add modff benchtest It adds four ranges, which is how the generic implementation handles normal numbers: 1. Random inputs in the range [0.0, 1.0]; 2. Random inputs in the range [1.0, (float)(1U << 23)]; 3. Random inputs in the range [(float)(1U << 23), FLT_MAX]; 4. Random integral inputs in the range [0.0, (float)(1U << 23)]. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-13 11:29:39 -03:00
Mark Harris	8af8beb1c4	riscv: Correct __riscv_hwprobe function prototype [BZ #32932 ] The third argument to __riscv_hwprobe is the size in bytes of the cpu bitmask pointed to by the fourth argument, however in the access attribute (read_only, 4, 3) it is used as an element count (i.e., the number of unsigned longs that make up the bitmask), resulting in a false compiler warning: $ gcc -c hwprobe1.c hwprobe1.c: In function 'main': hwprobe1.c:15:11: warning: '__riscv_hwprobe' reading 1024 bytes from a region of size 128 [-Wstringop-overread] 15 \| ret = __riscv_hwprobe (pairs, 1, sizeof(cpus), cpus, 0); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ hwprobe1.c:9:23: note: source object 'cpus' of size 128 9 \| unsigned long int cpus[16]; \| ^~~~ In file included from hwprobe1.c:1: /usr/include/riscv64-linux-gnu/sys/hwprobe.h:66:12: note: in a call to function '__riscv_hwprobe' declared with attribute 'access (read_only, 4, 3)' 66 \| extern int __riscv_hwprobe (struct riscv_hwprobe __pairs, size_t __pair_count, \| ^~~~~~~~~~~~~~~ $ The documentation (https://docs.kernel.org/arch/riscv/hwprobe.html) claims that the cpu bitmask has the type cpu_set_t , which would be consistent with other functions that take a cpu bitmask such as sched_setaffinity and sched_getaffinity. It also uses the name cpusetsize for the third argument, which is much more accurate than cpu_count since it is a size in bytes and not a cpu count. The (read_only, 4, 3) access attribute in the glibc prototype claims that the cpu bitmask is only read, however when flags is RISCV_HWPROBE_WHICH_CPUS it is both read and written. Therefore, in the glibc prototype the type of the fourth argument is changed to cpu_set_t * to match the documentation, the name of the third argument is changed to cpusetsize as in the documentation, and the incorrect access attribute that applies to these arguments is removed. Almost all existing callers pass a null pointer for the fourth argument, however a transparent union is introduced for compatibility with callers that cast a pointer to the old argument type, and a macro is introduced allowing callers the ability to distinguish between the old and new prototype when needed. The access attributes are being specified with __fortified_attr_access, however this macro is for fortified functions; the regular __attr_access macro is for non-fortified functions such as this one. Using the incorrect macro results in no access checks at fortify level 3, because it is assumed that the fortified function will be doing the checking. It is changed to use the correct macro so that the access checks will work regardless of fortify level. Also because __riscv_hwprobe is not a cancellation point, __THROW is added, consistent with similar functions. (However, it is omitted from the typedef because GCC does not accept it there.) The __wur (warn_unused_result) attribute is helpful for functions that cannot be used safely without checking the result, however code such as the following does not require the result to be checked and should not produce a warning: struct riscv_hwprobe pair = { RISCV_HWPROBE_KEY_IMA_EXT_0, 0 }; __riscv_hwprobe (&pair, 1, 0, NULL, 0); if (pair.value & RISCV_HWPROBE_EXT_ZBB) ... Therefore this attribute is omitted. The comment claiming that the second argument to the ifunc selector is a pointer to the vDSO function is corrected. It is a pointer to the regular glibc function (which returns errors as positive values), not the vDSO function (which returns errors as negative values). Fixes commit `426d0e1aa8` ("riscv: Add Linux hwprobe syscall support"). Fixes: BZ #32932 Signed-off-by: Mark Harris <mark.hsj@gmail.com> Signed-off-by: Mark Harris <mark.hsj@gmail.com> Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com> Acked-by: Palmer Dabbelt <palmer@dabbelt.com>	2025-06-13 11:25:12 -03:00
Sergey Kolosov	daab2a6d19	resolv: Add test for getaddrinfo returning FQDN in ai_canonname Test for BZ #15218. This test verifies that getaddrinfo returns a fully-qualified domain name in the ai_canonname field then AI_CANONNAME is set and search domains apply. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-10 15:10:31 +02:00
Yury Khrustalev	b15ed85c86	aarch64: fix typo in sysdeps/aarch64/Makefile	2025-06-10 10:48:07 +01:00
Siddhesh Poyarekar	f8f73249d9	Advisory text for CVE-2025-5745 The fix is not available yet, so this only records the first vulnerable commit. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2025-06-09 13:07:26 -04:00
Siddhesh Poyarekar	62cb3ee57d	Advisory text for CVE-2025-5702 The fix is not available yet, so this only records the first vulnerable commit. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>	2025-06-09 13:07:26 -04:00
Samuel Thibault	5fdc693d95	hurd: Make __getrandom_early_init call __mach_init `25d37948c9` ("malloc: Improve malloc initialization") moved calling malloc initialization earlier, within _dl_sysdep_start's call to dl_main, before __mach_init is called by _dl_init_first. But malloc initialization uses getrandom, which needs to make RPCs. This adds __getrandom_early_init on hurd to express that getrandom needs __mach_init too. This also adds a guard to avoid making it create several task and host ports. Fixes: `25d37948c9` ("malloc: Improve malloc initialization") Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>	2025-06-09 08:34:06 +00:00
H.J. Lu	0a027674a1	x86: Avoid GLRO(dl_x86_cpu_features) In init_cpu_features, replace GLRO(dl_x86_cpu_features) with cpu_features to avoid an extra load. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-09 13:03:13 +08:00
Maciej W. Rozycki	62fba6d980	manual: Add a comparative example of 'clock_nanosleep' use Add an illustrative example of how to express 'nanosleep' in terms of 'clock_nanosleep'.	2025-06-06 18:14:34 +01:00
Wilco Dijkstra	09795c5612	AArch64: Fix builderror with GCC 12.1/12.2 Early versions of GCC 12 didn't support -mtune=neoverse-v2, so use -mtune=neoverse-v1 instead. Reported-by: Yury Khrustalev <yury.khrustalev@arm.com>	2025-06-06 13:22:27 +00:00
Maciej W. Rozycki	7a751ce39c	Linux: Drop obsolete kernel support with `if_nameindex' and `if_nametoindex' Support for the SIOCGIFINDEX ioctl(2) Linux ABI (0x8933 command, called SIOGIFINDEX in the API originally) was added with kernel version 2.1.14 for AF_INET6 sockets, followed by general support with version 2.1.22. The Linux API was then updated by adding the current SIOCGIFINDEX name with kernel version 2.1.68, back in Nov 1997. All these kernel versions are well below our current default required minimum of 3.2.0, let alone some platform higher version requirements. Drop support for the absence of the SIOCGIFINDEX ioctl(2) in the API or ABI, by removing arrangements for the ENOSYS error condition. Discard the indirection from '__if_nameindex' to 'if_nameindex_netlink' and adjust the implementation of '__if_nametoindex' accordingly for a better code flow.	2025-06-05 19:04:46 +01:00
Yury Khrustalev	fcd6a8b5c5	aarch64: add __ifunc_hwcap function to be used in ifunc resolvers Add a new helper function __ifunc_hwcap() as a portable way to access HWCAP elements via the parameter(s) passed to an ifunc resolver checking the _IFUNC_ARG_HWCAP bit in the first parameter and size of the buffer in the second parameter. Note that 0 is returned when the requested element is not available or does not correspond to a valid AT_HWCAP{,2,...} value. Also add relevant tests. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-06-05 14:38:51 +01:00
Yury Khrustalev	ea14d04e9a	aarch64: add support for hwcap3,4 Add basic support for hwcap3 and hwcap4 in dynamic loader and ifunc resolvers. Describe new backward-compatible prototype for GNU indirect function resolvers that use a pointer to uint64_t array in stead of a pointer to the __ifunc_arg_t struct. This patch also adds macro _IFUNC_HWCAP_MAX to specify current number of hwcap elements. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-06-05 14:38:03 +01:00
Arjun Shankar	25f1d94576	manual: Document futimens and utimensat Document futimens and utimensat. Also document the EINVAL error condition for futimes. It is inherited by futimens and utimensat as well. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Arjun Shankar	75b725717f	manual: Document unlinkat Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Arjun Shankar	60f86c9cd0	manual: Document renameat Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Arjun Shankar	49766eb1a5	manual: Document mkdirat Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Arjun Shankar	941157dbcd	manual: Document faccessat Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Arjun Shankar	3b21166c4d	manual: Expand Descriptor-Relative Access section Improve the clarity of the paragraphs describing common flags and add a list of common error conditions for descriptor-relative functions. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 20:17:04 +02:00
Florian Weimer	2fca4b624b	Makefile: Avoid $(objpfx)/ in makefiles If paths with both $(objpfx)/ and $(objpfx) (which already includes a trailing slash) appear during the build, this can trigger unexpected rebuilds, or incorrect concurrent rebuilds.	2025-06-04 17:44:19 +02:00
Maciej W. Rozycki	140b20e971	manual: Document error codes missing for 'inet_pton' Add documentation for EAFNOSUPPORT error code returned, and the possible return values on non-success. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 16:27:20 +01:00
Maciej W. Rozycki	5a9020eeb2	manual: Document error codes missing for 'if_nametoindex' Add documentation for ENODEV error code returned and refer to 'socket' for further possible codes from the underlying function call. While changing the text clarify the description by mentioning 'ifname'. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 16:27:20 +01:00
Maciej W. Rozycki	46acdf46cc	manual: Document error codes missing for 'if_indextoname' Add documentation for ENXIO error code returned and refer to 'socket' for further possible codes from the underlying function call. While changing the text clarify the description by mentioning 'ifname' and replace @code tags with @var ones where referring to a function parameter. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 16:27:20 +01:00
Cœur	e885fd43db	posix: fix building regex when _LIBC isn't defined Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 13:55:23 +02:00
Collin Funk	5b45674869	localedata: Use the name North Macedonia. The name "the former Yugoslav Republic of Macedonia" is no longer in use since the signing of the Prespa Agreement [1][2]. This resolved the country's naming dispute with Greece and changed the name to "North Macedonia". The name field of this locale/iso-3166.def is not used, so this does not affect binaries. [1] https://en.wikipedia.org/wiki/Prespa_Agreement [2] https://treaties.un.org/Pages/showDetails.aspx?objid=0800000280544ac1 Signed-off-by: Collin Funk <collin.funk1@gmail.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-04 12:01:55 +02:00
Wilco Dijkstra	7e10e30e64	malloc: Count tcache entries downwards Currently tcache requires 2 global variable accesses to determine whether a block can be added to the tcache. Change the counts array to 'num_slots' to indicate the number of entries that could be added. If 'num_slots' reaches zero, no more blocks can be added. If the entries pointer is not NULL, at least one block is available for allocation. Now each tcache bin can support a different maximum number of entries, and they can be individually switched on or off (a zero initialized num_slots+entry means the tcache bin is not available for free or malloc). Reviewed-by: DJ Delorie <dj@redhat.com>	2025-06-03 17:16:39 +00:00
Adhemerval Zanella	404526ee2e	sparc: Fix argument passing to __libc_start_main (BZ 32981) sparc start.S does not provide the final argument for __libc_start_main, which is the highest stack address used to update the __libc_stack_end.A This fixes elf/tst-execstack-prog-static-tunable on sparc64. On sparcv9 this does not happen because the kernel puts an auxv value, which turns to point to a value in the stack itself. Checked on sparc64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-03 09:11:46 -03:00
Collin Funk	d475e5bf4f	localedata: Refer to Eswatini instead of Swaziland. The name was changed in 2018 [1]. The name is not used in locale/programs/ld-address.c so this does not change any binaries or data. [1] https://www.un.org/en/about-us/member-states/eswatini Signed-off-by: Collin Funk <collin.funk1@gmail.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-03 10:53:12 +02:00
наб	6945ce4a6f	sigaction: don't sign-extend sa_flags Before: rt_sigaction(SIGBUS, {sa_handler=0x55abb9960139, sa_mask=[], sa_flags=SA_RESTORER\|SA_RESETHAND\|SA_SIGINFO\|0xffffffff00000000, sa_restorer=0x7fb1b2a82050}, NULL, 8) = 0 After: rt_sigaction(SIGBUS, {sa_handler=0x7f6a70dce139, sa_mask=[], sa_flags=SA_RESTORER\|SA_RESETHAND\|SA_SIGINFO, sa_restorer=0x7f6a70c28f60}, NULL, 8) = 0 Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-03 10:53:12 +02:00
Collin Funk	b2970d5e5b	stdio-common: Add nonnull attribute to stdio_ext.h functions. * stdio-common/stdio_ext.h (__fbufsize, __freading, __fwriting) (__freadable, __fwritable, __flbf, __fpurge, __fpending, __fsetlocking): Add __nonnull ((1)) to these functions since they access the FP without checking if it is NULL. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	e529bfe8de	elf: Fix UB on _dl_map_object_from_fd On 32-bit architecture ubsan triggers: UBSAN: Undefined behaviour in dl-load.c:1345:54 pointer index expression with base 0x00612508 overflowed to 0xf7c3a508 Use explicit uintptr_t operation instead. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	1642570563	argp: Fix shift bug From gnulib commits 06094e390b0 and 88033d3779362a. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	7c00a20397	math: Remove i386 ilogb/ilogbf/llogb/llogbf The new float and double implementation does not required an extra function call and error handling uses math_err function, which results in better performance on i386 as well. With gcc-14 on AMD AMD Ryzen 9 5900X, master shows: $ ./benchtests/bench-ilogb "ilogb": { "subnormal": { "duration": 3.68863e+09, "iterations": 1.72228e+08, "max": 89.2995, "min": 21.016, "mean": 21.4171 }, "normal": { "duration": 3.68878e+09, "iterations": 1.72948e+08, "max": 78.6065, "min": 21.127, "mean": 21.3288 } } $ ./benchtests/bench-ilogbf "ilogbf": { "subnormal": { "duration": 3.68835e+09, "iterations": 1.66716e+08, "max": 46.953, "min": 21.793, "mean": 22.1236 }, "normal": { "duration": 3.68784e+09, "iterations": 1.66168e+08, "max": 46.9715, "min": 21.904, "mean": 22.1935 } } While with this patch: $ ./benchtests/bench-ilogb "ilogb": { "subnormal": { "duration": 3.68134e+09, "iterations": 4.17516e+08, "max": 32.5045, "min": 8.3245, "mean": 8.81723 }, "normal": { "duration": 3.6677e+09, "iterations": 6.79468e+08, "max": 50.9305, "min": 5.3465, "mean": 5.3979 } } $ ./benchtests/bench-ilogbf "ilogbf": { "subnormal": { "duration": 3.67553e+09, "iterations": 5.11032e+08, "max": 35.927, "min": 7.0485, "mean": 7.19237 }, "normal": { "duration": 3.66877e+09, "iterations": 6.556e+08, "max": 26.3625, "min": 5.5315, "mean": 5.59605 } } Checked on i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	39775f00b1	math: Optimize float ilogb/llogb It removes the wrapper by moving the error/EDOM handling to an out-of-line implementation (__math_invalidf_i/__math_invalidf_li). Also, __glibc_unlikely is used on errors case since it helps code generation on recent gcc. The code now builds to with gcc-14 on aarch64: 0000000000000000 <__ilogbf>: 0: 1e260000 fmov w0, s0 4: d3577801 ubfx x1, x0, #23, #8 8: 340000e1 cbz w1, 24 <__ilogbf+0x24> c: 5101fc20 sub w0, w1, #0x7f 10: 7103fc3f cmp w1, #0xff 14: 54000040 b.eq 1c <__ilogbf+0x1c> // b.none 18: d65f03c0 ret 1c: 12b00000 mov w0, #0x7fffffff // #2147483647 20: 14000000 b 0 <__math_invalidf_i> 24: 53175800 lsl w0, w0, #9 28: 340000a0 cbz w0, 3c <__ilogbf+0x3c> 2c: 5ac01000 clz w0, w0 30: 12800fc1 mov w1, #0xffffff81 // #-127 34: 4b000020 sub w0, w1, w0 38: d65f03c0 ret 3c: 320107e0 mov w0, #0x80000001 // #-2147483647 40: 14000000 b 0 <__math_invalidf_i> Some ABI requires additional adjustments: * i386 and m68k requires to use the template version, since both provide __ieee754_ilogb implementatations. * loongarch uses a custom implementation as well. * powerpc64le also has a custom implementation for POWER9, which is also used for float and float128 version. The generic e_ilogb.c implementation is moved on powerpc to keep the current code as-is. Checked on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	afe09d44f3	math: Remove UB and optimize double ilogbf The subnormal exponent calculation invokes UB by left shifting the signed expoenent to find the first leading bit. The patch reimplements ilogb using the math_config.h macros and uses the new stdbit.h function to simplify the subnormal handling. On aarch64 it generates better code: * master: 0000000000000000 <__ieee754_ilogbf>: 0: 1e260000 fmov w0, s0 4: 12007801 and w1, w0, #0x7fffffff 8: 72091c1f tst w0, #0x7f800000 c: 54000141 b.ne 34 <__ieee754_ilogbf+0x34> // b.any 10: 34000201 cbz w1, 50 <__ieee754_ilogbf+0x50> 14: 53185c21 lsl w1, w1, #8 18: 12800fa0 mov w0, #0xffffff82 // #-126 1c: d503201f nop 20: 531f7821 lsl w1, w1, #1 24: 51000400 sub w0, w0, #0x1 28: 7100003f cmp w1, #0x0 2c: 54ffffac b.gt 20 <__ieee754_ilogbf+0x20> 30: d65f03c0 ret 34: 13177c20 asr w0, w1, #23 38: 12b01002 mov w2, #0x7f7fffff // #2139095039 3c: 5101fc00 sub w0, w0, #0x7f 40: 6b02003f cmp w1, w2 44: 12b00001 mov w1, #0x7fffffff // #2147483647 48: 1a819000 csel w0, w0, w1, ls // ls = plast 4c: d65f03c0 ret 50: 320107e0 mov w0, #0x80000001 // #-2147483647 54: d65f03c0 ret * patch: 0000000000000000 <__ieee754_ilogbf>: 0: 1e260001 fmov w1, s0 4: d3577820 ubfx x0, x1, #23, #8 8: 350000e0 cbnz w0, 24 <__ieee754_ilogbf+0x24> c: 53175821 lsl w1, w1, #9 10: 34000141 cbz w1, 38 <__ieee754_ilogbf+0x38> 14: 5ac01021 clz w1, w1 18: 12800fc0 mov w0, #0xffffff81 // #-127 1c: 4b010000 sub w0, w0, w1 20: d65f03c0 ret 24: 7103fc1f cmp w0, #0xff 28: 5101fc00 sub w0, w0, #0x7f 2c: 12b00001 mov w1, #0x7fffffff // #2147483647 30: 1a811000 csel w0, w0, w1, ne // ne = any 34: d65f03c0 ret 38: 320107e0 mov w0, #0x80000001 // #-2147483647 3c: d65f03c0 ret Other architecture with support for stdc_leading_zeros and/or __builtin_clzll should have similar improvements. Checked on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	c4be334400	math: Optimize double ilogb/llogb It removes the wrapper by moving the error/EDOM handling to an out-of-line implementation (__math_invalid_i/__math_invalid_li). Also, __glibc_unlikely is used on errors case since it helps code generation on recent gcc. The code now builds to with gcc-14 on aarch64: 0000000000000000 <__ilogb>: 0: 9e660000 fmov x0, d0 4: d374f801 ubfx x1, x0, #52, #11 8: 340000e1 cbz w1, 24 <__ilogb+0x24> c: 510ffc20 sub w0, w1, #0x3ff 10: 711ffc3f cmp w1, #0x7ff 14: 54000040 b.eq 1c <__ilogb+0x1c> // b.none 18: d65f03c0 ret 1c: 12b00000 mov w0, #0x7fffffff // #2147483647 20: 14000000 b 0 <__math_invalid_i> 24: d374cc00 lsl x0, x0, #12 28: b40000a0 cbz x0, 3c <__ilogb+0x3c> 2c: dac01000 clz x0, x0 30: 12807fc1 mov w1, #0xfffffc01 // #-1023 34: 4b000020 sub w0, w1, w0 38: d65f03c0 ret 3c: 320107e0 mov w0, #0x80000001 // #-2147483647 40: 14000000 b 0 <__math_invalid_i> Some ABI requires additional adjustments: * i386 and m68k requires to use the template version, since both provide __ieee754_ilogb implementatations. * loongarch uses a custom implementation as well. * powerpc64le also has a custom implementation for POWER9, which is also used for float and float128 version. The generic e_ilogb.c implementation is moved on powerpc to keep the current code as-is. Checked on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-02 13:32:19 -03:00
Adhemerval Zanella	eb1e9194fa	math: Remove UB and optimize double ilogb The subnormal exponent calculation invokes UB by left shifting the signed exponent to find the first leading bit. The implementation also uses 32 bits operations, which generates suboptimal code in 64 bits architectures. The patch reimplements ilogb using the math_config.h macros and uses the new stdbit function to simplify the subnormal handling. On aarch64 it generates better code: * master: 0000000000000000 <__ieee754_ilogb>: 0: 9e660000 fmov x0, d0 4: d360fc02 lsr x2, x0, #32 8: d360f801 ubfx x1, x0, #32, #31 c: f26c285f tst x2, #0x7ff00000 10: 540001a1 b.ne 44 <__ieee754_ilogb+0x44> // b.any 14: 2a000022 orr w2, w1, w0 18: 34000322 cbz w2, 7c <__ieee754_ilogb+0x7c> 1c: 35000221 cbnz w1, 60 <__ieee754_ilogb+0x60> 20: 2a0003e1 mov w1, w0 24: 7100001f cmp w0, #0x0 28: 12808240 mov w0, #0xfffffbed // #-1043 2c: 540000ad b.le 40 <__ieee754_ilogb+0x40> 30: 531f7821 lsl w1, w1, #1 34: 51000400 sub w0, w0, #0x1 38: 7100003f cmp w1, #0x0 3c: 54ffffac b.gt 30 <__ieee754_ilogb+0x30> 40: d65f03c0 ret 44: 13147c20 asr w0, w1, #20 48: 12b00202 mov w2, #0x7fefffff // #2146435071 4c: 510ffc00 sub w0, w0, #0x3ff 50: 6b02003f cmp w1, w2 54: 12b00001 mov w1, #0x7fffffff // #2147483647 58: 1a819000 csel w0, w0, w1, ls // ls = plast 5c: d65f03c0 ret 60: 53155021 lsl w1, w1, #11 64: 12807fa0 mov w0, #0xfffffc02 // #-1022 68: 531f7821 lsl w1, w1, #1 6c: 51000400 sub w0, w0, #0x1 70: 7100003f cmp w1, #0x0 74: 54ffffac b.gt 68 <__ieee754_ilogb+0x68> 78: d65f03c0 ret 7c: 320107e0 mov w0, #0x80000001 // #-2147483647 80: d65f03c0 ret * patch: 0000000000000000 <__ieee754_ilogb>: 0: 9e660001 fmov x1, d0 4: d374f820 ubfx x0, x1, #52, #11 8: 350000e0 cbnz w0, 24 <__ieee754_ilogb+0x24> c: d374cc21 lsl x1, x1, #12 10: b4000141 cbz x1, 38 <__ieee754_ilogb+0x38> 14: dac01021 clz x1, x1 18: 12807fc0 mov w0, #0xfffffc01 // #-1023 1c: 4b010000 sub w0, w0, w1 20: d65f03c0 ret 24: 711ffc1f cmp w0, #0x7ff 28: 510ffc00 sub w0, w0, #0x3ff 2c: 12b00001 mov w1, #0x7fffffff // #2147483647 30: 1a811000 csel w0, w0, w1, ne // ne = any 34: d65f03c0 ret 38: 320107e0 mov w0, #0x80000001 // #-2147483647 3c: d65f03c0 ret Other architecture with support for stdc_leading_zeros and/or __builtin_clzll should have similar improvements. Checked on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-06-02 13:32:19 -03:00
Arjun Shankar	591283a689	manual: Correct return value description of 'clock_nanosleep' Commit `1a3d8f2201` incorrectly described 'clock_nanosleep' as having the same return values as 'nanosleep'. Fix this, clarifying that 'clock_nanosleep' returns a positive error number upon failure instead of setting 'errno'. Also clarify that 'nanosleep' returns '-1' upon error. Fixes: `1a3d8f2201` Reported-by: Mark Harris <mark.hsj@gmail.com> Reviewed-by: Mark Harris <mark.hsj@gmail.com>	2025-06-02 16:06:11 +02:00

1 2 3 4 5 ...

42475 Commits All Branches Search

42475 Commits

All Branches