glibc

Commit Graph

Author	SHA1	Message	Date
Paul Eggert	2642002380	Update copyright dates with scripts/update-copyrights	2025-01-01 11:22:09 -08:00
Joe Ramsay	2d82d781a5	AArch64: Remove SVE erf and erfc tables By using a combination of mask-and-add instead of the shift-based index calculation the routines can share the same table as other variants with no performance degradation. The tables change name because of other changes in downstream AOR. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-11-01 16:10:41 +00:00
Joe Ramsay	0fed0b250f	aarch64/fpu: Add vector variants of pow Plus a small amount of moving includes around in order to be able to remove duplicate definition of asuint64. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-05-21 14:38:49 +01:00
Joe Ramsay	87cb1dfcd6	aarch64/fpu: Add vector variants of erfc Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:33:24 +01:00
Joe Ramsay	bdb5705b7b	aarch64/fpu: Add vector variants of cosh Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:52 +01:00
Joe Ramsay	cb5d84f1f8	aarch64/fpu: Add vector variants of erf Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2024-04-04 10:32:48 +01:00
Paul Eggert	dff8da6b3e	Update copyright dates with scripts/update-copyrights	2024-01-01 10:53:40 -08:00
Joe Ramsay	b07038c5d3	aarch64: Add vector implementations of atan2 routines	2023-11-10 17:07:43 +00:00
Joe Ramsay	067a34156c	aarch64: Add vector implementations of log10 routines A table is also added, which is shared between AdvSIMD and SVE log10.	2023-10-23 15:00:45 +01:00
Joe Ramsay	a8e3ab3074	aarch64: Add vector implementations of log2 routines A table is also added, which is shared between AdvSIMD and SVE log2.	2023-10-23 15:00:45 +01:00
Joe Ramsay	5a4b6f8e4b	aarch64: Optimise vecmath logs * Transpose table layout for improved memory access * Use half-vector special comparisons for AdvSIMD * Improve register use near special-case branches - Due to the presence of a function call, return value would get mov-d out of x0 in order to facilitate PCS. By moving the final computation after the branch this can be avoided Also change SVE routines to use overloaded intrinsics for readability.	2023-10-05 16:54:16 +01:00
Joe Ramsay	4a9392ffc2	aarch64: Add vector implementations of exp routines Optimised implementations for single and double precision, Advanced SIMD and SVE, copied from Arm Optimized Routines. As previously, data tables are used via a barrier to prevent overly aggressive constant inlining. Special-case handlers are marked NOINLINE to avoid incurring the penalty of switching call standards unnecessarily. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-06-30 09:04:26 +01:00
Joe Ramsay	78c01a5cbe	aarch64: Add vector implementations of log routines Optimised implementations for single and double precision, Advanced SIMD and SVE, copied from Arm Optimized Routines. Log lookup table added as HIDDEN symbol to allow it to be shared between AdvSIMD and SVE variants. As previously, data tables are used via a barrier to prevent overly aggressive constant inlining. Special-case handlers are marked NOINLINE to avoid incurring the penalty of switching call standards unnecessarily. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-06-30 09:04:22 +01:00
Joe Ramsay	aed39a3aa3	aarch64: Add vector implementations of cos routines Replace the loop-over-scalar placeholder routines with optimised implementations from Arm Optimized Routines (AOR). Also add some headers containing utilities for aarch64 libmvec routines, and update libm-test-ulps. Data tables for new routines are used via a pointer with a barrier on it, in order to prevent overly aggressive constant inlining in GCC. This allows a single adrp, combined with offset loads, to be used for every constant in the table. Special-case handlers are marked NOINLINE in order to confine the save/restore overhead of switching from vector to normal calling standard. This way we only incur the extra memory access in the exceptional cases. NOINLINE definitions have been moved to math_private.h in order to reduce duplication. AOR exposes a config option, WANT_SIMD_EXCEPT, to enable selective masking (and later fixing up) of invalid lanes, in order to trigger fp exceptions correctly (AdvSIMD only). This is tested and maintained in AOR, however it is configured off at source level here for performance reasons. We keep the WANT_SIMD_EXCEPT blocks in routine sources to greatly simplify the upstreaming process from AOR to glibc. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-06-30 09:04:10 +01:00

14 Commits