On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option is enabled by default, because even with sanitization, > > a small but consistent speed up of 2 to 3% with Polyhedron capacita > > benchmark can be achieved vs. scalar code. > > > > Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9% > > vs. scalar code. This is what clang does by default, as it defaults > > to -fno-trapping-math. > > I like the new option, note you lack invoke.texi documentation where > I'd also elaborate a bit on the interaction with -fno-trapping-math > and the possible performance impact then NaNs or denormals leak > into the upper halves and cross-reference -mdaz-ftz. The attached doc patch is invoke.texi entry for -mmmxfp-with-sse option. It is written in a way to also cover half-float vectors. WDYT? Uros.