Hi all, I'd like to submit this patch introducing an Arm MTE compatible strchrnul implementation. Follows a performance comparison of the strchrnul benchmark run on Cortex-A72, Cortex-A53, Neoverse N1. | length | alignment | perf-uplift A72 | perf-uplift A53 | perf-uplift N1 | |--------+-----------+-----------------+-----------------+----------------| | 32 | 0 | 1.16x | 1.07x | 1.35x | | 32 | 1 | 1.25x | 1.16x | 1.15x | | 64 | 0 | 1.26x | 0.97x | 1.20x | | 64 | 2 | 1.35x | 1.04x | 1.30x | | 128 | 0 | 1.12x | 0.84x | 1.22x | | 128 | 3 | 1.25x | 0.87x | 1.30x | | 256 | 0 | 1.14x | 0.84x | 1.16x | | 256 | 4 | 1.24x | 0.81x | 1.16x | | 512 | 0 | 1.15x | 0.80x | 1.13x | | 512 | 5 | 1.17x | 0.81x | 1.14x | | 1024 | 0 | 1.14x | 0.78x | 1.08x | | 1024 | 6 | 1.03x | 0.78x | 1.10x | | 2048 | 0 | 1.12x | 0.76x | 1.08x | | 2048 | 7 | 1.14x | 0.77x | 1.09x | | 64 | 1 | 1.35x | 1.04x | 1.37x | | 64 | 1 | 1.36x | 1.04x | 1.37x | | 64 | 2 | 1.36x | 1.04x | 1.37x | | 64 | 2 | 1.37x | 1.04x | 1.38x | | 64 | 3 | 1.38x | 1.04x | 1.36x | | 64 | 3 | 1.40x | 1.04x | 1.36x | | 64 | 4 | 1.41x | 1.04x | 1.36x | | 64 | 4 | 1.36x | 1.04x | 1.36x | | 64 | 5 | 1.34x | 1.04x | 1.40x | | 64 | 5 | 1.35x | 1.04x | 1.36x | | 64 | 6 | 1.34x | 1.04x | 1.37x | | 64 | 6 | 1.41x | 1.04x | 1.37x | | 64 | 7 | 1.39x | 1.04x | 1.36x | | 64 | 7 | 1.34x | 1.04x | 1.37x | | 0 | 0 | 1.18x | 1.63x | 1.66x | | 0 | 0 | 1.18x | 1.63x | 1.66x | | 1 | 0 | 1.18x | 1.63x | 1.66x | | 1 | 0 | 1.18x | 1.63x | 1.67x | | 2 | 0 | 1.18x | 1.63x | 1.66x | | 2 | 0 | 1.18x | 1.63x | 1.65x | | 3 | 0 | 1.18x | 1.63x | 1.66x | | 3 | 0 | 1.18x | 1.63x | 1.66x | | 4 | 0 | 1.18x | 1.63x | 1.65x | | 4 | 0 | 1.18x | 1.63x | 1.66x | | 5 | 0 | 1.18x | 1.63x | 1.66x | | 5 | 0 | 1.18x | 1.63x | 1.66x | | 6 | 0 | 1.18x | 1.63x | 1.66x | | 6 | 0 | 1.18x | 1.63x | 1.66x | | 7 | 0 | 1.18x | 1.63x | 1.66x | | 7 | 0 | 1.18x | 1.63x | 1.64x | | 8 | 0 | 1.18x | 1.63x | 1.66x | | 8 | 0 | 1.18x | 1.63x | 1.66x | | 9 | 0 | 1.18x | 1.63x | 1.65x | | 9 | 0 | 1.18x | 1.63x | 1.66x | | 10 | 0 | 1.18x | 1.63x | 1.66x | | 10 | 0 | 1.18x | 1.63x | 1.66x | | 11 | 0 | 1.18x | 1.63x | 1.64x | | 11 | 0 | 1.18x | 1.63x | 1.63x | | 12 | 0 | 1.18x | 1.63x | 1.63x | | 12 | 0 | 1.18x | 1.63x | 1.66x | | 13 | 0 | 1.18x | 1.63x | 1.63x | | 13 | 0 | 1.18x | 1.63x | 1.63x | | 14 | 0 | 1.18x | 1.63x | 1.63x | | 14 | 0 | 1.18x | 1.63x | 1.22x | | 15 | 0 | 1.19x | 1.63x | 1.22x | | 15 | 0 | 1.18x | 1.63x | 1.63x | | 16 | 0 | 1.03x | 0.96x | 1.15x | | 16 | 0 | 1.03x | 0.96x | 1.13x | | 17 | 0 | 1.03x | 0.96x | 0.98x | | 17 | 0 | 1.03x | 0.96x | 0.98x | | 18 | 0 | 1.03x | 0.96x | 0.98x | | 18 | 0 | 1.03x | 0.96x | 0.98x | | 19 | 0 | 1.04x | 0.96x | 0.98x | | 19 | 0 | 1.04x | 0.96x | 0.98x | | 20 | 0 | 1.04x | 0.96x | 1.00x | | 20 | 0 | 1.03x | 0.96x | 0.99x | | 21 | 0 | 1.04x | 0.96x | 0.99x | | 21 | 0 | 1.03x | 0.96x | 1.14x | | 22 | 0 | 1.04x | 0.96x | 1.14x | | 22 | 0 | 1.03x | 0.96x | 1.14x | | 23 | 0 | 1.03x | 0.96x | 1.13x | | 23 | 0 | 1.03x | 0.96x | 1.15x | | 24 | 0 | 1.04x | 0.96x | 1.13x | | 24 | 0 | 1.04x | 0.95x | 1.13x | | 25 | 0 | 1.03x | 0.96x | 1.15x | | 25 | 0 | 1.04x | 0.96x | 1.12x | | 26 | 0 | 1.04x | 0.96x | 1.13x | | 26 | 0 | 1.02x | 0.96x | 1.13x | | 27 | 0 | 1.04x | 0.96x | 1.13x | | 27 | 0 | 1.03x | 0.96x | 1.13x | | 28 | 0 | 1.03x | 0.96x | 0.98x | | 28 | 0 | 1.04x | 0.96x | 1.05x | | 29 | 0 | 1.02x | 0.96x | 1.00x | | 29 | 0 | 1.03x | 0.96x | 1.00x | | 30 | 0 | 1.04x | 0.96x | 1.00x | | 30 | 0 | 1.04x | 0.96x | 1.00x | | 31 | 0 | 1.04x | 0.96x | 0.99x | | 31 | 0 | 1.03x | 0.96x | 0.99x | | 32 | 0 | 1.09x | 1.07x | 1.09x | | 32 | 1 | 1.25x | 1.15x | 1.38x | | 64 | 0 | 1.27x | 0.98x | 1.20x | | 64 | 2 | 1.41x | 1.04x | 1.30x | | 128 | 0 | 1.15x | 0.84x | 1.22x | | 128 | 3 | 1.23x | 0.87x | 1.30x | | 256 | 0 | 1.16x | 0.84x | 1.16x | | 256 | 4 | 1.23x | 0.81x | 1.17x | | 512 | 0 | 1.14x | 0.80x | 1.12x | | 512 | 5 | 1.18x | 0.81x | 1.14x | | 1024 | 0 | 1.16x | 0.78x | 1.09x | | 1024 | 6 | 1.03x | 0.78x | 1.11x | | 2048 | 0 | 1.14x | 0.76x | 1.08x | | 2048 | 7 | 1.14x | 0.77x | 1.09x | | 64 | 1 | 1.40x | 1.04x | 1.37x | | 64 | 1 | 1.40x | 1.04x | 1.37x | | 64 | 2 | 1.35x | 1.04x | 1.37x | | 64 | 2 | 1.38x | 1.04x | 1.37x | | 64 | 3 | 1.36x | 1.04x | 1.37x | | 64 | 3 | 1.34x | 1.04x | 1.37x | | 64 | 4 | 1.41x | 1.04x | 1.37x | | 64 | 4 | 1.38x | 1.04x | 1.37x | | 64 | 5 | 1.36x | 1.04x | 1.37x | | 64 | 5 | 1.36x | 1.04x | 1.37x | | 64 | 6 | 1.35x | 1.04x | 1.37x | | 64 | 6 | 1.40x | 1.04x | 1.37x | | 64 | 7 | 1.35x | 1.04x | 1.37x | | 64 | 7 | 1.40x | 1.04x | 1.37x | | 0 | 0 | 1.19x | 1.63x | 1.66x | | 0 | 0 | 1.19x | 1.63x | 1.66x | | 1 | 0 | 1.19x | 1.63x | 1.66x | | 1 | 0 | 1.19x | 1.63x | 1.66x | | 2 | 0 | 1.18x | 1.63x | 1.63x | | 2 | 0 | 1.18x | 1.63x | 1.66x | | 3 | 0 | 1.18x | 1.63x | 1.66x | | 3 | 0 | 1.20x | 1.63x | 1.63x | | 4 | 0 | 1.18x | 1.63x | 1.63x | | 4 | 0 | 1.18x | 1.63x | 1.66x | | 5 | 0 | 1.18x | 1.63x | 1.66x | | 5 | 0 | 1.18x | 1.63x | 1.66x | | 6 | 0 | 1.18x | 1.63x | 1.66x | | 6 | 0 | 1.18x | 1.63x | 1.66x | | 7 | 0 | 1.18x | 1.63x | 1.66x | | 7 | 0 | 1.18x | 1.63x | 1.66x | | 8 | 0 | 1.18x | 1.63x | 1.25x | | 8 | 0 | 1.18x | 1.63x | 1.66x | | 9 | 0 | 1.18x | 1.63x | 1.66x | | 9 | 0 | 1.18x | 1.63x | 1.66x | | 10 | 0 | 1.18x | 1.63x | 1.66x | | 10 | 0 | 1.18x | 1.63x | 1.66x | | 11 | 0 | 1.18x | 1.63x | 1.66x | | 11 | 0 | 1.18x | 1.63x | 1.66x | | 12 | 0 | 1.18x | 1.63x | 1.66x | | 12 | 0 | 1.19x | 1.63x | 1.66x | | 13 | 0 | 1.18x | 1.63x | 1.66x | | 13 | 0 | 1.18x | 1.63x | 1.66x | | 14 | 0 | 1.19x | 1.63x | 1.66x | | 14 | 0 | 1.19x | 1.63x | 1.66x | | 15 | 0 | 1.18x | 1.63x | 1.66x | | 15 | 0 | 1.18x | 1.63x | 1.66x | | 16 | 0 | 1.03x | 0.96x | 1.00x | | 16 | 0 | 1.03x | 0.96x | 1.00x | | 17 | 0 | 1.03x | 0.96x | 1.00x | | 17 | 0 | 1.03x | 0.96x | 1.15x | | 18 | 0 | 1.03x | 0.96x | 1.14x | | 18 | 0 | 1.04x | 0.96x | 1.15x | | 19 | 0 | 1.04x | 0.96x | 1.15x | | 19 | 0 | 1.04x | 0.96x | 1.15x | | 20 | 0 | 1.04x | 0.96x | 1.15x | | 20 | 0 | 1.03x | 0.96x | 1.15x | | 21 | 0 | 1.04x | 0.96x | 1.15x | | 21 | 0 | 1.03x | 0.96x | 1.15x | | 22 | 0 | 1.02x | 0.96x | 1.15x | | 22 | 0 | 1.03x | 0.96x | 1.15x | | 23 | 0 | 1.03x | 0.96x | 1.15x | | 23 | 0 | 1.03x | 0.96x | 1.15x | | 24 | 0 | 1.03x | 0.96x | 1.00x | | 24 | 0 | 1.02x | 0.96x | 1.00x | | 25 | 0 | 1.04x | 0.96x | 1.00x | | 25 | 0 | 1.03x | 0.96x | 1.16x | | 26 | 0 | 1.04x | 0.96x | 1.15x | | 26 | 0 | 1.03x | 0.96x | 1.15x | | 27 | 0 | 1.04x | 0.96x | 1.00x | | 27 | 0 | 1.03x | 0.96x | 1.00x | | 28 | 0 | 1.04x | 0.96x | 1.00x | | 28 | 0 | 1.04x | 0.96x | 1.00x | | 29 | 0 | 1.03x | 0.96x | 1.00x | | 29 | 0 | 1.04x | 0.96x | 1.15x | | 30 | 0 | 1.04x | 0.96x | 1.15x | | 30 | 0 | 1.03x | 0.95x | 1.00x | | 31 | 0 | 1.03x | 0.96x | 1.00x | | 31 | 0 | 1.04x | 0.96x | 1.00x | This patch is passing GLIBC tests. Regards Andrea 8< --- 8< --- 8< Introduce an Arm MTE compatible strchrnul implementation. Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1 does not show performance regressions. Co-authored-by: Wilco Dijkstra