public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
@ 2021-02-19 2:56 xuchunmei at linux dot alibaba.com
2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 2:56 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
Bug ID: 27437
Summary: [aarch64]memcpy_simd has performance regression with
larger size on Neoverse N1
Product: glibc
Version: 2.32
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: libc
Assignee: unassigned at sourceware dot org
Reporter: xuchunmei at linux dot alibaba.com
CC: drepper.fsp at gmail dot com
Target Milestone: ---
my test platform is Neoverse N1 with 8vcpu and 32G memory.
one test env is glibc2.28 and another is glibc2.32.
I use performance testcase perf-bench-mem with memcpy, the test command is:
perf bench mem memcpy -l 100000 -s 1MB -f default
following is the compare result of glibc 2.28 and glibc2.32, the first column
is the length of memcpy to test, and the data is perf-bench-mem test result of
copy throughput.
length glibc2.28 glibc2.32
1KB 40.974632 41.072926 1
2KB 42.864724 42.769414 -1%
4KB 43.652475 43.713758 1
8KB 44.136496 44.119306 1
16KB 44.216839 44.275858 1
32KB 43.860959 44.387913 1%
64KB 42.098147 44.104689 4%
128KB 41.403627 39.714452 -4%
256KB 43.682267 40.190337 -8%
512KB 44.157858 37.020873 -16%
1MB 44.398972 16.413157 -63%
2MB 44.401274 13.739617 -69%
when test size is larger than 128KB, glibc2.32 is slower to copy.
I use perf record to record the hot function:
glibc2.32:
+ 99.93% mem-memcpy libc-2.32.so [.] __GI___memcpy_simd
0.01% perf ld-2.32.so [.] do_lookup_x
0.01% mem-memcpy [kernel.kallsyms] [k] zap_pte_range
0.00% perf ld-2.32.so [.] strcmp
0.00% perf ld-2.32.so [.] _dl_relocate_object
glibc2.28:
+ 99.48% mem-memcpy libc-2.28.so [.] __memcpy_generic
0.18% perf ld-2.28.so [.] do_lookup_x
0.09% perf ld-2.28.so [.] _dl_relocate_object
0.04% perf ld-2.28.so [.] _dl_lookup_symbol_x
and detail in glibc2.32:
│ d8: ldr q3, [x1]
0.02 │ and x14, x1, #0xf
│ and x1, x1, #0xfffffffffffffff0
│ sub x3, x0, x14
│ add x2, x2, x14
│ ldp q0, q1, [x1, #16]
0.00 │ str q3, [x0]
│ ldp q2, q3, [x1, #48]
0.02 │ subs x2, x2, #0x90
│ ↓ b.ls 120
0.16 │100: stp q0, q1, [x3, #16]
5.40 │ ldp q0, q1, [x1, #80]
10.92 │ stp q2, q3, [x3, #48]
4.93 │ ldp q2, q3, [x1, #112]
77.29 │ add x1, x1, #0x40
0.44 │ add x3, x3, #0x40
0.01 │ subs x2, x2, #0x40
0.81 │ ↑ b.hi 100
│120: ldp q4, q5, [x4, #-64]
│ stp q0, q1, [x3, #16]
│ ldp q0, q1, [x4, #-32]
│ stp q2, q3, [x3, #48]
│ stp q4, q5, [x5, #-64]
│ stp q0, q1, [x5, #-32]
│ ← ret
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
@ 2021-02-19 2:57 ` xuchunmei at linux dot alibaba.com
2021-02-19 3:16 ` carlos at redhat dot com
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 2:57 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
xuchunmei <xuchunmei at linux dot alibaba.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |wdijkstr at arm dot com,
| |xuchunmei at linux dot alibaba.com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com
@ 2021-02-19 3:16 ` carlos at redhat dot com
2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: carlos at redhat dot com @ 2021-02-19 3:16 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
Carlos O'Donell <carlos at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |carlos at redhat dot com
--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
What results do you get from bench-memcpy-random i.e. make bench; on your
system?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com
2021-02-19 3:16 ` carlos at redhat dot com
@ 2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:27 ` wdijkstr at arm dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 3:42 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #2 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
(In reply to Carlos O'Donell from comment #1)
> What results do you get from bench-memcpy-random i.e. make bench; on your
> system?
# ./bench-memcpy-random
{
"timing_type": "hp_timing",
"functions": {
"memcpy": {
"bench-variant": "random",
"ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor",
"__memcpy_simd", "__memcpy_generic"],
"results": [
{
"max-size": 4096,
"timings": [61793.7, 59328.8, 56071.7, 50435.7, 53163.3]
},
{
"max-size": 8192,
"timings": [62629.7, 58642.9, 55397.9, 49791.9, 52634.6]
},
{
"max-size": 16384,
"timings": [63192.2, 58967, 55733.3, 49763.7, 53064.4]
},
{
"max-size": 32768,
"timings": [63471.5, 59236.6, 56408.2, 51509.8, 54014.6]
},
{
"max-size": 65536,
"timings": [65745.4, 60589.8, 57791.2, 52921.2, 57637.8]
},
{
"max-size": 131072,
"timings": [68051.3, 62946.6, 60451.2, 56379.2, 60693]
},
{
"max-size": 262144,
"timings": [74675.4, 69991.7, 67861.6, 63699, 67316.2]
},
{
"max-size": 524288,
"timings": [94101.2, 91320.1, 89655.6, 84932.3, 87520.9]
}]
}
}
}
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (2 preceding siblings ...)
2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com
@ 2021-02-19 11:27 ` wdijkstr at arm dot com
2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: wdijkstr at arm dot com @ 2021-02-19 11:27 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #3 from Wilco <wdijkstr at arm dot com> ---
So bench-memcpy-random shows __memcpy_simd is fastest by a good margin for
small cases. Do you see similar differences between __memcpy_generic and
__memcpy_simd in bench-memcpy-walk or bench-memcpy-large?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (3 preceding siblings ...)
2021-02-19 11:27 ` wdijkstr at arm dot com
@ 2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:54 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #4 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
(In reply to Wilco from comment #3)
> So bench-memcpy-random shows __memcpy_simd is fastest by a good margin for
> small cases. Do you see similar differences between __memcpy_generic and
> __memcpy_simd in bench-memcpy-walk or bench-memcpy-large?
# ./bench-memcpy-walk
{
"timing_type": "hp_timing",
"functions": {
"memcpy": {
"bench-variant": "walk",
"ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor",
"__memcpy_simd", "__memcpy_generic"],
"results": [
{
"length": 128,
"timings": [33.76, 34.9569, 9.54915, 9.57585, 9.47256]
},
{
"length": 129,
"timings": [35.2716, 31.849, 30.5517, 29.9549, 31.5533]
},
{
"length": 256,
"timings": [62.829, 60.8114, 57.7517, 56.2223, 57.1628]
},
{
"length": 257,
"timings": [49.6077, 47.399, 47.2415, 46.5737, 47.5711]
},
{
"length": 512,
"timings": [113.586, 113.427, 115.86, 116.922, 116.778]
},
{
"length": 513,
"timings": [106.793, 101.588, 94.7251, 88.1578, 88.2201]
},
{
"length": 1024,
"timings": [121.122, 122.055, 128.414, 122.723, 123.736]
},
{
"length": 1025,
"timings": [210.901, 195.7, 195.425, 181.864, 172.566]
},
{
"length": 2048,
"timings": [218.448, 224.068, 223.655, 230.854, 216.989]
},
{
"length": 2049,
"timings": [321.531, 329.909, 289.185, 285.851, 281.323]
},
{
"length": 4096,
"timings": [374.179, 401.979, 384.495, 392.112, 381.073]
},
{
"length": 4097,
"timings": [450.306, 510.441, 414.745, 401.545, 401.715]
},
{
"length": 8192,
"timings": [667.217, 673.253, 677.579, 694.132, 674.405]
},
{
"length": 8193,
"timings": [679.844, 768.811, 610.683, 591.152, 591.464]
},
{
"length": 16384,
"timings": [1236.02, 1206.93, 1261.11, 1287.68, 1255.33]
},
{
"length": 16385,
"timings": [1102.34, 1254.92, 1071.11, 1054.39, 1053.36]
},
{
"length": 32768,
"timings": [2275.35, 2294.88, 2424.76, 2472.89, 2404.96]
},
{
"length": 32769,
"timings": [2328.63, 2305.86, 2109.21, 2002.92, 2024.76]
},
{
"length": 65536,
"timings": [4437.26, 4435.93, 4803.28, 4880.86, 4743.9]
},
{
"length": 65537,
"timings": [4355.46, 4326.86, 4179.13, 4072.96, 4076.16]
},
{
"length": 131072,
"timings": [8670.91, 8735.29, 9394.05, 9515.15, 9383.43]
},
{
"length": 131073,
"timings": [8454.82, 9398.74, 8723.57, 8726.39, 8669.98]
},
{
"length": 262144,
"timings": [17410.9, 17450.9, 18792.5, 18928.1, 18682.2]
},
{
"length": 262145,
"timings": [16825.3, 16689.3, 17449.6, 17574.3, 17360.3]
},
{
"length": 524288,
"timings": [34310.5, 34433.7, 37218.7, 37446.8, 37060.6]
},
{
"length": 524289,
"timings": [33399.9, 33441.8, 34781.2, 34548.7, 34354.3]
},
{
"length": 1048576,
"timings": [68204.8, 68134.8, 74178.7, 74529.9, 73581.2]
},
{
"length": 1048577,
"timings": [64777.2, 65532.3, 69548.6, 69252.3, 68627.3]
},
{
"length": 2097152,
"timings": [134797, 135150, 146707, 147932, 146395]
},
{
"length": 2097153,
"timings": [131402, 130510, 141845, 142190, 141580]
},
{
"length": 4194304,
"timings": [268444, 269134, 292860, 295185, 293134]
},
{
"length": 4194305,
"timings": [265649, 265709, 287879, 289754, 288134]
},
{
"length": 8388608,
"timings": [534478, 538868, 585879, 589639, 584779]
},
{
"length": 8388609,
"timings": [533418, 535318, 581869, 587639, 580609]
},
{
"length": 16777216,
"timings": [1.07644e+06, 1.07876e+06, 1.17434e+06, 1.18022e+06,
1.17142e+06]
},
{
"length": 16777217,
"timings": [1.07702e+06, 1.07438e+06, 1.1722e+06, 1.18212e+06,
1.17316e+06]
},
{
"length": 33554432,
"timings": [2.14187e+06, 2.15503e+06, 2.35112e+06, 2.38628e+06,
2.34704e+06]
},
{
"length": 33554433,
"timings": [2.16367e+06, 2.15555e+06, 2.35476e+06, 2.36968e+06,
2.35604e+06]
}]
}
}
}
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (4 preceding siblings ...)
2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com
@ 2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:55 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #5 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
# ./bench-memcpy-large
{
"timing_type": "hp_timing",
"functions": {
"memcpy": {
"bench-variant": "large",
"ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor",
"__memcpy_simd", "__memcpy_generic"],
"results": [
{
"length": 65543,
"align1": 0,
"align2": 0,
"timings": [12775.2, 1480, 1425.06, 1420, 1425]
},
{
"length": 65551,
"align1": 0,
"align2": 3,
"timings": [2550.06, 1752.5, 1527.56, 1487.5, 2330.06]
},
{
"length": 65567,
"align1": 3,
"align2": 0,
"timings": [2250, 1747.56, 1452.5, 1432.56, 2150]
},
{
"length": 65599,
"align1": 3,
"align2": 5,
"timings": [2367.56, 1740, 1532.56, 1482.5, 2347.56]
},
{
"length": 131079,
"align1": 0,
"align2": 0,
"timings": [4535.06, 3045.06, 2805.06, 2977.56, 2892.5]
},
{
"length": 131087,
"align1": 0,
"align2": 3,
"timings": [6550.06, 3522.5, 3127.56, 3110.06, 5510.06]
},
{
"length": 131103,
"align1": 3,
"align2": 0,
"timings": [6780.12, 3455, 2832.5, 3102.56, 5442.56]
},
{
"length": 131135,
"align1": 3,
"align2": 5,
"timings": [6570.06, 3485.06, 3132.56, 3857.56, 5500.12]
},
{
"length": 262151,
"align1": 0,
"align2": 0,
"timings": [9050.12, 6047.62, 5560.06, 5900.12, 6667.62]
},
{
"length": 262159,
"align1": 0,
"align2": 3,
"timings": [13042.7, 6915.12, 6185.12, 6100.06, 10887.7]
},
{
"length": 262175,
"align1": 3,
"align2": 0,
"timings": [13512.7, 6900.12, 7135.06, 6115.06, 10787.7]
},
{
"length": 262207,
"align1": 3,
"align2": 5,
"timings": [13045.2, 6882.56, 6152.62, 6100.06, 10880.2]
},
{
"length": 524295,
"align1": 0,
"align2": 0,
"timings": [19175.2, 12147.7, 11115.1, 11762.7, 12040.2]
},
{
"length": 524303,
"align1": 0,
"align2": 3,
"timings": [26097.9, 13782.7, 12357.7, 12100.2, 21637.9]
},
{
"length": 524319,
"align1": 3,
"align2": 0,
"timings": [27090.4, 13725.2, 11530.1, 12185.2, 22050.3]
},
{
"length": 524351,
"align1": 3,
"align2": 5,
"timings": [26545.4, 13832.8, 12300.2, 12102.7, 21635.3]
},
{
"length": 1048583,
"align1": 0,
"align2": 0,
"timings": [56493.4, 51520.8, 55768.4, 57578.4, 55663.4]
},
{
"length": 1048591,
"align1": 0,
"align2": 3,
"timings": [63180.9, 55333.4, 52825.8, 52888.4, 67321]
},
{
"length": 1048607,
"align1": 3,
"align2": 0,
"timings": [64918.6, 54993.3, 56908.4, 58403.4, 68601.1]
},
{
"length": 1048639,
"align1": 3,
"align2": 5,
"timings": [62773.5, 54600.8, 55968.4, 54903.4, 64518.5]
},
{
"length": 2097159,
"align1": 0,
"align2": 0,
"timings": [137770, 140780, 153962, 153740, 153040]
},
{
"length": 2097167,
"align1": 0,
"align2": 3,
"timings": [145682, 146480, 150777, 150967, 160282]
},
{
"length": 2097183,
"align1": 3,
"align2": 0,
"timings": [149220, 144587, 156512, 155420, 164575]
},
{
"length": 2097215,
"align1": 3,
"align2": 5,
"timings": [142237, 139987, 147570, 148227, 158155]
},
{
"length": 4194311,
"align1": 0,
"align2": 0,
"timings": [305932, 297210, 320072, 322650, 316517]
},
{
"length": 4194319,
"align1": 0,
"align2": 3,
"timings": [297642, 299982, 313495, 316667, 330998]
},
{
"length": 4194335,
"align1": 3,
"align2": 0,
"timings": [304000, 299292, 317437, 320967, 333400]
},
{
"length": 4194367,
"align1": 3,
"align2": 5,
"timings": [299717, 297707, 317915, 318242, 331165]
},
{
"length": 8388615,
"align1": 0,
"align2": 0,
"timings": [630660, 604037, 649978, 655200, 646123]
},
{
"length": 8388623,
"align1": 0,
"align2": 3,
"timings": [622982, 614902, 642418, 646813, 670370]
},
{
"length": 8388639,
"align1": 3,
"align2": 0,
"timings": [626285, 615827, 646858, 650653, 669793]
},
{
"length": 8388671,
"align1": 3,
"align2": 5,
"timings": [629460, 622300, 647858, 656640, 696578]
},
{
"length": 16777223,
"align1": 0,
"align2": 0,
"timings": [1.29937e+06, 1.25757e+06, 1.29938e+06, 1.29798e+06,
1.28434e+06]
},
{
"length": 16777231,
"align1": 0,
"align2": 3,
"timings": [1.24592e+06, 1.23578e+06, 1.3021e+06, 1.29313e+06,
1.33601e+06]
},
{
"length": 16777247,
"align1": 3,
"align2": 0,
"timings": [1.27559e+06, 1.24098e+06, 1.29053e+06, 1.29575e+06,
1.33914e+06]
},
{
"length": 16777279,
"align1": 3,
"align2": 5,
"timings": [1.26273e+06, 1.23506e+06, 1.28375e+06, 1.29168e+06,
1.3573e+06]
},
{
"length": 33554439,
"align1": 0,
"align2": 0,
"timings": [2.63771e+06, 2.50596e+06, 2.6239e+06, 2.6192e+06, 2.62283e+06]
},
{
"length": 33554447,
"align1": 0,
"align2": 3,
"timings": [2.59987e+06, 2.50243e+06, 2.60401e+06, 2.62009e+06,
2.68767e+06]
},
{
"length": 33554463,
"align1": 3,
"align2": 0,
"timings": [2.61623e+06, 2.49325e+06, 2.7087e+06, 2.78022e+06,
2.84058e+06]
},
{
"length": 33554495,
"align1": 3,
"align2": 5,
"timings": [2.70683e+06, 2.50238e+06, 2.59609e+06, 2.60011e+06,
2.67857e+06]
}]
}
}
}
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (5 preceding siblings ...)
2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com
@ 2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com
2021-02-19 13:12 ` wdijkstr at arm dot com
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:59 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #6 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
bench-memcpy-walk result show thar when length is larger than 1024,
__memcpy_simd seems a little slower than __memcpy_generic.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (6 preceding siblings ...)
2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com
@ 2021-02-19 13:12 ` wdijkstr at arm dot com
2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com
2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com
9 siblings, 0 replies; 11+ messages in thread
From: wdijkstr at arm dot com @ 2021-02-19 13:12 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #7 from Wilco <wdijkstr at arm dot com> ---
(In reply to xuchunmei from comment #6)
> bench-memcpy-walk result show thar when length is larger than 1024,
> __memcpy_simd seems a little slower than __memcpy_generic.
Yes but the difference is small, and in bench-memcpy-large __memcpy_simd wins
by a huge margin on the unaligned cases.
Since none of these reproduce what you are seeing, would it be possible to
create a small testcase that demonstrates the issue you are seeing in
perf-bench-mem?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (7 preceding siblings ...)
2021-02-19 13:12 ` wdijkstr at arm dot com
@ 2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com
2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 14:28 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
--- Comment #8 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
(In reply to Wilco from comment #7)
> (In reply to xuchunmei from comment #6)
> > bench-memcpy-walk result show thar when length is larger than 1024,
> > __memcpy_simd seems a little slower than __memcpy_generic.
>
> Yes but the difference is small, and in bench-memcpy-large __memcpy_simd
> wins by a huge margin on the unaligned cases.
>
> Since none of these reproduce what you are seeing, would it be possible to
> create a small testcase that demonstrates the issue you are seeing in
> perf-bench-mem?
sorry to bring confusion, performance regression is not caused by
__memcpy_simd, it is the difference of my test env, the difference is not only
glibc, but also other differences, I will check again.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
` (8 preceding siblings ...)
2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com
@ 2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com
9 siblings, 0 replies; 11+ messages in thread
From: xuchunmei at linux dot alibaba.com @ 2021-02-19 14:33 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=27437
xuchunmei <xuchunmei at linux dot alibaba.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |NOTABUG
--- Comment #9 from xuchunmei <xuchunmei at linux dot alibaba.com> ---
since bench-memcpy-random and bench-memcpy-large result has showed that
__memcpy_simd has no regression, and my test env has more differences no just
glibc.
I will check in detail.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-19 14:33 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com
2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com
2021-02-19 3:16 ` carlos at redhat dot com
2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:27 ` wdijkstr at arm dot com
2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com
2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com
2021-02-19 13:12 ` wdijkstr at arm dot com
2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com
2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).