public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 @ 2021-02-19 2:56 xuchunmei at linux dot alibaba.com 2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com ` (9 more replies) 0 siblings, 10 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 2:56 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 Bug ID: 27437 Summary: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 Product: glibc Version: 2.32 Status: UNCONFIRMED Severity: normal Priority: P2 Component: libc Assignee: unassigned at sourceware dot org Reporter: xuchunmei at linux dot alibaba.com CC: drepper.fsp at gmail dot com Target Milestone: --- my test platform is Neoverse N1 with 8vcpu and 32G memory. one test env is glibc2.28 and another is glibc2.32. I use performance testcase perf-bench-mem with memcpy, the test command is: perf bench mem memcpy -l 100000 -s 1MB -f default following is the compare result of glibc 2.28 and glibc2.32, the first column is the length of memcpy to test, and the data is perf-bench-mem test result of copy throughput. length glibc2.28 glibc2.32 1KB 40.974632 41.072926 1 2KB 42.864724 42.769414 -1% 4KB 43.652475 43.713758 1 8KB 44.136496 44.119306 1 16KB 44.216839 44.275858 1 32KB 43.860959 44.387913 1% 64KB 42.098147 44.104689 4% 128KB 41.403627 39.714452 -4% 256KB 43.682267 40.190337 -8% 512KB 44.157858 37.020873 -16% 1MB 44.398972 16.413157 -63% 2MB 44.401274 13.739617 -69% when test size is larger than 128KB, glibc2.32 is slower to copy. I use perf record to record the hot function: glibc2.32: + 99.93% mem-memcpy libc-2.32.so [.] __GI___memcpy_simd 0.01% perf ld-2.32.so [.] do_lookup_x 0.01% mem-memcpy [kernel.kallsyms] [k] zap_pte_range 0.00% perf ld-2.32.so [.] strcmp 0.00% perf ld-2.32.so [.] _dl_relocate_object glibc2.28: + 99.48% mem-memcpy libc-2.28.so [.] __memcpy_generic 0.18% perf ld-2.28.so [.] do_lookup_x 0.09% perf ld-2.28.so [.] _dl_relocate_object 0.04% perf ld-2.28.so [.] _dl_lookup_symbol_x and detail in glibc2.32: │ d8: ldr q3, [x1] 0.02 │ and x14, x1, #0xf │ and x1, x1, #0xfffffffffffffff0 │ sub x3, x0, x14 │ add x2, x2, x14 │ ldp q0, q1, [x1, #16] 0.00 │ str q3, [x0] │ ldp q2, q3, [x1, #48] 0.02 │ subs x2, x2, #0x90 │ ↓ b.ls 120 0.16 │100: stp q0, q1, [x3, #16] 5.40 │ ldp q0, q1, [x1, #80] 10.92 │ stp q2, q3, [x3, #48] 4.93 │ ldp q2, q3, [x1, #112] 77.29 │ add x1, x1, #0x40 0.44 │ add x3, x3, #0x40 0.01 │ subs x2, x2, #0x40 0.81 │ ↑ b.hi 100 │120: ldp q4, q5, [x4, #-64] │ stp q0, q1, [x3, #16] │ ldp q0, q1, [x4, #-32] │ stp q2, q3, [x3, #48] │ stp q4, q5, [x5, #-64] │ stp q0, q1, [x5, #-32] │ ← ret -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com @ 2021-02-19 2:57 ` xuchunmei at linux dot alibaba.com 2021-02-19 3:16 ` carlos at redhat dot com ` (8 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 2:57 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 xuchunmei <xuchunmei at linux dot alibaba.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wdijkstr at arm dot com, | |xuchunmei at linux dot alibaba.com -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com 2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com @ 2021-02-19 3:16 ` carlos at redhat dot com 2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com ` (7 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: carlos at redhat dot com @ 2021-02-19 3:16 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 Carlos O'Donell <carlos at redhat dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |carlos at redhat dot com --- Comment #1 from Carlos O'Donell <carlos at redhat dot com> --- What results do you get from bench-memcpy-random i.e. make bench; on your system? -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com 2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com 2021-02-19 3:16 ` carlos at redhat dot com @ 2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:27 ` wdijkstr at arm dot com ` (6 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 3:42 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #2 from xuchunmei <xuchunmei at linux dot alibaba.com> --- (In reply to Carlos O'Donell from comment #1) > What results do you get from bench-memcpy-random i.e. make bench; on your > system? # ./bench-memcpy-random { "timing_type": "hp_timing", "functions": { "memcpy": { "bench-variant": "random", "ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor", "__memcpy_simd", "__memcpy_generic"], "results": [ { "max-size": 4096, "timings": [61793.7, 59328.8, 56071.7, 50435.7, 53163.3] }, { "max-size": 8192, "timings": [62629.7, 58642.9, 55397.9, 49791.9, 52634.6] }, { "max-size": 16384, "timings": [63192.2, 58967, 55733.3, 49763.7, 53064.4] }, { "max-size": 32768, "timings": [63471.5, 59236.6, 56408.2, 51509.8, 54014.6] }, { "max-size": 65536, "timings": [65745.4, 60589.8, 57791.2, 52921.2, 57637.8] }, { "max-size": 131072, "timings": [68051.3, 62946.6, 60451.2, 56379.2, 60693] }, { "max-size": 262144, "timings": [74675.4, 69991.7, 67861.6, 63699, 67316.2] }, { "max-size": 524288, "timings": [94101.2, 91320.1, 89655.6, 84932.3, 87520.9] }] } } } -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (2 preceding siblings ...) 2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com @ 2021-02-19 11:27 ` wdijkstr at arm dot com 2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com ` (5 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: wdijkstr at arm dot com @ 2021-02-19 11:27 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #3 from Wilco <wdijkstr at arm dot com> --- So bench-memcpy-random shows __memcpy_simd is fastest by a good margin for small cases. Do you see similar differences between __memcpy_generic and __memcpy_simd in bench-memcpy-walk or bench-memcpy-large? -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (3 preceding siblings ...) 2021-02-19 11:27 ` wdijkstr at arm dot com @ 2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com ` (4 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:54 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #4 from xuchunmei <xuchunmei at linux dot alibaba.com> --- (In reply to Wilco from comment #3) > So bench-memcpy-random shows __memcpy_simd is fastest by a good margin for > small cases. Do you see similar differences between __memcpy_generic and > __memcpy_simd in bench-memcpy-walk or bench-memcpy-large? # ./bench-memcpy-walk { "timing_type": "hp_timing", "functions": { "memcpy": { "bench-variant": "walk", "ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor", "__memcpy_simd", "__memcpy_generic"], "results": [ { "length": 128, "timings": [33.76, 34.9569, 9.54915, 9.57585, 9.47256] }, { "length": 129, "timings": [35.2716, 31.849, 30.5517, 29.9549, 31.5533] }, { "length": 256, "timings": [62.829, 60.8114, 57.7517, 56.2223, 57.1628] }, { "length": 257, "timings": [49.6077, 47.399, 47.2415, 46.5737, 47.5711] }, { "length": 512, "timings": [113.586, 113.427, 115.86, 116.922, 116.778] }, { "length": 513, "timings": [106.793, 101.588, 94.7251, 88.1578, 88.2201] }, { "length": 1024, "timings": [121.122, 122.055, 128.414, 122.723, 123.736] }, { "length": 1025, "timings": [210.901, 195.7, 195.425, 181.864, 172.566] }, { "length": 2048, "timings": [218.448, 224.068, 223.655, 230.854, 216.989] }, { "length": 2049, "timings": [321.531, 329.909, 289.185, 285.851, 281.323] }, { "length": 4096, "timings": [374.179, 401.979, 384.495, 392.112, 381.073] }, { "length": 4097, "timings": [450.306, 510.441, 414.745, 401.545, 401.715] }, { "length": 8192, "timings": [667.217, 673.253, 677.579, 694.132, 674.405] }, { "length": 8193, "timings": [679.844, 768.811, 610.683, 591.152, 591.464] }, { "length": 16384, "timings": [1236.02, 1206.93, 1261.11, 1287.68, 1255.33] }, { "length": 16385, "timings": [1102.34, 1254.92, 1071.11, 1054.39, 1053.36] }, { "length": 32768, "timings": [2275.35, 2294.88, 2424.76, 2472.89, 2404.96] }, { "length": 32769, "timings": [2328.63, 2305.86, 2109.21, 2002.92, 2024.76] }, { "length": 65536, "timings": [4437.26, 4435.93, 4803.28, 4880.86, 4743.9] }, { "length": 65537, "timings": [4355.46, 4326.86, 4179.13, 4072.96, 4076.16] }, { "length": 131072, "timings": [8670.91, 8735.29, 9394.05, 9515.15, 9383.43] }, { "length": 131073, "timings": [8454.82, 9398.74, 8723.57, 8726.39, 8669.98] }, { "length": 262144, "timings": [17410.9, 17450.9, 18792.5, 18928.1, 18682.2] }, { "length": 262145, "timings": [16825.3, 16689.3, 17449.6, 17574.3, 17360.3] }, { "length": 524288, "timings": [34310.5, 34433.7, 37218.7, 37446.8, 37060.6] }, { "length": 524289, "timings": [33399.9, 33441.8, 34781.2, 34548.7, 34354.3] }, { "length": 1048576, "timings": [68204.8, 68134.8, 74178.7, 74529.9, 73581.2] }, { "length": 1048577, "timings": [64777.2, 65532.3, 69548.6, 69252.3, 68627.3] }, { "length": 2097152, "timings": [134797, 135150, 146707, 147932, 146395] }, { "length": 2097153, "timings": [131402, 130510, 141845, 142190, 141580] }, { "length": 4194304, "timings": [268444, 269134, 292860, 295185, 293134] }, { "length": 4194305, "timings": [265649, 265709, 287879, 289754, 288134] }, { "length": 8388608, "timings": [534478, 538868, 585879, 589639, 584779] }, { "length": 8388609, "timings": [533418, 535318, 581869, 587639, 580609] }, { "length": 16777216, "timings": [1.07644e+06, 1.07876e+06, 1.17434e+06, 1.18022e+06, 1.17142e+06] }, { "length": 16777217, "timings": [1.07702e+06, 1.07438e+06, 1.1722e+06, 1.18212e+06, 1.17316e+06] }, { "length": 33554432, "timings": [2.14187e+06, 2.15503e+06, 2.35112e+06, 2.38628e+06, 2.34704e+06] }, { "length": 33554433, "timings": [2.16367e+06, 2.15555e+06, 2.35476e+06, 2.36968e+06, 2.35604e+06] }] } } } -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (4 preceding siblings ...) 2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com @ 2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com ` (3 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:55 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #5 from xuchunmei <xuchunmei at linux dot alibaba.com> --- # ./bench-memcpy-large { "timing_type": "hp_timing", "functions": { "memcpy": { "bench-variant": "large", "ifuncs": ["__memcpy_thunderx", "__memcpy_thunderx2", "__memcpy_falkor", "__memcpy_simd", "__memcpy_generic"], "results": [ { "length": 65543, "align1": 0, "align2": 0, "timings": [12775.2, 1480, 1425.06, 1420, 1425] }, { "length": 65551, "align1": 0, "align2": 3, "timings": [2550.06, 1752.5, 1527.56, 1487.5, 2330.06] }, { "length": 65567, "align1": 3, "align2": 0, "timings": [2250, 1747.56, 1452.5, 1432.56, 2150] }, { "length": 65599, "align1": 3, "align2": 5, "timings": [2367.56, 1740, 1532.56, 1482.5, 2347.56] }, { "length": 131079, "align1": 0, "align2": 0, "timings": [4535.06, 3045.06, 2805.06, 2977.56, 2892.5] }, { "length": 131087, "align1": 0, "align2": 3, "timings": [6550.06, 3522.5, 3127.56, 3110.06, 5510.06] }, { "length": 131103, "align1": 3, "align2": 0, "timings": [6780.12, 3455, 2832.5, 3102.56, 5442.56] }, { "length": 131135, "align1": 3, "align2": 5, "timings": [6570.06, 3485.06, 3132.56, 3857.56, 5500.12] }, { "length": 262151, "align1": 0, "align2": 0, "timings": [9050.12, 6047.62, 5560.06, 5900.12, 6667.62] }, { "length": 262159, "align1": 0, "align2": 3, "timings": [13042.7, 6915.12, 6185.12, 6100.06, 10887.7] }, { "length": 262175, "align1": 3, "align2": 0, "timings": [13512.7, 6900.12, 7135.06, 6115.06, 10787.7] }, { "length": 262207, "align1": 3, "align2": 5, "timings": [13045.2, 6882.56, 6152.62, 6100.06, 10880.2] }, { "length": 524295, "align1": 0, "align2": 0, "timings": [19175.2, 12147.7, 11115.1, 11762.7, 12040.2] }, { "length": 524303, "align1": 0, "align2": 3, "timings": [26097.9, 13782.7, 12357.7, 12100.2, 21637.9] }, { "length": 524319, "align1": 3, "align2": 0, "timings": [27090.4, 13725.2, 11530.1, 12185.2, 22050.3] }, { "length": 524351, "align1": 3, "align2": 5, "timings": [26545.4, 13832.8, 12300.2, 12102.7, 21635.3] }, { "length": 1048583, "align1": 0, "align2": 0, "timings": [56493.4, 51520.8, 55768.4, 57578.4, 55663.4] }, { "length": 1048591, "align1": 0, "align2": 3, "timings": [63180.9, 55333.4, 52825.8, 52888.4, 67321] }, { "length": 1048607, "align1": 3, "align2": 0, "timings": [64918.6, 54993.3, 56908.4, 58403.4, 68601.1] }, { "length": 1048639, "align1": 3, "align2": 5, "timings": [62773.5, 54600.8, 55968.4, 54903.4, 64518.5] }, { "length": 2097159, "align1": 0, "align2": 0, "timings": [137770, 140780, 153962, 153740, 153040] }, { "length": 2097167, "align1": 0, "align2": 3, "timings": [145682, 146480, 150777, 150967, 160282] }, { "length": 2097183, "align1": 3, "align2": 0, "timings": [149220, 144587, 156512, 155420, 164575] }, { "length": 2097215, "align1": 3, "align2": 5, "timings": [142237, 139987, 147570, 148227, 158155] }, { "length": 4194311, "align1": 0, "align2": 0, "timings": [305932, 297210, 320072, 322650, 316517] }, { "length": 4194319, "align1": 0, "align2": 3, "timings": [297642, 299982, 313495, 316667, 330998] }, { "length": 4194335, "align1": 3, "align2": 0, "timings": [304000, 299292, 317437, 320967, 333400] }, { "length": 4194367, "align1": 3, "align2": 5, "timings": [299717, 297707, 317915, 318242, 331165] }, { "length": 8388615, "align1": 0, "align2": 0, "timings": [630660, 604037, 649978, 655200, 646123] }, { "length": 8388623, "align1": 0, "align2": 3, "timings": [622982, 614902, 642418, 646813, 670370] }, { "length": 8388639, "align1": 3, "align2": 0, "timings": [626285, 615827, 646858, 650653, 669793] }, { "length": 8388671, "align1": 3, "align2": 5, "timings": [629460, 622300, 647858, 656640, 696578] }, { "length": 16777223, "align1": 0, "align2": 0, "timings": [1.29937e+06, 1.25757e+06, 1.29938e+06, 1.29798e+06, 1.28434e+06] }, { "length": 16777231, "align1": 0, "align2": 3, "timings": [1.24592e+06, 1.23578e+06, 1.3021e+06, 1.29313e+06, 1.33601e+06] }, { "length": 16777247, "align1": 3, "align2": 0, "timings": [1.27559e+06, 1.24098e+06, 1.29053e+06, 1.29575e+06, 1.33914e+06] }, { "length": 16777279, "align1": 3, "align2": 5, "timings": [1.26273e+06, 1.23506e+06, 1.28375e+06, 1.29168e+06, 1.3573e+06] }, { "length": 33554439, "align1": 0, "align2": 0, "timings": [2.63771e+06, 2.50596e+06, 2.6239e+06, 2.6192e+06, 2.62283e+06] }, { "length": 33554447, "align1": 0, "align2": 3, "timings": [2.59987e+06, 2.50243e+06, 2.60401e+06, 2.62009e+06, 2.68767e+06] }, { "length": 33554463, "align1": 3, "align2": 0, "timings": [2.61623e+06, 2.49325e+06, 2.7087e+06, 2.78022e+06, 2.84058e+06] }, { "length": 33554495, "align1": 3, "align2": 5, "timings": [2.70683e+06, 2.50238e+06, 2.59609e+06, 2.60011e+06, 2.67857e+06] }] } } } -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (5 preceding siblings ...) 2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com @ 2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com 2021-02-19 13:12 ` wdijkstr at arm dot com ` (2 subsequent siblings) 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 11:59 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #6 from xuchunmei <xuchunmei at linux dot alibaba.com> --- bench-memcpy-walk result show thar when length is larger than 1024, __memcpy_simd seems a little slower than __memcpy_generic. -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (6 preceding siblings ...) 2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com @ 2021-02-19 13:12 ` wdijkstr at arm dot com 2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com 2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com 9 siblings, 0 replies; 11+ messages in thread From: wdijkstr at arm dot com @ 2021-02-19 13:12 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #7 from Wilco <wdijkstr at arm dot com> --- (In reply to xuchunmei from comment #6) > bench-memcpy-walk result show thar when length is larger than 1024, > __memcpy_simd seems a little slower than __memcpy_generic. Yes but the difference is small, and in bench-memcpy-large __memcpy_simd wins by a huge margin on the unaligned cases. Since none of these reproduce what you are seeing, would it be possible to create a small testcase that demonstrates the issue you are seeing in perf-bench-mem? -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (7 preceding siblings ...) 2021-02-19 13:12 ` wdijkstr at arm dot com @ 2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com 2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 14:28 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 --- Comment #8 from xuchunmei <xuchunmei at linux dot alibaba.com> --- (In reply to Wilco from comment #7) > (In reply to xuchunmei from comment #6) > > bench-memcpy-walk result show thar when length is larger than 1024, > > __memcpy_simd seems a little slower than __memcpy_generic. > > Yes but the difference is small, and in bench-memcpy-large __memcpy_simd > wins by a huge margin on the unaligned cases. > > Since none of these reproduce what you are seeing, would it be possible to > create a small testcase that demonstrates the issue you are seeing in > perf-bench-mem? sorry to bring confusion, performance regression is not caused by __memcpy_simd, it is the difference of my test env, the difference is not only glibc, but also other differences, I will check again. -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug libc/27437] [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com ` (8 preceding siblings ...) 2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com @ 2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com 9 siblings, 0 replies; 11+ messages in thread From: xuchunmei at linux dot alibaba.com @ 2021-02-19 14:33 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=27437 xuchunmei <xuchunmei at linux dot alibaba.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |NOTABUG --- Comment #9 from xuchunmei <xuchunmei at linux dot alibaba.com> --- since bench-memcpy-random and bench-memcpy-large result has showed that __memcpy_simd has no regression, and my test env has more differences no just glibc. I will check in detail. -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-19 14:33 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-19 2:56 [Bug libc/27437] New: [aarch64]memcpy_simd has performance regression with larger size on Neoverse N1 xuchunmei at linux dot alibaba.com 2021-02-19 2:57 ` [Bug libc/27437] " xuchunmei at linux dot alibaba.com 2021-02-19 3:16 ` carlos at redhat dot com 2021-02-19 3:42 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:27 ` wdijkstr at arm dot com 2021-02-19 11:54 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:55 ` xuchunmei at linux dot alibaba.com 2021-02-19 11:59 ` xuchunmei at linux dot alibaba.com 2021-02-19 13:12 ` wdijkstr at arm dot com 2021-02-19 14:28 ` xuchunmei at linux dot alibaba.com 2021-02-19 14:33 ` xuchunmei at linux dot alibaba.com
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).