This is the patch I'm going to push to the trunk. On Wed, May 12, 2021 at 3:28 PM Hongtao Liu wrote: > > ping > > On Fri, Apr 30, 2021 at 12:49 PM Hongtao Liu wrote: > > > > Hi: > > For v{,p}expand* When mask is 0, -1, or has all all one bits in its > > lower part, it can be optimized to simple mov or mask mov. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,} and > > x86_64-linux-gnu{m32\ -march=cascadelake,-m64\ -march=cascadelake}, > > > > gcc/ChangeLog: > > > > * config/i386/i386-builtin.def (BDESC): Adjust builtin name. > > * config/i386/sse.md (_expand_mask): Rename to .. > > (expand_mask): this .. > > (*expand_mask): New pre_reload splitter to transform > > * v{,p}expand* to vmov* when mask is zero, all ones, or has > > all ones in its lower part, otherwise still generate v{,p}expand*. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/avx512bw-pr100267-1.c: New test. > > * gcc.target/i386/avx512bw-pr100267-b-2.c: New test. > > * gcc.target/i386/avx512bw-pr100267-d-2.c: New test. > > * gcc.target/i386/avx512bw-pr100267-q-2.c: New test. > > * gcc.target/i386/avx512bw-pr100267-w-2.c: New test. > > * gcc.target/i386/avx512f-pr100267-1.c: New test. > > * gcc.target/i386/avx512f-pr100267-pd-2.c: New test. > > * gcc.target/i386/avx512f-pr100267-ps-2.c: New test. > > * gcc.target/i386/avx512vl-pr100267-1.c: New test. > > * gcc.target/i386/avx512vl-pr100267-pd-2.c: New test. > > * gcc.target/i386/avx512vl-pr100267-ps-2.c: New test. > > * gcc.target/i386/avx512vlbw-pr100267-1.c: New test. > > * gcc.target/i386/avx512vlbw-pr100267-b-2.c: New test. > > * gcc.target/i386/avx512vlbw-pr100267-d-2.c: New test. > > * gcc.target/i386/avx512vlbw-pr100267-q-2.c: New test. > > * gcc.target/i386/avx512vlbw-pr100267-w-2.c: New test. > > > > -- > BR, > Hongtao -- BR, Hongtao