On 15/02/18 07:25, Uros Bizjak wrote: > On Wed, Feb 14, 2018 at 7:04 PM, Kyrill Tkachov > wrote: >> On 13/02/18 16:45, Jeff Law wrote: >>> On 02/09/2018 07:50 AM, Kyrill Tkachov wrote: >>>> Hi Uros, >>>> >>>> On 08/02/18 22:54, Uros Bizjak wrote: >>>>> On Thu, Feb 8, 2018 at 6:11 PM, Kyrill Tkachov >>>>> wrote: >>>>>> Hi all, >>>>>> >>>>>> This patch fixes some fallout in the i386 testsuite that occurs after >>>>>> the >>>>>> simplification in patch [1/3] [1]. >>>>>> The gcc.target/i386/extract-2.c FAILs because it expects to match: >>>>>> (set (reg:CC 17 flags) >>>>>> (compare:CC (subreg:QI (zero_extract:SI (reg:HI 98) >>>>>> (const_int 8 [0x8]) >>>>>> (const_int 8 [0x8])) 0) >>>>>> (const_int 4 [0x4]))) >>>>>> >>>>>> which is the *cmpqi_ext_2 pattern in i386.md but with the new >>>>>> simplification >>>>>> the combine/simplify-rtx >>>>>> machinery produces: >>>>>> (set (reg:CC 17 flags) >>>>>> (compare:CC (subreg:QI (zero_extract:HI (reg:HI 98) >>>>>> (const_int 8 [0x8]) >>>>>> (const_int 8 [0x8])) 0) >>>>>> (const_int 4 [0x4]))) >>>>>> >>>>>> Notice that the zero_extract now has HImode like the register source >>>>>> rather >>>>>> than SImode. >>>>>> The existing *cmpqi_ext_ patterns however explicitly demand an >>>>>> SImode on >>>>>> the zero_extract. >>>>>> I'm not overly familiar with the i386 port but I think that's too >>>>>> restrictive. >>>>>> The RTL documentation says: >>>>>> For (zero_extract:m loc size pos) "The mode m is the same as the mode >>>>>> that >>>>>> would be used for loc if it were a register." >>>>>> I'm not sure if that means that the mode of the zero_extract and the >>>>>> source >>>>>> register must always match (as is the >>>>>> case after patch [1/3]) but in any case it shouldn't matter >>>>>> semantically >>>>>> since we're taking a QImode subreg of the whole >>>>>> thing anyway. >>>>>> >>>>>> So the proposed solution in this patch is to allow HI, SI and DImode >>>>>> zero_extracts in these patterns as these are the >>>>>> modes that the ext_register_operand predicate accepts, so that the >>>>>> patterns >>>>>> can match the new form above. >>>>>> >>>>>> With this patch the aforementioned test passes again and bootstrap and >>>>>> testing on x86_64-unknown-linux-gnu shows >>>>>> no regressions. >>>>>> >>>>>> Is this ok for trunk if the first patch is accepted? >>>>> Huh, there are many other zero-extract patterns besides cmpqi_ext_* >>>>> with QImode subreg of SImode zero_extract in i386.md, used to access >>>>> high QImode register of HImode pair. A quick grep shows these that >>>>> have _ext_ in their name: >>>>> >>>>> (define_insn "*cmpqi_ext_1" >>>>> (define_insn "*cmpqi_ext_2" >>>>> (define_expand "cmpqi_ext_3" >>>>> (define_insn "*cmpqi_ext_3" >>>>> (define_insn "*cmpqi_ext_4" >>>>> (define_insn "addqi_ext_1" >>>>> (define_insn "*addqi_ext_2" >>>>> (define_expand "testqi_ext_1_ccno" >>>>> (define_insn "*testqi_ext_1" >>>>> (define_insn "*testqi_ext_2" >>>>> (define_insn_and_split "*testqi_ext_3" >>>>> (define_insn "andqi_ext_1" >>>>> (define_insn "*andqi_ext_1_cc" >>>>> (define_insn "*andqi_ext_2" >>>>> (define_insn "*qi_ext_1" >>>>> (define_insn "*qi_ext_2" >>>>> (define_expand "xorqi_ext_1_cc" >>>>> (define_insn "*xorqi_ext_1_cc" >>>>> >>>>> There are also relevant splitters and peephole2 patterns. >>>> I see. Another approach I've looked at is removing the mode specifier >>>> from >>>> the zero_extract in these patterns. This means that they can be of any >>>> mode >>>> so they will match all of these modes without creating new patterns >>>> through >>>> iterators. That also works for the testcase and passes bootstrap and >>>> testing >>>> however there is the snag that the define_insns that don't start with a >>>> "*" >>>> are used to generate RTL through the gen_* mechanism and in that context >>>> the absence of a mode on the zero_extract would mean a VOIDmode >>>> zero_extract >>>> would be created, which I'm fairly sure is not good. So in my >>>> experiments I left >>>> those patterns alone (with an explicit SI on the zero_extract). >>>> >>>>> IIRC, SImode zero_extract was enough to catch all high-register uses. >>>>> There will be a pattern explosion if we want to handle all other >>>>> integer modes here. However, I'm not a RTL expert, so someone will >>>>> have to say what is the correct RTX form here. >>>> Jeff, Richard, could you please give us some guidance on this issue? >>>> Sorry for the trouble. >>>> >>> I don't think any of the patterns above are known to the generic code. >>> So you just have to check the x86 backend to see their precise uses in a >>> generator (ie gen_cmpqi_ext_3) and verify those do not allow a VOIDmode >>> (or any other undesirable mode) to slip through. >>> >>> Jeff >> >> Thanks Jeff, I did have a look. I think we want to maintain the SImode on >> the >> RTL that gets created through these named expanders, as generating a >> VOIDmode >> zero_extract is not valid. So my patch leaves those intact. >> The patch removes the mode from the zero_extract RTXes in patterns that are >> only ever going to get matched (as opposed to generated). That is the ones >> that >> start with "*" in their name. >> This should allow them to match any of the zero_extract modes that >> might get generated by the midend. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. >> Is this a preferable approach? > We are trying to avoid VOIDmode RTXes as much as possible (to tighten > pattern matching to avoid various surpsrises). Looking at these > instructions, I think that using SWI248 in insn and peephole2 patterns > should be OK. > > So, if it survives regression tests, the patch is OK with SWI248 mode > instead of void mode. > > Thanks, > Uros. Thanks Uros, here's my proposed patch. If we're adding the mode iterator we also have to adjust the name of the patterns I believe, so this patch does that too. Bootstrapped and tested on x86_64-unknown-linux-gnu. I will commit it if the prerequisite patch at https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00443.html is approved. Thank you for your guidance, Kyrill 2018-02-16 Kyrylo Tkachov PR target/84164 * config/i386/i386.md (*cmpqi_ext_1): Rename to... (*cmpqi_ext_1): ... This. Use SWI248 for zero_extract mode. (*cmpqi_ext_2): Rename to... (*cmpqi_ext_2): ... This. Use SWI248 for zero_extract mode. (*cmpqi_ext_3): Rename to... (*cmpqi_ext_3): ... This. Use SWI248 for zero_extract mode. (*cmpqi_ext_4): Rename to... (*cmpqi_ext_4): ... This. Use SWI248 for zero_extract mode. (*extzvqi_mem_rex64): Rename to... (*extzvqi_mem_rex64): ... This. Use SWI248 for zero_extract mode. (*extzvqi): Rename to.. (*extzvqi): ... This. Use SWI248 for zero_extract mode. (QImode zero_extract peephole): Use SWI248 for zero_extract mode. (*addqi_ext_2): Rename to... (*addqi_ext_2): ... This. Use SWI248 for zero_extract mode. (*testqi_ext_1): Rename to... (*testqi_ext_1): ... This. Use SWI248 for zero_extract mode. (*testqi_ext_2): Rename to... (*testqi_ext_2): ... This. Use SWI248 for zero_extract mode. (*andqi_ext_1_cc): Rename to... (*andqi_ext_1_cc): ... This. Use SWI248 for zero_extract mode. (*andqi_ext_2): Rename to... (*andqi_ext_2): ... This. Use SWI248 for zero_extract mode. (*qi_ext_1): Rename to... (*qi_ext_1): ... This. Use SWI248 for zero_extract mode. (*qi_ext_2): Rename to... (*qi_ext_2): ... This. Use SWI248 for zero_extract mode. (*xorqi_ext_1_cc): Rename to... (*xorqi_ext_1_cc): ... This. Use SWI248 for zero_extract mode. (AND QImode subreg of zero_extract peephole): Use SWI248 for zero_extract mode.