From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B41C63858402; Wed, 13 Sep 2023 09:31:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B41C63858402 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694597511; bh=8SyL8riAiFaFmSb2JnUcB82U0LAZJDjNGN0Gq2OxgfA=; h=From:To:Subject:Date:From; b=B8lzts/YMtXMOwiSd4vZy/n/Njy7ILrnNE/Ijh5v0R1vjvUnyjvUihITaeSkovzou i3Lwt0nY9w3J0iRDM5WqMmYK5dr1ZuFJO60444l6pL7CRbtyK3p1KKME4FyQYLADWJ l/B20LfkZ9U/KgN9TN/yv/QGSsspctmbyIR7q4O4= From: "juzhe.zhong at rivai dot ai" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/111401] New: Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS Date: Wed, 13 Sep 2023 09:31:51 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: juzhe.zhong at rivai dot ai X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111401 Bug ID: 111401 Summary: Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- There is a case I think I missed the optimization in the loop vectorizer: https://godbolt.org/z/x5sjdenhM double foo2 (double *__restrict a, double init, int *__restrict cond, int n) { for (int i =3D 0; i < n; i++) if (cond[i]) init +=3D a[i]; return init; } It generate the GIMPLE IR as follows: _60 =3D .SELECT_VL (ivtmp_58, 4); ... vect__ifc__35.14_56 =3D .VCOND_MASK (mask__23.10_50, vect__8.13_54, { 0.0, = 0.0, 0.0, 0.0 }); _36 =3D .MASK_LEN_FOLD_LEFT_PLUS (init_20, vect__ifc__35.14_56, { -1, -1,= -1, -1 }, _60, 0); The mask of MASK_LEN_FOLD_LEFT_PLUS is the dummy mask {-1.-1,...-1} I think we should forward the mask of VCOND_MASK into the MASK_LEN_FOLD_LEFT_PLUS. Then we can eliminate the VCOND_MASK. I don't where is the optimal place to do the optimization. Should be the match.pd ? or the loop vectorizer code? Thanks.=