From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by sourceware.org (Postfix) with ESMTPS id 9C49F3858C20 for ; Thu, 23 Nov 2023 11:37:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C49F3858C20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9C49F3858C20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700739424; cv=none; b=Ysd6bZTRSfwYIbSb+kgYlEioqTZrm5iAPUpQzXL5RiTDyrHyrN5x3TOcuokL/khXaTFDvRdSHzdgd6KbaFY61lS+B1E8SNtYMIG9tQqJ8bLHvALE+sR+EGkXzg8f7eWMRWAE1S4bp0o+LFG4azhJlVGkOJa8uWwntd4S/1wugXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700739424; c=relaxed/simple; bh=cREsd3CT/uMGp5DOq5IJ5AoUr9iPjuF28QupKuxAiEA=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=LbeDrhVonTDXBd4jLyaGJxQlrFlJArSNu5XHStrN4ACuM84PgMGjSvAmQdAgXJiJiE0Bw7XtOt8sDqN7iK3rOeAMdUknilzgADF4Kfv3emCkUWP8L5sdy9r5zVhJ7d9+9fxSE2Smwo/nnGMad6GiDFXL0f5HRGgP8cm8uUuhJtY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12e.google.com with SMTP id 2adb3069b0e04-507a29c7eefso921491e87.1 for ; Thu, 23 Nov 2023 03:37:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1700739420; x=1701344220; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=8pd6qA11yIs7I0GEh0DoxxarhzaJcv7Vt7oTdNzuWuU=; b=VhAUAIFVuU3aIpuee9/eC37A/bguOlTl7kSAnyZGXP4psZfQ0ykpcoS9SAnM2+CKUC SLivY4VS4NvQKA94/93dvdxMF28ZfJsM2cBa5kf1HwpLiJvgiIatOjT9cVqcGYYzP9Ff BDZnlJLd6w5kaZ7XfKYsQxZBkl/wTNnpkQLpGk6jHALTiQt7QQcvGfeiilSg4gh6E0AU IKEZgrqLW3onITb2MHPivYtXYuSn5jrDU76VWf2yW2RX/gZib7muYmTmT0tKl0sdj5Co yyo9wOyOnZ2uu2WkYOxutq5Sd436FXEi66Y8YLXXzsDfsgPDWz3Izyfw8cep7Qa96F3P x6dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700739420; x=1701344220; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=8pd6qA11yIs7I0GEh0DoxxarhzaJcv7Vt7oTdNzuWuU=; b=IaDkPRGlowjXVTfj1jA5t/lngvMMRVwZDavmdZ7X6eUXQPbkXloj3SvCys7cgfu1c1 vkeIjJQ4oNSD+FbxOjomnEdG2jy6Zd/HLtB7vKkkdto4r1UulcmZIteRFzT0yGMc3NKu knFj5oULGo4TdgzOIK4+xsitaaDk2gGJcPY5lSkYtXeOShOh8FAfTFE/7CgDhlzBIBPB PKBMXr5NjGY1LpyZ5XFj/VY1gwPJVe70JfJ6BsGVarEB54PcgKOghtg200gv+PjtXcJv sYeYQY+576NWT+RDwZ85RCqNdMuLfxRuf7wxpoRDxnYg0sFuDAJPQCQrQm3d/pWqbBxn dGWg== X-Gm-Message-State: AOJu0Yx36aTp3H/Ic8a2Cx5t3yRi7J/VneN7izCy3j3CloYYonKvkYcA TAeiQ5IV5XbpQ1qhX7HYq9Cr77LB1NSjzWbXHRfDXhErl5BUNvFQ X-Google-Smtp-Source: AGHT+IFgJPsL2uxZadac3UOly6R+Lgdk+hGGxPE7hoBzPwVm8cJCvfDKN8ewG4Fcdw+8zOPENZz43m99xKXbA9t064E= X-Received: by 2002:ac2:4882:0:b0:508:11c3:c8ca with SMTP id x2-20020ac24882000000b0050811c3c8camr3636624lfc.7.1700739420320; Thu, 23 Nov 2023 03:37:00 -0800 (PST) MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Thu, 23 Nov 2023 17:06:24 +0530 Message-ID: Subject: [aarch64] PR111702 - ICE in insert_regs after interleave+zip1 vector initialization patch To: Richard Sandiford , gcc Patches Content-Type: multipart/mixed; boundary="00000000000001ed0a060ad041b1" X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --00000000000001ed0a060ad041b1 Content-Type: text/plain; charset="UTF-8" Hi Richard, For the test-case mentioned in PR111702, compiling with -O2 -frounding-math -fstack-protector-all results in following ICE during cse2 pass: test.c: In function 'foo': test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 119 | } | ^ 0xb7ebb0 insert_regs ../../gcc/gcc/cse.cc:1120 0x1f95134 merge_equiv_classes ../../gcc/gcc/cse.cc:1764 0x1f9b9ab cse_insn ../../gcc/gcc/cse.cc:4793 0x1f9fe30 cse_extended_basic_block ../../gcc/gcc/cse.cc:6577 0x1f9fe30 cse_main ../../gcc/gcc/cse.cc:6722 0x1fa0984 rest_of_handle_cse2 ../../gcc/gcc/cse.cc:7620 0x1fa0984 execute ../../gcc/gcc/cse.cc:7675 This happens only with interleave+zip1 vector initialization with -frounding-math -fstack-protector-all, while it compiles OK without -fstack-protector-all. Also, it compiles OK with fallback sequence code-gen (with or without -fstack-protector-all). Unfortunately, I haven't been able to reduce the test-case further :/ >From the test-case, it seems only the vector initializer for type J uses interleave+zip1 approach, while rest of the vector initializers use fallback sequence. J is defined as: typedef _Float16 __attribute__((__vector_size__ (16))) J; and the initializer is: (J) { 11654, 4801, 5535, 9743, 61680} interleave+zip1 sequence for above initializer J: mode = V8HF vals: (parallel:V8HF [ (reg:HF 642) (reg:HF 645) (reg:HF 648) (reg:HF 651) (reg:HF 654) (const_double:HF 0.0 [0x0.0p+0]) repeated x3 ]) target: (reg:V8HF 641) seq: (insn 1058 0 1059 (set (reg:V4HF 657) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1059 1058 1060 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) (reg:V4HF 657) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1060 1059 1061 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) (reg:V4HF 657) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1061 1060 1062 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) (reg:V4HF 657) (const_int 4 [0x4]))) "test.c":81:8 -1 (nil)) (insn 1062 1061 1063 (set (reg:V4HF 658) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1063 1062 1064 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) (reg:V4HF 658) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1064 1063 1065 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) (reg:V4HF 658) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1065 1064 0 (set (reg:V8HF 641) (unspec:V8HF [ (subreg:V8HF (reg:V4HF 657) 0) (subreg:V8HF (reg:V4HF 658) 0) ] UNSPEC_ZIP1)) "test.c":81:8 -1 (nil)) It seems to me that the above sequence correctly initializes the vector into r641 ? insns 1058-1061 construct r657 = { r642, r648, r654, 0 } insns 1062-1064 construct r658 = { r645, r651, 0, 0 } and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } For the above test, it seems that with interleave+zip1 approach and -fstack-protector-all, in cse pass, there are two separate equivalence classes created for (const_int 1), that need to be merged in cse_insn: if (elt->first_same_value != src_eqv_elt->first_same_value) { /* The REG_EQUAL is indicating that two formerly distinct classes are now equivalent. So merge them. */ merge_equiv_classes (elt, src_eqv_elt); elt equivalence chain: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (const_int 1 [0x1]) src_eqv_elt equivalence chain: Equivalence chain for (const_int 1 [0x1]): (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) The issue is that merge_equiv_classes doesn't seem to deal correctly with multiple occurences of same register in class2 (src_eqv_elt), which has two occurrences of (reg:QI 34 v2) In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) from reg_equiv_table by calling delete_equiv_reg(34), and in insert_regs it will create an entry for (reg:QI 34) in qty_table with new quantity number, and create new equivalence in reg_eqv_table. When we again come across (reg:QI 34) in class2, it will unconditionally remove the register from reg_eqv_table, thus making REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 chain. Then in insert_regs, we have: x: (reg:QI 34 v2) classp: (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (const_int 1 [0x1]) And while iterating over elements in classp, we end up with regno == c_regno == 34. However, as mentioned above, merge_equiv_classes has deleted entry for (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end up hitting the following assert: gcc_assert (REGNO_QTY_VALID_P (c_regno)); I am not sure tho why this is triggered only with interleave+zip1 approach with -fstack-protector-all. The attached (untested) patch is a workaround for the above issue -- In merge_equiv_classes, while iterating over elements in class2, it simply checks if element is a reg, and already inserted in class1 with equivalent mode, and avoids deleting it from reg_eqv_table in that case. This avoids hitting the assert, and following is the result of merging above two classes: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) Which seems to be OK (?), but am not sure if this patch is in the right direction, and is also not efficient. Could you please suggest how to proceed ? Thanks, Prathamesh --00000000000001ed0a060ad041b1 Content-Type: text/plain; charset="US-ASCII"; name="gnu-1002-1.txt" Content-Disposition: attachment; filename="gnu-1002-1.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lpb40tu40 ZGlmZiAtLWdpdCBhL2djYy9jc2UuY2MgYi9nY2MvY3NlLmNjCmluZGV4IGY5NjAzZmRmZDQzLi4x ZTIwYmU0NTdjNCAxMDA2NDQKLS0tIGEvZ2NjL2NzZS5jYworKysgYi9nY2MvY3NlLmNjCkBAIC0x NzQ3LDcgKzE3NDcsMTYgQEAgbWVyZ2VfZXF1aXZfY2xhc3NlcyAoc3RydWN0IHRhYmxlX2VsdCAq Y2xhc3MxLCBzdHJ1Y3QgdGFibGVfZWx0ICpjbGFzczIpCiAJICBpZiAoUkVHX1AgKGV4cCkpCiAJ ICAgIHsKIAkgICAgICBuZWVkX3JlaGFzaCA9IFJFR05PX1FUWV9WQUxJRF9QIChSRUdOTyAoZXhw KSk7Ci0JICAgICAgZGVsZXRlX3JlZ19lcXVpdiAoUkVHTk8gKGV4cCkpOworCisJICAgICAgLyog SWYgcmVnIGlzIGFscmVhZHkgaW5zZXJ0ZWQgaW50byBjbGFzczEgYW5kIGhhcyBhIHZhbGlkIG5l dworCQkgcXVhbnRpdHksIGF2b2lkIGRlbGV0aW5nIGl0IGZyb20gcmVnX2Vxdl90YWJsZS4gICov CisJICAgICAgdGFibGVfZWx0ICplOworCSAgICAgIGZvciAoZSA9IGNsYXNzMS0+Zmlyc3Rfc2Ft ZV92YWx1ZTsgZTsgZSA9IGUtPm5leHRfc2FtZV92YWx1ZSkKKwkJaWYgKFJFR19QIChlLT5leHAp ICYmIFJFR05PIChlLT5leHApID09IFJFR05PIChleHApCisJCSAgICAmJiBlLT5tb2RlID09IG1v ZGUpCisJCSAgYnJlYWs7CisJICAgICAgaWYgKGUgPT0gTlVMTCkKKwkJZGVsZXRlX3JlZ19lcXVp diAoUkVHTk8gKGV4cCkpOwogCSAgICB9CiAKIAkgIGlmIChSRUdfUCAoZXhwKSAmJiBSRUdOTyAo ZXhwKSA+PSBGSVJTVF9QU0VVRE9fUkVHSVNURVIpCg== --00000000000001ed0a060ad041b1--