From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19550 invoked by alias); 8 Aug 2011 17:12:14 -0000 Received: (qmail 19534 invoked by uid 22791); 8 Aug 2011 17:12:12 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,TW_DC,TW_DD,TW_DQ,TW_OV,TW_PX,TW_VD,TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail-yx0-f175.google.com (HELO mail-yx0-f175.google.com) (209.85.213.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 08 Aug 2011 17:11:57 +0000 Received: by yxi19 with SMTP id 19so3072442yxi.20 for ; Mon, 08 Aug 2011 10:11:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.60.16 with SMTP id i16mr6026738wfa.343.1312823512631; Mon, 08 Aug 2011 10:11:52 -0700 (PDT) Received: by 10.142.118.41 with HTTP; Mon, 8 Aug 2011 10:11:37 -0700 (PDT) In-Reply-To: <201108081530.p78FUAgM029764@d06av02.portsmouth.uk.ibm.com> References: <201108081530.p78FUAgM029764@d06av02.portsmouth.uk.ibm.com> Date: Mon, 08 Aug 2011 17:12:00 -0000 Message-ID: Subject: Re: [RFC PATCH, i386]: Allow zero_extended addresses (+ problems with reload and offsetable address, "o" constraint) From: Uros Bizjak To: Ulrich Weigand Cc: gcc-patches@gcc.gnu.org, GCC Development , "H.J. Lu" Content-Type: multipart/mixed; boundary=00504502bf461f8a8f04aa018b8b Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2011-08/txt/msg00169.txt.bz2 --00504502bf461f8a8f04aa018b8b Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-length: 2068 On Mon, Aug 8, 2011 at 5:30 PM, Ulrich Weigand wrote: > Uros Bizjak wrote: > >> Although, it would be nice for reload to subsequently fix CSE'd >> non-offsetable memory by copying address to temporary reg (*as said in >> the documentation*), we could simply require an XMM temporary for >> TImode reloads to/from integer registers, and this fixes ICE for x32. > > Moves are special as far as reload is concerned. =A0If there is already > a move instruction present *before* reload, it will get fixed up > according to its constraints as any other instruction. > > However, reload will *introduce* new moves as part of its operation, > and those will *not* themselves get reloaded. =A0Instead, reload simply > assumes that every plain move will just succeed without requiring > any reload; if this is not true, the target *must* provide a > secondary reload for this move. > > (Note that the secondary reload could also work by reloading the > target address into a temporary; that's up to the target to > implement.) Whoa, indeed. Using attached patch that reloads memory address instead of going through XMM register, the code for the testcase improves from: test: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 sall $4, %esi addl %edi, %esi movdqa (%esi), %xmm0 movdqa %xmm0, -16(%rsp) movq -16(%rsp), %rcx movq -8(%rsp), %rbx addq $1, %rcx adcq $0, %rbx movq %rcx, -16(%rsp) sall $4, %edx movq %rbx, -8(%rsp) movdqa -16(%rsp), %xmm0 movdqa %xmm0, (%esi) pxor %xmm0, %xmm0 movdqa %xmm0, (%edx,%esi) popq %rbx .cfi_def_cfa_offset 8 ret to: test: .LFB0: .cfi_startproc sall $4, %esi pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 addl %edi, %esi pxor %xmm0, %xmm0 mov %esi, %eax movq (%rax), %rcx movq 8(%rax), %rbx addq $1, %rcx adcq $0, %rbx sall $4, %edx movq %rcx, (%rax) movq %rbx, 8(%rax) movdqa %xmm0, (%edx,%esi) popq %rbx .cfi_def_cfa_offset 8 ret H.J., can you please test attached patch? This optimization won't trigger on x86_64 anymore. Thanks, Uros. --00504502bf461f8a8f04aa018b8b Content-Type: text/plain; charset=US-ASCII; name="z.diff.txt" Content-Disposition: attachment; filename="z.diff.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_gr3p7d4x0 Content-length: 3604 SW5kZXg6IGNvbmZpZy9pMzg2L2kzODYubWQKPT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PQotLS0gY29uZmlnL2kzODYvaTM4Ni5tZAkocmV2aXNpb24gMTc3NTY1 KQorKysgY29uZmlnL2kzODYvaTM4Ni5tZAkod29ya2luZyBjb3B5KQpAQCAt MjA3Myw2ICsyMDczLDQwIEBACiAgICAgICAgKGNvbnN0X3N0cmluZyAib3Jp ZyIpKSkKICAgIChzZXRfYXR0ciAibW9kZSIgIlNJLERJLERJLERJLFNJLERJ LERJLERJLERJLERJLFRJLERJLFRJLERJLERJLERJLERJLERJIildKQogCis7 OyBSZWxvYWQgcGF0dGVybnMgdG8gc3VwcG9ydCBtdWx0aS13b3JkIGxvYWQv c3RvcmUKKzs7IHdpdGggbm9uLW9mZnNldGFibGUgYWRkcmVzcy4KKyhkZWZp bmVfZXhwYW5kICJyZWxvYWRfbm9mZl9zdG9yZSIKKyAgWyhwYXJhbGxlbCBb KG1hdGNoX29wZXJhbmQgMCAibWVtb3J5X29wZXJhbmQiICI9bSIpCisgICAg ICAgICAgICAgIChtYXRjaF9vcGVyYW5kIDEgInJlZ2lzdGVyX29wZXJhbmQi ICJyIikKKyAgICAgICAgICAgICAgKG1hdGNoX29wZXJhbmQ6REkgMiAicmVn aXN0ZXJfb3BlcmFuZCIgIj0mciIpXSldCisgICJUQVJHRVRfNjRCSVQiCit7 CisgIHJ0eCBtZW0gPSBvcGVyYW5kc1swXTsKKyAgcnR4IGFkZHIgPSBYRVhQ IChtZW0sIDApOworCisgIGVtaXRfbW92ZV9pbnNuIChvcGVyYW5kc1syXSwg YWRkcik7CisgIG1lbSA9IHJlcGxhY2VfZXF1aXZfYWRkcmVzc19udiAobWVt LCBvcGVyYW5kc1syXSk7CisKKyAgZW1pdF9pbnNuIChnZW5fcnR4X1NFVCAo Vk9JRG1vZGUsIG1lbSwgb3BlcmFuZHNbMV0pKTsKKyAgRE9ORTsKK30pCisK KyhkZWZpbmVfZXhwYW5kICJyZWxvYWRfbm9mZl9sb2FkIgorICBbKHBhcmFs bGVsIFsobWF0Y2hfb3BlcmFuZCAwICJyZWdpc3Rlcl9vcGVyYW5kIiAiPXIi KQorICAgICAgICAgICAgICAobWF0Y2hfb3BlcmFuZCAxICJtZW1vcnlfb3Bl cmFuZCIgIm0iKQorICAgICAgICAgICAgICAobWF0Y2hfb3BlcmFuZDpESSAy ICJyZWdpc3Rlcl9vcGVyYW5kIiAiPXIiKV0pXQorICAiVEFSR0VUXzY0QklU IgoreworICBydHggbWVtID0gb3BlcmFuZHNbMV07CisgIHJ0eCBhZGRyID0g WEVYUCAobWVtLCAwKTsKKworICBlbWl0X21vdmVfaW5zbiAob3BlcmFuZHNb Ml0sIGFkZHIpOworICBtZW0gPSByZXBsYWNlX2VxdWl2X2FkZHJlc3NfbnYg KG1lbSwgb3BlcmFuZHNbMl0pOworCisgIGVtaXRfaW5zbiAoZ2VuX3J0eF9T RVQgKFZPSURtb2RlLCBvcGVyYW5kc1swXSwgbWVtKSk7CisgIERPTkU7Cit9 KQorCiA7OyBDb252ZXJ0IGltcG9zc2libGUgc3RvcmVzIG9mIGltbWVkaWF0 ZSB0byBleGlzdGluZyBpbnN0cnVjdGlvbnMuCiA7OyBGaXJzdCB0cnkgdG8g Z2V0IHNjcmF0Y2ggcmVnaXN0ZXIgYW5kIGdvIHRocm91Z2ggaXQuICBJbiBj YXNlIHRoaXMKIDs7IGZhaWxzLCBtb3ZlIGJ5IDMyYml0IHBhcnRzLgpJbmRl eDogY29uZmlnL2kzODYvaTM4Ni5jCj09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K LS0tIGNvbmZpZy9pMzg2L2kzODYuYwkocmV2aXNpb24gMTc3NTY2KQorKysg Y29uZmlnL2kzODYvaTM4Ni5jCSh3b3JraW5nIGNvcHkpCkBAIC0yODI0NSwx OCArMjgyNDUsMjUgQEAKIAogc3RhdGljIHJlZ19jbGFzc190CiBpeDg2X3Nl Y29uZGFyeV9yZWxvYWQgKGJvb2wgaW5fcCwgcnR4IHgsIHJlZ19jbGFzc190 IHJjbGFzcywKLQkJICAgICAgIGVudW0gbWFjaGluZV9tb2RlIG1vZGUsCi0J CSAgICAgICBzZWNvbmRhcnlfcmVsb2FkX2luZm8gKnNyaSBBVFRSSUJVVEVf VU5VU0VEKQorCQkgICAgICAgZW51bSBtYWNoaW5lX21vZGUgbW9kZSwgc2Vj b25kYXJ5X3JlbG9hZF9pbmZvICpzcmkpCiB7CiAgIC8qIERvdWJsZS13b3Jk IHNwaWxscyBmcm9tIGdlbmVyYWwgcmVnaXN0ZXJzIHRvIG5vbi1vZmZzZXR0 YWJsZSBtZW1vcnkKLSAgICAgcmVmZXJlbmNlcyAoemVyby1leHRlbmRlZCBh ZGRyZXNzZXMpIGdvIHRocm91Z2ggWE1NIHJlZ2lzdGVyLiAgKi8KKyAgICAg cmVmZXJlbmNlcyAoemVyby1leHRlbmRlZCBhZGRyZXNzZXMpIHJlcXVpcmUg c3BlY2lhbCBoYW5kbGluZy4gICovCiAgIGlmIChUQVJHRVRfNjRCSVQKICAg ICAgICYmIE1FTV9QICh4KQogICAgICAgJiYgR0VUX01PREVfU0laRSAobW9k ZSkgPiBVTklUU19QRVJfV09SRAogICAgICAgJiYgcmNsYXNzID09IEdFTkVS QUxfUkVHUwogICAgICAgJiYgIW9mZnNldHRhYmxlX21lbXJlZl9wICh4KSkK LSAgICByZXR1cm4gU1NFX1JFR1M7CisgICAgeworICAgICAgc3JpLT5pY29k ZSA9IChpbl9wCisJCSAgICA/IENPREVfRk9SX3JlbG9hZF9ub2ZmX2xvYWQK KwkJICAgIDogQ09ERV9GT1JfcmVsb2FkX25vZmZfc3RvcmUpOworICAgICAg LyogQWRkIHRoZSBjb3N0IG9mIG1vdmUgdG8gYSB0ZW1wb3JhcnkuICAqLwor ICAgICAgc3JpLT5leHRyYV9jb3N0ID0gMTsKIAorICAgICAgcmV0dXJuIE5P X1JFR1M7CisgICAgfQorCiAgIC8qIFFJbW9kZSBzcGlsbHMgZnJvbSBub24t UUkgcmVnaXN0ZXJzIHJlcXVpcmUKICAgICAgaW50ZXJtZWRpYXRlIHJlZ2lz dGVyIG9uIDMyYml0IHRhcmdldHMuICAqLwogICBpZiAoIVRBUkdFVF82NEJJ VAo= --00504502bf461f8a8f04aa018b8b--