From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13969 invoked by alias); 4 Nov 2013 11:14:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 13952 invoked by uid 89); 4 Nov 2013 11:14:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,RDNS_NONE,URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: eggs.gnu.org Received: from Unknown (HELO eggs.gnu.org) (208.118.235.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 04 Nov 2013 11:13:38 +0000 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VdI5y-0007oo-VD for gcc-patches@gcc.gnu.org; Mon, 04 Nov 2013 06:13:30 -0500 Received: from e06smtp12.uk.ibm.com ([195.75.94.108]:58854) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VdI5y-0007oS-MP for gcc-patches@gcc.gnu.org; Mon, 04 Nov 2013 06:13:22 -0500 Received: from /spool/local by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 4 Nov 2013 11:11:42 -0000 Received: from d06dlp02.portsmouth.uk.ibm.com (9.149.20.14) by e06smtp12.uk.ibm.com (192.168.101.142) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 4 Nov 2013 11:11:41 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 03133219005E for ; Mon, 4 Nov 2013 11:11:41 +0000 (GMT) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps3075.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rA4BBSk256492246 for ; Mon, 4 Nov 2013 11:11:28 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rA4BBexg010733 for ; Mon, 4 Nov 2013 04:11:40 -0700 Received: from sandifor-thinkpad.stglab.manchester.uk.ibm.com (sig-9-145-147-153.de.ibm.com [9.145.147.153]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id rA4BBdQ4010692 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 4 Nov 2013 04:11:40 -0700 From: Richard Sandiford To: "H.J. Lu" Mail-Followup-To: "H.J. Lu" ,gcc-patches@gcc.gnu.org, law@redhat.com, rguenther@suse.de, rsandifo@linux.vnet.ibm.com Cc: gcc-patches@gcc.gnu.org, law@redhat.com, rguenther@suse.de Subject: Re: PATCH: middle-end/58981: movmem/setmem use mode wider than Pmode for size References: <20131104103216.GA13798@lucon.org> Date: Mon, 04 Nov 2013 11:18:00 -0000 In-Reply-To: <20131104103216.GA13798@lucon.org> (H. J. Lu's message of "Mon, 4 Nov 2013 02:32:16 -0800") Message-ID: <87fvrcjuac.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13110411-8372-0000-0000-000007A8A129 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 195.75.94.108 X-SW-Source: 2013-11/txt/msg00190.txt.bz2 Message-ID: <20131104111800.SAX3-Tu8jOo0xLh8mzxtLpMV28__8kyTCcYZkF7BEtk@z> "H.J. Lu" writes: > emit_block_move_via_movmem and set_storage_via_setmem have > > for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode; > mode = GET_MODE_WIDER_MODE (mode)) > { > enum insn_code code = direct_optab_handler (movmem_optab, mode); > > if (code != CODE_FOR_nothing > /* We don't need MODE to be narrower than BITS_PER_HOST_WIDE_INT > here because if SIZE is less than the mode mask, as it is > returned by the macro, it will definitely be less than the > actual mode mask. */ > && ((CONST_INT_P (size) > && ((unsigned HOST_WIDE_INT) INTVAL (size) > <= (GET_MODE_MASK (mode) >> 1))) > || GET_MODE_BITSIZE (mode) >= BITS_PER_WORD)) > { > > Backend may assume mode of size in movmem and setmem expanders is no > widder than Pmode since size is within the Pmode address space. X86 > backend expand_set_or_movmem_prologue_epilogue_by_misaligned has > > rtx saveddest = *destptr; > > gcc_assert (desired_align <= size); > /* Align destptr up, place it to new register. */ > *destptr = expand_simple_binop (GET_MODE (*destptr), PLUS, *destptr, > GEN_INT (prolog_size), > NULL_RTX, 1, OPTAB_DIRECT); > *destptr = expand_simple_binop (GET_MODE (*destptr), AND, *destptr, > GEN_INT (-desired_align), > *destptr, 1, OPTAB_DIRECT); > /* See how many bytes we skipped. */ > saveddest = expand_simple_binop (GET_MODE (*destptr), MINUS, saveddest, > *destptr, > saveddest, 1, OPTAB_DIRECT); > /* Adjust srcptr and count. */ > if (!issetmem) > *srcptr = expand_simple_binop (GET_MODE (*srcptr), MINUS, *srcptr, saveddest, > *srcptr, 1, OPTAB_DIRECT); > *count = expand_simple_binop (GET_MODE (*count), PLUS, *count, > saveddest, *count, 1, OPTAB_DIRECT); > > saveddest is a negative number in Pmode and *count is in word_mode. For > x32, when Pmode is SImode and word_mode is DImode, saveddest + *count > leads to overflow. We could fix it by using mode of saveddest to compute > saveddest + *count. But it leads to extra conversions and other backends > may run into the same problem. A better fix is to limit mode of size in > movmem and setmem expanders to Pmode. It generates better and correct > memcpy and memset for x32. > > There is also a typo in comments. It should be BITS_PER_WORD, not > BITS_PER_HOST_WIDE_INT. I don't think it's a typo. It's explaining why we don't have to worry about: GET_MODE_BITSIZE (mode) > BITS_PER_HOST_WIDE_INT in the CONST_INT_P test (because in that case the GET_MODE_MASK macro will be an all-1 HOST_WIDE_INT, even though that's narrower than the real mask). I don't think the current comment covers the BITS_PER_WORD test at all. AIUI it's there because the pattern is defined as taking a length of at most word_mode, so we should stop once we reach it. FWIW, I agree Pmode makes more sense on face value. But shouldn't we replace the BITS_PER_WORD test instead of adding to it? Having both would only make a difference for Pmode > word_mode targets, which might be able to handle full Pmode lengths. Either way, the md.texi documentation should be updated too. Thanks, Richard