From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 645 invoked by alias); 31 Jul 2012 18:17:37 -0000 Received: (qmail 631 invoked by uid 22791); 31 Jul 2012 18:17:35 -0000 X-SWARE-Spam-Status: No, hits=-4.1 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,MSGID_FROM_MTA_HEADER,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,TW_SR,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e06smtp14.uk.ibm.com (HELO e06smtp14.uk.ibm.com) (195.75.94.110) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 31 Jul 2012 18:17:22 +0000 Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 31 Jul 2012 19:17:20 +0100 Received: from d06nrmr1407.portsmouth.uk.ibm.com (9.149.38.185) by e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 31 Jul 2012 19:17:19 +0100 Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6VIHI6G3104892 for ; Tue, 31 Jul 2012 19:17:18 +0100 Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6VIHIWV000895 for ; Tue, 31 Jul 2012 12:17:18 -0600 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id q6VIHH8e000866; Tue, 31 Jul 2012 12:17:17 -0600 Message-Id: <201207311817.q6VIHH8e000866@d06av02.portsmouth.uk.ibm.com> Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Tue, 31 Jul 2012 20:17:16 +0200 Subject: Re: [PATCH 0/2] Convert s390 to atomic optabs, v2 To: rth@redhat.com (Richard Henderson) Date: Tue, 31 Jul 2012 18:36:00 -0000 From: "Ulrich Weigand" Cc: gcc-patches@gcc.gnu.org, rguenther@suse.de In-Reply-To: <1343687574-3244-1-git-send-email-rth@redhat.com> from "Richard Henderson" at Jul 30, 2012 03:32:52 PM MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit x-cbid: 12073118-1948-0000-0000-0000028C5F79 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-07/txt/msg01584.txt.bz2 Richard Henderson wrote: > I've had a go at generating better code in the HQImode CAS > loop for aligned memory, but I don't know that I'd call it > the most efficient thing ever. Thanks for having a look at this! > (3) Support for IC, and ICM via the insv pattern is lacking. > I've added a tiny bit of support here, in the form of using > the existing strict_low_part patterns, but most definitely we > could do better. This doesn't look correct: + /* Emit a strict_low_part pattern if possible. */ + if (bitpos == 0 && GET_MODE_BITSIZE (smode) == bitsize) With bitpos == 0 we need to insert into the *high* part, not the low part on a big-endian platform. This probably causes this incorrect code below: icm %r5,3,0(%r12) We'd need icm mask 12, not 3, to load into the two upper bytes. [ This is also probably causing the testing failures I'm seeing with the patch as-is. I haven't looked into them in detail yet. ] > (4) The *sethighpartsi and *sethighpartdi_64 patterns ought to be > more different. As is, we can't insert into bits 48-56 of a > DImode quantity, because we don't generate ICM for DImode, > only ICMH. > > (5) Missing support for RISBGZ in the form of an extv/z expander. > The existing *extv/z splitters probably ought to be conditionalized > on !Z10. > > (6) The strict_low_part patterns should allow registers for at > least Z10. The SImode strict_low_part can use LR everywhere. > > (7) RISBGZ could be used for a 3-address constant lshrsi3 before > srlk is available. Good points, agreed with all of that. None of that ought to be a prerequisite for the atomic patch, of course ... > * Given that we're having to zap the mask in %r1 for the second > compare anyway, I wonder if RISBG is really beneficial over OR. > Is RISBG (or ICM for that matter) any faster (or even smaller)? Just a plain OR is preferable to a RISBG. I guess the point of the RISBG is that you can avoid the extra shift ... Now, if that shift can be moved ahead of the loop, that may not be all that big of a win. On the other hand, these loops hopefully don't loop very often if we don't have a lot of contention ... Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE Ulrich.Weigand@de.ibm.com