From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1946 invoked by alias); 20 Mar 2012 22:19:01 -0000 Received: (qmail 1915 invoked by uid 22791); 20 Mar 2012 22:18:59 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 20 Mar 2012 22:18:31 +0000 Received: from svr-orw-fem-01.mgc.mentorg.com ([147.34.98.93]) by relay1.mentorg.com with esmtp id 1SA7Nu-0002j6-PA from wade_farnsworth@mentor.com ; Tue, 20 Mar 2012 15:18:30 -0700 Received: from SVR-ORW-FEM-04.mgc.mentorg.com ([147.34.97.41]) by svr-orw-fem-01.mgc.mentorg.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Tue, 20 Mar 2012 15:18:30 -0700 Received: from [172.30.11.251] (147.34.91.1) by svr-orw-fem-04.mgc.mentorg.com (147.34.97.41) with Microsoft SMTP Server id 14.1.289.1; Tue, 20 Mar 2012 15:18:29 -0700 Message-ID: <4F69022E.6070400@mentor.com> Date: Tue, 20 Mar 2012 22:19:00 -0000 From: Wade Farnsworth User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110424 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Josh Stone , CC: , Mark Wielaard Subject: Re: [PATCH] PR13475: Fix ARM SDT_V3 operand parsing References: <4F58C7E0.70008@mentor.com> <4F59AE29.7020903@redhat.com> In-Reply-To: <4F59AE29.7020903@redhat.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2012-q1/txt/msg00349.txt.bz2 Josh Stone wrote: > On 03/08/2012 06:53 AM, Wade Farnsworth wrote: >> * Allow for whitespace in ARM operands containing []'s > > IIRC, this is sort of a latent issue on all archs. While the compiler > tends not to add spaces on e.g. x86, SDT probes in hand-written asm > could very well have spaces in the arguments. The only real separator > we have is the '@' in each SIZE@LOCATION. I think Roland chose '@' > specifically for the belief that it wouldn't ever appear in the actual > location asm string. > > I'm not sure if regex matching would make this easier or not, but it's > something like: ([+-]?\d+@[^@]+?)(\s+[+-]?\d+@[^@]+?)* > > Or just iteratively, find('@'), rfind(' '), split, repeat. > > The actual handling for the location will always be arch specific, of > course, like dealing with ARM's []. > > It appears that operand parsing on ARM fails with V2 and V1 probes as well, which, as I understand, don't have the SIZE@ notation. So your suggestion won't work in the generic case. With that in mind, It may be beneficial to identify where the non-delimiting whitespace may occur. On ARM, I only receive non-delimiting whitespace after a comma. I don't believe that any architecture could have a comma as the last character of a token (correct me if I'm wrong), so it would be simple enough to implement a version of tokenize() that detects a ", " sequence as non-delimiting. Does this sound like a reasonable approach? Would there be any other such exceptions that we should be detecting? Thanks, Wade