public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Testsuite regular expression question
@ 2009-10-26 23:35 Steve Ellcey
  2009-10-27 10:38 ` Kaveh R. GHAZI
  2009-10-27 12:03 ` Andreas Schwab
  0 siblings, 2 replies; 8+ messages in thread
From: Steve Ellcey @ 2009-10-26 23:35 UTC (permalink / raw)
  To: gcc; +Cc: janis187


I am looking at a failure of gcc.dg/debug/dwarf2/inline2.c on IA64
HP-UX.  The problem I have is with the assembler scan:

/* { dg-final { scan-assembler-times "byte.*?0x3.*? DW_AT_inline" 3 } } */

IA64 HP-UX is using 'data1' instead of 'byte' in the output.  Now that
should be easy to fix and if I change the string to:

/* { dg-final { scan-assembler-times "data1.*?0x3.*? DW_AT_inline" 3 } } */

I do get this test to pass.  But other systems are using 'byte' so of
course I need to allow for either byte or data1.  I have tried:

/* { dg-final { scan-assembler-times "(byte|data1).*?0x3.*? DW_AT_inline" 3 } } */

/* { dg-final { scan-assembler-times "(byte\|data1).*?0x3.*? DW_AT_inline" 3 } } */

/* { dg-final { scan-assembler-times "\(byte\|data1\).*?0x3.*? DW_AT_inline" 3 } } */

And various other escapes, parenthesis, etc but cannot get any of them
to work.  Does anyone know what this RE should look like to allow 'byte'
or 'data1' in the string?  It seems to break as soon as I introduce
parenthesis anywhere in the RE.

Steve Ellcey
sje@cup.hp.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-26 23:35 Testsuite regular expression question Steve Ellcey
@ 2009-10-27 10:38 ` Kaveh R. GHAZI
  2009-10-27 17:00   ` Steve Ellcey
  2009-10-27 17:16   ` Steve Ellcey
  2009-10-27 12:03 ` Andreas Schwab
  1 sibling, 2 replies; 8+ messages in thread
From: Kaveh R. GHAZI @ 2009-10-27 10:38 UTC (permalink / raw)
  To: Steve Ellcey; +Cc: gcc, janis187

On Mon, 26 Oct 2009, Steve Ellcey wrote:

> I have tried:
> /* { dg-final { scan-assembler-times "(byte|data1).*?0x3.*? DW_AT_inline" 3 } } */
> /* { dg-final { scan-assembler-times "(byte\|data1).*?0x3.*? DW_AT_inline" 3 } } */
> /* { dg-final { scan-assembler-times "\(byte\|data1\).*?0x3.*? DW_AT_inline" 3 } } */
>
> And various other escapes, parenthesis, etc but cannot get any of them
> to work.  Does anyone know what this RE should look like to allow 'byte'
> or 'data1' in the string?  It seems to break as soon as I introduce
> parenthesis anywhere in the RE.

I think you need two backslashes escaping the paren and maybe one before
the pipe.  I'm not sure about the \| vs | though, give the double escape
paren a try both ways.  You can see some examples via:

	find gcc/testsuite | xargs grep 'scan-assembler-times.*[(|]'

		--Kaveh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-26 23:35 Testsuite regular expression question Steve Ellcey
  2009-10-27 10:38 ` Kaveh R. GHAZI
@ 2009-10-27 12:03 ` Andreas Schwab
  1 sibling, 0 replies; 8+ messages in thread
From: Andreas Schwab @ 2009-10-27 12:03 UTC (permalink / raw)
  To: sje; +Cc: gcc, janis187

Steve Ellcey <sje@cup.hp.com> writes:

> I do get this test to pass.  But other systems are using 'byte' so of
> course I need to allow for either byte or data1.  I have tried:
>
> /* { dg-final { scan-assembler-times "(byte|data1).*?0x3.*? DW_AT_inline" 3 } } */
>
> /* { dg-final { scan-assembler-times "(byte\|data1).*?0x3.*? DW_AT_inline" 3 } } */
>
> /* { dg-final { scan-assembler-times "\(byte\|data1\).*?0x3.*? DW_AT_inline" 3 } } */

They should all three do exactly the same, since "\(" is identical to
"(" as a tcl string.

Andreas.

-- 
Andreas Schwab, schwab@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-27 10:38 ` Kaveh R. GHAZI
@ 2009-10-27 17:00   ` Steve Ellcey
  2009-10-27 17:13     ` Andreas Schwab
  2009-10-27 17:16   ` Steve Ellcey
  1 sibling, 1 reply; 8+ messages in thread
From: Steve Ellcey @ 2009-10-27 17:00 UTC (permalink / raw)
  To: Kaveh R. GHAZI; +Cc: gcc, janis187

On Tue, 2009-10-27 at 02:09 -0400, Kaveh R. GHAZI wrote:
> On Mon, 26 Oct 2009, Steve Ellcey wrote:
> 
> > I have tried:
> > /* { dg-final { scan-assembler-times "(byte|data1).*?0x3.*? DW_AT_inline" 3 } } */
> > /* { dg-final { scan-assembler-times "(byte\|data1).*?0x3.*? DW_AT_inline" 3 } } */
> > /* { dg-final { scan-assembler-times "\(byte\|data1\).*?0x3.*? DW_AT_inline" 3 } } */
> >
> > And various other escapes, parenthesis, etc but cannot get any of them
> > to work.  Does anyone know what this RE should look like to allow 'byte'
> > or 'data1' in the string?  It seems to break as soon as I introduce
> > parenthesis anywhere in the RE.
> 
> I think you need two backslashes escaping the paren and maybe one before
> the pipe.  I'm not sure about the \| vs | though, give the double escape
> paren a try both ways.  You can see some examples via:
> 
> 	find gcc/testsuite | xargs grep 'scan-assembler-times.*[(|]'
> 
> 		--Kaveh

I tried two backslashes in front of the parens and either one or no
backslashes in front of the pipe.  None of them work.  I did a find
looking for examples and that didn't help.  The most common usage I
found was no backslashes before the parens and one backslash before the
pipe.  But those examples didn't have any .*? expressions in them and I
don't know if that is messing things up somehow because that doesn't
work for me in this case.

Steve Ellcey
sje@cup.hp.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-27 17:00   ` Steve Ellcey
@ 2009-10-27 17:13     ` Andreas Schwab
  2009-10-27 17:22       ` Steve Ellcey
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2009-10-27 17:13 UTC (permalink / raw)
  To: sje; +Cc: Kaveh R. GHAZI, gcc, janis187

The regexp should not use .* in the first place, because "." also
matches the newline, and you need to use the non-capturing variant of
the grouping operator.

$ tclsh
% set fd [open "~/src/gcc/gcc/gcc/testsuite/gcc.dg/debug/dwarf2/inline2.s" r]
file3
% set text [read $fd]
[...]
% regexp -inline -all -- "byte.*?0x3.*? DW_AT_inline" $text
{byte	0x3	# Start new file
	.uleb128 0x0	# Included from line number 0
	.uleb128 0x1	# file inline2.c
	.byte	0x1	# Define macro
	.uleb128 0x0	# At line number 0
	.ascii "__STDC__ 1\0"	# The macro
	.byte	0x1	# Define macro
	.uleb128 0x0	# At line number 0
	.ascii "__STDC_HOSTED__ 1\0"	# The macro
	.byte	0x1	# Define macro
	.uleb128 0x0	# At line number 0
	.ascii "__GNUC__ 4\0"	# The macro
	.byte	0x1	# Define macro
	.uleb128 0x0	# At line number 0
[...]
% regexp -inline -all -- "byte\[^\n\]*0x3\[^\n\]* DW_AT_inline" $text
{byte	0x3	# DW_AT_inline} {byte	0x3	# DW_AT_inline} {byte	0x3	# DW_AT_inline}
% regexp -inline -all -- "(byte|data1)\[^\n\]*0x3\[^\n\]* DW_AT_inline" $text
{byte	0x3	# DW_AT_inline} byte {byte	0x3	# DW_AT_inline} byte {byte	0x3	# DW_AT_inline} byte
% regexp -inline -all -- "(?:byte|data1)\[^\n\]*0x3\[^\n\]* DW_AT_inline" $text
{byte	0x3	# DW_AT_inline} {byte	0x3	# DW_AT_inline} {byte	0x3	# DW_AT_inline}

Andreas.

-- 
Andreas Schwab, schwab@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-27 10:38 ` Kaveh R. GHAZI
  2009-10-27 17:00   ` Steve Ellcey
@ 2009-10-27 17:16   ` Steve Ellcey
  1 sibling, 0 replies; 8+ messages in thread
From: Steve Ellcey @ 2009-10-27 17:16 UTC (permalink / raw)
  To: Kaveh R. GHAZI, Andreas Schwab; +Cc: gcc, janis187


So it looks like the problem isn't in the pattern matching, it is in the
counting.

If I use this:

/* { dg-final { scan-assembler-times  "data1.*?0x3.*? DW_AT_inline" 3 } } */

Everything works.  If I change it to:

/* { dg-final { scan-assembler-times  "(data1|byte).*?0x3.*? DW_AT_inline" 3 } } */

Then it fails.  But if I change the count from 3 to 2, like this:

/* { dg-final { scan-assembler-times  "(data1|byte).*?0x3.*? DW_AT_inline" 2 } } */

Then it works again.

Now, there are 3 instances of the string in the output file, I know that because the first
string that I use (with no pipe in it) works and by looking at the assembler file by hand.
But for some reason when I add the parens to the expression then the scan-assembler-times thinks
there are only two instances.  But there are 3 when I run it by hand and all 3 lines are exactly
identical.  This must be a bug in the scan-assembler-times code somewhere.

Steve Ellcey
sje@cup.hp.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-27 17:13     ` Andreas Schwab
@ 2009-10-27 17:22       ` Steve Ellcey
  2009-10-27 20:18         ` Andreas Schwab
  0 siblings, 1 reply; 8+ messages in thread
From: Steve Ellcey @ 2009-10-27 17:22 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Kaveh R. GHAZI, gcc, janis187

On Tue, 2009-10-27 at 17:59 +0100, Andreas Schwab wrote:
> The regexp should not use .* in the first place, because "." also
> matches the newline, and you need to use the non-capturing variant of
> the grouping operator.

I am not sure what the non-capturing variant of the grouping operator
is.  Is that the '?:' ?   I have never seen that used in a
scan-assembler anywhere else.  But this patch does seem to work for me:

-/* { dg-final { scan-assembler-times "byte.*?0x3.*? DW_AT_inline" 3 } } */
+/* { dg-final { scan-assembler-times  "(?:byte|data1)\[^\n\]*0x3\[^\n\]* DW_AT_inline" 3 } } */

I'll let it run tonight on my other platforms to make sure it works on
more then IA64 HP-UX before submitting it.

Steve Ellcey
sje@cup.hp.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Testsuite regular expression question
  2009-10-27 17:22       ` Steve Ellcey
@ 2009-10-27 20:18         ` Andreas Schwab
  0 siblings, 0 replies; 8+ messages in thread
From: Andreas Schwab @ 2009-10-27 20:18 UTC (permalink / raw)
  To: sje; +Cc: Kaveh R. GHAZI, gcc, janis187

Steve Ellcey <sje@cup.hp.com> writes:

> I am not sure what the non-capturing variant of the grouping operator
> is. Is that the '?:' ?

Yes, see re_syntax(n).

> I have never seen that used in a scan-assembler anywhere else.

None of them use grouping in first place (except for a few by accident).

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-10-27 19:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-26 23:35 Testsuite regular expression question Steve Ellcey
2009-10-27 10:38 ` Kaveh R. GHAZI
2009-10-27 17:00   ` Steve Ellcey
2009-10-27 17:13     ` Andreas Schwab
2009-10-27 17:22       ` Steve Ellcey
2009-10-27 20:18         ` Andreas Schwab
2009-10-27 17:16   ` Steve Ellcey
2009-10-27 12:03 ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).