public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* ffix required
@ 2022-05-24 21:02 Alejandro Colomar
  2022-05-25 12:01 ` Florian Weimer
  0 siblings, 1 reply; 3+ messages in thread
From: Alejandro Colomar @ 2022-05-24 21:02 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 1488 bytes --]

Hi Florian,

I was testing my regex-based program for finding C source code, and 
accidentally found some incorrectly indented braces in glibc that broke 
my regex.

In the code below, see how it finds two consecutive structures (the 
first one incorrectly), due to the regex trying to find opening and 
closing braces with the same indentation level.

It may be good to check if this happens elsewhere too.

Cheers,

Alex


$ grepc ucontext_t ./sysdeps/unix/sysv/linux/x86/sys/ucontext.h
./sysdeps/unix/sysv/linux/x86/sys/ucontext.h:133:
typedef struct
   {
     gregset_t __ctx(gregs);
     /* Note that fpregs is a pointer.  */
     fpregset_t __ctx(fpregs);
     __extension__ unsigned long long __reserved1 [8];
} mcontext_t;

/* Userlevel context.  */
typedef struct ucontext_t
   {
     unsigned long int __ctx(uc_flags);
     struct ucontext_t *uc_link;
     stack_t uc_stack;
     mcontext_t uc_mcontext;
     sigset_t uc_sigmask;
     struct _libc_fpstate __fpregs_mem;
     __extension__ unsigned long long int __ssp[4];
   } ucontext_t;


./sysdeps/unix/sysv/linux/x86/sys/ucontext.h:247:
typedef struct ucontext_t
   {
     unsigned long int __ctx(uc_flags);
     struct ucontext_t *uc_link;
     stack_t uc_stack;
     mcontext_t uc_mcontext;
     sigset_t uc_sigmask;
     struct _libc_fpstate __fpregs_mem;
     unsigned long int __ssp[4];
   } ucontext_t;


-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ffix required
  2022-05-24 21:02 ffix required Alejandro Colomar
@ 2022-05-25 12:01 ` Florian Weimer
  2022-05-25 13:30   ` Alejandro Colomar
  0 siblings, 1 reply; 3+ messages in thread
From: Florian Weimer @ 2022-05-25 12:01 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: GNU C Library

* Alejandro Colomar:

> $ grepc ucontext_t ./sysdeps/unix/sysv/linux/x86/sys/ucontext.h
> ./sysdeps/unix/sysv/linux/x86/sys/ucontext.h:133:
> typedef struct
>   {
>     gregset_t __ctx(gregs);
>     /* Note that fpregs is a pointer.  */
>     fpregset_t __ctx(fpregs);
>     __extension__ unsigned long long __reserved1 [8];
> } mcontext_t;

What would you do if this turned into this?

typedef struct {
    gregset_t __ctx(gregs);
    /* Note that fpregs is a pointer.  */
    fpregset_t __ctx(fpregs);
    __extension__ unsigned long long __reserved1 [8];
} mcontext_t;

It's probably not how we'd fix this (it's not really GNU style), but
other projects might use that.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ffix required
  2022-05-25 12:01 ` Florian Weimer
@ 2022-05-25 13:30   ` Alejandro Colomar
  0 siblings, 0 replies; 3+ messages in thread
From: Alejandro Colomar @ 2022-05-25 13:30 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 2014 bytes --]

Hi, Florian!

On 5/25/22 14:01, Florian Weimer wrote:
> * Alejandro Colomar:
> 
>> $ grepc ucontext_t ./sysdeps/unix/sysv/linux/x86/sys/ucontext.h
>> ./sysdeps/unix/sysv/linux/x86/sys/ucontext.h:133:
>> typedef struct
>>    {
>>      gregset_t __ctx(gregs);
>>      /* Note that fpregs is a pointer.  */
>>      fpregset_t __ctx(fpregs);
>>      __extension__ unsigned long long __reserved1 [8];
>> } mcontext_t;
> 
> What would you do if this turned into this?
> 
> typedef struct {
>      gregset_t __ctx(gregs);
>      /* Note that fpregs is a pointer.  */
>      fpregset_t __ctx(fpregs);
>      __extension__ unsigned long long __reserved1 [8];
> } mcontext_t;

That is covered.  The regex is:

'(?s)^[ \t]*typedef\s+(struct|union|enum)\b(?:(?!\W'"$1"'\W)([\w 
\t[\]]|::))*\n*([ \t]*){(?:(?!^\3?}).)*?^\3}\s*'"$1"'(\[[\w\(,\)]\])*;'

with the relevant part being '\n*([ \t]*){(?:(?!^\3?}).)*?^\3}'

'\3', which resolves to '([ \t]*)' is empty when the opening brace is in 
the same line as '(struct|union|enum)'.  So, if the opening brace is in 
the same line, the closing brace must have no leading whitespace for 
this regex to work.

You can find the full program source code here, if you're 
curious/interested (it's a small sh(1) script):

<http://www.alejandro-colomar.es/src/alx/alx/grepc.git/tree/bin/grepc>
<git://www.alejandro-colomar.es/src/alx/alx/grepc.git>

> 
> It's probably not how we'd fix this (it's not really GNU style), but
> other projects might use that.

The regex isn't perfect, and probably there's no perfect regex for 
searching C code, but I support "sane" coding standards (or I attempt to).

The only widely-used (although luckily not so much these days) style 
that I'm aware that my regexes can't find is K&R-style function 
definitions, i.e.:

foo(int, long)
     int x;
     long y;
{
     return x;
}

But I can live with it.


Cheers,

Alex


-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-25 13:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24 21:02 ffix required Alejandro Colomar
2022-05-25 12:01 ` Florian Weimer
2022-05-25 13:30   ` Alejandro Colomar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).