Proposal: format GDB Python files with black

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* Proposal: format GDB Python files with black
@ 2021-04-26 15:55 Simon Marchi
  2021-04-26 16:21 ` Andrew Burgess
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: Simon Marchi @ 2021-04-26 15:55 UTC (permalink / raw)
  To: gdb-patches

Hi all,

A few people I talked to about this and I have good experience with the
tool black to auto-format Python code.  It is simple to use, fast and
reliable (if it detects that it changed the meaning of the program by
mistake, internal-errors out).  I don't think I need to spell out the
advantage of using such a tool, but just in case: it removes all the
overhead of thinking about formatting when writing or reviewing code.

My suggestion would be to format our code with black all at once.  The
typical counter-argument to this is "it will be harder to use
git-blame".  I don't think this is true, because you need to be able to
skip over commits anyway, and it's trivial to skip over a commit when
git-blaming using an interactive tool.  But it's also possible to tell
git to ignore certain commits when git-blaming [2], so we can do that.

I think the output of black is quite readable.  When it does weird
stuff, it's generally because the lines / expressions are two long
anyway, and deserve to be split in multiple lines / expressions.  Here's
a branch that shows how it would look like:

  https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/simark/black

If the feedback is overall positive, I'll send a more format patch.

Simon

[1] https://github.com/psf/black
[2] https://github.com/psf/black#migrating-your-code-style-without-ruining-git-blame

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
@ 2021-04-26 16:21 ` Andrew Burgess
  2021-04-26 17:25   ` Simon Marchi
  2021-04-26 17:42   ` Tom Tromey
  2021-04-26 17:34 ` Paul Koning
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 28+ messages in thread
From: Andrew Burgess @ 2021-04-26 16:21 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> [2021-04-26 11:55:21 -0400]:

> Hi all,
> 
> A few people I talked to about this and I have good experience with the
> tool black to auto-format Python code.  It is simple to use, fast and
> reliable (if it detects that it changed the meaning of the program by
> mistake, internal-errors out).  I don't think I need to spell out the
> advantage of using such a tool, but just in case: it removes all the
> overhead of thinking about formatting when writing or reviewing code.
> 
> My suggestion would be to format our code with black all at once.  The
> typical counter-argument to this is "it will be harder to use
> git-blame".  I don't think this is true, because you need to be able to
> skip over commits anyway, and it's trivial to skip over a commit when
> git-blaming using an interactive tool.  But it's also possible to tell
> git to ignore certain commits when git-blaming [2], so we can do that.
> 
> I think the output of black is quite readable.  When it does weird
> stuff, it's generally because the lines / expressions are two long
> anyway, and deserve to be split in multiple lines / expressions.  Here's
> a branch that shows how it would look like:

In general, the idea of auto-formatting the code sounds great, but I
would like to understand more about how the process would work.

Right now, if I make a change I edit a file, git add, git commit, and
finally, git push.  At what point does the auto-formatting take place?
Do I need to manually run the tool after editing the file?  Is it
done, well, automatically at some point?  And if it is done
automatically, then at what point in the process does this happen, and
how does this relate to code review?

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 16:21 ` Andrew Burgess
@ 2021-04-26 17:25   ` Simon Marchi
  2021-04-26 17:50     ` Andrew Burgess
  2021-04-26 17:42   ` Tom Tromey
  1 sibling, 1 reply; 28+ messages in thread
From: Simon Marchi @ 2021-04-26 17:25 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

On 2021-04-26 12:21 p.m., Andrew Burgess wrote:
> * Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> [2021-04-26 11:55:21 -0400]:
> 
>> Hi all,
>>
>> A few people I talked to about this and I have good experience with the
>> tool black to auto-format Python code.  It is simple to use, fast and
>> reliable (if it detects that it changed the meaning of the program by
>> mistake, internal-errors out).  I don't think I need to spell out the
>> advantage of using such a tool, but just in case: it removes all the
>> overhead of thinking about formatting when writing or reviewing code.
>>
>> My suggestion would be to format our code with black all at once.  The
>> typical counter-argument to this is "it will be harder to use
>> git-blame".  I don't think this is true, because you need to be able to
>> skip over commits anyway, and it's trivial to skip over a commit when
>> git-blaming using an interactive tool.  But it's also possible to tell
>> git to ignore certain commits when git-blaming [2], so we can do that.
>>
>> I think the output of black is quite readable.  When it does weird
>> stuff, it's generally because the lines / expressions are two long
>> anyway, and deserve to be split in multiple lines / expressions.  Here's
>> a branch that shows how it would look like:
> 
> In general, the idea of auto-formatting the code sounds great, but I
> would like to understand more about how the process would work.
> 
> Right now, if I make a change I edit a file, git add, git commit, and
> finally, git push.  At what point does the auto-formatting take place?
> Do I need to manually run the tool after editing the file?  Is it
> done, well, automatically at some point?  And if it is done
> automatically, then at what point in the process does this happen, and
> how does this relate to code review?

Those are good questions.

At the simplest level, the individual making changes is responsible for
the tool at any time they want, as long as it's before sending the patch
and / or pushing to master.  So it would be pretty much like today,
where we are responsible for formatting the source code correctly,
except that it would now be tool-assisted instead of done mentally.

How to run the tool is left to each individual:

1. By hand on the command line
2. From the editor [1]
3. Using a git-commit hook

If we want to be a bit more sophisticated, we could check-in a git
commit hook that runs black (if available) to check your formatting as
you are committing, if your commit includes changes to Python files.
That doesn't prevents formatting errors from slipping in the repo, but
it's already a good start (and formatting errors slipping in the repo
are not the end of the world).

We don't really have CI right now (especially pre-merge testing), but if
we had, checking the formatting of Python files could be a validation
step.

We do however, have scripts that run server side on push, so it would be
possible to run black server-side to reject commits that would introduce
formatting errors.

Simon

[1] https://black.readthedocs.io/en/stable/editor_integration.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
  2021-04-26 16:21 ` Andrew Burgess
@ 2021-04-26 17:34 ` Paul Koning
  2021-04-26 17:44   ` Simon Marchi
  2021-04-26 17:39 ` Tom Tromey
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Paul Koning @ 2021-04-26 17:34 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> On Apr 26, 2021, at 11:55 AM, Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> wrote:
> 
> Hi all,
> 
> A few people I talked to about this and I have good experience with the
> tool black to auto-format Python code.  It is simple to use, fast and
> reliable (if it detects that it changed the meaning of the program by
> mistake, internal-errors out).  I don't think I need to spell out the
> advantage of using such a tool, but just in case: it removes all the
> overhead of thinking about formatting when writing or reviewing code.

In the case of Python, formatting is a core part of writing the code.  Apart from that, I don't see much need for this, what with Emacs editing mechanisms.

I looked at the sample you mentioned.  It's reasonable enough except for the bizarre way of formatting a long parenthesized condition in an if statement (near the start of FrameDecorator.py).  So while I can't see any reason to use this, it seems ok as a tool that others might use if they like to do so.

I would object to having this mandatory.

	paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
  2021-04-26 16:21 ` Andrew Burgess
  2021-04-26 17:34 ` Paul Koning
@ 2021-04-26 17:39 ` Tom Tromey
  2021-04-30 16:26   ` Joel Brobecker
  2021-04-26 22:40 ` Lancelot SIX
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: Tom Tromey @ 2021-04-26 17:39 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches

>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

Simon> A few people I talked to about this and I have good experience with the
Simon> tool black to auto-format Python code.

Yeah, +1 from me.  We use it internally here and it has been fine.  And,
it eliminates an entire review step, which is good.

Tom

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 16:21 ` Andrew Burgess
  2021-04-26 17:25   ` Simon Marchi
@ 2021-04-26 17:42   ` Tom Tromey
  1 sibling, 0 replies; 28+ messages in thread
From: Tom Tromey @ 2021-04-26 17:42 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Simon Marchi, gdb-patches

>>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:

Andrew> Right now, if I make a change I edit a file, git add, git commit, and
Andrew> finally, git push.  At what point does the auto-formatting take place?
Andrew> Do I need to manually run the tool after editing the file?  Is it
Andrew> done, well, automatically at some point?  And if it is done
Andrew> automatically, then at what point in the process does this happen, and
Andrew> how does this relate to code review?

Internally, we do it via the Python 'pre-commit' program.  Then if I
"git commit" in the repository, the formatter is run.
It's easy to run manually as well though.

For gdb I think it would relate in the same way that testing is done:
you just indicate that you did it.

Tom

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 17:34 ` Paul Koning
@ 2021-04-26 17:44   ` Simon Marchi
  2021-04-26 17:48     ` Simon Marchi
  0 siblings, 1 reply; 28+ messages in thread
From: Simon Marchi @ 2021-04-26 17:44 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

On 2021-04-26 1:34 p.m., Paul Koning wrote:> 
> 
>> On Apr 26, 2021, at 11:55 AM, Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> wrote:
>>
>> Hi all,
>>
>> A few people I talked to about this and I have good experience with the
>> tool black to auto-format Python code.  It is simple to use, fast and
>> reliable (if it detects that it changed the meaning of the program by
>> mistake, internal-errors out).  I don't think I need to spell out the
>> advantage of using such a tool, but just in case: it removes all the
>> overhead of thinking about formatting when writing or reviewing code.
> 
> In the case of Python, formatting is a core part of writing the code.  Apart from that, I don't see much need for this, what with Emacs editing mechanisms.
> 
> I looked at the sample you mentioned.  It's reasonable enough except for the bizarre way of formatting a long parenthesized condition in an if statement (near the start of FrameDecorator.py).  So while I can't see any reason to use this, it seems ok as a tool that others might use if they like to do so.
> 
> I would object to having this mandatory.

The thing is that it's difficult for some people to use an auto-format
tool if not everybody use it.  You would always get some spurious
formatting changes, so you would need to hand-edit the patch after that
to avoid adding those.  You don't really save time.  What would save me
(and everybody) some cycles is something we could just run mindlessly.

For clang-format, there is git-clang-format that formats just the lines
touched by a commit, perhaps there's something similar with black.  That
sounds like a good idea, but in my experience you just end up with a
weird mix of styles that is even worse, and it still ends up
reformatting more than what you really changed.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 17:44   ` Simon Marchi
@ 2021-04-26 17:48     ` Simon Marchi
  0 siblings, 0 replies; 28+ messages in thread
From: Simon Marchi @ 2021-04-26 17:48 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb-patches

On 2021-04-26 1:44 p.m., Simon Marchi via Gdb-patches wrote:> On 2021-04-26 1:34 p.m., Paul Koning wrote:> 
>>
>>> On Apr 26, 2021, at 11:55 AM, Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> wrote:
>>>
>>> Hi all,
>>>
>>> A few people I talked to about this and I have good experience with the
>>> tool black to auto-format Python code.  It is simple to use, fast and
>>> reliable (if it detects that it changed the meaning of the program by
>>> mistake, internal-errors out).  I don't think I need to spell out the
>>> advantage of using such a tool, but just in case: it removes all the
>>> overhead of thinking about formatting when writing or reviewing code.
>>
>> In the case of Python, formatting is a core part of writing the code.  Apart from that, I don't see much need for this, what with Emacs editing mechanisms.
>>
>> I looked at the sample you mentioned.  It's reasonable enough except for the bizarre way of formatting a long parenthesized condition in an if statement (near the start of FrameDecorator.py).  So while I can't see any reason to use this, it seems ok as a tool that others might use if they like to do so.
>>
>> I would object to having this mandatory.
> 
> The thing is that it's difficult for some people to use an auto-format
> tool if not everybody use it.  You would always get some spurious
> formatting changes, so you would need to hand-edit the patch after that
> to avoid adding those.  You don't really save time.  What would save me
> (and everybody) some cycles is something we could just run mindlessly.
> 
> For clang-format, there is git-clang-format that formats just the lines
> touched by a commit, perhaps there's something similar with black.  That
> sounds like a good idea, but in my experience you just end up with a
> weird mix of styles that is even worse, and it still ends up
> reformatting more than what you really changed.

To be clear, I wouldn't really mind if some contributors didn't use it
(for the same reason that it's not the end of the world if formatting
errors are introduced).  If I detect that a file I want to modify is
not black-compliant, I'll just make a preparatory commit "Format foo.py
with black" and make my patch on top of that.  But I would appreciate if
we could do a first pass of formatting all the code, to have a good
baseline.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 17:25   ` Simon Marchi
@ 2021-04-26 17:50     ` Andrew Burgess
  2021-04-26 18:08       ` Simon Marchi
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Burgess @ 2021-04-26 17:50 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-04-26 13:25:37 -0400]:

> On 2021-04-26 12:21 p.m., Andrew Burgess wrote:
> > * Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> [2021-04-26 11:55:21 -0400]:
> > 
> >> Hi all,
> >>
> >> A few people I talked to about this and I have good experience with the
> >> tool black to auto-format Python code.  It is simple to use, fast and
> >> reliable (if it detects that it changed the meaning of the program by
> >> mistake, internal-errors out).  I don't think I need to spell out the
> >> advantage of using such a tool, but just in case: it removes all the
> >> overhead of thinking about formatting when writing or reviewing code.
> >>
> >> My suggestion would be to format our code with black all at once.  The
> >> typical counter-argument to this is "it will be harder to use
> >> git-blame".  I don't think this is true, because you need to be able to
> >> skip over commits anyway, and it's trivial to skip over a commit when
> >> git-blaming using an interactive tool.  But it's also possible to tell
> >> git to ignore certain commits when git-blaming [2], so we can do that.
> >>
> >> I think the output of black is quite readable.  When it does weird
> >> stuff, it's generally because the lines / expressions are two long
> >> anyway, and deserve to be split in multiple lines / expressions.  Here's
> >> a branch that shows how it would look like:
> > 
> > In general, the idea of auto-formatting the code sounds great, but I
> > would like to understand more about how the process would work.
> > 
> > Right now, if I make a change I edit a file, git add, git commit, and
> > finally, git push.  At what point does the auto-formatting take place?
> > Do I need to manually run the tool after editing the file?  Is it
> > done, well, automatically at some point?  And if it is done
> > automatically, then at what point in the process does this happen, and
> > how does this relate to code review?
> 
> Those are good questions.
> 
> At the simplest level, the individual making changes is responsible for
> the tool at any time they want, as long as it's before sending the patch
> and / or pushing to master.  So it would be pretty much like today,
> where we are responsible for formatting the source code correctly,
> except that it would now be tool-assisted instead of done mentally.
> 
> How to run the tool is left to each individual:
> 
> 1. By hand on the command line
> 2. From the editor [1]
> 3. Using a git-commit hook

So without going more sophisticated as you describe below we'd still
be relying on reviewers to either (with enough experience) "just know"
that the code is black formatted, or apply the patch, run black, and
see if the formatting changes.

Then of course, there's the questions about what happens when
different users are running different versions of black - assuming
different versions might format things slightly differently.

Another concern I'd have is that unrelated "formatting" changes might
creep in.

So, I edit a .py file and either don't run black (naughty), or use
version X of black.

Much later, I edit a different part of the same file, now either I run
black, or run version Y of black.  My original edit is reformatted
slightly differently.

My assumption is that this change (the second patch) should be
rejected at review, as the reformatting change (of the earlier code)
is unrelated to the work of the second patch.  But I suspect some folk
might find it frustrating, if on one had we say run black, but on the
other hand we say you need to exclude unrelated chunks.

I think, in the heterogeneous environments we all develop in,
situations like the above will crop up, so it would be important to
know what the expectations would be in such a case.

> 
> If we want to be a bit more sophisticated, we could check-in a git
> commit hook that runs black (if available) to check your formatting as
> you are committing, if your commit includes changes to Python files.
> That doesn't prevents formatting errors from slipping in the repo, but
> it's already a good start (and formatting errors slipping in the repo
> are not the end of the world).

Again, this sort of thing works great in a corporate environment where
you can guarantee that everyone is (or has no excuse not to be)
running the exact same version of every tool.

My concern would be that such a strategy would invite unintended
changes to creep in.

> 
> We don't really have CI right now (especially pre-merge testing), but if
> we had, checking the formatting of Python files could be a validation
> step.

Agreed.

> We do however, have scripts that run server side on push, so it would be
> possible to run black server-side to reject commits that would introduce
> formatting errors.

I guess this would solve a lot of the problems I'm concerned about.
Once we've "corrected" the formatting of all .py code then nobody
could merge bad code.

Thanks for all the info,

Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 17:50     ` Andrew Burgess
@ 2021-04-26 18:08       ` Simon Marchi
  2021-04-27  7:54         ` Andrew Burgess
  0 siblings, 1 reply; 28+ messages in thread
From: Simon Marchi @ 2021-04-26 18:08 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

On 2021-04-26 1:50 p.m., Andrew Burgess wrote:
> So without going more sophisticated as you describe below we'd still
> be relying on reviewers to either (with enough experience) "just know"
> that the code is black formatted, or apply the patch, run black, and
> see if the formatting changes.

The use of black would be documented somewhere (in the repo, in the
wiki, or both).  Nevertheless, I don't expect all contributors to know
about it and do it.

I also don't expect reviewers to be able to spot code that isn't
black-compliant.  If you think of checking it or asking about it in a
review, great.  Otherwise, I don't think we should be too afraid of
errors slipping in, since they're trivial to fix.

But to be frank, as a reviewer today, I don't really know what to check
in the Python code formatting.  Our wiki:

  https://sourceware.org/gdb/wiki/Internals%20GDB-Python-Coding-Standards

says that we follow PEP8.  I have no idea how to format PEP8 code, so
as a result I'm not really checking the formatting of Python code today.

> Then of course, there's the questions about what happens when
> different users are running different versions of black - assuming
> different versions might format things slightly differently.

Indeed, I should have mentioned this: we would define a version to use,
just like we do for autoconf/automake.

The goal of black is that the output won't change just for fun, but it
could change due to fixed bugs.

> Another concern I'd have is that unrelated "formatting" changes might
> creep in.
> 
> So, I edit a .py file and either don't run black (naughty), or use
> version X of black.
> 
> Much later, I edit a different part of the same file, now either I run
> black, or run version Y of black.  My original edit is reformatted
> slightly differently.
> 
> My assumption is that this change (the second patch) should be
> rejected at review, as the reformatting change (of the earlier code)
> is unrelated to the work of the second patch.  But I suspect some folk
> might find it frustrating, if on one had we say run black, but on the
> other hand we say you need to exclude unrelated chunks.

This is why I think it's easier to start from a clean state, by
formatting everything in one shot in the beginning.

If the issue you describe happens while I am writing a patch, I would
notice the spurious diff and push an obvious "Format foo.py with
black" commit and rebase my WIP patch on top.

If the issue you describe happens while I am revieweing a patch, I would
probably also push an obvious re-formatting commit (or ask the person to
do so, if they have push access).  The offending hunk will go away when
they rebase, which is necessary prior to pushing.

> I think, in the heterogeneous environments we all develop in,
> situations like the above will crop up, so it would be important to
> know what the expectations would be in such a case.

Indeed.  I hope I was able to show that these situations can be handled
easily and are not a showstopper.

>>
>> If we want to be a bit more sophisticated, we could check-in a git
>> commit hook that runs black (if available) to check your formatting as
>> you are committing, if your commit includes changes to Python files.
>> That doesn't prevents formatting errors from slipping in the repo, but
>> it's already a good start (and formatting errors slipping in the repo
>> are not the end of the world).
> 
> Again, this sort of thing works great in a corporate environment where
> you can guarantee that everyone is (or has no excuse not to be)
> running the exact same version of every tool.

Even in corporate environment, you can't assume that!

> My concern would be that such a strategy would invite unintended
> changes to creep in.
> 
>>
>> We don't really have CI right now (especially pre-merge testing), but if
>> we had, checking the formatting of Python files could be a validation
>> step.
> 
> Agreed.
> 
>> We do however, have scripts that run server side on push, so it would be
>> possible to run black server-side to reject commits that would introduce
>> formatting errors.
> 
> I guess this would solve a lot of the problems I'm concerned about.
> Once we've "corrected" the formatting of all .py code then nobody
> could merge bad code.

Thanks,

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
                   ` (2 preceding siblings ...)
  2021-04-26 17:39 ` Tom Tromey
@ 2021-04-26 22:40 ` Lancelot SIX
  2021-04-30 17:04   ` Tom Tromey
  2021-04-30 17:21 ` Luis Machado
  2021-05-08 16:00 ` Tom Tromey
  5 siblings, 1 reply; 28+ messages in thread
From: Lancelot SIX @ 2021-04-26 22:40 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

Hi

I have been using black on a few project and it is overall an enjoyable
experience.  It is also quite nice to have for new contributors with
limited experience with a particular coding style in place.  It gets
this out of the way.

As this has been mentioned earlier in the discussion[1], black output
might change slightly between versions.  I sometimes had some continuous
integration build fail because the version on the development
environment and the one running `black --check` differ.  The differences
are minor but this is  worth keeping in mind if “false positive” are an
issue, for example if it rejects commits in server-side scripts as
mentioned in [2].  That being said, specifying which version to use
solves the problem easily.  Another option can be to use 'black' to
help format the code and 'pycodestyle'[3] to do the automated checks,
but I am not sure this is the way you want to go to, introducing more
and more tools.

I’d also like to point out that by default, 'black' will use 88 char
long lines[4], which is not compliant with the PEP8[5], which is, as you
pointed out[1], referenced in GDB's wiki[6].  Either the tool's
configuration or the standard can be adjusted, depending on the
maintainers preferences.

For what it is worth, I really think black is a great tool that can
simplify development workflow.

Lancelot.

[1] https://sourceware.org/pipermail/gdb-patches/2021-April/178209.html
[2] https://sourceware.org/pipermail/gdb-patches/2021-April/178200.html
[3] https://pypi.org/project/pycodestyle/
[4] https://black.readthedocs.io/en/stable/the_black_code_style.html#line-length
[5] https://www.python.org/dev/peps/pep-0008/#maximum-line-length
[6] https://sourceware.org/gdb/wiki/Internals%20GDB-Python-Coding-Standards

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 18:08       ` Simon Marchi
@ 2021-04-27  7:54         ` Andrew Burgess
  2021-04-27 13:21           ` Simon Marchi
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Burgess @ 2021-04-27  7:54 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-04-26 14:08:24 -0400]:

> On 2021-04-26 1:50 p.m., Andrew Burgess wrote:
> > So without going more sophisticated as you describe below we'd still
> > be relying on reviewers to either (with enough experience) "just know"
> > that the code is black formatted, or apply the patch, run black, and
> > see if the formatting changes.
> 
> The use of black would be documented somewhere (in the repo, in the
> wiki, or both).  Nevertheless, I don't expect all contributors to know
> about it and do it.
> 
> I also don't expect reviewers to be able to spot code that isn't
> black-compliant.  If you think of checking it or asking about it in a
> review, great.  Otherwise, I don't think we should be too afraid of
> errors slipping in, since they're trivial to fix.
> 
> But to be frank, as a reviewer today, I don't really know what to check
> in the Python code formatting.  Our wiki:
> 
>   https://sourceware.org/gdb/wiki/Internals%20GDB-Python-Coding-Standards
> 
> says that we follow PEP8.  I have no idea how to format PEP8 code, so
> as a result I'm not really checking the formatting of Python code today.
> 
> > Then of course, there's the questions about what happens when
> > different users are running different versions of black - assuming
> > different versions might format things slightly differently.
> 
> Indeed, I should have mentioned this: we would define a version to use,
> just like we do for autoconf/automake.
> 
> The goal of black is that the output won't change just for fun, but it
> could change due to fixed bugs.
> 
> > Another concern I'd have is that unrelated "formatting" changes might
> > creep in.
> > 
> > So, I edit a .py file and either don't run black (naughty), or use
> > version X of black.
> > 
> > Much later, I edit a different part of the same file, now either I run
> > black, or run version Y of black.  My original edit is reformatted
> > slightly differently.
> > 
> > My assumption is that this change (the second patch) should be
> > rejected at review, as the reformatting change (of the earlier code)
> > is unrelated to the work of the second patch.  But I suspect some folk
> > might find it frustrating, if on one had we say run black, but on the
> > other hand we say you need to exclude unrelated chunks.
> 
> This is why I think it's easier to start from a clean state, by
> formatting everything in one shot in the beginning.
> 
> If the issue you describe happens while I am writing a patch, I would
> notice the spurious diff and push an obvious "Format foo.py with
> black" commit and rebase my WIP patch on top.
> 
> If the issue you describe happens while I am revieweing a patch, I would
> probably also push an obvious re-formatting commit (or ask the person to
> do so, if they have push access).  The offending hunk will go away when
> they rebase, which is necessary prior to pushing.
> 
> > I think, in the heterogeneous environments we all develop in,
> > situations like the above will crop up, so it would be important to
> > know what the expectations would be in such a case.
> 
> Indeed.  I hope I was able to show that these situations can be handled
> easily and are not a showstopper.

Thanks for taking the time to reply.

I have no objections to adopting the use of black, my only request
would be that we formally make it a policy that "rogue" re-formatting
hunks, as we discussed above, should be fixed in a separate commit,
and not included in random patches.

For me, one of the great things about working on GDB is the generally
good quality of the commits, and I feel that if commits start
including extra reformatting this would be a step backwards.

> 
> >>
> >> If we want to be a bit more sophisticated, we could check-in a git
> >> commit hook that runs black (if available) to check your formatting as
> >> you are committing, if your commit includes changes to Python files.
> >> That doesn't prevents formatting errors from slipping in the repo, but
> >> it's already a good start (and formatting errors slipping in the repo
> >> are not the end of the world).
> > 
> > Again, this sort of thing works great in a corporate environment where
> > you can guarantee that everyone is (or has no excuse not to be)
> > running the exact same version of every tool.
> 
> Even in corporate environment, you can't assume that!

I disagree, but I suspect the problem is I didn't explain myself well
enough.

Never mind, given the above I think you've answered my questions.

Thanks for your time,
Andrew


> 
> > My concern would be that such a strategy would invite unintended
> > changes to creep in.
> > 
> >>
> >> We don't really have CI right now (especially pre-merge testing), but if
> >> we had, checking the formatting of Python files could be a validation
> >> step.
> > 
> > Agreed.
> > 
> >> We do however, have scripts that run server side on push, so it would be
> >> possible to run black server-side to reject commits that would introduce
> >> formatting errors.
> > 
> > I guess this would solve a lot of the problems I'm concerned about.
> > Once we've "corrected" the formatting of all .py code then nobody
> > could merge bad code.
> 
> Thanks,
> 
> Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-27  7:54         ` Andrew Burgess
@ 2021-04-27 13:21           ` Simon Marchi
  0 siblings, 0 replies; 28+ messages in thread
From: Simon Marchi @ 2021-04-27 13:21 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> I have no objections to adopting the use of black, my only request
> would be that we formally make it a policy that "rogue" re-formatting
> hunks, as we discussed above, should be fixed in a separate commit,
> and not included in random patches.
> 
> For me, one of the great things about working on GDB is the generally
> good quality of the commits, and I feel that if commits start
> including extra reformatting this would be a step backwards.

I agree.  I thought it was kind of obvious, because it's a continuation
of what we do today: if somebody includes an unrelated formatting change
in a patch, you'll tell them about it.  But it's true that the risk of
it happening with an automated is perhaps greater, as people will run
the tool and not carefully check the output.  I happen to carefully
check the diff of my patches before sending them, but maybe not everyone
does that.

So I'll make sure to include that in the "rules to follow" for
re-formatting.

> Never mind, given the above I think you've answered my questions.
> 
> Thanks for your time,

Thanks for the questions, you raised good points I overlooked.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 17:39 ` Tom Tromey
@ 2021-04-30 16:26   ` Joel Brobecker
  0 siblings, 0 replies; 28+ messages in thread
From: Joel Brobecker @ 2021-04-30 16:26 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi via Gdb-patches

> Simon> A few people I talked to about this and I have good experience with the
> Simon> tool black to auto-format Python code.
> 
> Yeah, +1 from me.  We use it internally here and it has been fine.  And,
> it eliminates an entire review step, which is good.

Same for me. And I also recomment the "pre-commit" python program
to manage the pre-commit hooks. Both Tom and I should be able to
help with that, if you like.

-- 
Joel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 22:40 ` Lancelot SIX
@ 2021-04-30 17:04   ` Tom Tromey
  2021-04-30 17:14     ` Simon Marchi
  0 siblings, 1 reply; 28+ messages in thread
From: Tom Tromey @ 2021-04-30 17:04 UTC (permalink / raw)
  To: Lancelot SIX via Gdb-patches; +Cc: Simon Marchi, Lancelot SIX

>>>>> "Lancelot" == Lancelot SIX via Gdb-patches <gdb-patches@sourceware.org> writes:

Lancelot> Another option can be to use 'black' to
Lancelot> help format the code and 'pycodestyle'[3] to do the automated checks,
Lancelot> but I am not sure this is the way you want to go to, introducing more
Lancelot> and more tools.

Doing some kind of checking like this would be good as well.
We've used pycodestyle internally a bit, but recently (-ish) moved to
flake8.  I don't know why we made the switch or which static checker is
preferable, though.

Lancelot> I’d also like to point out that by default, 'black' will use 88 char
Lancelot> long lines[4], which is not compliant with the PEP8[5], which is, as you
Lancelot> pointed out[1], referenced in GDB's wiki[6].  Either the tool's
Lancelot> configuration or the standard can be adjusted, depending on the
Lancelot> maintainers preferences.

We only really chose PEP8 because it existed; not out of some deeper
liking for it or anything like that.  So personally I'm fine with just
switching to a tool.

Tom

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-30 17:04   ` Tom Tromey
@ 2021-04-30 17:14     ` Simon Marchi
  2021-05-01  6:42       ` Joel Brobecker
  0 siblings, 1 reply; 28+ messages in thread
From: Simon Marchi @ 2021-04-30 17:14 UTC (permalink / raw)
  To: Tom Tromey, Lancelot SIX via Gdb-patches; +Cc: Lancelot SIX



On 2021-04-30 1:04 p.m., Tom Tromey wrote:
>>>>>> "Lancelot" == Lancelot SIX via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Lancelot> Another option can be to use 'black' to
> Lancelot> help format the code and 'pycodestyle'[3] to do the automated checks,
> Lancelot> but I am not sure this is the way you want to go to, introducing more
> Lancelot> and more tools.
> 
> Doing some kind of checking like this would be good as well.
> We've used pycodestyle internally a bit, but recently (-ish) moved to
> flake8.  I don't know why we made the switch or which static checker is
> preferable, though.


I have used flake8 for a long time, I'm happy with it.  It does find
some real bugs, like a variable being referenced without being assigned,
things like that, so it has real value.

> Lancelot> I’d also like to point out that by default, 'black' will use 88 char
> Lancelot> long lines[4], which is not compliant with the PEP8[5], which is, as you
> Lancelot> pointed out[1], referenced in GDB's wiki[6].  Either the tool's
> Lancelot> configuration or the standard can be adjusted, depending on the
> Lancelot> maintainers preferences.
> 
> We only really chose PEP8 because it existed; not out of some deeper
> liking for it or anything like that.  So personally I'm fine with just
> switching to a tool.

Although PEP8 specifies more than just formatting, like naming
convention and things like that.  So for the wiki, which mentions PEP8
currently, I plan on saying something like "Use black for the
formatting.  For the rest, follow PEP8.".

Let's just use black's default line length.  I certainly don't want to
start a discussion on line length, and I don't see any reason to deviate
from the default.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
                   ` (3 preceding siblings ...)
  2021-04-26 22:40 ` Lancelot SIX
@ 2021-04-30 17:21 ` Luis Machado
  2021-05-08 16:00 ` Tom Tromey
  5 siblings, 0 replies; 28+ messages in thread
From: Luis Machado @ 2021-04-30 17:21 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 4/26/21 12:55 PM, Simon Marchi via Gdb-patches wrote:
> Hi all,
> 
> A few people I talked to about this and I have good experience with the
> tool black to auto-format Python code.  It is simple to use, fast and
> reliable (if it detects that it changed the meaning of the program by
> mistake, internal-errors out).  I don't think I need to spell out the
> advantage of using such a tool, but just in case: it removes all the
> overhead of thinking about formatting when writing or reviewing code.
> 
> My suggestion would be to format our code with black all at once.  The
> typical counter-argument to this is "it will be harder to use
> git-blame".  I don't think this is true, because you need to be able to
> skip over commits anyway, and it's trivial to skip over a commit when
> git-blaming using an interactive tool.  But it's also possible to tell
> git to ignore certain commits when git-blaming [2], so we can do that.
> 
> I think the output of black is quite readable.  When it does weird
> stuff, it's generally because the lines / expressions are two long
> anyway, and deserve to be split in multiple lines / expressions.  Here's
> a branch that shows how it would look like:
> 
>    https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/simark/black
> 
> If the feedback is overall positive, I'll send a more format patch.
> 
> Simon
> 
> [1] https://github.com/psf/black
> [2] https://github.com/psf/black#migrating-your-code-style-without-ruining-git-blame
> 

FTR, I like this too.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-30 17:14     ` Simon Marchi
@ 2021-05-01  6:42       ` Joel Brobecker
  0 siblings, 0 replies; 28+ messages in thread
From: Joel Brobecker @ 2021-05-01  6:42 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches; +Cc: Tom Tromey, Lancelot SIX

> > Doing some kind of checking like this would be good as well.
> > We've used pycodestyle internally a bit, but recently (-ish) moved to
> > flake8.  I don't know why we made the switch or which static checker is
> > preferable, though.

flake8 does more, as it is a wrapper around pyflakes and pycodestyle.

> 
> I have used flake8 for a long time, I'm happy with it.  It does find
> some real bugs, like a variable being referenced without being assigned,
> things like that, so it has real value.
> 
> > Lancelot> I’d also like to point out that by default, 'black' will use 88 char
> > Lancelot> long lines[4], which is not compliant with the PEP8[5], which is, as you
> > Lancelot> pointed out[1], referenced in GDB's wiki[6].  Either the tool's
> > Lancelot> configuration or the standard can be adjusted, depending on the
> > Lancelot> maintainers preferences.
> > 
> > We only really chose PEP8 because it existed; not out of some deeper
> > liking for it or anything like that.  So personally I'm fine with just
> > switching to a tool.
> 
> Although PEP8 specifies more than just formatting, like naming
> convention and things like that.  So for the wiki, which mentions PEP8
> currently, I plan on saying something like "Use black for the
> formatting.  For the rest, follow PEP8.".
> 
> Let's just use black's default line length.  I certainly don't want to
> start a discussion on line length, and I don't see any reason to deviate
> from the default.

Agreed.

One other thing that I thought about is perhaps we can look at
writing a server-side hook that would be run at commit time which
would run those checks before accepting them. The git-hooks we are
using provide support for adding extra checks via the
"hooks.commit-extra-checker" option. I'm not 100% about how easy
or difficult it would be to write the actual script that would
reproduce the issue, though.

-- 
Joel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
                   ` (4 preceding siblings ...)
  2021-04-30 17:21 ` Luis Machado
@ 2021-05-08 16:00 ` Tom Tromey
  2021-05-11  2:55   ` Simon Marchi
  5 siblings, 1 reply; 28+ messages in thread
From: Tom Tromey @ 2021-05-08 16:00 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches

>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

Simon> A few people I talked to about this and I have good experience
Simon> with the tool black to auto-format Python code.

Inspired by this, I took another look at clang-format.  If we could use
this, I think it would be a decent productivity improvement, because it
would mean contributors would no longer need to worry about the minutiae
of the GNU style.  We could also stop reviewing this.

My github has my current hacks, on the t/clang-format branch.  This
branch just has .clang-format, I didn't reformat the sources.  I'm using
clang-format 10.0.1.

I managed to tweak the penalty values in the format to avoid the
bin-packing problem I was seeing previously.  I haven't extensively
tested this (I usually only reformat gdb_bfd.c and look at it, since it
exercised several bugs before...), so I recommend that others play with
it a bit.

I did see a few issues, though perhaps they are minor.

1. It removes all the Control-L page breaks from the source.
   I know these often confuse newcomers, so maybe it's fine.
   I feel that Someone out there will cheer this :)

2. clang-format can't handle GNU-style goto label out-denting.
   There's a clang-format bug on this topic
   https://bugs.llvm.org/show_bug.cgi?id=24118

3. In gdb if we have a large 'if' condition that consists of expression
   chained with "&&" or "||", we often put each sub-condition on its own
   line.  clang-format packs them instead, and I didn't see a way to
   counteract this.

4. Gettext invocations like _("text") have a space added, like _ ("text").
   I didn't see a way to change this.

Another possible problem is that, unlike with 'black', ensuring that
everyone has the same version is a pain.

Maybe we could live with these?  Let me know what you think.
I can continue my experiments if this seems worthwhile.

Tom

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-05-08 16:00 ` Tom Tromey
@ 2021-05-11  2:55   ` Simon Marchi
  2021-05-11  2:57     ` Using clang-format Simon Marchi
  2021-05-11 11:38     ` Proposal: format GDB Python files with black Luis Machado
  0 siblings, 2 replies; 28+ messages in thread
From: Simon Marchi @ 2021-05-11  2:55 UTC (permalink / raw)
  To: Tom Tromey, Simon Marchi via Gdb-patches

Changing the subject, so this gets more visibility.

On 2021-05-08 12:00 p.m., Tom Tromey wrote:
>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Simon> A few people I talked to about this and I have good experience
> Simon> with the tool black to auto-format Python code.
> 
> Inspired by this, I took another look at clang-format.  If we could use
> this, I think it would be a decent productivity improvement, because it
> would mean contributors would no longer need to worry about the minutiae
> of the GNU style.  We could also stop reviewing this.

I would love this.

> My github has my current hacks, on the t/clang-format branch.  This
> branch just has .clang-format, I didn't reformat the sources.  I'm using
> clang-format 10.0.1.
> 
> I managed to tweak the penalty values in the format to avoid the
> bin-packing problem I was seeing previously.  I haven't extensively
> tested this (I usually only reformat gdb_bfd.c and look at it, since it
> exercised several bugs before...), so I recommend that others play with
> it a bit.

I formatted the whole code base and checked a few files in Meld.  In
general, I would say that it does more good (fixing little mistakes,
finding split lines that could actually be one line) than harm.  I
didn't see anything outrageously badly formatted (and if there is, we
always have the option to disable formatting for a section of a file).

There are a lot of changes that are neutral, where the before and after
would be equally good.

And of course, some changes where it does things not exactly like we
would, like:

  gcc_type
  compile_cplus_instance::convert_reference_base
    (gcc_type base, enum gcc_cp_ref_qualifiers rquals)

which becomes:

  gcc_type
  compile_cplus_instance::convert_reference_base (
    gcc_type base, enum gcc_cp_ref_qualifiers rquals)

But really, I don't think these should be blockers.  It would take
someone some really bad faith to say that the above changes how they
read the code.

Is the following what you don't like?  In gdbsupport/common-debug.h,
before:

		debug_prefixed_printf (m_module, m_func, "%s: <%s debugging was not enabled on entry>",
				       m_end_prefix, m_module);

after:

		debug_prefixed_printf (
		  m_module, m_func,
		  "%s: <%s debugging was not enabled on entry>", m_end_prefix,
		  m_module);

Note: the current line is too long, my bad.

I have only seen this when we make clang-format's life difficult by
using long names or long string literals.  If we don't like how it
formats some piece of code, we always have the choice of tweaking it to
make its like easier.  It can be a good motivation to try to reduce
indentation levels by splitting the code into functions, for example.

I'm surprised you didn't use "UseTab: Always", that gives pretty much
our scheme of tabs + spaces for indentation.  By default it uses just
spaces (I wouldn't mind using either, but since the goal is to get
something close to our current style).

I would use "IndentPPDirectives: AfterHash".  Unlike what we use
currently, it will indent with $IndentWidth columns, whereas we
typically we indent these with a single column when we do it manually.
But, I think I prefer 2 columns actually.  You can try formatting
gdbsupport/gdb_locale.h for a good example.

Speaking of which, in order to be usable from gdbsupport and gdbserver,
.clang-format would need to be in the top-level directory.  Or maybe it
could be in gdbsupport, and gdb/gdbserver symlink to it.

We have a number of places that put the namespace brace on the same
line:

  namespace foo {

We could keep that by using "BreakBeforeBraces: Custom".  But I don't
think it's worth the trouble.  It adds one extra line for the opening
brace, but it's typically not in a region of the file that you care much
about.

I noticed this:

  btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
  {
    /* Nothing.  */
  }

The initializer list is put on the same line as the constructor's
prototype.  If there's a way to tell it that we don't want that, it
would be closer to our style.

For AlignEscapedNewlines, I'd probably choose Right or DontAlign.  The
reason being that if you modify the longest line of a macro (make that
line longer or shorter), all the macros line will change as a result.
So it will be harder to spot what changed when reading the hunk in the
diff.  I have a slight preference for Right, because it puts the
backslashes out of sight, thus improving readability.  It's the kind of
thing that is quite annoying to maintain by hand, but it's not if a tool
does it.

I would suggest setting SplitEmptyFunction, SplitEmptyRecord and
SplitEmptyNamespace to false.  We already do it for empty methods.

Enough for now.

> I did see a few issues, though perhaps they are minor.
> 
> 1. It removes all the Control-L page breaks from the source.
>    I know these often confuse newcomers, so maybe it's fine.
>    I feel that Someone out there will cheer this :)

Who dat?

> 
> 2. clang-format can't handle GNU-style goto label out-denting.
>    There's a clang-format bug on this topic
>    https://bugs.llvm.org/show_bug.cgi?id=24118

Probably not a big deal for us, as we don't use goto much?

> 
> 3. In gdb if we have a large 'if' condition that consists of expression
>    chained with "&&" or "||", we often put each sub-condition on its own
>    line.  clang-format packs them instead, and I didn't see a way to
>    counteract this.

I hate too long conditions anyway.  If the condition spans too many
lines or has deep parenthesis nesting, it would deserve to be refactored
anyway.  Like, put that code in a function and just have:

  if (my_new_function (...))

With a descriptive name for the function, the code ends up more readable.

> 4. Gettext invocations like _("text") have a space added, like _ ("text").
>    I didn't see a way to change this.

A very little inconvenience IMO, compared to not having to think about
formatting.  We could live with this, and file a bug at the same time.
There could be an option to add exceptions to SpaceBeforeParens.

> Another possible problem is that, unlike with 'black', ensuring that
> everyone has the same version is a pain.

Indeed.  It's perhaps easy for me to say, because I get to choose what
Linux distro and version I work on (so I opt for something recent), but
I would still lean towards just following whatever the current latest
stable version is.  There might be new options in the latest stable
version we want to use, it would be nice not to have to wait years
before we can use them.  And I suppose there's a not too painful way to
get it for all the major distros out there (and for Windows and macOS,
there are binary releases).  And you can always run it in Docker or
something.

But we would probably encourage people to use git-clang-format, which
only reformats the lines you touched.  So, it wouldn't be a big problem
for people to use different versions, as it wouldn't create spurious
changes elsewhere in the file.  The differences in formatting would be
in lines you touch anyway.

Speaking of git-clang-format: from my experience, it is not super cool
when the code base isn't already formatted with the target formatting
style.  It causes discrepancies in style, because you have a mix of new
style (whatever lines were touched) and old style (whatever lines were
not touched) which can be annoying and distracting.  Or, it causes
slightly bigger diffs than you would get otherwise.  For example, I
ran git-clang-format on this commit here:

    https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d

... to see what would have been the resulting patch had the author used
git-clang-format.  This would have been the patch:

    https://pastebin.com/pxrZtnDU

If you check the changes in mi/mi-cmd-break.c, you can see that
git-clang-format re-formatted the whole enum and array initializer.
Which is nice in itself, the resulting formatting is nice.  But, the
patch would be a bit harder to read.

The other option is to start by doing a big re-format and then tell
people to use git-clang-format.  This way, there might be some small
incoherences here and there due to different versions, but I think
they'll be insignificant.

With the Python code, we did a massive re-format.  I don't think it did
/ will cause a big inconvenience, because there aren't many patches to
the Python code.  But for the C++ code, it might be different, as there
are more patches in everyone's pipeline that touch that code.

Another question is: would we re-format the C/C++ code in the testsuite?
It might require a few adjustments to test cases (like there was one for
black), but I would be fine with that.  My concern is more if there are
some tests where the formatting of the code was specifically important.
We could disable formatting for them, but the difficulty is to find
them.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Using clang-format
  2021-05-11  2:55   ` Simon Marchi
@ 2021-05-11  2:57     ` Simon Marchi
  2021-05-11 13:31       ` Marco Barisione
  2021-05-11 20:40       ` Tom Tromey
  2021-05-11 11:38     ` Proposal: format GDB Python files with black Luis Machado
  1 sibling, 2 replies; 28+ messages in thread
From: Simon Marchi @ 2021-05-11  2:57 UTC (permalink / raw)
  To: Tom Tromey, Simon Marchi via Gdb-patches

On 2021-05-10 10:55 p.m., Simon Marchi via Gdb-patches wrote:
> Changing the subject, so this gets more visibility.

Arf, I forgot to do that before sending.  There, done.

> On 2021-05-08 12:00 p.m., Tom Tromey wrote:
>>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
>>
>> Simon> A few people I talked to about this and I have good experience
>> Simon> with the tool black to auto-format Python code.
>>
>> Inspired by this, I took another look at clang-format.  If we could use
>> this, I think it would be a decent productivity improvement, because it
>> would mean contributors would no longer need to worry about the minutiae
>> of the GNU style.  We could also stop reviewing this.
> 
> I would love this.
> 
>> My github has my current hacks, on the t/clang-format branch.  This
>> branch just has .clang-format, I didn't reformat the sources.  I'm using
>> clang-format 10.0.1.
>>
>> I managed to tweak the penalty values in the format to avoid the
>> bin-packing problem I was seeing previously.  I haven't extensively
>> tested this (I usually only reformat gdb_bfd.c and look at it, since it
>> exercised several bugs before...), so I recommend that others play with
>> it a bit.
> 
> I formatted the whole code base and checked a few files in Meld.  In
> general, I would say that it does more good (fixing little mistakes,
> finding split lines that could actually be one line) than harm.  I
> didn't see anything outrageously badly formatted (and if there is, we
> always have the option to disable formatting for a section of a file).
> 
> There are a lot of changes that are neutral, where the before and after
> would be equally good.
> 
> And of course, some changes where it does things not exactly like we
> would, like:
> 
>   gcc_type
>   compile_cplus_instance::convert_reference_base
>     (gcc_type base, enum gcc_cp_ref_qualifiers rquals)
> 
> which becomes:
> 
>   gcc_type
>   compile_cplus_instance::convert_reference_base (
>     gcc_type base, enum gcc_cp_ref_qualifiers rquals)
> 
> But really, I don't think these should be blockers.  It would take
> someone some really bad faith to say that the above changes how they
> read the code.
> 
> Is the following what you don't like?  In gdbsupport/common-debug.h,
> before:
> 
> 		debug_prefixed_printf (m_module, m_func, "%s: <%s debugging was not enabled on entry>",
> 				       m_end_prefix, m_module);
> 
> after:
> 
> 		debug_prefixed_printf (
> 		  m_module, m_func,
> 		  "%s: <%s debugging was not enabled on entry>", m_end_prefix,
> 		  m_module);
> 
> Note: the current line is too long, my bad.
> 
> I have only seen this when we make clang-format's life difficult by
> using long names or long string literals.  If we don't like how it
> formats some piece of code, we always have the choice of tweaking it to
> make its like easier.  It can be a good motivation to try to reduce
> indentation levels by splitting the code into functions, for example.
> 
> I'm surprised you didn't use "UseTab: Always", that gives pretty much
> our scheme of tabs + spaces for indentation.  By default it uses just
> spaces (I wouldn't mind using either, but since the goal is to get
> something close to our current style).
> 
> I would use "IndentPPDirectives: AfterHash".  Unlike what we use
> currently, it will indent with $IndentWidth columns, whereas we
> typically we indent these with a single column when we do it manually.
> But, I think I prefer 2 columns actually.  You can try formatting
> gdbsupport/gdb_locale.h for a good example.
> 
> Speaking of which, in order to be usable from gdbsupport and gdbserver,
> .clang-format would need to be in the top-level directory.  Or maybe it
> could be in gdbsupport, and gdb/gdbserver symlink to it.
> 
> We have a number of places that put the namespace brace on the same
> line:
> 
>   namespace foo {
> 
> We could keep that by using "BreakBeforeBraces: Custom".  But I don't
> think it's worth the trouble.  It adds one extra line for the opening
> brace, but it's typically not in a region of the file that you care much
> about.
> 
> I noticed this:
> 
>   btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
>   {
>     /* Nothing.  */
>   }
> 
> The initializer list is put on the same line as the constructor's
> prototype.  If there's a way to tell it that we don't want that, it
> would be closer to our style.
> 
> For AlignEscapedNewlines, I'd probably choose Right or DontAlign.  The
> reason being that if you modify the longest line of a macro (make that
> line longer or shorter), all the macros line will change as a result.
> So it will be harder to spot what changed when reading the hunk in the
> diff.  I have a slight preference for Right, because it puts the
> backslashes out of sight, thus improving readability.  It's the kind of
> thing that is quite annoying to maintain by hand, but it's not if a tool
> does it.
> 
> I would suggest setting SplitEmptyFunction, SplitEmptyRecord and
> SplitEmptyNamespace to false.  We already do it for empty methods.
> 
> Enough for now.
> 
>> I did see a few issues, though perhaps they are minor.
>>
>> 1. It removes all the Control-L page breaks from the source.
>>    I know these often confuse newcomers, so maybe it's fine.
>>    I feel that Someone out there will cheer this :)
> 
> Who dat?
> 
>>
>> 2. clang-format can't handle GNU-style goto label out-denting.
>>    There's a clang-format bug on this topic
>>    https://bugs.llvm.org/show_bug.cgi?id=24118
> 
> Probably not a big deal for us, as we don't use goto much?
> 
>>
>> 3. In gdb if we have a large 'if' condition that consists of expression
>>    chained with "&&" or "||", we often put each sub-condition on its own
>>    line.  clang-format packs them instead, and I didn't see a way to
>>    counteract this.
> 
> I hate too long conditions anyway.  If the condition spans too many
> lines or has deep parenthesis nesting, it would deserve to be refactored
> anyway.  Like, put that code in a function and just have:
> 
>   if (my_new_function (...))
> 
> With a descriptive name for the function, the code ends up more readable.
> 
>> 4. Gettext invocations like _("text") have a space added, like _ ("text").
>>    I didn't see a way to change this.
> 
> A very little inconvenience IMO, compared to not having to think about
> formatting.  We could live with this, and file a bug at the same time.
> There could be an option to add exceptions to SpaceBeforeParens.
> 
>> Another possible problem is that, unlike with 'black', ensuring that
>> everyone has the same version is a pain.
> 
> Indeed.  It's perhaps easy for me to say, because I get to choose what
> Linux distro and version I work on (so I opt for something recent), but
> I would still lean towards just following whatever the current latest
> stable version is.  There might be new options in the latest stable
> version we want to use, it would be nice not to have to wait years
> before we can use them.  And I suppose there's a not too painful way to
> get it for all the major distros out there (and for Windows and macOS,
> there are binary releases).  And you can always run it in Docker or
> something.
> 
> But we would probably encourage people to use git-clang-format, which
> only reformats the lines you touched.  So, it wouldn't be a big problem
> for people to use different versions, as it wouldn't create spurious
> changes elsewhere in the file.  The differences in formatting would be
> in lines you touch anyway.
> 
> Speaking of git-clang-format: from my experience, it is not super cool
> when the code base isn't already formatted with the target formatting
> style.  It causes discrepancies in style, because you have a mix of new
> style (whatever lines were touched) and old style (whatever lines were
> not touched) which can be annoying and distracting.  Or, it causes
> slightly bigger diffs than you would get otherwise.  For example, I
> ran git-clang-format on this commit here:
> 
>     https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d
> 
> ... to see what would have been the resulting patch had the author used
> git-clang-format.  This would have been the patch:
> 
>     https://pastebin.com/pxrZtnDU
> 
> If you check the changes in mi/mi-cmd-break.c, you can see that
> git-clang-format re-formatted the whole enum and array initializer.
> Which is nice in itself, the resulting formatting is nice.  But, the
> patch would be a bit harder to read.
> 
> The other option is to start by doing a big re-format and then tell
> people to use git-clang-format.  This way, there might be some small
> incoherences here and there due to different versions, but I think
> they'll be insignificant.
> 
> With the Python code, we did a massive re-format.  I don't think it did
> / will cause a big inconvenience, because there aren't many patches to
> the Python code.  But for the C++ code, it might be different, as there
> are more patches in everyone's pipeline that touch that code.
> 
> Another question is: would we re-format the C/C++ code in the testsuite?
> It might require a few adjustments to test cases (like there was one for
> black), but I would be fine with that.  My concern is more if there are
> some tests where the formatting of the code was specifically important.
> We could disable formatting for them, but the difficulty is to find
> them.
> 
> Simon
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-05-11  2:55   ` Simon Marchi
  2021-05-11  2:57     ` Using clang-format Simon Marchi
@ 2021-05-11 11:38     ` Luis Machado
  2021-05-11 13:49       ` Simon Marchi
  1 sibling, 1 reply; 28+ messages in thread
From: Luis Machado @ 2021-05-11 11:38 UTC (permalink / raw)
  To: Simon Marchi, Tom Tromey, Simon Marchi via Gdb-patches

On 5/10/21 11:55 PM, Simon Marchi via Gdb-patches wrote:
> Changing the subject, so this gets more visibility.
> 
> On 2021-05-08 12:00 p.m., Tom Tromey wrote:
>>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
>>
>> Simon> A few people I talked to about this and I have good experience
>> Simon> with the tool black to auto-format Python code.
>>
>> Inspired by this, I took another look at clang-format.  If we could use
>> this, I think it would be a decent productivity improvement, because it
>> would mean contributors would no longer need to worry about the minutiae
>> of the GNU style.  We could also stop reviewing this.
> 
> I would love this.
> 

I'm fine with using a tool to autoformat things. I wouldn't mind getting 
used to a slightly different formatting scheme and I feel it would save 
me quite a bit of time by not having to bother with little formatting 
details.

>> My github has my current hacks, on the t/clang-format branch.  This
>> branch just has .clang-format, I didn't reformat the sources.  I'm using
>> clang-format 10.0.1.
>>
>> I managed to tweak the penalty values in the format to avoid the
>> bin-packing problem I was seeing previously.  I haven't extensively
>> tested this (I usually only reformat gdb_bfd.c and look at it, since it
>> exercised several bugs before...), so I recommend that others play with
>> it a bit.
> 
> I formatted the whole code base and checked a few files in Meld.  In
> general, I would say that it does more good (fixing little mistakes,
> finding split lines that could actually be one line) than harm.  I
> didn't see anything outrageously badly formatted (and if there is, we
> always have the option to disable formatting for a section of a file).
> 
> There are a lot of changes that are neutral, where the before and after
> would be equally good.
> 
> And of course, some changes where it does things not exactly like we
> would, like:
> 
>    gcc_type
>    compile_cplus_instance::convert_reference_base
>      (gcc_type base, enum gcc_cp_ref_qualifiers rquals)
> 
> which becomes:
> 
>    gcc_type
>    compile_cplus_instance::convert_reference_base (
>      gcc_type base, enum gcc_cp_ref_qualifiers rquals)
> 
> But really, I don't think these should be blockers.  It would take
> someone some really bad faith to say that the above changes how they
> read the code.
> 
> Is the following what you don't like?  In gdbsupport/common-debug.h,
> before:
> 
> 		debug_prefixed_printf (m_module, m_func, "%s: <%s debugging was not enabled on entry>",
> 				       m_end_prefix, m_module);
> 
> after:
> 
> 		debug_prefixed_printf (
> 		  m_module, m_func,
> 		  "%s: <%s debugging was not enabled on entry>", m_end_prefix,
> 		  m_module);
> 
> Note: the current line is too long, my bad.
> 
> I have only seen this when we make clang-format's life difficult by
> using long names or long string literals.  If we don't like how it
> formats some piece of code, we always have the choice of tweaking it to
> make its like easier.  It can be a good motivation to try to reduce
> indentation levels by splitting the code into functions, for example.
> 
> I'm surprised you didn't use "UseTab: Always", that gives pretty much
> our scheme of tabs + spaces for indentation.  By default it uses just
> spaces (I wouldn't mind using either, but since the goal is to get
> something close to our current style).
> 
> I would use "IndentPPDirectives: AfterHash".  Unlike what we use
> currently, it will indent with $IndentWidth columns, whereas we
> typically we indent these with a single column when we do it manually.
> But, I think I prefer 2 columns actually.  You can try formatting
> gdbsupport/gdb_locale.h for a good example.
> 
> Speaking of which, in order to be usable from gdbsupport and gdbserver,
> .clang-format would need to be in the top-level directory.  Or maybe it
> could be in gdbsupport, and gdb/gdbserver symlink to it.
> 
> We have a number of places that put the namespace brace on the same
> line:
> 
>    namespace foo {
> 
> We could keep that by using "BreakBeforeBraces: Custom".  But I don't
> think it's worth the trouble.  It adds one extra line for the opening
> brace, but it's typically not in a region of the file that you care much
> about.
> 
> I noticed this:
> 
>    btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
>    {
>      /* Nothing.  */
>    }
> 
> The initializer list is put on the same line as the constructor's
> prototype.  If there's a way to tell it that we don't want that, it
> would be closer to our style.
> 
> For AlignEscapedNewlines, I'd probably choose Right or DontAlign.  The
> reason being that if you modify the longest line of a macro (make that
> line longer or shorter), all the macros line will change as a result.
> So it will be harder to spot what changed when reading the hunk in the
> diff.  I have a slight preference for Right, because it puts the
> backslashes out of sight, thus improving readability.  It's the kind of
> thing that is quite annoying to maintain by hand, but it's not if a tool
> does it.
> 
> I would suggest setting SplitEmptyFunction, SplitEmptyRecord and
> SplitEmptyNamespace to false.  We already do it for empty methods.
> 
> Enough for now.
> 
>> I did see a few issues, though perhaps they are minor.
>>
>> 1. It removes all the Control-L page breaks from the source.
>>     I know these often confuse newcomers, so maybe it's fine.
>>     I feel that Someone out there will cheer this :)
> 
> Who dat?
> 
>>
>> 2. clang-format can't handle GNU-style goto label out-denting.
>>     There's a clang-format bug on this topic
>>     https://bugs.llvm.org/show_bug.cgi?id=24118
> 
> Probably not a big deal for us, as we don't use goto much?
> 
>>
>> 3. In gdb if we have a large 'if' condition that consists of expression
>>     chained with "&&" or "||", we often put each sub-condition on its own
>>     line.  clang-format packs them instead, and I didn't see a way to
>>     counteract this.
> 
> I hate too long conditions anyway.  If the condition spans too many
> lines or has deep parenthesis nesting, it would deserve to be refactored
> anyway.  Like, put that code in a function and just have:
> 
>    if (my_new_function (...))
> 
> With a descriptive name for the function, the code ends up more readable.
> 
>> 4. Gettext invocations like _("text") have a space added, like _ ("text").
>>     I didn't see a way to change this.
> 
> A very little inconvenience IMO, compared to not having to think about
> formatting.  We could live with this, and file a bug at the same time.
> There could be an option to add exceptions to SpaceBeforeParens.
> 
>> Another possible problem is that, unlike with 'black', ensuring that
>> everyone has the same version is a pain.
> 
> Indeed.  It's perhaps easy for me to say, because I get to choose what
> Linux distro and version I work on (so I opt for something recent), but
> I would still lean towards just following whatever the current latest
> stable version is.  There might be new options in the latest stable
> version we want to use, it would be nice not to have to wait years
> before we can use them.  And I suppose there's a not too painful way to
> get it for all the major distros out there (and for Windows and macOS,
> there are binary releases).  And you can always run it in Docker or
> something.

I suppose we could have scripts to automate this sort of task. Something 
that checks for the latest stable release and downloads it? A helper to 
make things easier and more consistent.

Can we do this server-side and take the burden off of the developers to 
have to do this as an additional step before pushing changes?

> 
> But we would probably encourage people to use git-clang-format, which
> only reformats the lines you touched.  So, it wouldn't be a big problem
> for people to use different versions, as it wouldn't create spurious
> changes elsewhere in the file.  The differences in formatting would be
> in lines you touch anyway. >
> Speaking of git-clang-format: from my experience, it is not super cool
> when the code base isn't already formatted with the target formatting
> style.  It causes discrepancies in style, because you have a mix of new
> style (whatever lines were touched) and old style (whatever lines were
> not touched) which can be annoying and distracting.  Or, it causes
> slightly bigger diffs than you would get otherwise.  For example, I
> ran git-clang-format on this commit here:
> 
>      https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d
> 
> ... to see what would have been the resulting patch had the author used
> git-clang-format.  This would have been the patch:
> 
>      https://pastebin.com/pxrZtnDU
> 
> If you check the changes in mi/mi-cmd-break.c, you can see that
> git-clang-format re-formatted the whole enum and array initializer.
> Which is nice in itself, the resulting formatting is nice.  But, the
> patch would be a bit harder to read.
> 
> The other option is to start by doing a big re-format and then tell
> people to use git-clang-format.  This way, there might be some small
> incoherences here and there due to different versions, but I think
> they'll be insignificant.
> 
> With the Python code, we did a massive re-format.  I don't think it did
> / will cause a big inconvenience, because there aren't many patches to
> the Python code.  But for the C++ code, it might be different, as there
> are more patches in everyone's pipeline that touch that code.
> 
> Another question is: would we re-format the C/C++ code in the testsuite?
> It might require a few adjustments to test cases (like there was one for
> black), but I would be fine with that.  My concern is more if there are
> some tests where the formatting of the code was specifically important.
> We could disable formatting for them, but the difficulty is to find
> them.

I think reformatting the test sources has the potential to break things 
on testcases with hardcoded assumptions. I wouldn't touch them at first.

> 
> Simon
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Using clang-format
  2021-05-11  2:57     ` Using clang-format Simon Marchi
@ 2021-05-11 13:31       ` Marco Barisione
  2021-05-11 13:44         ` Simon Marchi
  2021-05-11 20:40       ` Tom Tromey
  1 sibling, 1 reply; 28+ messages in thread
From: Marco Barisione @ 2021-05-11 13:31 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, GDB patches mailing list

On 11 May 2021, at 03:57, Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> wrote:
> 
> On 2021-05-10 10:55 p.m., Simon Marchi via Gdb-patches wrote:
>> On 2021-05-08 12:00 p.m., Tom Tromey wrote:
>>>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
>>> 
>>> Simon> A few people I talked to about this and I have good experience
>>> Simon> with the tool black to auto-format Python code.
>>> 
>>> Inspired by this, I took another look at clang-format.  If we could use
>>> this, I think it would be a decent productivity improvement, because it
>>> would mean contributors would no longer need to worry about the minutiae
>>> of the GNU style.  We could also stop reviewing this.
>> 
>> I would love this.

As an occasional contributor I’d love this.
Getting the correct style right is complicated if you spend most of your
time developing other projects with other coding styles.
It’s even worse for cases that are not trivial as GDB’s codebase is
occasionally inconsistent so there’s nowhere where to take inspiration
from.

>> I noticed this:
>> 
>>  btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
>>  {
>>    /* Nothing.  */
>>  }
>> 
>> The initializer list is put on the same line as the constructor's
>> prototype.  If there's a way to tell it that we don't want that, it
>> would be closer to our style.

You can use BreakConstructorInitializers.

>>> Another possible problem is that, unlike with 'black', ensuring that
>>> everyone has the same version is a pain.
>> 
>> Indeed.  It's perhaps easy for me to say, because I get to choose what
>> Linux distro and version I work on (so I opt for something recent), but
>> I would still lean towards just following whatever the current latest
>> stable version is.  There might be new options in the latest stable
>> version we want to use, it would be nice not to have to wait years
>> before we can use them.  And I suppose there's a not too painful way to
>> get it for all the major distros out there (and for Windows and macOS,
>> there are binary releases).  And you can always run it in Docker or
>> something.
>> 
>> But we would probably encourage people to use git-clang-format, which
>> only reformats the lines you touched.  So, it wouldn't be a big problem
>> for people to use different versions, as it wouldn't create spurious
>> changes elsewhere in the file.  The differences in formatting would be
>> in lines you touch anyway.
>> 
>> Speaking of git-clang-format: from my experience, it is not super cool
>> when the code base isn't already formatted with the target formatting
>> style.  It causes discrepancies in style, because you have a mix of new
>> style (whatever lines were touched) and old style (whatever lines were
>> not touched) which can be annoying and distracting.  Or, it causes
>> slightly bigger diffs than you would get otherwise.  For example, I
>> ran git-clang-format on this commit here:
>> 
>>    https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d
>> 
>> ... to see what would have been the resulting patch had the author used
>> git-clang-format.  This would have been the patch:
>> 
>>    https://pastebin.com/pxrZtnDU
>> 
>> If you check the changes in mi/mi-cmd-break.c, you can see that
>> git-clang-format re-formatted the whole enum and array initializer.
>> Which is nice in itself, the resulting formatting is nice.  But, the
>> patch would be a bit harder to read.
>> 
>> The other option is to start by doing a big re-format and then tell
>> people to use git-clang-format.  This way, there might be some small
>> incoherences here and there due to different versions, but I think
>> they'll be insignificant.
>> 
>> With the Python code, we did a massive re-format.  I don't think it did
>> / will cause a big inconvenience, because there aren't many patches to
>> the Python code.  But for the C++ code, it might be different, as there
>> are more patches in everyone's pipeline that touch that code.

Personally, I’d be a bit worried about dealing with difficult conflicts
manually, but there are solutions for that:

1. Rebase your branch to the point just before reformatting:
     $ git rebase <commit before clang-format commit ID>

2. Rebase on top of the format changes but preferring your local
   changes:

     $ git rebase --strategy=recursive \
           --strategy-option=theirs <clang-format commit ID>

3. Finally reformat all of your code you just committed and amend
   your previous commits.


At work we decided not to reformat the whole C/C++ code base (while
we did for Python).  Instead, we use some git hooks I wrote which
reformat only what you are committing.  See
https://github.com/barisione/clang-format-hooks.

While I wrote these scripts and would be happy for others to use
them, I think that (in GDB’s case) it would be easier to just
reformat everything and deal with the short-term pain. 


-- 
Marco Barisione


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Using clang-format
  2021-05-11 13:31       ` Marco Barisione
@ 2021-05-11 13:44         ` Simon Marchi
  0 siblings, 0 replies; 28+ messages in thread
From: Simon Marchi @ 2021-05-11 13:44 UTC (permalink / raw)
  To: Marco Barisione; +Cc: Tom Tromey, GDB patches mailing list

> As an occasional contributor I’d love this.
> Getting the correct style right is complicated if you spend most of your
> time developing other projects with other coding styles.
> It’s even worse for cases that are not trivial as GDB’s codebase is
> occasionally inconsistent so there’s nowhere where to take inspiration
> from.

Thanks for that feedback.

> 
>>> I noticed this:
>>>
>>>  btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
>>>  {
>>>    /* Nothing.  */
>>>  }
>>>
>>> The initializer list is put on the same line as the constructor's
>>> prototype.  If there's a way to tell it that we don't want that, it
>>> would be closer to our style.
> 
> You can use BreakConstructorInitializers.

Tom's config has

  BreakConstructorInitializers: BeforeColon

But it still does it.

> Personally, I’d be a bit worried about dealing with difficult conflicts
> manually, but there are solutions for that:
> 
> 1. Rebase your branch to the point just before reformatting:
>      $ git rebase <commit before clang-format commit ID>
> 
> 2. Rebase on top of the format changes but preferring your local
>    changes:
> 
>      $ git rebase --strategy=recursive \
>            --strategy-option=theirs <clang-format commit ID>
> 
> 3. Finally reformat all of your code you just committed and amend
>    your previous commits.

I think that's a good option.  It's a bit of manual steps, but:

1. It's O(1), it's as quick to do this with a small patch and a big
   patch.
2. It's temporary, eventually all patches that were in the pipeline at
   the time of the re-format will be submitted or dropped.

> 
> 
> At work we decided not to reformat the whole C/C++ code base (while
> we did for Python).  Instead, we use some git hooks I wrote which
> reformat only what you are committing.  See
> https://github.com/barisione/clang-format-hooks.
> 
> While I wrote these scripts and would be happy for others to use
> them, I think that (in GDB’s case) it would be easier to just
> reformat everything and deal with the short-term pain. 

Agreed.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-05-11 11:38     ` Proposal: format GDB Python files with black Luis Machado
@ 2021-05-11 13:49       ` Simon Marchi
  2021-05-11 14:23         ` Luis Machado
  0 siblings, 1 reply; 28+ messages in thread
From: Simon Marchi @ 2021-05-11 13:49 UTC (permalink / raw)
  To: Luis Machado, Tom Tromey, Simon Marchi via Gdb-patches

On 2021-05-11 7:38 a.m., Luis Machado wrote:
>> Indeed.  It's perhaps easy for me to say, because I get to choose what
>> Linux distro and version I work on (so I opt for something recent), but
>> I would still lean towards just following whatever the current latest
>> stable version is.  There might be new options in the latest stable
>> version we want to use, it would be nice not to have to wait years
>> before we can use them.  And I suppose there's a not too painful way to
>> get it for all the major distros out there (and for Windows and macOS,
>> there are binary releases).  And you can always run it in Docker or
>> something.
> 
> I suppose we could have scripts to automate this sort of task. Something that checks for the latest stable release and downloads it? A helper to make things easier and more consistent.

Download from where?

> 
> Can we do this server-side and take the burden off of the developers to have to do this as an additional step before pushing changes?

We can check server-side if the formatting is good and reject the commit
if not, but I don't think we can modify the commit being pushed (I don't
think it would be a good idea if that was possible).

But just like black (it's on my todo list to check that), we can have a
commit hook that runs clang-format (at least just to warn you, "Hey, you
code needs to be formatted" if clang-format would change something).

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Proposal: format GDB Python files with black
  2021-05-11 13:49       ` Simon Marchi
@ 2021-05-11 14:23         ` Luis Machado
  0 siblings, 0 replies; 28+ messages in thread
From: Luis Machado @ 2021-05-11 14:23 UTC (permalink / raw)
  To: Simon Marchi, Tom Tromey, Simon Marchi via Gdb-patches

On 5/11/21 10:49 AM, Simon Marchi wrote:
> On 2021-05-11 7:38 a.m., Luis Machado wrote:
>>> Indeed.  It's perhaps easy for me to say, because I get to choose what
>>> Linux distro and version I work on (so I opt for something recent), but
>>> I would still lean towards just following whatever the current latest
>>> stable version is.  There might be new options in the latest stable
>>> version we want to use, it would be nice not to have to wait years
>>> before we can use them.  And I suppose there's a not too painful way to
>>> get it for all the major distros out there (and for Windows and macOS,
>>> there are binary releases).  And you can always run it in Docker or
>>> something.
>>
>> I suppose we could have scripts to automate this sort of task. Something that checks for the latest stable release and downloads it? A helper to make things easier and more consistent.
> 
> Download from where?
> 

I guess this works better for black since you can fetch it from a repo 
and use a particular version.

Can we do something similar for clang-format without having to build the 
entire LLVM codebase?

>>
>> Can we do this server-side and take the burden off of the developers to have to do this as an additional step before pushing changes?
> 
> We can check server-side if the formatting is good and reject the commit
> if not, but I don't think we can modify the commit being pushed (I don't
> think it would be a good idea if that was possible).

I see. Maybe a client-side hook that formats things during a local 
commit will work better then.

That way we can send a local commit for review and it will be properly 
formatted already.

> 
> But just like black (it's on my todo list to check that), we can have a
> commit hook that runs clang-format (at least just to warn you, "Hey, you
> code needs to be formatted" if clang-format would change something).

I guess that would as well.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Using clang-format
  2021-05-11  2:57     ` Using clang-format Simon Marchi
  2021-05-11 13:31       ` Marco Barisione
@ 2021-05-11 20:40       ` Tom Tromey
  2021-05-13 17:13         ` Simon Marchi
  1 sibling, 1 reply; 28+ messages in thread
From: Tom Tromey @ 2021-05-11 20:40 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, Simon Marchi via Gdb-patches

>> gcc_type
>> compile_cplus_instance::convert_reference_base (
>> gcc_type base, enum gcc_cp_ref_qualifiers rquals)
>> 
>> But really, I don't think these should be blockers.  It would take
>> someone some really bad faith to say that the above changes how they
>> read the code.

I don't agree, mainly because I dislike that style, and I feel like I'm
generally above-board.

This weird dangling paren was an issue for calls, previously, though,
and I fixed it by tweaking one of the penalty settings.  So maybe that
can be salvaged.

>> I have only seen this when we make clang-format's life difficult by
>> using long names or long string literals.  If we don't like how it
>> formats some piece of code, we always have the choice of tweaking it to
>> make its like easier.

Yeah.  Also it's not like the current code is pristine.

>> I'm surprised you didn't use "UseTab: Always"

I routinely forget whether gdb uses tabs or not and rely on Emacs to
DTRT.

>> I would use "IndentPPDirectives: AfterHash".  Unlike what we use
>> currently, it will indent with $IndentWidth columns, whereas we
>> typically we indent these with a single column when we do it manually.
>> But, I think I prefer 2 columns actually.  You can try formatting
>> gdbsupport/gdb_locale.h for a good example.

My impression was that normally gdb didn't do this kind of indentation,
but in this particular case, I don't really have any opinion and I'm
happy with whatever setting everyone else enjoys.

>> Speaking of which, in order to be usable from gdbsupport and gdbserver,
>> .clang-format would need to be in the top-level directory.  Or maybe it
>> could be in gdbsupport, and gdb/gdbserver symlink to it.

Yeah, I was wondering about this as well.

I don't think we should put it at top-level, though, as people may think
it applies to the whole tree, which it wouldn't.

>> btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
>> {
>> /* Nothing.  */
>> }
>> 
>> The initializer list is put on the same line as the constructor's
>> prototype.  If there's a way to tell it that we don't want that, it
>> would be closer to our style.

I think clang-format doesn't offer this level of control here.

>> For AlignEscapedNewlines, I'd probably choose Right or DontAlign.  The
>> reason being that if you modify the longest line of a macro (make that
>> line longer or shorter), all the macros line will change as a result.
>> So it will be harder to spot what changed when reading the hunk in the
>> diff.  I have a slight preference for Right,

Sounds good.

>>> 1. It removes all the Control-L page breaks from the source.
>>> I know these often confuse newcomers, so maybe it's fine.
>>> I feel that Someone out there will cheer this :)
>> 
>> Who dat?

I thought Pedro was in the anti-control-L camp?
Anyway, no biggie, I like them but I probably won't notice their
absence.

>>> Another possible problem is that, unlike with 'black', ensuring that
>>> everyone has the same version is a pain.

>> With the Python code, we did a massive re-format.  I don't think it did
>> / will cause a big inconvenience, because there aren't many patches to
>> the Python code.  But for the C++ code, it might be different, as there
>> are more patches in everyone's pipeline that touch that code.

If our reformatting commit is purely automated, we can write a short
script using git filter-branch so that anybody with a patch to rebase
can first rebase to the commit just before the reformat, then run the
script to rebase-and-reformat.  This would ease the pain somewhat.

>> Another question is: would we re-format the C/C++ code in the testsuite?
>> It might require a few adjustments to test cases (like there was one for
>> black), but I would be fine with that.  My concern is more if there are
>> some tests where the formatting of the code was specifically important.
>> We could disable formatting for them, but the difficulty is to find
>> them.

Personally I would say that we shouldn't touch the tests at all, and
just not worry about the formatting there.  That's been my usual review
policy, as I think it's fine, and maybe even mildly beneficial in some
theoretical way, for the tests to have a variety of styles.  Also
normally they are just throwaway code anyway.

Tom

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Using clang-format
  2021-05-11 20:40       ` Tom Tromey
@ 2021-05-13 17:13         ` Simon Marchi
  0 siblings, 0 replies; 28+ messages in thread
From: Simon Marchi @ 2021-05-13 17:13 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi via Gdb-patches

On 2021-05-11 4:40 p.m., Tom Tromey wrote:
>>> gcc_type
>>> compile_cplus_instance::convert_reference_base (
>>> gcc_type base, enum gcc_cp_ref_qualifiers rquals)
>>>
>>> But really, I don't think these should be blockers.  It would take
>>> someone some really bad faith to say that the above changes how they
>>> read the code.
> 
> I don't agree, mainly because I dislike that style, and I feel like I'm
> generally above-board.

Ok, and sorry about my original formulation, when re-reading it it
sounds a bit strong.

Simon

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-05-13 17:13 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 15:55 Proposal: format GDB Python files with black Simon Marchi
2021-04-26 16:21 ` Andrew Burgess
2021-04-26 17:25   ` Simon Marchi
2021-04-26 17:50     ` Andrew Burgess
2021-04-26 18:08       ` Simon Marchi
2021-04-27  7:54         ` Andrew Burgess
2021-04-27 13:21           ` Simon Marchi
2021-04-26 17:42   ` Tom Tromey
2021-04-26 17:34 ` Paul Koning
2021-04-26 17:44   ` Simon Marchi
2021-04-26 17:48     ` Simon Marchi
2021-04-26 17:39 ` Tom Tromey
2021-04-30 16:26   ` Joel Brobecker
2021-04-26 22:40 ` Lancelot SIX
2021-04-30 17:04   ` Tom Tromey
2021-04-30 17:14     ` Simon Marchi
2021-05-01  6:42       ` Joel Brobecker
2021-04-30 17:21 ` Luis Machado
2021-05-08 16:00 ` Tom Tromey
2021-05-11  2:55   ` Simon Marchi
2021-05-11  2:57     ` Using clang-format Simon Marchi
2021-05-11 13:31       ` Marco Barisione
2021-05-11 13:44         ` Simon Marchi
2021-05-11 20:40       ` Tom Tromey
2021-05-13 17:13         ` Simon Marchi
2021-05-11 11:38     ` Proposal: format GDB Python files with black Luis Machado
2021-05-11 13:49       ` Simon Marchi
2021-05-11 14:23         ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).