public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
@ 2023-09-10 10:07 ` egallager at gcc dot gnu.org
  2023-09-17 18:37 ` simon at pushface dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: egallager at gcc dot gnu.org @ 2023-09-10 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

Eric Gallager <egallager at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |egallager at gcc dot gnu.org
             Status|SUSPENDED                   |UNCONFIRMED
     Ever confirmed|1                           |0

--- Comment #5 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to simon from comment #4)
> (In reply to Eric Botcazou from comment #2)
> 
> When I said in comment 1 
> 
> >I have to say that, great as it would be to have this fixed, the changes 
> >required would be extensive, and I can’t see that anyone would think it 
> >worth the trouble.
> 
> I meant that coping with macOS’s HFS+ behaviour w.r.t. NFC vs NFD was 
> something it’d be unreasonable to spend effort on fixing.
> 
> The main point of this PR is that you can’t use extended characters in 
> unit names on case-insensitive filesystems, *which includes Windows*. 
> Fixing that problem (I can see it might mean introducing a new adaint.c 
> interface "is filesystem UTF8?") would be a good thing. Can the compiler 
> use iconv? or Ada.Wide_Characters.Handling,
> Ada.Strings.UTF_Encoding.[Wide_]Strings?
> 
> The awkwardness discussed in comment 1 isn’t really a problem except 
> when compiling the offending unit from the command line; when compiled 
> as part of the closure by gnatmake there’s no problem, I guess gnatmake 
> reads the unit name in NFC and gets the file name in NFD from the file 
> system.
> 
> I think there _is_ a problem in gprbuild but of course that’s nothing 
> to do with GCC.
> 
> Please can this PR be reopened?

Well, it was never closed in the first place, just marked as SUSPENDED, but I
can put it back to UNCONFIRMED, I guess...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
  2023-09-10 10:07 ` [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems egallager at gcc dot gnu.org
@ 2023-09-17 18:37 ` simon at pushface dot org
  2023-10-01 17:34 ` egallager at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: simon at pushface dot org @ 2023-09-17 18:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

--- Comment #6 from simon at pushface dot org ---
(In reply to simon from comment #1)
> Further:
> 
> $ GNAT_FILE_NAME_CASE_SENSITIVE=1 gnatmake -c p*.ads
> gcc -c páck3.ads
> páck3.ads:1:10: warning: file name does not match unit name, should be
> "páck3.ads"
> 
> The reason for this apparently-bizarre message is[1] that macOS takes 
> the composed form (lowercase a acute) and converts it under the hood 
> to what HFS+ insists on, the fully decomposed form (lowercase a, combining 
> acute); thus the names are actually different even though they _look_ 
> the same.

This behaviour (I think it was an error) was fixed by darwin 19. Opening by a
name with the composed form now correctly finds the file named with the fully
decomposed form.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
  2023-09-10 10:07 ` [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems egallager at gcc dot gnu.org
  2023-09-17 18:37 ` simon at pushface dot org
@ 2023-10-01 17:34 ` egallager at gcc dot gnu.org
  2023-10-18 16:05 ` simon at pushface dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: egallager at gcc dot gnu.org @ 2023-10-01 17:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

--- Comment #7 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to simon from comment #6)
> (In reply to simon from comment #1)
> > Further:
> > 
> > $ GNAT_FILE_NAME_CASE_SENSITIVE=1 gnatmake -c p*.ads
> > gcc -c páck3.ads
> > páck3.ads:1:10: warning: file name does not match unit name, should be
> > "páck3.ads"
> > 
> > The reason for this apparently-bizarre message is[1] that macOS takes 
> > the composed form (lowercase a acute) and converts it under the hood 
> > to what HFS+ insists on, the fully decomposed form (lowercase a, combining 
> > acute); thus the names are actually different even though they _look_ 
> > the same.
> 
> This behaviour (I think it was an error) was fixed by darwin 19. Opening by
> a name with the composed form now correctly finds the file named with the
> fully decomposed form.

OK, so do we still want to fix it for older darwin versions, or...?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-10-01 17:34 ` egallager at gcc dot gnu.org
@ 2023-10-18 16:05 ` simon at pushface dot org
  2023-10-18 16:09 ` simon at pushface dot org
  2023-10-18 16:12 ` ebotcazou at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: simon at pushface dot org @ 2023-10-18 16:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

--- Comment #8 from simon at pushface dot org ---
I think I’d forgotten that compiling páck3.ads on its own, rather than as 
part of the closure, was the way to demonstrate this problem. It was NOT 
fixed in darwin19 (it’s still present in darwin23).

For interest, I made a C file which #includes a header with an a-acute in 
its name; the C file uses the composed a-acute, but the header’s file name
(as shown by ls) uses the combining a-acute. Compiles without complaint.
Attachment c-demo.zip.

On third thoughts, this should probably go back to SUSPENDED. When I looked
into it, it seemed to involve quite deep parts of the compiler, which
probably means that the Ada maintainers would be resistant (especially
since AdaCore don’t support macOS).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-10-18 16:05 ` simon at pushface dot org
@ 2023-10-18 16:09 ` simon at pushface dot org
  2023-10-18 16:12 ` ebotcazou at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: simon at pushface dot org @ 2023-10-18 16:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

--- Comment #9 from simon at pushface dot org ---
Created attachment 56140
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56140&action=edit
C demonstrator

As noted in comment 8, the C compiler doesn’t have a problem with 
finding a file with a combining filename when the #include
directice uses a composed filename.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems
       [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-10-18 16:09 ` simon at pushface dot org
@ 2023-10-18 16:12 ` ebotcazou at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2023-10-18 16:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2017-07-04 00:00:00         |2023-10-18
             Status|UNCONFIRMED                 |SUSPENDED
     Ever confirmed|0                           |1

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-18 16:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-81114-4@http.gcc.gnu.org/bugzilla/>
2023-09-10 10:07 ` [Bug ada/81114] GNAT mishandles filenames with UTF8 chars on case-insensitive filesystems egallager at gcc dot gnu.org
2023-09-17 18:37 ` simon at pushface dot org
2023-10-01 17:34 ` egallager at gcc dot gnu.org
2023-10-18 16:05 ` simon at pushface dot org
2023-10-18 16:09 ` simon at pushface dot org
2023-10-18 16:12 ` ebotcazou at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).