public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
From: Christian Franke <Christian.Franke@t-online.de>
To: cygwin-patches@cygwin.com
Subject: Re: [PATCH] Cygwin: Add /dev/disk/by-label and /dev/disk/by-uuid symlinks
Date: Tue, 21 Nov 2023 19:31:40 +0100	[thread overview]
Message-ID: <7d24b7f1-0dae-ad23-6bde-3502716edbad@t-online.de> (raw)
In-Reply-To: <ZVzLnADL0i2X3orL@calimero.vinschen.de>

[-- Attachment #1: Type: text/plain, Size: 2175 bytes --]

Corinna Vinschen wrote:
> Hi Christian,
>
> Looks good, but I just realized that I was already wondering about the
> sanitization and forgot to talk about it:
>
> On Nov 21 12:24, Christian Franke wrote:
>> diff --git a/winsup/cygwin/fhandler/dev_disk.cc b/winsup/cygwin/fhandler/dev_disk.cc
>> index c5d72816f..d12ac52fa 100644
>> --- a/winsup/cygwin/fhandler/dev_disk.cc
>> +++ b/winsup/cygwin/fhandler/dev_disk.cc
>> @@ -64,10 +64,12 @@ sanitize_label_string (WCHAR *s)
>>     /* Linux does not skip leading spaces. */
>>     return sanitize_string (s, L'\0', L' ', L'_', [] (WCHAR c) -> bool
>>       {
>> -      /* Labels may contain characters not allowed in filenames.
>> -	 Linux replaces spaces with \x20 which is not an option here. */
>> +      /* Labels may contain characters not allowed in filenames.  Also
> Apart from slash and backslash, we don't have this problem in Cygwin,
> usually.  Even control characters are no problem.  All chars not allowed
> in filenames are just transposed into the Unicode private use area, as
> per strfuncs.cc, line 20ff on the way to storage, and back when reading
> the names from storage.  This, and especially in a virtual filesystem
> like /proc, there's no reason to avoid these characters.

Thanks for clarification.


>
>> +         replace '#' to avoid that duplicate markers introduce new
>> +	 duplicates.  Linux replaces spaces with \x20 which is not an
>> +	 option here. */
>>         return !((0 <= c && c <= L' ') || c == L':' || c == L'/' || c == L'\\'
>> -	      || c == L'"');
>> +	      || c == L'#' || c == L'"');
> If you really want to avoid chars not allowed in DOS filenames, the
> list seems incomplete, missing '<', '>', '?', '*', '|'.
>
> But as I said, there's really no reason for that.  I simply reduced the
> above expression to
>
>    return !(c == L'/' || c == L'\\' || c == L'#');
>
> and created a disk label
>
>    test"foo*bar?baz:"
>
> It works nicely, including stuff like
>
>    $ ls *\"*
>    $ ls *\**
>
> So, I can push it as is, or we just allow everything and the kitchen sink
> as per the reduced filter expression.  What do you prefer?

The latter - patch attached.

Christian


[-- Attachment #2: 0001-Cygwin-dev-disk-Append-N-if-the-same-name-appears-mo.patch --]
[-- Type: text/plain, Size: 4724 bytes --]

From ecc54356adbe7768bd5fd5561c78c67cd5725183 Mon Sep 17 00:00:00 2001
From: Christian Franke <christian.franke@t-online.de>
Date: Tue, 21 Nov 2023 19:28:02 +0100
Subject: [PATCH] Cygwin: /dev/disk: Append '#N' if the same name appears more
 than once

No longer drop ranges of identical link names.  Append '#0, #1, ...'
to each name instead.  Enhance charset allowed in label names.
No longer ignore null volume serial numbers.

Signed-off-by: Christian Franke <christian.franke@t-online.de>
---
 winsup/cygwin/fhandler/dev_disk.cc | 54 ++++++++++++++++++------------
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/winsup/cygwin/fhandler/dev_disk.cc b/winsup/cygwin/fhandler/dev_disk.cc
index c5d72816f..29af9de95 100644
--- a/winsup/cygwin/fhandler/dev_disk.cc
+++ b/winsup/cygwin/fhandler/dev_disk.cc
@@ -64,10 +64,11 @@ sanitize_label_string (WCHAR *s)
   /* Linux does not skip leading spaces. */
   return sanitize_string (s, L'\0', L' ', L'_', [] (WCHAR c) -> bool
     {
-      /* Labels may contain characters not allowed in filenames.
-	 Linux replaces spaces with \x20 which is not an option here. */
-      return !((0 <= c && c <= L' ') || c == L':' || c == L'/' || c == L'\\'
-	      || c == L'"');
+      /* Labels may contain characters not allowed in filenames.  Also
+         replace '#' to avoid that duplicate markers introduce new
+	 duplicates.  Linux replaces spaces with \x20 which is not an
+	 option here. */
+      return !(c == L'/' || c == L'\\' || c == L'#');
     }
   );
 }
@@ -304,8 +305,7 @@ partition_to_label_or_uuid(bool uuid, const UNICODE_STRING *drive_uname,
   const NTFS_VOLUME_DATA_BUFFER *nvdb =
     reinterpret_cast<const NTFS_VOLUME_DATA_BUFFER *>(ioctl_buf);
   if (uuid && DeviceIoControl (volhdl, FSCTL_GET_NTFS_VOLUME_DATA, nullptr, 0,
-			       ioctl_buf, NT_MAX_PATH, &bytes_read, nullptr)
-      && nvdb->VolumeSerialNumber.QuadPart)
+			       ioctl_buf, NT_MAX_PATH, &bytes_read, nullptr))
     {
       /* Print without any separator as on Linux. */
       __small_sprintf (name, "%016X", nvdb->VolumeSerialNumber.QuadPart);
@@ -327,13 +327,9 @@ partition_to_label_or_uuid(bool uuid, const UNICODE_STRING *drive_uname,
   FILE_FS_VOLUME_INFORMATION *ffvi =
     reinterpret_cast<FILE_FS_VOLUME_INFORMATION *>(ioctl_buf);
   if (uuid)
-    {
-      if (!ffvi->VolumeSerialNumber)
-	return false;
-      /* Print with separator as on Linux. */
-      __small_sprintf (name, "%04x-%04x", ffvi->VolumeSerialNumber >> 16,
-		       ffvi->VolumeSerialNumber & 0xffff);
-    }
+    /* Print with separator as on Linux. */
+    __small_sprintf (name, "%04x-%04x", ffvi->VolumeSerialNumber >> 16,
+		     ffvi->VolumeSerialNumber & 0xffff);
   else
     {
       /* Label is not null terminated. */
@@ -361,6 +357,20 @@ by_id_compare_name (const void *a, const void *b)
   return strcmp (ap->name, bp->name);
 }
 
+static int
+by_id_compare_name_drive_part (const void *a, const void *b)
+{
+  const by_id_entry *ap = reinterpret_cast<const by_id_entry *>(a);
+  const by_id_entry *bp = reinterpret_cast<const by_id_entry *>(b);
+  int cmp = strcmp (ap->name, bp->name);
+  if (cmp)
+    return cmp;
+  cmp = ap->drive - bp->drive;
+  if (cmp)
+    return cmp;
+  return ap->part - bp->part;
+}
+
 static by_id_entry *
 by_id_realloc (by_id_entry *p, size_t n)
 {
@@ -610,8 +620,9 @@ get_by_id_table (by_id_entry * &table, fhandler_dev_disk::dev_disk_location loc)
   if (!table)
     return (errno_set ? -1 : 0);
 
-  /* Sort by name and remove duplicates. */
-  qsort (table, table_size, sizeof (*table), by_id_compare_name);
+  /* Sort by {name, drive, part} to ensure stable sort order. */
+  qsort (table, table_size, sizeof (*table), by_id_compare_name_drive_part);
+  /* Mark duplicate names. */
   for (unsigned i = 0; i < table_size; i++)
     {
       unsigned j = i + 1;
@@ -619,12 +630,13 @@ get_by_id_table (by_id_entry * &table, fhandler_dev_disk::dev_disk_location loc)
 	j++;
       if (j == i + 1)
 	continue;
-      /* Duplicate(s) found, remove all entries with this name. */
-      debug_printf ("removing duplicates %d-%d: '%s'", i, j - 1, table[i].name);
-      if (j < table_size)
-	memmove (table + i, table + j, (table_size - j) * sizeof (*table));
-      table_size -= j - i;
-      i--;
+      /* Duplicate(s) found, append "#N" to all entries.  This never
+	 introduces new duplicates because '#' never occurs in the
+	 original names. */
+      debug_printf ("mark duplicates %u-%u of '%s'", i, j - 1, table[i].name);
+      size_t len = strlen (table[i].name);
+      for (unsigned k = i; k < j; k++)
+	__small_sprintf (table[k].name + len, "#%u", k - i);
     }
 
   debug_printf ("table_size: %d", table_size);
-- 
2.42.1


  reply	other threads:[~2023-11-21 18:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-17 14:39 Christian Franke
2023-11-17 16:23 ` Corinna Vinschen
2023-11-17 16:45   ` Christian Franke
2023-11-17 16:49     ` Corinna Vinschen
2023-11-17 17:53       ` Christian Franke
2023-11-17 19:40         ` Corinna Vinschen
2023-11-17 20:25           ` Christian Franke
2023-11-20  9:40             ` Corinna Vinschen
2023-11-20  9:46               ` Corinna Vinschen
2023-11-20 14:54                 ` Christian Franke
2023-11-20 20:02                   ` Corinna Vinschen
2023-11-21 11:24                   ` Christian Franke
2023-11-21 15:24                     ` Corinna Vinschen
2023-11-21 18:31                       ` Christian Franke [this message]
2023-11-21 18:41                         ` Corinna Vinschen
2023-11-22  9:18                           ` Corinna Vinschen
2023-11-22 16:31                             ` Christian Franke
2023-11-23 16:27                               ` Corinna Vinschen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d24b7f1-0dae-ad23-6bde-3502716edbad@t-online.de \
    --to=christian.franke@t-online.de \
    --cc=cygwin-patches@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).