From: Christian Franke <Christian.Franke@t-online.de>
To: cygwin-patches@cygwin.com
Subject: Re: [PATCH] Cygwin: Add /dev/disk/by-label and /dev/disk/by-uuid symlinks
Date: Tue, 21 Nov 2023 19:31:40 +0100 [thread overview]
Message-ID: <7d24b7f1-0dae-ad23-6bde-3502716edbad@t-online.de> (raw)
In-Reply-To: <ZVzLnADL0i2X3orL@calimero.vinschen.de>
[-- Attachment #1: Type: text/plain, Size: 2175 bytes --]
Corinna Vinschen wrote:
> Hi Christian,
>
> Looks good, but I just realized that I was already wondering about the
> sanitization and forgot to talk about it:
>
> On Nov 21 12:24, Christian Franke wrote:
>> diff --git a/winsup/cygwin/fhandler/dev_disk.cc b/winsup/cygwin/fhandler/dev_disk.cc
>> index c5d72816f..d12ac52fa 100644
>> --- a/winsup/cygwin/fhandler/dev_disk.cc
>> +++ b/winsup/cygwin/fhandler/dev_disk.cc
>> @@ -64,10 +64,12 @@ sanitize_label_string (WCHAR *s)
>> /* Linux does not skip leading spaces. */
>> return sanitize_string (s, L'\0', L' ', L'_', [] (WCHAR c) -> bool
>> {
>> - /* Labels may contain characters not allowed in filenames.
>> - Linux replaces spaces with \x20 which is not an option here. */
>> + /* Labels may contain characters not allowed in filenames. Also
> Apart from slash and backslash, we don't have this problem in Cygwin,
> usually. Even control characters are no problem. All chars not allowed
> in filenames are just transposed into the Unicode private use area, as
> per strfuncs.cc, line 20ff on the way to storage, and back when reading
> the names from storage. This, and especially in a virtual filesystem
> like /proc, there's no reason to avoid these characters.
Thanks for clarification.
>
>> + replace '#' to avoid that duplicate markers introduce new
>> + duplicates. Linux replaces spaces with \x20 which is not an
>> + option here. */
>> return !((0 <= c && c <= L' ') || c == L':' || c == L'/' || c == L'\\'
>> - || c == L'"');
>> + || c == L'#' || c == L'"');
> If you really want to avoid chars not allowed in DOS filenames, the
> list seems incomplete, missing '<', '>', '?', '*', '|'.
>
> But as I said, there's really no reason for that. I simply reduced the
> above expression to
>
> return !(c == L'/' || c == L'\\' || c == L'#');
>
> and created a disk label
>
> test"foo*bar?baz:"
>
> It works nicely, including stuff like
>
> $ ls *\"*
> $ ls *\**
>
> So, I can push it as is, or we just allow everything and the kitchen sink
> as per the reduced filter expression. What do you prefer?
The latter - patch attached.
Christian
[-- Attachment #2: 0001-Cygwin-dev-disk-Append-N-if-the-same-name-appears-mo.patch --]
[-- Type: text/plain, Size: 4724 bytes --]
From ecc54356adbe7768bd5fd5561c78c67cd5725183 Mon Sep 17 00:00:00 2001
From: Christian Franke <christian.franke@t-online.de>
Date: Tue, 21 Nov 2023 19:28:02 +0100
Subject: [PATCH] Cygwin: /dev/disk: Append '#N' if the same name appears more
than once
No longer drop ranges of identical link names. Append '#0, #1, ...'
to each name instead. Enhance charset allowed in label names.
No longer ignore null volume serial numbers.
Signed-off-by: Christian Franke <christian.franke@t-online.de>
---
winsup/cygwin/fhandler/dev_disk.cc | 54 ++++++++++++++++++------------
1 file changed, 33 insertions(+), 21 deletions(-)
diff --git a/winsup/cygwin/fhandler/dev_disk.cc b/winsup/cygwin/fhandler/dev_disk.cc
index c5d72816f..29af9de95 100644
--- a/winsup/cygwin/fhandler/dev_disk.cc
+++ b/winsup/cygwin/fhandler/dev_disk.cc
@@ -64,10 +64,11 @@ sanitize_label_string (WCHAR *s)
/* Linux does not skip leading spaces. */
return sanitize_string (s, L'\0', L' ', L'_', [] (WCHAR c) -> bool
{
- /* Labels may contain characters not allowed in filenames.
- Linux replaces spaces with \x20 which is not an option here. */
- return !((0 <= c && c <= L' ') || c == L':' || c == L'/' || c == L'\\'
- || c == L'"');
+ /* Labels may contain characters not allowed in filenames. Also
+ replace '#' to avoid that duplicate markers introduce new
+ duplicates. Linux replaces spaces with \x20 which is not an
+ option here. */
+ return !(c == L'/' || c == L'\\' || c == L'#');
}
);
}
@@ -304,8 +305,7 @@ partition_to_label_or_uuid(bool uuid, const UNICODE_STRING *drive_uname,
const NTFS_VOLUME_DATA_BUFFER *nvdb =
reinterpret_cast<const NTFS_VOLUME_DATA_BUFFER *>(ioctl_buf);
if (uuid && DeviceIoControl (volhdl, FSCTL_GET_NTFS_VOLUME_DATA, nullptr, 0,
- ioctl_buf, NT_MAX_PATH, &bytes_read, nullptr)
- && nvdb->VolumeSerialNumber.QuadPart)
+ ioctl_buf, NT_MAX_PATH, &bytes_read, nullptr))
{
/* Print without any separator as on Linux. */
__small_sprintf (name, "%016X", nvdb->VolumeSerialNumber.QuadPart);
@@ -327,13 +327,9 @@ partition_to_label_or_uuid(bool uuid, const UNICODE_STRING *drive_uname,
FILE_FS_VOLUME_INFORMATION *ffvi =
reinterpret_cast<FILE_FS_VOLUME_INFORMATION *>(ioctl_buf);
if (uuid)
- {
- if (!ffvi->VolumeSerialNumber)
- return false;
- /* Print with separator as on Linux. */
- __small_sprintf (name, "%04x-%04x", ffvi->VolumeSerialNumber >> 16,
- ffvi->VolumeSerialNumber & 0xffff);
- }
+ /* Print with separator as on Linux. */
+ __small_sprintf (name, "%04x-%04x", ffvi->VolumeSerialNumber >> 16,
+ ffvi->VolumeSerialNumber & 0xffff);
else
{
/* Label is not null terminated. */
@@ -361,6 +357,20 @@ by_id_compare_name (const void *a, const void *b)
return strcmp (ap->name, bp->name);
}
+static int
+by_id_compare_name_drive_part (const void *a, const void *b)
+{
+ const by_id_entry *ap = reinterpret_cast<const by_id_entry *>(a);
+ const by_id_entry *bp = reinterpret_cast<const by_id_entry *>(b);
+ int cmp = strcmp (ap->name, bp->name);
+ if (cmp)
+ return cmp;
+ cmp = ap->drive - bp->drive;
+ if (cmp)
+ return cmp;
+ return ap->part - bp->part;
+}
+
static by_id_entry *
by_id_realloc (by_id_entry *p, size_t n)
{
@@ -610,8 +620,9 @@ get_by_id_table (by_id_entry * &table, fhandler_dev_disk::dev_disk_location loc)
if (!table)
return (errno_set ? -1 : 0);
- /* Sort by name and remove duplicates. */
- qsort (table, table_size, sizeof (*table), by_id_compare_name);
+ /* Sort by {name, drive, part} to ensure stable sort order. */
+ qsort (table, table_size, sizeof (*table), by_id_compare_name_drive_part);
+ /* Mark duplicate names. */
for (unsigned i = 0; i < table_size; i++)
{
unsigned j = i + 1;
@@ -619,12 +630,13 @@ get_by_id_table (by_id_entry * &table, fhandler_dev_disk::dev_disk_location loc)
j++;
if (j == i + 1)
continue;
- /* Duplicate(s) found, remove all entries with this name. */
- debug_printf ("removing duplicates %d-%d: '%s'", i, j - 1, table[i].name);
- if (j < table_size)
- memmove (table + i, table + j, (table_size - j) * sizeof (*table));
- table_size -= j - i;
- i--;
+ /* Duplicate(s) found, append "#N" to all entries. This never
+ introduces new duplicates because '#' never occurs in the
+ original names. */
+ debug_printf ("mark duplicates %u-%u of '%s'", i, j - 1, table[i].name);
+ size_t len = strlen (table[i].name);
+ for (unsigned k = i; k < j; k++)
+ __small_sprintf (table[k].name + len, "#%u", k - i);
}
debug_printf ("table_size: %d", table_size);
--
2.42.1
next prev parent reply other threads:[~2023-11-21 18:32 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-17 14:39 Christian Franke
2023-11-17 16:23 ` Corinna Vinschen
2023-11-17 16:45 ` Christian Franke
2023-11-17 16:49 ` Corinna Vinschen
2023-11-17 17:53 ` Christian Franke
2023-11-17 19:40 ` Corinna Vinschen
2023-11-17 20:25 ` Christian Franke
2023-11-20 9:40 ` Corinna Vinschen
2023-11-20 9:46 ` Corinna Vinschen
2023-11-20 14:54 ` Christian Franke
2023-11-20 20:02 ` Corinna Vinschen
2023-11-21 11:24 ` Christian Franke
2023-11-21 15:24 ` Corinna Vinschen
2023-11-21 18:31 ` Christian Franke [this message]
2023-11-21 18:41 ` Corinna Vinschen
2023-11-22 9:18 ` Corinna Vinschen
2023-11-22 16:31 ` Christian Franke
2023-11-23 16:27 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d24b7f1-0dae-ad23-6bde-3502716edbad@t-online.de \
--to=christian.franke@t-online.de \
--cc=cygwin-patches@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).