public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
From: Thomas Wolff <towo@towo.net>
To: cygwin-patches@cygwin.com
Subject: Re: /dev/clipboard pasting with small read() buffer
Date: Thu, 16 Aug 2012 14:21:00 -0000	[thread overview]
Message-ID: <502D0199.6040203@towo.net> (raw)
In-Reply-To: <20120816123033.GH17546@calimero.vinschen.de>

[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]

On 16.08.2012 14:30, Corinna Vinschen wrote:
> On Aug 16 14:11, Thomas Wolff wrote:
>> Hi Corinna,
>>
>> On 16.08.2012 11:33, Corinna Vinschen wrote:
>>> Hi Thomas,
>>>
>>> thanks for the patch.   I have a few minor nits:
>>>
>>> On Aug 14 22:56, Thomas Wolff wrote:
>>> ... 
>>>> +	  char cprabuf [8 + 1];	/* need this length for surrogates */
>>>> +	  if (len < 8)
>>>> +	    {
>>>> +	      _ptr = cprabuf;
>>>> +	      _len = 8;
>>>> +	    }
>>> 8?  Why 8?  The size appears to be rather artificial.  The code should
>>> use MB_CUR_MAX instead.
>> MB_CUR_MAX does not work because its value is 1 at this point
> So what about MB_LEN_MAX then?  There's no problem using a multiplier,
> but a symbolic constant is always better than a numerical constant.
I've now used _MB_LEN_MAX from newlib.h, rather than MB_LEN_MAX from 
limits.h (note the "_" distinction :) ),
because the latter, by its preceding comment, reserves the option to be 
changed into a dynamic function in the future, which could then possibly 
have the same problems as MB_CUR_MAX.

About the surrogates problem, I think I've found a solution:
I've added an explicit test to avoid processing of split surrogate pairs 
(to that loop...); this seems to work now.

>>>> +	      /* If using read-ahead buffer, copy to class read-ahead buffer
>>>> +	         and deliver first byte. */
>>>> +	      if (_ptr == cprabuf)
>>>> +		{
>>>> +		  puts_readahead (cprabuf, ret);
>>>> +		  * (char *) ptr = get_readahead ();
>>>> +		  ret = 1;
>>> (*) Ok, that works, but wouldn't it be more efficient to do that in
>>> a tiny loop along the lines of
>>>
>>> 		  int x;
>>> 		  ret = 0;
>>>                    while (ret < len && (x = get_readahead ()) >= 0)
>>> 		    ptr++ = x;
>>> 		    ret++;
>>>
>>> ?
>> I can add it if you prefer; I just didn't think it's worth the
>> effort and concerning efficiency, after that prior trial-and-error
>> count-down-loop...
> Yeah, that's a valid point.  But maybe we shouldn't make it slower
> than necessary?  If you have a good idea how to avoid the other
> loop, don't hesitate to submit a patch.
Added the loop to use up the caller's buffer.
About avoiding the trial-and-error loop, I think that would require 
digging into sys_mbstowcs (which doesn't even seem to behave as documented).

------
Thomas

[-- Attachment #2: clipboard-small-buffer.patch.3 --]
[-- Type: text/plain, Size: 2985 bytes --]

--- sav/fhandler_clipboard.cc	2012-07-08 02:36:47.000000000 +0200
+++ ./fhandler_clipboard.cc	2012-08-16 16:08:23.782692300 +0200
@@ -222,6 +222,7 @@ fhandler_dev_clipboard::read (void *ptr,
   UINT formatlist[2];
   int format;
   LPVOID cb_data;
+  int rach;
 
   if (!OpenClipboard (NULL))
     {
@@ -243,12 +244,24 @@ fhandler_dev_clipboard::read (void *ptr,
       cygcb_t *clipbuf = (cygcb_t *) cb_data;
 
       if (pos < clipbuf->len)
-      	{
+	{
 	  ret = ((len > (clipbuf->len - pos)) ? (clipbuf->len - pos) : len);
 	  memcpy (ptr, clipbuf->data + pos , ret);
 	  pos += ret;
 	}
     }
+  else if ((rach = get_readahead ()) >= 0)
+    {
+      /* Deliver from read-ahead buffer. */
+      char * out_ptr = (char *) ptr;
+      * out_ptr++ = rach;
+      ret = 1;
+      while (ret < len && (rach = get_readahead ()) >= 0)
+	{
+	  * out_ptr++ = rach;
+	  ret++;
+	}
+    }
   else
     {
       wchar_t *buf = (wchar_t *) cb_data;
@@ -256,25 +269,54 @@ fhandler_dev_clipboard::read (void *ptr,
       size_t glen = GlobalSize (hglb) / sizeof (WCHAR) - 1;
       if (pos < glen)
 	{
+	  /* If caller's buffer is too small to hold at least one 
+	     max-size character, redirect algorithm to local 
+	     read-ahead buffer, finally fill class read-ahead buffer 
+	     with result and feed caller from there. */
+	  char * conv_ptr = (char *) ptr;
+	  size_t conv_len = len;
+#define cprabuf_len _MB_LEN_MAX	/* newlib's max MB_CUR_MAX of all encodings */
+	  char cprabuf [cprabuf_len];
+	  if (len < cprabuf_len)
+	    {
+	      conv_ptr = cprabuf;
+	      conv_len = cprabuf_len;
+	    }
+
 	  /* Comparing apples and oranges here, but the below loop could become
 	     extremly slow otherwise.  We rather return a few bytes less than
 	     possible instead of being even more slow than usual... */
-	  if (glen > pos + len)
-	    glen = pos + len;
+	  if (glen > pos + conv_len)
+	    glen = pos + conv_len;
 	  /* This loop is necessary because the number of bytes returned by
 	     sys_wcstombs does not indicate the number of wide chars used for
 	     it, so we could potentially drop wide chars. */
 	  while ((ret = sys_wcstombs (NULL, 0, buf + pos, glen - pos))
 		  != (size_t) -1
-		 && ret > len)
+		 && (ret > conv_len 
+			/* Skip separated high surrogate: */
+		     || ((buf [pos + glen - 1] & 0xFC00) == 0xD800 && glen - pos > 1)))
 	     --glen;
 	  if (ret == (size_t) -1)
 	    ret = 0;
 	  else
 	    {
-	      ret = sys_wcstombs ((char *) ptr, (size_t) -1,
+	      ret = sys_wcstombs ((char *) conv_ptr, (size_t) -1,
 				  buf + pos, glen - pos);
 	      pos = glen;
+	      /* If using read-ahead buffer, copy to class read-ahead buffer
+	         and deliver first byte. */
+	      if (conv_ptr == cprabuf)
+		{
+		  puts_readahead (cprabuf, ret);
+		  char * out_ptr = (char *) ptr;
+		  ret = 0;
+		  while (ret < len && (rach = get_readahead ()) >= 0)
+		    {
+		      * out_ptr++ = rach;
+		      ret++;
+		    }
+		}
 	    }
 	}
     }

  reply	other threads:[~2012-08-16 14:21 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-14 20:56 Thomas Wolff
2012-08-16  9:34 ` Corinna Vinschen
2012-08-16 12:12   ` Thomas Wolff
2012-08-16 12:31     ` Corinna Vinschen
2012-08-16 14:21       ` Thomas Wolff [this message]
2012-08-16 15:24         ` Eric Blake
2012-08-16 16:23           ` Corinna Vinschen
2012-08-17  8:44             ` Thomas Wolff
2012-08-17  9:23               ` Corinna Vinschen
2012-08-17 13:05                 ` Thomas Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502D0199.6040203@towo.net \
    --to=towo@towo.net \
    --cc=cygwin-patches@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).