On 16.08.2012 14:30, Corinna Vinschen wrote: > On Aug 16 14:11, Thomas Wolff wrote: >> Hi Corinna, >> >> On 16.08.2012 11:33, Corinna Vinschen wrote: >>> Hi Thomas, >>> >>> thanks for the patch. I have a few minor nits: >>> >>> On Aug 14 22:56, Thomas Wolff wrote: >>> ... >>>> + char cprabuf [8 + 1]; /* need this length for surrogates */ >>>> + if (len < 8) >>>> + { >>>> + _ptr = cprabuf; >>>> + _len = 8; >>>> + } >>> 8? Why 8? The size appears to be rather artificial. The code should >>> use MB_CUR_MAX instead. >> MB_CUR_MAX does not work because its value is 1 at this point > So what about MB_LEN_MAX then? There's no problem using a multiplier, > but a symbolic constant is always better than a numerical constant. I've now used _MB_LEN_MAX from newlib.h, rather than MB_LEN_MAX from limits.h (note the "_" distinction :) ), because the latter, by its preceding comment, reserves the option to be changed into a dynamic function in the future, which could then possibly have the same problems as MB_CUR_MAX. About the surrogates problem, I think I've found a solution: I've added an explicit test to avoid processing of split surrogate pairs (to that loop...); this seems to work now. >>>> + /* If using read-ahead buffer, copy to class read-ahead buffer >>>> + and deliver first byte. */ >>>> + if (_ptr == cprabuf) >>>> + { >>>> + puts_readahead (cprabuf, ret); >>>> + * (char *) ptr = get_readahead (); >>>> + ret = 1; >>> (*) Ok, that works, but wouldn't it be more efficient to do that in >>> a tiny loop along the lines of >>> >>> int x; >>> ret = 0; >>> while (ret < len && (x = get_readahead ()) >= 0) >>> ptr++ = x; >>> ret++; >>> >>> ? >> I can add it if you prefer; I just didn't think it's worth the >> effort and concerning efficiency, after that prior trial-and-error >> count-down-loop... > Yeah, that's a valid point. But maybe we shouldn't make it slower > than necessary? If you have a good idea how to avoid the other > loop, don't hesitate to submit a patch. Added the loop to use up the caller's buffer. About avoiding the trial-and-error loop, I think that would require digging into sys_mbstowcs (which doesn't even seem to behave as documented). ------ Thomas