[Bese-devel] Re: character issues. aka: http is a binary protocol, get over it.

Pascal Bourguignon pjb at informatimago.com
Thu Dec 15 20:32:14 UTC 2005


Maciek Pasternacki writes:
> What is wrong about my solution with treating stream as
> iso-8859-1-encoded string (which is completely equivalent to binary
> stream), and recoding it when I expect text to be in another charset?

For one thing, on most CL implementations, characters will take more
(much more) than one byte of space, even when only iso-8859-1
characters.  Then you convert three times when only one or zero
conversion is needed:
             octet->iso-8859-1->octet->actual encoding 
instead of:  octet->actual encoding  ; for text 
or just:     octet                   ; for binary data


> On one hand it's a kind of hack, OTOH we work along the RFCs with
> Latin-1 text, as RFCs state (and as is easier to debug than parsing
> byte arrays), and after parsing, after all protocol-related work, we
> re-encode Latin-1 text to encoding expected by us (or decode it to
> byte arrays).  All encoding issues take place when they won't make
> trouble, and when they start being actually relevant.  Analogically
> with encoding reply to send out -- app works with Unicode text, when
> it starts being encoded in any way, it's being re-coded to transparent
> Latin-1 not to bother RFC-related code.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
You're always typing.
Well, let's see you ignore my
sitting on your hands.



More information about the bese-devel mailing list