From: Eugene Zaikonnikov
Subject: Reading from a mixed bytestream
Date: 
Message-ID: <37E265C4.EEFE195D@cit.org.by>
Suppose we have a bytestream which contains both singed and unsigned
8-bit bytes mixed. The question is: how to read bytes from the stream
correctly with minimal overhead? The only thing that came to my mind so
far is to open a stream of element-type unsigned-byte and to call
something like this on bytes we know to be signed:

(defun yield-signed (b)
	(if (zerop (logand #b10000000 b))
	    b
	  (- (+ 257 (lognot  b)))))

Could it be done in a more elegant way? I'm not advocating C, but there
it could be done by a simple type cast.

Thanks,
   Eugene.

P.S. The function above is 140 machine instructions compiled in ACL
Trial Linux.

From: Roger Corman
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <37e281e9.851846690@nntp.best.com>
On Fri, 17 Sep 1999 19:01:08 +0300, Eugene Zaikonnikov
<······@cit.org.by> wrote:

>Suppose we have a bytestream which contains both singed and unsigned
>8-bit bytes mixed. The question is: how to read bytes from the stream
>correctly with minimal overhead? The only thing that came to my mind so
>far is to open a stream of element-type unsigned-byte and to call
>something like this on bytes we know to be signed:
>
>(defun yield-signed (b)
>	(if (zerop (logand #b10000000 b))
>	    b
>	  (- (+ 257 (lognot  b)))))
>
>Could it be done in a more elegant way? I'm not advocating C, but there
>it could be done by a simple type cast.

How about:

(defun yield-signed (b) (if (> b 127) (- b 256) b))


Roger Corman
From: Eugene Zaikonnikov
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <37E35BD2.38E5CAA3@cit.org.by>
Roger Corman wrote:
> 
> >
> >(defun yield-signed (b)
> >       (if (zerop (logand #b10000000 b))
> >           b
> >         (- (+ 257 (lognot  b)))))
> >
> >Could it be done in a more elegant way? I'm not advocating C, but there
> >it could be done by a simple type cast.
> 
> How about:
> 
> (defun yield-signed (b) (if (> b 127) (- b 256) b))
> 
Yup, it's better. I should admit that it is more elegant and efficient:
it takes 18 instructions with optimizations on against 52 of my code.
But what I really asked was if I doomed to do bit-bashing? It seemed so,
but I was not sure.

--
 Eugene.
From: Marco Antoniotti
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <lwwvtkr5ar.fsf@copernico.parades.rm.cnr.it>
Eugene Zaikonnikov <······@cit.org.by> writes:

> Roger Corman wrote:
> > 
> > >
> > >(defun yield-signed (b)
> > >       (if (zerop (logand #b10000000 b))
> > >           b
> > >         (- (+ 257 (lognot  b)))))
> > >
> > >Could it be done in a more elegant way? I'm not advocating C, but there
> > >it could be done by a simple type cast.
> > 
> > How about:
> > 
> > (defun yield-signed (b) (if (> b 127) (- b 256) b))
> > 
> Yup, it's better. I should admit that it is more elegant and efficient:
> it takes 18 instructions with optimizations on against 52 of my code.
> But what I really asked was if I doomed to do bit-bashing? It seemed so,
> but I was not sure.

Looks to me that you are doomed to do bit bashing no matter what
language you use....

Cheers


-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
From: Roger Corman
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <37e91800.1282550530@nntp.best.com>
On 21 Sep 1999 11:18:36 +0200, Marco Antoniotti
<·······@copernico.parades.rm.cnr.it> wrote:

>
>Eugene Zaikonnikov <······@cit.org.by> writes:
>
>> Roger Corman wrote:
>> > 
>> > >
>> > >(defun yield-signed (b)
>> > >       (if (zerop (logand #b10000000 b))
>> > >           b
>> > >         (- (+ 257 (lognot  b)))))
>> > >
>> > >Could it be done in a more elegant way? I'm not advocating C, but there
>> > >it could be done by a simple type cast.
>> > 
>> > How about:
>> > 
>> > (defun yield-signed (b) (if (> b 127) (- b 256) b))
>> > 
>> Yup, it's better. I should admit that it is more elegant and efficient:
>> it takes 18 instructions with optimizations on against 52 of my code.
>> But what I really asked was if I doomed to do bit-bashing? It seemed so,
>> but I was not sure.
>
>Looks to me that you are doomed to do bit bashing no matter what
>language you use....
>

Well, I think Eugene's point was that in C or C++ the compiler can
just reinterpret the unsigned byte as a signed byte without modifying
the data ("bit bashing"). This is true and of course there is no
overhead at run time. Lisp represents the signed and unsigned bytes
differently (i.e. 255 and -1) and so you will always have do some
runtime bit bashing to turn one into the other. However, it is quite
inexpensive, only a few instructions if inlined.

Roger Corman
From: Marco Antoniotti
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <lwso46gdmj.fsf@copernico.parades.rm.cnr.it>
·····@xippix.com (Roger Corman) writes:

> >Looks to me that you are doomed to do bit bashing no matter what
> >language you use....
> >
> 
> Well, I think Eugene's point was that in C or C++ the compiler can
> just reinterpret the unsigned byte as a signed byte without modifying
> the data ("bit bashing"). This is true and of course there is no
> overhead at run time. Lisp represents the signed and unsigned bytes
> differently (i.e. 255 and -1) and so you will always have do some
> runtime bit bashing to turn one into the other. However, it is quite
> inexpensive, only a few instructions if inlined.

Al right, got it.

Marco

-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa
From: Eugene Zaikonnikov
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <938091823.142328@lxms.cit.org.by>
Roger Corman <·····@xippix.com> wrote in message
························@nntp.best.com...
> Well, I think Eugene's point was that in C or C++ the compiler can
> just reinterpret the unsigned byte as a signed byte without modifying
> the data ("bit bashing"). This is true and of course there is no
> overhead at run time. Lisp represents the signed and unsigned bytes
> differently (i.e. 255 and -1) and so you will always have do some
> runtime bit bashing to turn one into the other. However, it is quite
> inexpensive, only a few instructions if inlined.

Exactly. I fully understand that the overhead of a few simple instructions
is nothing in context of file operations, and that the absence of a
transparent byte 'conversion' in CL is a price it pays for abstract numeric
types. But since usually it takes cleaner construction to express some idiom
in Lisp than in C, my first thought was "perhaps I've missed something
obvious". In such situations I just have to look up the HyperSpec to find
what I need, but this was not the case, thus my original post.

Cheers,
    Eugene.
From: Pekka P. Pirinen
Subject: Re: Reading from a mixed bytestream
Date: 
Message-ID: <ixemfo5oc1.fsf@gaspode.cam.harlequin.co.uk>
·····@xippix.com (Roger Corman) writes:
> On 21 Sep 1999 11:18:36 +0200, Marco Antoniotti
> <·······@copernico.parades.rm.cnr.it> wrote:
> >Eugene Zaikonnikov <······@cit.org.by> writes:
> >> Roger Corman wrote:
> >> > (defun yield-signed (b) (if (> b 127) (- b 256) b))
> >> > 
> >> Yup, it's better. I should admit that it is more elegant and efficient:
> >> it takes 18 instructions with optimizations on against 52 of my code.
> >> But what I really asked was if I doomed to do bit-bashing? It seemed so,
> >> but I was not sure.
> >
> >Looks to me that you are doomed to do bit bashing no matter what
> >language you use....
> 
> Well, I think Eugene's point was that in C or C++ the compiler can
> just reinterpret the unsigned byte as a signed byte without modifying
> the data ("bit bashing"). This is true and of course there is no
> overhead at run time.

Yes, although if you perform any arithmetic on that signed byte (and
if you don't, why would the signedness matter?), the C compiler might
well sign-extend it to a word first, and not necessarily in an
efficient way.  Casting all your constants to the right type helps.

> Lisp represents the signed and unsigned bytes
> differently (i.e. 255 and -1) and so you will always have do some
> runtime bit bashing to turn one into the other.

Yes, and Lisp lacks a native way to express sign-extension, so it is
probably more effective to do it the other way 'round: read all as
signed and coerce to unsigned by taking the low byte.

CL-USER 19> (declaim (optimize (safety 0) (speed 3) (debug 0)))
nil
CL-USER 20> (disassemble (defun yield-unsigned (x)
                           (declare (fixnum x))
                           (the fixnum (logand x 255))))
.L00:   mov     1, %g7
4       and     %o0, 3fc, %o0
8       jmp     [%o7 + 8]
C       ld      [%o4 + 6], %o4
4
nil

Four instructions, and three of those are function call overhead.
-- 
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
If it's spam, it's a scam.  Don't do business with Net abusers.