From: Alan Manuel K. Gloria
Subject: Bits from a stream
Date: 
Message-ID: <1142858230.903896.135640@j33g2000cwa.googlegroups.com>
I'm currently trying to figure out how bits work in CL.  From what I've
dredged through the Hyperspec, bits are integers/unsigned bytes of
width 1.  I would like to know how I should handle them - do I handle
them as if they were ordinary integers, just having a significantly
reduced range?  Googling "bits common lisp" and "bit common lisp"
doesn't seem to give me useful information ("32 bit and 64 bit versions
of Common Lisp"...).

Basically I need to do some compression on some data I'm storing - the
way I keep my data is rather amenable to compression, but compressing
it requires access to individual bits.  I would like to store and
retrieve data from a file, so I will need access to bits in a stream.

So, first I'd like to know if there already exists such a function.
Basically I have a bit of lisp-pseudocode for such a hypothetical
function:
(let
    (
        (bitn 0)
        (byte 0))
    (defun read-bit-reset ()
        (setf bitn (setf byte 0)) )
    (defun read-bit (stream)
        (when (zerop bitn)
            (setf bitn 8)
            (setf byte (read-byte stream)) )
        (prog1 (logical-and byte 1) #|erk.  what's the logical
(bitwise) and?|#
            (setf byte (right-shift byte 1)) #|la la la la la... what's
shift?|#
            (decf bitn) )))

Okay, so maybe I'll dredge through the hyperspec and figure out what
names CL gives to bitwise or/and and bit shifting.  But first, is there
already such a function?  Thanks!

From: sross
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142860317.613478.262750@u72g2000cwu.googlegroups.com>
Take a look at logand
http://www.lispworks.com/documentation/HyperSpec/Body/f_logand.htm
and ash http://www.lispworks.com/documentation/HyperSpec/Body/f_ash.htm

Cheers,
  Sean.
From: Alan Manuel K. Gloria
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142862026.629170.255270@g10g2000cwb.googlegroups.com>
Thanks!  I wasn't able to find ash at all, but I did find ldb.  This is
what I ended up writing:

(let
	(
		(bitn 8)
		(byt 0) )
	(defun read-bit-reset ()
		(setf bitn 8) )
	(defun read-bit (stream)
		(when (eq 8 bitn)
			(setf bitn 0)
			(setf byt (read-byte stream)) )
		(prog1
			(ldb (byte 1 bitn) byt)
			(incf bitn) )))

The hyperspec mentions a function 'dpb.  Can I assume that the (setf
ldb) is dpb?

Also, am I right in assuming that a bit is just a really really small
integer?  Basically, any worries I have about bits would have to be
deferred until I decide I need to 'declare types?
From: Pascal Bourguignon
Subject: Re: Bits from a stream
Date: 
Message-ID: <87slpd5i3w.fsf@thalassa.informatimago.com>
"Alan Manuel K. Gloria" <········@gmail.com> writes:

> Thanks!  I wasn't able to find ash at all, but I did find ldb.  This is
> what I ended up writing:
>
> (let
> 	(
> 		(bitn 8)
> 		(byt 0) )
> 	(defun read-bit-reset ()
> 		(setf bitn 8) )
> 	(defun read-bit (stream)
> 		(when (eq 8 bitn)
> 			(setf bitn 0)
> 			(setf byt (read-byte stream)) )
> 		(prog1
> 			(ldb (byte 1 bitn) byt)
> 			(incf bitn) )))

(eq 8 8) may return NIL.
(eq 1000000000 1000000000) DO return NIL on most implementations.

Use = for numbers.

Otherwise, your READ-BIT function is wrong.  You've not proved it,
it's clear, but did you even try it?  



> The hyperspec mentions a function 'dpb.  Can I assume that the (setf
> ldb) is dpb?

Not exactly. DPB is a pure function.  SETF modifies the place.
But otherwise, yes:
    (setf (ldb (byte w o) x) v) === (setf x (dpb v (byte w o) x))



> Also, am I right in assuming that a bit is just a really really small
> integer?  

Any doubt?  Ask it!  


[314]> (type-of 1)
BIT
[315]> (subtypep (type-of 1) 'integer)
T ;
T
[316]> (subtypep (type-of 1) 'fixnum)
T ;
T


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

PLEASE NOTE: Some quantum physics theories suggest that when the
consumer is not directly observing this product, it may cease to
exist or will exist only in a vague and undetermined state.
From: Zach Beane
Subject: Re: Bits from a stream
Date: 
Message-ID: <m3oe012qi4.fsf@unnamed.xach.com>
"Alan Manuel K. Gloria" <········@gmail.com> writes:

> I'm currently trying to figure out how bits work in CL.

You can access the bits of an integer with functions like LDB and
LOGAND. You can produce integers with modified bits with functions
like DPB and LOGIOR. You can perform bitwise shifts with ASH. You can
test individual bits with LOGBITP or groups of bits with LOGTEST.

There are many more functions for accessing and operating on the bits
of an integer. See the numbers dictionary for a list:

   http://www.lispworks.com/documentation/HyperSpec/Body/c_number.htm

> Basically I need to do some compression on some data I'm storing - the
> way I keep my data is rather amenable to compression, but compressing
> it requires access to individual bits.  I would like to store and
> retrieve data from a file, so I will need access to bits in a stream.

I have written a few bitstream functions for a Flash file
reader/writer. They are available here:

   http://www.xach.com/lisp/cl-flash/bitio.lisp

It will allow you to both read and write variable-length contiguous
bits across octet boundaries in an (UNSIGNED-BYTE 8) stream. There
isn't much to it.

> Okay, so maybe I'll dredge through the hyperspec and figure out what
> names CL gives to bitwise or/and and bit shifting.

This is a very good idea in general.

Zach
From: Alan Manuel K. Gloria
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142864860.784393.192310@z34g2000cwc.googlegroups.com>
Thanks!  I'm looking at your bitio.lisp file now.
From: Peter Seibel
Subject: Re: Bits from a stream
Date: 
Message-ID: <m2hd5t2ptr.fsf@gigamonkeys.com>
Zach Beane <····@xach.com> writes:

> "Alan Manuel K. Gloria" <········@gmail.com> writes:
>
>> I'm currently trying to figure out how bits work in CL.
>
> You can access the bits of an integer with functions like LDB and
> LOGAND. You can produce integers with modified bits with functions
> like DPB and LOGIOR. You can perform bitwise shifts with ASH. You can
> test individual bits with LOGBITP or groups of bits with LOGTEST.
>
> There are many more functions for accessing and operating on the bits
> of an integer. See the numbers dictionary for a list:
>
>    http://www.lispworks.com/documentation/HyperSpec/Body/c_number.htm
>
>> Basically I need to do some compression on some data I'm storing - the
>> way I keep my data is rather amenable to compression, but compressing
>> it requires access to individual bits.  I would like to store and
>> retrieve data from a file, so I will need access to bits in a stream.
>
> I have written a few bitstream functions for a Flash file
> reader/writer. They are available here:
>
>    http://www.xach.com/lisp/cl-flash/bitio.lisp
>
> It will allow you to both read and write variable-length contiguous
> bits across octet boundaries in an (UNSIGNED-BYTE 8) stream. There
> isn't much to it.

You might also want to check out Chapter 24, "Parsing Binary Files" of
my book wherein I discuss how to build a library for parsing various
kinds of binary files.

  <http://www.gigamonkeys.com/book/practical-parsing-binary-files.html>

-Peter

-- 
Peter Seibel           * ·····@gigamonkeys.com
Gigamonkeys Consulting * http://www.gigamonkeys.com/
Practical Common Lisp  * http://www.gigamonkeys.com/book/
From: Alan Manuel K. Gloria
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142863545.016318.50780@t31g2000cwb.googlegroups.com>
Thanks pascal!  Basically I'm parsing by octet bytes now, and I'm
twiddling bits (I need a variable-length encoding to take full
advantage of compression, so I need it as a stream of bits).

My write-bit routine is now:
(let
	(
		(bitn 0)
		(byt 0) )
	(defun write-bit-flush (stream)
		(write-byte byt stream)
		(setf bitn 0)
		t)
	(defun write-bit (bit stream)
		(setf (ldb (byte 1 bitn)) bit)
		(when (eq 8 (incf bitn))
			(write-byte byt stream)
			(setf bitn 0))
		bit ))
I've tested read-bit way above (which, btw, is little-endian, it
returns the lowest bit first), but write-bit here hasn't been tested
yet.  I'm using a var-length encoding to store data from a hash table,
and the hash is big - it would be generated by taking character n-grams
from text documents, and my goal is to store the hash table in less
space than the original file.
From: Alan Manuel K. Gloria
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142864333.039229.116580@g10g2000cwb.googlegroups.com>
> Otherwise, your READ-BIT function is wrong.  You've not proved it,
> it's clear, but did you even try it?
That's weird.  I've tried it.  It seems to work....
F:\LispTest>type foo.txt
7n
F:\LispTest>clisp
[1]> (load "bitstream.lisp")
;; Loading file bitstream.lisp ...
;; Loaded file bitstream.lisp
T
[2]> (setf f (open "foo.txt" :direction :input :element-type
'unsigned-byte))
#<INPUT BUFFERED FILE-STREAM (UNSIGNED-BYTE 8) #P"foo.txt">
[3]> (read-bit f)
1
[4]> (read-bit f)
1
[5]> (read-bit f)
1
[6]> (read-bit f)
0
[7]> (read-bit f)
1
[8]> (read-bit f)
1
[9]> (read-bit f)
0
[10]> (read-bit f)
0
[11]> (read-bit f)
0
[12]> (read-bit f)
1
[13]> (read-bit f)
1
[14]> (read-bit f)
1
[15]> (read-bit f)
0
[16]> (read-bit f)
1
[17]> (read-bit f)
1
[18]> (read-bit f)
0

7=>37h, 00110111
n=>6eh, 01101110

stream read (in sequence):
1 1 1 0 _ 1 1 0 0 _ 0 1 1 1 _ 0 1 1 0
which was exactly what I was expecting...

Hmm, lemme change test benches... (btw I've now followed your
recommendation and changed 'eq to '=)
From: Pascal Bourguignon
Subject: Re: Bits from a stream
Date: 
Message-ID: <87oe015dzq.fsf@thalassa.informatimago.com>
"Alan Manuel K. Gloria" <········@gmail.com> writes:

>> Otherwise, your READ-BIT function is wrong.  You've not proved it,
>> it's clear, but did you even try it?
> That's weird.  I've tried it.  It seems to work....


All right, my fault, I expected something else. 
Indeed, your routine works.

[323]> (with-open-file (out "/tmp/bits" :direction :output
                             :element-type '(unsigned-byte 8))
              (write-sequence '(#xf0 #xaa #xc0) out))
(240 170 192)
[324]> (with-open-file (in "/tmp/bits" :element-type '(unsigned-byte 8))
            (loop repeat 24 do (princ (read-bit in))))
000011110101010100000011
NIL

I was expecting: 111100001010101011000000
(Most significant bits in the bytes first).


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d? s++:++ a+ C+++ UL++++ P--- L+++ E+++ W++ N+++ o-- K- w--- 
O- M++ V PS PE++ Y++ PGP t+ 5+ X++ R !tv b+++ DI++++ D++ 
G e+++ h+ r-- z? 
------END GEEK CODE BLOCK------
From: Alan Manuel K. Gloria
Subject: Re: Bits from a stream
Date: 
Message-ID: <1142931016.556942.201730@t31g2000cwb.googlegroups.com>
@Peter:
Ah, I see.

Anyway, after opening a file, it would be best to first execute
(read-bit-reset) to reset the reader's closure.

In fact, I would probably have also written some sort of
(open-input-bit-stream) that does the open with (unsigned-byte 8) and
then does a (read-bit-reset), which I could then encapsulate out by
creating a new closure instead of resetting it - pretty much what
Zach's library does using objects.

@Zach:
Is there some sort of license to the code you posted a pointer to?  I
assume it's sort of public domain?
From: Zach Beane
Subject: Re: Bits from a stream
Date: 
Message-ID: <m3bqw02dxi.fsf@unnamed.xach.com>
"Alan Manuel K. Gloria" <········@gmail.com> writes:

> @Zach:
> Is there some sort of license to the code you posted a pointer to?  

No.

> I assume it's sort of public domain?

Why would you assume that?

Zach
From: Pascal Bourguignon
Subject: Re: Bits from a stream
Date: 
Message-ID: <8764m96xp6.fsf@thalassa.informatimago.com>
"Alan Manuel K. Gloria" <········@gmail.com> writes:
> Basically I need to do some compression on some data I'm storing - the
> way I keep my data is rather amenable to compression, but compressing
> it requires access to individual bits.  I would like to store and
> retrieve data from a file, so I will need access to bits in a stream.

You should note that the format of bit files is implementation
dependant.  (All files actually).  The problem is this: if you write a
file with 10 bits, FILE-LENGTH specifies that:

   For a binary file, the length is measured in units of the element
   type of the stream.

therefore your lisp implementation must save that information
somewhere, since a 10-bit file will need the same 2 octets, as a 9-bit
file or a 16-bit file.  Therefore, a non-standard header will have to
be written to the bit file.  Moreover, the order in which the bits are
stored into bytes or words is implementation dependant too.



With some luck, files of characters and files of (unsigned-byte 8) (on
octet-based machines), will have the same format with all CL
implementations.  So if you want to read and write a binary file with
a given format, your best chances are with an element-type of
(unsigned-byte 8).  

This means that you'll have to store and retrieve the bits from small
integers yourself, with functions such LOGAND, LOGOR, LDB, DPB, ASH,
etc.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

HANDLE WITH EXTREME CARE: This product contains minute electrically
charged particles moving at velocities in excess of five hundred
million miles per hour.