I have been looking through the packages on the cliki compression
page:
http://www.cliki.net/Compression
Lots of good code to compress and decompress, but I am also looking
for a wrapper that will let me do simply (as in Ruby) operations like
read a GZIPed text file, passing my own function to process each line.
Or something like (with-zip-file ...)
Something that works with the iterate package would be especially
good, but I am looking for something that wraps reading compressed
text files in a line or two of code (because this is a common
operation for me).
Anyway, if anyone has a nice wrapper, please share it :-)
Thanks,
Mark
PS. I thought of asking this group because I saw a great snippet
posted by Louis Oliveira that I now use frequently:
(iter (for line in-file file using #'read-line)
;; process 'line'
)
I find that I learn a lot from reading other people's code - fun way
to learn.
Also, the text files that I process are often very large, so using
streams and not reading everything into memory is best.
Off topic, but for Franz Lisp users, I have found that using excl:read-
line-into buffer helps a lot: cuts way down on cons'ing when
processing large files.
On Aug 15, 8:45 am, Mark Watson <···········@gmail.com> wrote:
> I have been looking through the packages on the cliki compression
> page:
>
> http://www.cliki.net/Compression
>
> Lots of good code to compress and decompress, but I am also looking
> for a wrapper that will let me do simply (as in Ruby) operations like
> read a GZIPed text file, passing my own function to process each line.
> Or something like (with-zip-file ...)
>
> Something that works with the iterate package would be especially
> good, but I am looking for something that wraps reading compressed
> text files in a line or two of code (because this is a common
> operation for me).
>
> Anyway, if anyone has a nice wrapper, please share it :-)
>
> Thanks,
> Mark
>
> PS. I thought of asking this group because I saw a great snippet
> posted by Louis Oliveira that I now use frequently:
>
> (iter (for line in-file file using #'read-line)
> ;; process 'line'
> )
>
> I find that I learn a lot from reading other people's code - fun way
> to learn.
On 2007-08-15, Mark Watson <···········@gmail.com> wrote:
> Lots of good code to compress and decompress, but I am also looking
> for a wrapper that will let me do simply (as in Ruby) operations like
> read a GZIPed text file, passing my own function to process each line.
> Or something like (with-zip-file ...)
>
> Something that works with the iterate package would be especially
> good, but I am looking for something that wraps reading compressed
> text files in a line or two of code (because this is a common
> operation for me).
Put the following code into the package Franz' inflate routines are in.
(For example using the ZIP package, which includes inflate.cl.)
Use like this:
(with-open-file (s "passwd.gz" :element-type '(unsigned-byte 8))
(zip::skip-gzip-header s)
(let ((r (flexi-streams:make-flexi-stream (zip::make-inflate-stream s))))
(loop for line = (read-line r nil)
while line
do (print line))))
"root:x:0:0:root:/root:/bin/bash"
"daemon:x:1:1:daemon:/usr/sbin:/bin/sh"
"bin:x:2:2:bin:/bin:/bin/sh"
"sys:x:3:3:sys:/dev:/bin/sh"
...
;;;; (c) David Lichteblau, X11-style license
(in-package :zip)
(defclass inflate-stream
(trivial-gray-stream-mixin fundamental-binary-input-stream)
((current-stream :initform nil)
(current-vectors :initform nil)
(br :initarg :br)
(buffer :initarg :buffer)
(end :initform 0)))
(defmethod initialize-instance :after ((stream inflate-stream) &key)
(refill stream))
(defun make-inflate-stream (source)
(let ((buf (make-array (* 32 1024) :element-type '(unsigned-byte 8))))
(make-instance 'inflate-stream :br (new-bit-reader source) :buffer buf)))
(defun refill (stream)
(with-slots (current-stream current-vectors br buffer end) stream
(unless current-vectors
(unless end
(setf current-stream nil)
(return-from refill nil))
(setf current-vectors
(nreverse
(let ((vectors '()))
(flet ((op (buf end)
(push (subseq buf 0 end) vectors)))
(setq end (process-deflate-block br #'op buffer end)))
vectors))))
(setf current-stream
(flexi-streams:make-in-memory-input-stream (pop current-vectors)))))
(defmethod stream-element-type ((stream inflate-stream))
'(unsigned-byte 8))
(defmethod stream-read-byte ((stream inflate-stream))
(with-slots (current-stream) stream
(or (read-byte current-stream nil)
(if (refill stream)
(read-byte current-stream nil :eof)
:eof))))
(defmethod stream-listen ((stream inflate-stream))
(with-slots (current-stream) stream
(or (listen current-stream)
(if (refill stream)
(listen current-stream)
nil))))
(defmethod stream-read-sequence
((stream inflate-stream) sequence start end &key)
(with-slots (current-stream) stream
(let ((index (read-sequence current-stream
sequence
:start start
:end end)))
(loop while (and (< index end) (refill stream)) do
(setf index (read-sequence current-stream
sequence
:start index
:end end))))))
Thanks David!
This is helpful, and a time saver for me.
Best regards,
Mark
On Aug 15, 9:50 am, David Lichteblau <···········@lichteblau.com>
wrote:
> On 2007-08-15, Mark Watson <···········@gmail.com> wrote:
>
> > Lots of good code to compress and decompress, but I am also looking
> > for a wrapper that will let me do simply (as in Ruby) operations like
> > read a GZIPed text file, passing my own function to process each line.
> > Or something like (with-zip-file ...)
>
> > Something that works with the iterate package would be especially
> > good, but I am looking for something that wraps reading compressed
> > text files in a line or two of code (because this is a common
> > operation for me).
>
> Put the following code into the package Franz' inflate routines are in.
> (For example using the ZIP package, which includes inflate.cl.)
>
> Use like this:
>
> (with-open-file (s "passwd.gz" :element-type '(unsigned-byte 8))
> (zip::skip-gzip-header s)
> (let ((r (flexi-streams:make-flexi-stream (zip::make-inflate-stream s))))
> (loop for line = (read-line r nil)
> while line
> do (print line))))
>
> "root:x:0:0:root:/root:/bin/bash"
> "daemon:x:1:1:daemon:/usr/sbin:/bin/sh"
> "bin:x:2:2:bin:/bin:/bin/sh"
> "sys:x:3:3:sys:/dev:/bin/sh"
> ...
>
> ;;;; (c) David Lichteblau, X11-style license
>
> (in-package :zip)
>
> (defclass inflate-stream
> (trivial-gray-stream-mixin fundamental-binary-input-stream)
> ((current-stream :initform nil)
> (current-vectors :initform nil)
> (br :initarg :br)
> (buffer :initarg :buffer)
> (end :initform 0)))
>
> (defmethod initialize-instance :after ((stream inflate-stream) &key)
> (refill stream))
>
> (defun make-inflate-stream (source)
> (let ((buf (make-array (* 32 1024) :element-type '(unsigned-byte 8))))
> (make-instance 'inflate-stream :br (new-bit-reader source) :buffer buf)))
>
> (defun refill (stream)
> (with-slots (current-stream current-vectors br buffer end) stream
> (unless current-vectors
> (unless end
> (setf current-stream nil)
> (return-from refill nil))
> (setf current-vectors
> (nreverse
> (let ((vectors '()))
> (flet ((op (buf end)
> (push (subseq buf 0 end) vectors)))
> (setq end (process-deflate-block br #'op buffer end)))
> vectors))))
> (setf current-stream
> (flexi-streams:make-in-memory-input-stream (pop current-vectors)))))
>
> (defmethod stream-element-type ((stream inflate-stream))
> '(unsigned-byte 8))
>
> (defmethod stream-read-byte ((stream inflate-stream))
> (with-slots (current-stream) stream
> (or (read-byte current-stream nil)
> (if (refill stream)
> (read-byte current-stream nil :eof)
> :eof))))
>
> (defmethod stream-listen ((stream inflate-stream))
> (with-slots (current-stream) stream
> (or (listen current-stream)
> (if (refill stream)
> (listen current-stream)
> nil))))
>
> (defmethod stream-read-sequence
> ((stream inflate-stream) sequence start end &key)
> (with-slots (current-stream) stream
> (let ((index (read-sequence current-stream
> sequence
> :start start
> :end end)))
> (loop while (and (< index end) (refill stream)) do
> (setf index (read-sequence current-stream
> sequence
> :start index
> :end end))))))