Critique my code: using Lisp to parse lots of "ps -elf" files in AIX

From: Larry Clapp
Subject: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Sat, 07 May 2005 16:57:47 +0000
Message-ID: <slrnd7psr5.uqf.larry@theclapp.ddts.net>

Greetings, all,

I seek critiques and recommendations on the linked code (see below
under "Requests").  It's my first use of Lisp at work, and maybe my
third use "in anger", as the saying goes.

The situation: at work we had a machine crash due to running out of
memory and swap.  Another project (under the same manager) had a ksh
script you could run as a cronjob to gather "ps -elf" snapshots and
write them to files, and another ksh script to analyze the output.  My
team lead suggested that our manager might not trust anything but the
original scripts, since the other project had apparently had a lot of
success using them to diagnose some of their own memory problems, so I
installed the first script in a cronjob that ran every minute, and
then took a look at the second, analytic script.

What a disaster.  I don't remember exactly, but I think it took about
an hour to process about an hour's worth of data (i.e. 60 data files).
Nothing like using <filename><nnn> to simulate arrays to slow down
your script.  People that don't know that awk(1) has more features
than "{ print $2 }" should not write shell scripts.  

So I threw my team lead's advice out the window and decided to rewrite
it in Lisp, using clisp (clisp.cons.org).

The basic idea is to read each file (a snapshot of "ps -elf", as
mentioned), keep track of the memory used by each process, and then
print out the memory deltas per process once it's read all the files.

Example output stanza for a particular process:

  680002, ran for 111 units, -ksh 
  First appeared in reading_3May13:10:00
  (524 (stable for 11 units) 8 (stable for 98 units))

It lists the PID, how long it ran, the command line, what data file it
first appeared in, and then a list of numbers, starting with the
initial memory usage, followed by a list of changes in memory usage.
This example shows that a ksh process started circa 5/3 @ 1:10pm, with
524 kb in use, stayed the same for 11 minutes, grew by 8 kb, stayed
the same for 98 more minutes, and then exited.

Design goals:

  - correctness (obviously :)
  - No use of external libraries -- not enough time to download them
    and get them working (or so it seemed at the time).  E.g. the
    script might benefit from a regular expression library, but at the
    time I didn't want to bother.
  - Reasonable speed.  My current version can process an hour's data
    in 3-10 seconds (depending on the size of the data readings, i.e.
    the number of processes running).  I consider this reasonable and
    "fast enough", actually, but I wouldn't mind making it even
    faster.

Non-goals:

  Portability.  I don't care if it'll run or not under other Lisp
  implementations.  E.g. I don't know or care if my use of DIRECTORY
  will work correctly under other Lisps; if I need to run it under
  something else I'll download Peter's portable pathname library.
  (Actually, I've already done this, but haven't integrated it into
  this code.)

Design:

  Have an EQUAL hash table keyed on (pid . command-line), with a list
  of memory readings.  Read all data files.  Post-process the hash
  table to output information formatted as above.

Possible changes not yet implemented:

  Instead of storing each memory reading, just store the difference
  between the current reading and the previous reading.

Requests:

  Please see http://www.theclapp.org/process-mem-readings.lisp and/or
  http://www.theclapp.org/process-mem-readings.lisp.html.

  How can I make it
    - faster?
    - more readable?
    - more useful / functional?
    - Lispier?  (most of my background is in C and Perl)

Commentary:

  - I developed this script using neither SLIME nor my own VILisp
    tool.  You can get a great deal done with just a few utility
    functions, plain ol' cut-and-paste, and clisp's command history
    function.  (I have since installed xemacs, but not SLIME yet.)
  - I really liked the ability to load a bunch of data into memory,
    and then step-wise refine the code to post-process it, without
    having to spend three minutes reloading the data each time I
    wanted to generate output.  I could not have developed this code
    as quickly in any other language.  (On further thought, this last
    comment probably isn't true.  If I'd run my tests against a subset
    of the data files, I probably could have done it as quickly in
    Perl.  Oh well.  :)  Also, someone else might have done it as fast
    in something else, using e.g. Python or Scheme or any other
    language with a REPL, but I don't know any of them and didn't have
    them already installed on our development box anyway.)
  - I wrote a short ksh script to do a similar analysis, and comparing
    it to the output of the Lisp script revealed a subtle bug in the
    shell version that I don't think I'd have noticed otherwise.

Thank you for any assistance or advice you can provide.

-- Larry Clapp

Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX ············@gmail.com
- Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Larry Clapp
  - Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Rob Warnock
    - Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Lars Brinkhoff
Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Paolo Amoroso
Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Pascal Bourguignon
- Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Larry Clapp
Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX Lars Brinkhoff

From: ············@gmail.com
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Sat, 07 May 2005 19:24:27 +0000
Message-ID: <1115493867.021804.89200@f14g2000cwb.googlegroups.com>

Hi Larry,

two questions:

1) why don't you write documentation strings for your functions? It
would me so much easier to read the code with them in place.

2) why do you need to limit the number of files processed by n, even if
the number of files is greater?

David

From: Larry Clapp
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Sat, 07 May 2005 20:53:47 +0000
Message-ID: <slrnd7qamr.uqf.larry@theclapp.ddts.net>

In article <·······················@f14g2000cwb.googlegroups.com>,
············@gmail.com wrote:
> two questions:
> 
> 1) why don't you write documentation strings for your functions? It
> would me so much easier to read the code with them in place.

I tend to forget that I can.  I'll try to remember that, especially
the next time I post my code for public review!  :)

> 2) why do you need to limit the number of files processed by n, even
> if the number of files is greater?

Running from the REPL for quick debugging it goes faster if I can say
"do the first 10" or 100 or 1000 instead of "do them all".

Thanks for your comments!

-- Larry

From: Rob Warnock
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Mon, 09 May 2005 02:52:39 +0000
Message-ID: <mdKdnZG5r4_qTePfRVn-iA@speakeasy.net>

Larry Clapp  <·····@theclapp.org> wrote:
+---------------
| ············@gmail.com wrote:
| > 2) why do you need to limit the number of files processed by n, even
| > if the number of files is greater?
| 
| Running from the REPL for quick debugging it goes faster if I can say
| "do the first 10" or 100 or 1000 instead of "do them all".
+---------------

In that case I would suggest adding a keyword argument (say) :LIMIT
that would limit the number, which by default would be unlimited.
So (DO-STUFF) would do them all, and (DO-STUFF :LIMIT 10) would do
only the first 10. Or if it's too inconvenient to pass that down the
call chain, make it a global (special) variable, say, MY-APP:*FILES-LIMIT*,
initially bound to NIL (meaning unlimited) but which you can bind
or SETF to a small number when working in the REPL. [See the standard
CL variables *PRINT-LEVEL* & *PRINT-LENGTH* for other examples of this.]


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Lars Brinkhoff
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Mon, 09 May 2005 09:03:40 +0000
Message-ID: <85br7kvkib.fsf@junk.nocrew.org>

····@rpw3.org (Rob Warnock) writes:
> In that case I would suggest adding a keyword argument (say) :LIMIT
> that would limit the number, which by default would be unlimited.

Or :COUNT, to mimic some sequence functions.

From: Paolo Amoroso
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Sat, 07 May 2005 19:11:18 +0000
Message-ID: <8764xuua09.fsf@plato.moon.paoloamoroso.it>

Larry Clapp <·····@theclapp.org> writes:

> I seek critiques and recommendations on the linked code (see below
[...]
>   Please see http://www.theclapp.org/process-mem-readings.lisp and/or
>   http://www.theclapp.org/process-mem-readings.lisp.html.
>
>   How can I make it
[...]
>     - more readable?

You may add a shorts comments at the top telling what the program
does.  Part of your original message would be fine.


Paolo
-- 
Why Lisp? http://lisp.tech.coop/RtL%20Highlight%20Film
Recommended Common Lisp libraries/tools (see also http://clrfi.alu.org):
- ASDF/ASDF-INSTALL: system building/installation
- CL-PPCRE: regular expressions
- UFFI: Foreign Function Interface

From: Pascal Bourguignon
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Sun, 08 May 2005 02:56:52 +0000
Message-ID: <87is1uxw5n.fsf@thalassa.informatimago.com>

Larry Clapp <·····@theclapp.org> writes:
>   Please see http://www.theclapp.org/process-mem-readings.lisp and/or
>   http://www.theclapp.org/process-mem-readings.lisp.html.
>
>   How can I make it
>     - faster?
>     - more readable?
>     - more useful / functional?
>     - Lispier?  (most of my background is in C and Perl)


;;--------------------

I would order the functions and the _macros_ as in Pascal: used one
first.  When you compile a file, (or when you use an implementation
that compiles automatically at the REPL or when loading), it's better
to have the macro defined before they're used, otherwise the compiler
sees only function calls and doesn't expand them.

Even with clisp:

[103]> (compile-file "/home/pjb/src/lisp/encours/process-mem-readings.lisp")
;; Compiling file /home/pjb/src/lisp/encours/process-mem-readings.lisp ...
;; Wrote file /home/pjb/src/lisp/encours/process-mem-readings.fas
The following functions were used but not defined:
 PRINT-WIDEST REDIRECT
0 errors, 0 warnings
#P"/home/pjb/src/lisp/encours/process-mem-readings.fas" ;
NIL ;
NIL

[ What saved you is that you  loaded the file before calling (c) ].

;;--------------------

Instead of:

(defun starts-with (cmd string)
  (eql 0 (search string cmd)))	; use eql because search can return nil


I have this function:

(DEFUN PREFIX-P (STRING PREFIX)
  "
RETURN:  Whether PREFIX is a prefix of the STRING.
"
  (AND (<= (LENGTH PREFIX) (LENGTH STRING))
       (STRING= STRING PREFIX :END1 (LENGTH PREFIX))))


It's slightly larger but when prefix is a prefix of string, it's
O(length(prefix)), not O(length(string)) and when it's not, it's O(1)
instead of O(length(string)).

;;--------------------

(defun get-differences (mem-readings)
  (let* ((last nil)
	 (difs (loop with list = nil
		     for (this next) on mem-readings
		     while next
		     do (push (- this next) list)
		     finally 
		     (setf last this)
		     (return list))))
    (cons last difs)))



Here's an idiom to easily compute differences between the consecutive
elements of list. MAPCAR stops as soon as the shortest list is reached.

(defun differencial (list) 
   "
RETURN: A list of the difference between each two consecutive elements of list.
"
   (mapcar (function -) list (cdr list)))


Now, in a bottom-up way-of-life, The specifications of get-differences
may be questionable.  Do you really need to reverse it?  Why is the
last mem-reading needed (vs. the first)?  We could use directly
DIFFERENCIAL.  Anyway, now get-differences is simpler this way:


(defun get-differences (mem-readings)
    "
RETURN: A list whose FIRST is the last element of mem-readings, and 
        whose REST is the reversed differenical of mem-readings.
"
    (cons (car (last mem-readings))  (nreverse (differencial mem-readings))))

;;--------------------

I don't like very much these procedures:

(defun show-difs (difs) 
  (loop for cons on difs
	do (find-run cons)
	finally (return difs)))

;;; "run" as in "run length encoding"
(defun find-run (cons)
  (let ((run (count-run-and-compress cons)))
    (when (plusp run)
      (setf (car cons)
	    (list "stable for" (1+ run) "units")))))

(defun count-run-and-compress (cons)
  (let ((car (car cons)))
    (loop while (and (cdr cons) (= 0 car (cadr cons)))
	  count 1
	  do (setf (cdr cons) (cddr cons)))))

because they're modifying the lists they work on.  I'd rather write a
pure function such as:

(defun rle-compress (list)
  (loop for curr = list then (nthcdr pos curr)
        for pos = (or (position-if (lambda (item) (not (eql (car curr) item))) 
                                   curr)
                      (length curr))
        when (plusp pos)
        collect (if (< pos 2)
                  (car curr)
                  `(stable for ,pos units at ,(car curr)))
        while curr))

[168]> (setf list (differencial d))
(0 0 -1 -1 0 -1 0 0 0 -1 -1 -1 0 -2)
[169]> (rle-compress list)
((STABLE FOR 2 UNITS AT 0) (STABLE FOR 2 UNITS AT -1) 0 -1
 (STABLE FOR 3 UNITS AT 0) (STABLE FOR 3 UNITS AT -1) 0 -2)
[170]> list
(0 0 -1 -1 0 -1 0 0 0 -1 -1 -1 0 -2)

instead of:

[171]> (setf list (get-differences d))
(9 -2 0 -1 -1 -1 0 0 0 -1 0 -1 -1 0 0)
[172]> (show-difs list)
(9 -2 0 -1 -1 -1 ("stable for" 3 "units") -1 0 -1 -1 ("stable for" 2 "units"))
[173]> list
(9 -2 0 -1 -1 -1 ("stable for" 3 "units") -1 0 -1 -1 ("stable for" 2 "units"))

;;--------------------

(defun get-fields3 (line)
  (apply #'vector
	 (loop with len = (length line)
	       for (start end) on (word-boundaries line) by #'cddr
	       collect (subseq line start (or end len)))))

To avoid consing the list coerced to a vector [ (apply #'vector list)
can be written as: (coerce list 'vector)  too], I'd allocate the
resulting vector first and fill it directly:

(defun get-fields3 (line)
  (loop with boundaries = (word-boundaries line)
        with result = (make-array (truncate (length boundaries) 2))
        with len = (length line)
        for i from 0
        for (start end) on boundaries by (function cddr)
        do (setf (aref result i) (subseq line start (or end len)))
        finally (return result)))

;;--------------------

(> (length fields) 10)

It might be a good habit to only use < or <=.

(< 10 (length fields))

;;--------------------

fields as array.

A quick-and-dirty trick is to use defstruct with (:type list):

(defstruct (ps  (:type list))
  F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD)


Then, you can write

(defun get-fields3 (line)
  (loop with len = (length line)
        for post in '(identity identity identity parse-integer parse-integer
                      identity parse-integer parse-integer identity
                      parse-integer identity parse-time identity parse-command)
        for i from 0
        for (start end) on (word-boundaries line) by (function cddr)
        collect (funcall post (subseq line start (or end len)))))

(defun join (delim fields)
  (format nil (format nil  "~~{~~A~~^~A~~}" delim) fields))

(let ((record (get-fields3 line)))
   (print (list (ps-pid record) (ps-cmd record) (subseq record 14)))
   (princ (join ";" record)))


But if you still prefer vectors, you can also use:

(defstruct (ps  (:type vector))
  F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD)

(let ((record (get-fields3 line)))
   (print (list (ps-pid record) (ps-cmd record) (subseq record 14)))
   (princ (join ";" record)))


;;--------------------

Since you don't mind portability, you may use extension packages
included in clisp such as REGEXP. This ease the splitting of the lines:

(defparameter *ps-line-regexp*
  (let ((field " *\\([^ ]\\+\\)") (wchan " \\(......\\)") (command " \\(.*\\)"))
    (format nil "~{~A~}~A~{~A~}~A"
            (make-list 10 :initial-element field)
            wchan
            (make-list 3 :initial-element field)
            command)))
;; Here you can also compile the regexp, see clisp implementation notes.


(defstruct (ps  (:type list))
  F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD)

(mapcar
 (lambda (line)
   (let ((record
          (mapcar (lambda (match) (regexp:match-string line match))
                  (cdr (multiple-value-list
                        (regexp:match *ps-line-regexp* line))))))
     (list (ps-wchan record) (ps-cmd record))))
 '("1 S larissa  22192     1  0  69   0 - 52970 unix_s 01:46 pts/1    00:00:04 /usr/l"
   "0 R pjb      29571 24970  0  74   0 -   908        04:45 pts/10   00:00:00 ps -el"))

--> 

(("unix_s" "/usr/l") ("      " "ps -el"))


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Wanna go outside.
Oh, no! Help! I got outside!
Let me back inside!

From: Larry Clapp
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Tue, 10 May 2005 01:29:46 +0000
Message-ID: <slrnd803j2.uqf.larry@theclapp.ddts.net>

In article <··············@thalassa.informatimago.com>, Pascal Bourguignon wrote:
> Larry Clapp <·····@theclapp.org> writes:
>>   Please see http://www.theclapp.org/process-mem-readings.lisp and/or
>>   http://www.theclapp.org/process-mem-readings.lisp.html.
>>
>>   How can I make it
>>     - faster?
>>     - more readable?
>>     - more useful / functional?
>>     - Lispier?  (most of my background is in C and Perl)
[lots of good advice]

Thanks for the tips, Pascal!

-- Larry

From: Lars Brinkhoff
Subject: Re: Critique my code: using Lisp to parse lots of "ps -elf" files in AIX
Date: Tue, 10 May 2005 12:09:10 +0000
Message-ID: <8564xrth95.fsf@junk.nocrew.org>

Larry Clapp <·····@theclapp.org> writes:
> I seek critiques and recommendations on the linked code (see below
> under "Requests").

I think making your macros follow the usual conventions would make
your code easier to read.  For example, I was initially confused by
the calls to redirect, because they look like function calls.  The
macro could be called with-redirection (or with-standard-output-to),
and perhaps the body should be on a separate line, e.g.:

  (with-redirection (format nil "main.out.~A" which)
    (process-files which n))