From: ·········@gmail.com
Subject: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123550890.284040.249760@f14g2000cwb.googlegroups.com>
hi,
  i'm a complete newbie and would be grateful to anyone who answers
this question.  (and if i ever publish i will acknowledge your help)

i have the following pseudocode algorithm for finding the largest set
of points which are not neighbors (called the greedy hobohm algorithm
in bioinformatics):

0.  initialize an S x S matrix of bools, representing the neighbor
relationships of the S objects.  (where for objects A and B, A--A and
A--B implies B--A)

1.  eliminate the object (i) with the most neighbors. (delete row i and
column i with the most elements TRUE)
2.  repeat 1 until the matrix is the identity matrix.

my question is:  can this pseudocode, involving a mutable data
structure, be translated into a purely functional pseudocode?  (perhaps
not even using a data structure at all?)

as a side note, since the matrix is huge in practice (S ~= 100,000) but
also sparse, i use a map<int, set<int> > in my c++ implementation to
represent it.  but, i noticed that ocaml has non-mutable map and set.

thanks in advance for any insight you may have.  (recommended reading,
etc)

Henry Bigelow

From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <2005080902190816807%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-08 22:02:46 -0400, ·········@gmail.com said:

> my question is:  can this pseudocode, involving a mutable data
> structure, be translated into a purely functional pseudocode?  (perhaps
> not even using a data structure at all?)

There's no need to mutate the original matrix to arrive at your 
solution. In other words, your pseudocode algorithm is not the only 
solution to the problem. Just iterate through half of the matrix and 
collect a list (or other data structure) of the indices with no true 
elements (i.e., no neighbors).

(defun collect-no-neighbors (neighbor-matrix)
  (let ((size (first (array-dimensions neighbor-matrix))))
  (loop for row from 0 to (1- size)
        when (every #'null (loop for column from 0 to (1- size) collect
                                              ;; this knocks out the 
diagonal identity elements
                                              (cond ((= row column) nil)
                                                    ((< row column) 
(aref neighbor-matrix row column))
                                                    ;; switch indices 
so we only look at half the matrix
                                                    (t  (aref 
neighbor-matrix column row)))))
        collect row)))


example:

CL-USER 21 > (setf test-mat #2a( (t   nil nil t   nil nil)
                                 (nil t   nil nil nil t  )
                                 (nil nil t   nil nil nil)
                                 (t   nil nil t   t   nil)
                                 (nil nil nil t   t   nil)
                                 (nil t   nil nil nil t  )))

CL-USER 22 > (collect-no-neighbors test-mat)
(2)
From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <2005080902285875249%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-09 02:19:08 -0400, Raffael Cavallaro 
<················@pas-d'espam-s'il-vous-plait-mac.com> said:

> On 2005-08-08 22:02:46 -0400, ·········@gmail.com said:
> 
>> my question is:  can this pseudocode, involving a mutable data
>> structure, be translated into a purely functional pseudocode?  (perhaps
>> not even using a data structure at all?)
> 
> There's no need to mutate the original matrix to arrive at your 
> solution. In other words, your pseudocode algorithm is not the only 
> solution to the problem. Just iterate through half of the matrix and 
> collect a list (or other data structure) of the indices with no true 
> elements (i.e., no neighbors).
> 
> (defun collect-no-neighbors (neighbor-matrix)
>   (let ((size (first (array-dimensions neighbor-matrix))))
>   (loop for row from 0 to (1- size)
>         when (every #'null (loop for column from 0 to (1- size) collect
>                                               ;; this knocks out the 
> diagonal identity elements
>                                               (cond ((= row column) nil)
>                                                     ((< row column) 
> (aref neighbor-matrix row column))
>                                                     ;; switch indices 
> so we only look at half the matrix
>                                                     (t  (aref 
> neighbor-matrix column row)))))
>         collect row)))
> 
> 
> example:
> 
> CL-USER 21 > (setf test-mat #2a( (t   nil nil t   nil nil)
>                                  (nil t   nil nil nil t  )
>                                  (nil nil t   nil nil nil)
>                                  (t   nil nil t   t   nil)
>                                  (nil nil nil t   t   nil)
>                                  (nil t   nil nil nil t  )))
> 
> CL-USER 22 > (collect-no-neighbors test-mat)
> (2)

oops! Realized this looks at all of the elements anyway - serves me 
right for posting at 2:00 am :(
From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <2005080910154550073%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-09 02:28:58 -0400, Raffael Cavallaro 
<················@pas-d'espam-s'il-vous-plait-mac.com> said:

> oops! Realized this looks at all of the elements anyway - serves me 
> right for posting at 2:00 am :(


Here's a version that only looks at one triangle of the square matrix. 
Amazing how actually sleeping clears the mind ;^)

(defun collect-no-neighbors (neighbor-matrix)
  (let* ((size (first (array-dimensions neighbor-matrix)))
        ;;assume to start that all elements have no neighbors
        (result-vector (make-array (list size) :element-type 'bit 
:initial-element 0)))
    (loop for row from 0 to (1- size)
          ;;this ensures that we only look at half the matrix
          ;;and skip the identity diagonal as well
          do
          (loop for column from (1+ row) to (1- size)
                when (not (null (aref neighbor-matrix row column)))
                ;;this will usually write the same entry more than once
                ;;but we're assuming that read/test/write is more costly
                ;;than simply writing again
                do
                  (setf (aref result-vector row) 1)
                  (setf (aref result-vector column) 1)))
    (loop for result-index from 0 to (1- size) when
          (zerop (aref result-vector result-index))
          collect result-index)))
From: Geoffrey Summerhayes
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <LO5Ke.61$yH2.46136@news20.bellglobal.com>
"Raffael Cavallaro" <················@pas-d'espam-s'il-vous-plait-mac.com> 
wrote in message 
·········································@pasdespamsilvousplaitmaccom...
> On 2005-08-09 02:28:58 -0400, Raffael Cavallaro 
> <················@pas-d'espam-s'il-vous-plait-mac.com> said:
>
>> oops! Realized this looks at all of the elements anyway - serves me right 
>> for posting at 2:00 am :(
>
>
> Here's a version that only looks at one triangle of the square matrix. 
> Amazing how actually sleeping clears the mind ;^)
>
> (defun collect-no-neighbors (neighbor-matrix)

...

Solves the wrong problem.

(collect-no-neighbors #2A(( t  nil nil nil)
                          (nil  t   t  nil)
                          (nil  t   t   t )
                          (nil nil  t   t)))

gives (0), the ops algorithm would generate (0 1 3).

--
Geoff 
From: hank_rb
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123619170.359078.108080@g44g2000cwa.googlegroups.com>
Thanks both of you for your replies.  I am unfamiliar with lisp, but I
see that the program has loops, which make it not purely functional,
right?  I am really trying to get a gist of a purely functional style
expression of the procedural algorithm i described.  To, clarify,
though, here is an example sequence of matrices that it would generate
(in procedural style):

 ABCDEF
At--t--
B-t-ttt
C--t--t
Dtt-t--
E-t--t-
F-tt--t

 ACDEF
At-t--
C-t--t
Dt-t--
E---t-
F-t--t

 ACEF
At---
C-t-t
E--t-
F-t-t

 AEF
At--
E-t-
F--t

Note that in the second matrix, you have a choice between A,C,D, or F
to eliminate.  I arbitrarily chose D.  In the third matrix, we have a
choice between C or F and I chose to eliminate C.

In any case, if anyone can specify some set of functions such that the
final composition of them has a signature f : Initial matrix -> Final
matrix, this is really what I'm looking for.

My background is several years in c++, so I would benefit from some
book which presents side-by-side algorithms expressed procedurally, and
the same algorithm expressed functionally.  Does anyone have any
recommended reading they might suggest?  I saw three titles which
looked promising:

Functional Programming: Practice and Theory
Introduction to Functional Programming
Algorithms: A Functional Programming Approach

Thanks a lot!

Sincerely,

Henry Bigelow
From: Joe Marshall
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <oe86j0vg.fsf@ccs.neu.edu>
"hank_rb" <·········@gmail.com> writes:

> Thanks both of you for your replies.  I am unfamiliar with lisp, but I
> see that the program has loops, which make it not purely functional,
> right?  

Loops can be functional.
From: Geoffrey Summerhayes
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <dNbKe.7423$6d4.967673@news20.bellglobal.com>
"hank_rb" <·········@gmail.com> wrote in message ·····························@g44g2000cwa.googlegroups.com...
>
> In any case, if anyone can specify some set of functions such that the
> final composition of them has a signature f : Initial matrix -> Final
> matrix, this is really what I'm looking for.
>

Well, if the matrix is sparse enough, we can ignore the diagonal and
just concentrate on the upper triangle. No need to store (a b) and (b a)

Maybe as

(defvar matrix '((0 . 1)(2 . 4)...(100 . 101)))

then an unoptimised but short

(defmacro def-sparse-matrix-fun (name fn)
  `(defun ,name (x matrix)
     (,fn (lambda (vertex)
            (or (= x (car vertex))
                (= x (cdr vertex))))
          matrix)))

(def-sparse-matrix-fun remove-column-row remove-if)

(def-sparse-matrix-fun count-occurrences count-if)

(defun inversion-of-answer (matrix size)
  ;; returns the rows removed rather than what's left
  (let ((removed '()))
    (loop
      (let ((max 0)
            (place 0))
        (dotimes (x size)
          (when (> (count-occurrences x matrix) max)
            (setf max (count-occurrences x matrix)
                  place x)))
        (when (zerop max) (return removed))
        (setf matrix (remove-column-row place matrix))
        (push place removed)))))

Then

CL-USER 17 > (inversion-of-answer '((1 . 2)(2 . 3)) 4)
(2)

--
Geoff 
From: Geoffrey Summerhayes
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <qccKe.7437$6d4.970930@news20.bellglobal.com>
"Geoffrey Summerhayes" <·······@NhOoStPmAaMil.com> wrote in message ··························@news20.bellglobal.com...
>
> (defun inversion-of-answer (matrix size)
>  ;; returns the rows removed rather than what's left
>  (let ((removed '()))
>    (loop
>      (let ((max 0)
>            (place 0))
>        (dotimes (x size)
>          (when (> (count-occurrences x matrix) max)
>            (setf max (count-occurrences x matrix)
>                  place x)))
>        (when (zerop max) (return removed))
>        (setf matrix (remove-column-row place matrix))
>        (push place removed)))))
>
>

I've got the "After Post Blues"... all untested

(defun inversion-of-answer-2 (matrix size)
  (loop while matrix collect
        (let ((max 0)
              (place 0))
          (dotimes (x size)
            (when (> (count-occurrences x matrix) max)
              (setf max (count-occurrences x matrix)
                    place x)))
          (setf matrix (remove-column-row place matrix))
          place)))

;; recursive

(defun inversion-of-answer-3 (matrix size &optional removed)
  (if (null matrix) removed
    (labels ((maximum-row (x max place)
               (let ((current  (count-occurences x matrix)))
                 (if (= x size) place
                   (if (> current max)
                       (maximum-row (1+ x) current x)
                     (maximum-row (1+ x) max place))))))
    (let ((place (maximum-row 0 0 0)))
      (inversion-of-answer-3
       (remove-column-row place matrix)
       size
       (cons place removed))))))

--
Geoff 
From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <2005080919082843658%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-09 13:33:44 -0400, "Geoffrey Summerhayes" 
<·············@hotmail.com> said:

> Solves the wrong problem.
> 
> (collect-no-neighbors #2A(( t  nil nil nil)
>                           (nil  t   t  nil)
>                           (nil  t   t   t )
>                           (nil nil  t   t)))
> 
> gives (0), the ops algorithm would generate (0 1 3).

Sorry - I misunderstood the OP's problem statement. In general it's 
hard to do something functionally if the result we're interested in 
depends on a global state change. Seems like this is indeed difficult 
to do functionally without lots of copying of sub-matrices. If we go 
that route the consing costs could get prohibitive for very large 
arrays.
From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <200508092052328930%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-09 19:08:28 -0400, Raffael Cavallaro 
<················@pas-d'espam-s'il-vous-plait-mac.com> said:

> On 2005-08-09 13:33:44 -0400, "Geoffrey Summerhayes" 
> <·············@hotmail.com> said:
> 
>> Solves the wrong problem.
>> 
>> (collect-no-neighbors #2A(( t  nil nil nil)
>>                           (nil  t   t  nil)
>>                           (nil  t   t   t )
>>                           (nil nil  t   t)))
>> 
>> gives (0), the ops algorithm would generate (0 1 3).
> 
> Sorry - I misunderstood the OP's problem statement. In general it's 
> hard to do something functionally if the result we're interested in 
> depends on a global state change. Seems like this is indeed difficult 
> to do functionally without lots of copying of sub-matrices. If we go 
> that route the consing costs could get prohibitive for very large 
> arrays.

Fwiw, here's a recursive functional version, but it copies a lot. Note 
that it if two or more rows/columns are tied for most neighbors it 
defaults to removing the row/column of lower rank (i.e., row/column 0 
rather than 1 if 0 and 1 are tied for most neighbors, etc.).

(defun label-matrix (input-matrix)
  "returns a new matrix with an extra row and column - 0 - which
are filled with labels of the input matrix row and column numbers"
  (let* ((size (first (array-dimensions input-matrix)))
        ;;create a new matrix with an extra row and column for labels
        (labeled-matrix (make-array (list (1+ size) (1+ size)))))
     ;;label the rows from 0 to (1- size) in column 0
    (loop for row-index from 1 to size do (setf (aref labeled-matrix 
row-index 0) (1- row-index)))
    ;;do the same for the columns in row 0
    (loop for column-index from 1 to size do (setf (aref labeled-matrix 
0 column-index) (1- column-index)))
    ;;fill the interior with input-matrix
    (loop for row from 1 to size do (loop for column from 1 to size do 
(setf (aref labeled-matrix row column) (aref input-matrix (1- row) (1- 
column)))))
    labeled-matrix))

(defun remove-row-col (n input-matrix)
  "returns a new labeled matrix with both dimensions one less
than input-matrix without row-col n"
  (let* ((size (first (array-dimensions input-matrix)))
        ;;make a new matrix with one fewer row & column
        (result-matrix (make-array (list (1- size) (1- size)))))
    (loop for row from 0 to (1- size) do
          (loop for column from 0 to (1- size )
                ;;don't set row-column n - we're losing it
                when (not (or (= row n) (= column n)))
                ;;if the current row-column is greater than n
                ;;we need to adjust the array indices we setf down 1
                ;;to compensate for the row we're losing
                do (let ((result-row (if (> row n) (1- row) row))
                         (result-column (if (> column n) (1- column) column)))
                     (setf (aref result-matrix result-row 
result-column) (aref input-matrix row column)))))
  result-matrix))

(defun most-neighbors (labeled-matrix)
  "takes a labeled marix and returns the row-column number with
the most neighbors. Ignores row-column 0 as this is used for the labels"
  (let* ((size (first (array-dimensions labeled-matrix)))
        (max 0)
        (max-row-column nil)
        (current-nbr-count 0))
    (loop for row from 1 to (1- size) do
          (setf current-nbr-count 0)
          (loop for column from  1 to (1- size)
                when (and (not (null (aref labeled-matrix row column))) 
(not (= row column)))
                do (incf current-nbr-count)
                ;;only change the winner if its neighbor count
                ;;is greater than the current leader - this means
                ;;we lose lower ranked rows-columns if tied
                finally (if (> current-nbr-count max)
                          (progn (setf max current-nbr-count)
                            (setf max-row-column row)))))
    max-row-column))

(defun identity-matrix-p (labeled-matrix)
  "returns t if labeled-matrix is an identity matrix"
  (null (most-neighbors labeled-matrix)))


(defun gh-worker (labeled-matrix)
  "takes a neighbor matrix with rows and columns labeled
and recursively applies itself until it returns the labeled identity-matrix
the original labeled-matrix is unchanged"
  (if (identity-matrix-p labeled-matrix) labeled-matrix
    (gh-worker (remove-row-col (most-neighbors labeled-matrix) 
labeled-matrix))))

(defun greedy-hobohm (neighbor-matrix)
  "calls label-matrix and passes the newly created labeled
copy to gh-worker"
  (gh-worker (label-matrix neighbor-matrix)))


and a test:

CL-USER 115 > (greedy-hobohm #2A(( t  nil nil nil)
                          (nil  t   t  nil)
                          (nil  t   t   t )
                          (nil nil  t   t)))

#2A((NIL 0 1 3) (0 T NIL NIL) (1 NIL T NIL) (3 NIL NIL T))
From: Louis Theran
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123677365.701785.242430@g44g2000cwa.googlegroups.com>
·········@gmail.com wrote:

> i have the following pseudocode algorithm for finding the largest set
> of points which are not neighbors (called the greedy hobohm algorithm
> in bioinformatics):

Here's a functional implementation of your algorithm.  The graph is
represented as an adjacency list.  For example the input in:

  ? (greedy-independent-set '((1) (2 3) (3 2  4) (4 3)))
  (1 2 4)

is the graph with the edge set {23, 34}.

It's worth noting that this can give a very bad approximation of the
maximum independent set in the graph.

Also, is there any particular reason why you want a functional
implementation of this algorithm?  There are some relatively obvious
ways to save space by using mutable data structures.

(defun vertex-degree (va)
  (length (cdr va)))

(defun empty-graph-p (graph)
  (notany 'cdr graph))

(defun graph-vertices (graph)
  (mapcar 'car graph))

(defun max-degree (l r)
  (if (>= (vertex-degree l) (vertex-degree r)) l r))

(defun largest-degree-vertex (graph)
  (car (reduce 'max-degree graph)))

(defun truncate-vertex (v graph)
  (flet ((remove-v (adj)
           (remove v adj)))
    (let ((g (remove v graph :key 'car)))
      (mapcar #'remove-v g))))

(defun greedy-independent-set (graph)
  (cond ((empty-graph-p graph)
         (graph-vertices graph))
        (t
         (let ((big (largest-degree-vertex graph)))
           (greedy-independent-set (truncate-vertex big graph))))))
From: hank_rb
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123698899.561767.307560@z14g2000cwz.googlegroups.com>
Hi Louis,
  Thanks for your reply.  I didn't know that this greedy algorithm
didn't find such a set.  I hope it isn't too poor of an approximation!
I see from MathWorld, the M.I.S. problem is NP-complete.  Here in the
lab, we use the 'greedy-hobohm' algorithm, which I've written in c++
using a highly optimized code (and mutable structure) because we
sometimes have to filter a set of 150,000 proteins.


>Also, is there any particular reason why you want a functional
>implementation of this algorithm?  There are some relatively obvious
>ways to save space by using mutable data structures.

I am new to OCaml and FP, and wanted to bootstrap my way by applying it
to a problem I already am familiar with in procedural programming.
And, to see if it is anywhere near as fast as a procedural c++
implementation.  I saw the 'computer language shootout' and that OCaml
was nearly as fast as c++, sometimes faster, but these are all toy
problems.

Thanks for the code.  Do you know of any that actually solves the
maximal independent set problem, that would be practical for such a
large set (of 150,000, albeit somewhat sparse, graph)?

Henry
From: Louis Theran
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123728689.842541.163320@g44g2000cwa.googlegroups.com>
hank_rb wrote:
> Hi Louis,
>   Thanks for your reply.  I didn't know that this greedy algorithm
> didn't find such a set.  I hope it isn't too poor of an approximation!

  [ ... ]

> Do you know of any that actually solves the
> maximal independent set problem, that would be practical for such a
> large set (of 150,000, albeit somewhat sparse, graph)?

In general, it can be quite bad, since Independent Set is one of the
hardest NP-complete problems to approximate.  Since it is NP-complete,
we don't know of any exact algorithms that are much better than brute
force.

>From a computer science perspective, what you would want to do in a
case like this is try to identify some properties (e.g., small degree)
that your inputs are likely to have and then try and apply some
existing approximation results.  One starting point might be

  http://www.nada.kth.se/~viggo/wwwcompendium/node34.html

even if it's not entirely up to date.

^L
From: Raffael Cavallaro
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <2005081023441111272%raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2005-08-10 15:01:22 -0400, "hank_rb" <·········@gmail.com> said:

> Do you know of any that actually solves the
> maximal independent set problem, that would be practical for such a
> large set (of 150,000, albeit somewhat sparse, graph)?

Do you mean maximal or maximum?

Maximal is easier:
 (from <http://en.wikipedia.org/wiki/Independent_Set_problem> or 
<http://www.answers.com/topic/maximum-independent-set-problem>)

"A much easier problem to solve is that of finding a maximal 
independent set, which is an independent set not contained in any other 
independent set. We begin with a single vertex. We find a vertex not 
adjacent to it and add it, then find a vertex adjacent to neither of 
these and add it, and so on until we can find no more vertices to add. 
At this time the set is maximal independent."


maximum is "known to be an NP-complete problem."


Google gives these papers on the Maximum Independent Set Problem which 
present algorithms that may be useful:

<http://ie.tamu.edu/people/faculty/butenko/papers/critical_orl2.pdf>

<http://www.misojiro.t.u-tokyo.ac.jp/~tomomi/TRs/JCDCG4.pdf>

<http://eccc.uni-trier.de/eccc-reports/2005/TR05-050/Paper.pdf>
From: hank_rb
Subject: Re: translating imperative pseudocode to functional style (repeat from comp.lang.functional)
Date: 
Message-ID: <1123737854.905296.123500@f14g2000cwb.googlegroups.com>
Raffael,
  Thanks for the papers.  If efficient enough, I might use the
algorithm in the first to solve the problem.  So, I had never heard of
this problem until yesterday, but took the definition from MathWorld,
in which the two terms are synonyms, and correspond to the definition
of 'maximum', i.e. the largest set such that no neighbor relationships
exist in the set.  This is the one i'm interested in.

Thanks again for your help!

Henry