From: Alex Mizrahi
Subject: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <4550ae27$0$49203$14726298@news.sunsite.dk>
helo

i'm searching for LSA (latent sematic analysis) implementation to play with, 
if someone knows something working one in Common Lisp please point me to.

or i'd like to see anything else in that kind too -- something that can 
'clusterize' documents, infer tags, etc.

or i'd like to see numeric SVD code (something able to eat sparse matrices 
is prefered).

With best regards, Alex 'killer_storm' Mizrahi. 

From: Vebjorn Ljosa
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <1162921485.601019.104190@e3g2000cwe.googlegroups.com>
Alex Mizrahi wrote:
> or i'd like to see numeric SVD code (something able to eat sparse matrices
> is prefered).

A Google search found the following SVD implementations (or FFI
interfaces):

 * Matlisp, <URL:http://matlisp.sourceforge.net/>
 * CL-Mathstats,
<URL:http://common-lisp.net/project/tinaa/documentation/cl-mathstats-asdf-system/index.html>
 * Obvious,
<URL:http://vismod.media.mit.edu/pub/obvius/obvius-2.2.tar.Z>

I haven't tried any of them.  After you try them out, it would be great
if you could report back here with your experiences.

Vebjorn
From: Alex Mizrahi
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <4550ec84$0$49202$14726298@news.sunsite.dk>
(message (Hello 'Vebjorn)
(you :wrote  :on '(7 Nov 2006 09:44:45 -0800))
(

 VL>  * Matlisp, <URL:http://matlisp.sourceforge.net/>

this is only a wrapper FFI

 VL> * Obvious,
 VL> <URL:http://vismod.media.mit.edu/pub/obvius/obvius-2.2.tar.Z>

this is only a FFI wrapper

 VL>  * CL-Mathstats,
 VL> <URL:http://common-lisp.net/project/tinaa/documentation/cl-mathstats-as
 VL> df-system/index.html>

this is a real code. thanks -- i've somehow missed it in my own google 
search :)

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: rif
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <wj0bqni9b0e.fsf@five-percent-nation.mit.edu>
"Alex Mizrahi" <········@users.sourceforge.net> writes:

> (message (Hello 'Vebjorn)
> (you :wrote  :on '(7 Nov 2006 09:44:45 -0800))
> (
> 
>  VL>  * Matlisp, <URL:http://matlisp.sourceforge.net/>
> 
> this is only a wrapper FFI
> 
>  VL> * Obvious,
>  VL> <URL:http://vismod.media.mit.edu/pub/obvius/obvius-2.2.tar.Z>
> 
> this is only a FFI wrapper
> 
>  VL>  * CL-Mathstats,
>  VL> <URL:http://common-lisp.net/project/tinaa/documentation/cl-mathstats-as
>  VL> df-system/index.html>
> 
> this is a real code. thanks -- i've somehow missed it in my own google 
> search :)
> 
> )
> (With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
> "People who lust for the Feel of keys on their fingertips (c) Inity") 

Why would you want a CL implementation of SVD?  An SVD is pretty hard
to get right, and the people who wrote LAPACK, which Matlisp wraps,
spent a lot of time getting it right.  Alternately, since you're
suggesting you want to deal with sparse matrices, what you probably
want is a wrapper around ARPACK.  Let me know if you make one, since I
could use it too.

Cheers,

rif
From: Bill Atkins
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <m2slgu6h5u.fsf@royal-purple-24.dynamic2.rpi.edu>
rif <···@mit.edu> writes:

> "Alex Mizrahi" <········@users.sourceforge.net> writes:
>
>> (message (Hello 'Vebjorn)
>> (you :wrote  :on '(7 Nov 2006 09:44:45 -0800))
>> (
>> 
>>  VL>  * Matlisp, <URL:http://matlisp.sourceforge.net/>
>> 
>> this is only a wrapper FFI
>> 
>>  VL> * Obvious,
>>  VL> <URL:http://vismod.media.mit.edu/pub/obvius/obvius-2.2.tar.Z>
>> 
>> this is only a FFI wrapper
>> 
>>  VL>  * CL-Mathstats,
>>  VL> <URL:http://common-lisp.net/project/tinaa/documentation/cl-mathstats-as
>>  VL> df-system/index.html>
>> 
>> this is a real code. thanks -- i've somehow missed it in my own google 
>> search :)
>> 
>> )
>> (With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
>> "People who lust for the Feel of keys on their fingertips (c) Inity") 
>
> Why would you want a CL implementation of SVD?  An SVD is pretty hard
> to get right, and the people who wrote LAPACK, which Matlisp wraps,
> spent a lot of time getting it right.  Alternately, since you're
> suggesting you want to deal with sparse matrices, what you probably
> want is a wrapper around ARPACK.  Let me know if you make one, since I
> could use it too.
>
> Cheers,
>
> rif

http://article.gmane.org/gmane.lisp.lispworks.general/5926/match=lapack
From: Alex Mizrahi
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <4551f205$0$49198$14726298@news.sunsite.dk>
(message (Hello 'rif)
(you :wrote  :on '(07 Nov 2006 16:56:17 -0500))
(

 r> Why would you want a CL implementation of SVD?

i'm now using Armed Bear Common Lisp that can easily FFI only Java.
so far i'm only experimenting, so i'd rather make implementation fast, then 
make fast implementation. i can use optimized implementation afterwards.
cl-mathstats' implementation in ABCL solves 100x100 SVD in 7 seconds, i 
think it's 100 times slower than native one, but it's quite satisfying for 
experiments.

 r>  An SVD is pretty hard to get right,

cl-mathstats just copies "numerical recipes" C code, i hope it's a correct 
one.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: rif
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <wj0d57yklia.fsf@five-percent-nation.mit.edu>
"Alex Mizrahi" <········@users.sourceforge.net> writes:

> (message (Hello 'rif)
> (you :wrote  :on '(07 Nov 2006 16:56:17 -0500))
> (
> 
>  r> Why would you want a CL implementation of SVD?
> 
> i'm now using Armed Bear Common Lisp that can easily FFI only Java.
> so far i'm only experimenting, so i'd rather make implementation fast, then 
> make fast implementation. i can use optimized implementation afterwards.
> cl-mathstats' implementation in ABCL solves 100x100 SVD in 7 seconds, i 
> think it's 100 times slower than native one, but it's quite satisfying for 
> experiments.

There's an automatically generated (f2j) version of LAPACK available
for java.  Haven't used it, but it might help you.  This and other
java linear algebra packages (including somet that do SVD) are linked from:

http://math.nist.gov/javanumerics/

> 
>  r>  An SVD is pretty hard to get right,
> 
> cl-mathstats just copies "numerical recipes" C code, i hope it's a correct 
> one.

I cannot speak for SVD in particular, but "numerical recipes" is
pretty notorious for not getting right.  If your matrices are
well-conditioned, it's probably not a problem.  If your matrices are
troublesome, it's quite possible the numerical recipes algorithms will
break down sooner than the LAPACK ones.  Of course, with matrix
factorizations, you can at least multiply the matrices back together,
check residuals, etc. to see how well you're doing.

Personally, I am always happier getting my matrix factorizations from
someone who really knew what they were doing and took the time to do
it right.

rif
From: ············@gmail.com
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <1163061341.212197.248360@b28g2000cwb.googlegroups.com>
Just out of curiosity, when you say you want an SVD, does that mean you
want the full SVD factorization, or just a few singular values and
vectors?  If you want the full factorization, then there's no reason to
use sparse matrices.  If you want just a few singular values and
vectors, then ARPACK is your friend; use f2j to translate it to Java if
you have to.

Don't modern Javas have an FFI to "native" code?

mfh
From: Robert Dodier
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <1163084291.498219.43210@f16g2000cwb.googlegroups.com>
············@gmail.com wrote:

> Just out of curiosity, when you say you want an SVD, does that mean you
> want the full SVD factorization, or just a few singular values and
> vectors?  If you want the full factorization, then there's no reason to
> use sparse matrices.  If you want just a few singular values and
> vectors, then ARPACK is your friend; use f2j to translate it to Java if
> you have to.

I second the recommendation for ARPACK.

This being a Lisp newsgroup, I'll mention that f2cl could be used
to translate ARPACK (Fortran) to Lisp. 

Robert Dodier
From: Alex Mizrahi
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <455335f3$0$49204$14726298@news.sunsite.dk>
(message (Hello ·············@gmail.com)
(you :wrote  :on '(9 Nov 2006 00:35:41 -0800))
(

 mh> Just out of curiosity, when you say you want an SVD, does that mean you
 mh> want the full SVD factorization, or just a few singular values and
 mh> vectors?  If you want the full factorization, then there's no reason to
 mh> use sparse matrices.

yes, i need only some hundred most significant values/vectors for LSA.

 mh> If you want just a few singular values and
 mh> vectors, then ARPACK is your friend; use f2j to translate it to Java if
 mh> you have to.

ok, i'll check it

mh> Don't modern Javas have an FFI to "native" code?

yes, it's possible -- but for now i want a simple solution..

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: ············@gmail.com
Subject: Re: LSA (latent sematic analysis) in Lisp, or at least SVD
Date: 
Message-ID: <1163187007.855967.125850@h54g2000cwb.googlegroups.com>
Alex Mizrahi wrote:
> (message (Hello ·············@gmail.com)
> (you :wrote  :on '(9 Nov 2006 00:35:41 -0800))
> (
>
>  mh> Just out of curiosity, when you say you want an SVD, does that mean you
>  mh> want the full SVD factorization, or just a few singular values and
>  mh> vectors?  If you want the full factorization, then there's no reason to
>  mh> use sparse matrices.
>
> yes, i need only some hundred most significant values/vectors for LSA.)

That means you definitely shouldn't take a full SVD.  Use ARPACK or
something like it.  I'd also recommend reading "Templates for the
solution of Algebraic Eigenvalue Problems":

http://www.cs.ucdavis.edu/~bai/ET/contents.html

and in particular:

http://www.cs.utk.edu/~dongarra/etemplates/node200.html

which discusses software availability.  It might be a little
out-of-date so be sure to exercise Google a bit.

mfh