first Lisp program: not sure what to do

From: christaylor
Subject: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 08:12:02 +0000
Message-ID: <0cf62616-dde2-4ac6-bd9a-ee5b207f1434@y5g2000hsf.googlegroups.com>

I'm basically a clueless newbie that's just started playing around
with Common Lisp, and I've decided on the first small project to learn
some things. I've got some basic things working, but I just can't get
a big-picture idea of the best way to approach what I want to do.
Sorry if this is really simple- here's an example of what I'm trying
to do, using a contrived list of movies and the stars, genre, etc.

Say I have a series of text files that look something like this:

  BLADE RUNNER
  Main Star: Harrison Ford
  Setting: Future
  Genre: Science Fiction

  CABARET
  Main Star: Liza Minelli
  Setting: 20th Century
  Genre: Musical

  ALL THAT JAZZ
  Main Star: Roy Scheider
  Setting:  20th Century
  Genre: Musical

I have another series of text files that look like this:

  BLADE RUNNER
  Director: Ridley Scott

  CABARET
  Director: Bob Fosse

  ALL THAT JAZZ
  Director: Bob Fosse

I want to read the files in and do something with them, and then print
out something like:

  BLADE RUNNER
  Dir: Ridley Scott
  Main Star: Harrison Ford
  Setting: Future
  Genre: Science Fiction

  CABARET
  Dir: Bob Fosse
  Main Star: Liza Minelli
  Setting: 20th Century
  Genre: Musical
  See also: ALL THAT JAZZ

  ALL THAT JAZZ
  Dir: Bob Fosse
  Main Star: Roy Scheider
  Setting:  20th Century
  Genre: Musical
  See also: CABARET

So basically, I'm reading in two different series of text files,
finding the similarities, and printing out a summary that consolidates
the information in little easier-to-read format. Reading the files in
and printing them out is no problem, I have that working fine. I want
to show things like places where the same star or director show up
multiple times. I have some Lisp code that makes a hash table where I
can look up 'BLADE RUNNER' and get back a list that looks like
(("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
"Science Fiction")), and I can see where maybe I could use LOOP to
iterate through the elements of the list, but I just can't think of
the best way to read in and keep track of two different hash tables,
one from the first series of text files, and the other from the second
series, where it has the movie title and just the director. Should I
use a hash table of hash tables? Or a-lists and use things like the
ASSOC function? I'd like to make it s simple as possible, and I don't
care about performance- these files will be very small and the lists
will be short. I like the idea of using something very basic like the
list manipulating functions. Am I missing a really obvious feature of
Lisp that wold seem to lend itself to keeping track of and indexing
between two trees of lists, so to speak?

Re: first Lisp program: not sure what to do Kent M Pitman
- Re: first Lisp program: not sure what to do smallpond
Re: first Lisp program: not sure what to do Ken Tilton
- Re: first Lisp program: not sure what to do Ken Tilton
Re: first Lisp program: not sure what to do Rainer Joswig
- Re: first Lisp program: not sure what to do viper-2
  - Re: first Lisp program: not sure what to do christaylor
Re: first Lisp program: not sure what to do Pascal Bourguignon
Re: first Lisp program: not sure what to do Robert Uhl
Re: first Lisp program: not sure what to do Madhu
Re: first Lisp program: not sure what to do Madhu

From: Kent M Pitman
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 09:59:13 +0000
Message-ID: <u8x34lhji.fsf@nhplace.com>

christaylor <···@imipolex-g.com> writes:

> So basically, I'm reading in two different series of text files,
> finding the similarities, and printing out a summary that consolidates
> the information in little easier-to-read format. Reading the files in
> and printing them out is no problem, I have that working fine. I want
> to show things like places where the same star or director show up
> multiple times. I have some Lisp code that makes a hash table where I
> can look up 'BLADE RUNNER' and get back a list that looks like
> (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
> "Science Fiction")), and I can see where maybe I could use LOOP to
> iterate through the elements of the list, but I just can't think of
> the best way to read in and keep track of two different hash tables,
> one from the first series of text files, and the other from the second
> series, where it has the movie title and just the director. Should I
> use a hash table of hash tables? Or a-lists and use things like the
> ASSOC function? I'd like to make it s simple as possible, and I don't
> care about performance- these files will be very small and the lists
> will be short. I like the idea of using something very basic like the
> list manipulating functions. Am I missing a really obvious feature of
> Lisp that wold seem to lend itself to keeping track of and indexing
> between two trees of lists, so to speak?

(I'm assuming this is not homework.)

I would usually do a few things:

(1) Doing something like this when parsing:
    (intern (string-trim '(#\:) 
               (string-upcase (substitute #\- #\Space "Main Star:")))
            *package*)
    => MAIN-STAR
    will give you a symbol that's easier to index by. You can then use
    pointer lookups rather than string compares.  That's not just efficiency,
    though efficiency isn't bad.  But lots of things will require many 
    fewer arguments if you do it that way.  Use another package besides
    *package* if you like; "KEYWORD" is one possible one. Or you can make
    a special package just for this use.  You can turn the symbol back to
    a pretty name with
    (string-capitalize (substitute #\Space #\- (symbol-name 'main-star)))

(2) You can use a mix of hash tables and properties.
    (defvar *my-mdb* (make-hash-table)
    (setf (getf (gethash 'blade-runner *my-mdb*) 'director) 'ridley-scott)
    (setf (getf (gethash 'blade-runner *my-mdb*) 'main-star) 'harrison-ford)
    etc.
    But you could do this entirely without hash tables using only 
    property lists. Personally, for a simple application like this, I'd
    find this easiest to do.  Just let Lisp be your db.
    (defvar *all-movies* '())
    (defvar *all-actors* '())
    (defun stars-in (movie &rest actors)
      (pushnew movie *all-movies*)
      (dolist (actor actors)
        (pushnew actor *all-actors*)
        (push actor (get movie 'actors))
        (push movie (get actor 'movies))))
    (stars-in 'blade-runner 'harrison-ford 'rutger-hauer 'sean-young)

From: smallpond
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 12:36:41 +0000
Message-ID: <9684c2e0-c45a-430a-9a56-466d3a0f7be8@d4g2000prg.googlegroups.com>

On Jan 5, 4:59 am, Kent M Pitman <······@nhplace.com> wrote:
> christaylor <····@imipolex-g.com> writes:
> > So basically, I'm reading in two different series of text files,
> > finding the similarities, and printing out a summary that consolidates
> > the information in little easier-to-read format. Reading the files in
> > and printing them out is no problem, I have that working fine. I want
> > to show things like places where the same star or director show up
> > multiple times. I have some Lisp code that makes a hash table where I
> > can look up 'BLADE RUNNER' and get back a list that looks like
> > (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
> > "Science Fiction")), and I can see where maybe I could use LOOP to
> > iterate through the elements of the list, but I just can't think of
> > the best way to read in and keep track of two different hash tables,
> > one from the first series of text files, and the other from the second
> > series, where it has the movie title and just the director. Should I
> > use a hash table of hash tables? Or a-lists and use things like the
> > ASSOC function? I'd like to make it s simple as possible, and I don't
> > care about performance- these files will be very small and the lists
> > will be short. I like the idea of using something very basic like the
> > list manipulating functions. Am I missing a really obvious feature of
> > Lisp that wold seem to lend itself to keeping track of and indexing
> > between two trees of lists, so to speak?
>
> (I'm assuming this is not homework.)
>
> I would usually do a few things:
>
> (1) Doing something like this when parsing:
>     (intern (string-trim '(#\:)
>                (string-upcase (substitute #\- #\Space "Main Star:")))
>             *package*)
>     => MAIN-STAR
>     will give you a symbol that's easier to index by. You can then use
>     pointer lookups rather than string compares.  That's not just efficiency,
>     though efficiency isn't bad.  But lots of things will require many
>     fewer arguments if you do it that way.  Use another package besides
>     *package* if you like; "KEYWORD" is one possible one. Or you can make
>     a special package just for this use.  You can turn the symbol back to
>     a pretty name with
>     (string-capitalize (substitute #\Space #\- (symbol-name 'main-star)))
>
> (2) You can use a mix of hash tables and properties.
>     (defvar *my-mdb* (make-hash-table)
>     (setf (getf (gethash 'blade-runner *my-mdb*) 'director) 'ridley-scott)
>     (setf (getf (gethash 'blade-runner *my-mdb*) 'main-star) 'harrison-ford)
>     etc.
>     But you could do this entirely without hash tables using only
>     property lists. Personally, for a simple application like this, I'd
>     find this easiest to do.  Just let Lisp be your db.
>     (defvar *all-movies* '())
>     (defvar *all-actors* '())
>     (defun stars-in (movie &rest actors)
>       (pushnew movie *all-movies*)
>       (dolist (actor actors)
>         (pushnew actor *all-actors*)
>         (push actor (get movie 'actors))
>         (push movie (get actor 'movies))))
>     (stars-in 'blade-runner 'harrison-ford 'rutger-hauer 'sean-young)


Of course it's homework.  What clueless newby would have
CABARET and BLADE RUNNER in the same list?

If this were not homework Chris would have to worry about
uniqueness.  Hollywood actor names are unique by decree,
but movie titles are not.  You could combine movie title
and year of release, but even then there might be a few
errors.
--S

From: Ken Tilton
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 13:20:57 +0000
Message-ID: <477f3e02$0$9171$607ed4bc@cv.net>

christaylor wrote:
> I'm basically a clueless newbie that's just started playing around
> with Common Lisp, and I've decided on the first small project to learn
> some things. I've got some basic things working, but I just can't get
> a big-picture idea of the best way to approach what I want to do.
> Sorry if this is really simple- here's an example of what I'm trying
> to do, using a contrived list of movies and the stars, genre, etc.
> 
> Say I have a series of text files that look something like this:
> 
>   BLADE RUNNER
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
> 
>   CABARET
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
> 
>   ALL THAT JAZZ
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical
> 
> I have another series of text files that look like this:
> 
>   BLADE RUNNER
>   Director: Ridley Scott
> 
>   CABARET
>   Director: Bob Fosse
> 
>   ALL THAT JAZZ
>   Director: Bob Fosse
> 
> I want to read the files in and do something with them, and then print
> out something like:
> 
>   BLADE RUNNER
>   Dir: Ridley Scott
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
> 
>   CABARET
>   Dir: Bob Fosse
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
>   See also: ALL THAT JAZZ
> 
>   ALL THAT JAZZ
>   Dir: Bob Fosse
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical
>   See also: CABARET
> 
> So basically, I'm reading in two different series of text files,
> finding the similarities, and printing out a summary that consolidates
> the information in little easier-to-read format. Reading the files in
> and printing them out is no problem, I have that working fine. I want
> to show things like places where the same star or director show up
> multiple times. I have some Lisp code that makes a hash table where I
> can look up 'BLADE RUNNER' and get back a list that looks like
> (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
> "Science Fiction")), and I can see where maybe I could use LOOP to
> iterate through the elements of the list, but I just can't think of
> the best way to read in and keep track of two different hash tables,
> one from the first series of text files, and the other from the second
> series, where it has the movie title and just the director. Should I
> use a hash table of hash tables? Or a-lists and use things like the
> ASSOC function? I'd like to make it s simple as possible, and I don't
> care about performance- these files will be very small and the lists
> will be short. I like the idea of using something very basic like the
> list manipulating functions. Am I missing a really obvious feature of
> Lisp that wold seem to lend itself to keeping track of and indexing
> between two trees of lists, so to speak?

You seem to have a good grasp of the tools you might apply, so... Just 
Try It. You might do something daft at first, but when you go to use the 
data you will then find you cannot do something you want to do. Then 
change what you did up front until you can. The epiphany will come.

Uh, implicit in this is that, yes, you can do this with no more than 
vanilla Lisp data structures and "library" routines*. Seibel's PCL 
(available on-line) had a lot of examples.

One practical question I had, btw, was that it sounded as if you were 
trying to keep the data separated based on origin. I am not sure that is 
what you were saying, but if so, how come? I would think you would be 
merging the data from differently formatted inputs into one normalized 
database.

kt

* If you want to have a lot of fun, grab the free ACL and use 
AllegroGraph, the solution will just drop out of a tree for you and the 
data will be persistent. k

-- 
http://www.theoryyalgebra.com/

"In the morning, hear the Way;
  in the evening, die content!"
                     -- Confucius

From: Ken Tilton
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 14:01:53 +0000
Message-ID: <477f479b$0$9150$607ed4bc@cv.net>

Ken Tilton wrote:
> 
> 
> christaylor wrote:
> 
>> I'm basically a clueless newbie that's just started playing around
>> with Common Lisp, and I've decided on the first small project to learn
>> some things. I've got some basic things working, but I just can't get
>> a big-picture idea of the best way to approach what I want to do.
>> Sorry if this is really simple- here's an example of what I'm trying
>> to do, using a contrived list of movies and the stars, genre, etc.
>>
>> Say I have a series of text files that look something like this:
>>
>>   BLADE RUNNER
>>   Main Star: Harrison Ford
>>   Setting: Future
>>   Genre: Science Fiction
>>
>>   CABARET
>>   Main Star: Liza Minelli
>>   Setting: 20th Century
>>   Genre: Musical
>>
>>   ALL THAT JAZZ
>>   Main Star: Roy Scheider
>>   Setting:  20th Century
>>   Genre: Musical
>>
>> I have another series of text files that look like this:
>>
>>   BLADE RUNNER
>>   Director: Ridley Scott
>>
>>   CABARET
>>   Director: Bob Fosse
>>
>>   ALL THAT JAZZ
>>   Director: Bob Fosse
>>
>> I want to read the files in and do something with them, and then print
>> out something like:
>>
>>   BLADE RUNNER
>>   Dir: Ridley Scott
>>   Main Star: Harrison Ford
>>   Setting: Future
>>   Genre: Science Fiction
>>
>>   CABARET
>>   Dir: Bob Fosse
>>   Main Star: Liza Minelli
>>   Setting: 20th Century
>>   Genre: Musical
>>   See also: ALL THAT JAZZ
>>
>>   ALL THAT JAZZ
>>   Dir: Bob Fosse
>>   Main Star: Roy Scheider
>>   Setting:  20th Century
>>   Genre: Musical
>>   See also: CABARET
>>
>> So basically, I'm reading in two different series of text files,
>> finding the similarities, and printing out a summary that consolidates
>> the information in little easier-to-read format. Reading the files in
>> and printing them out is no problem, I have that working fine. I want
>> to show things like places where the same star or director show up
>> multiple times. I have some Lisp code that makes a hash table where I
>> can look up 'BLADE RUNNER' and get back a list that looks like
>> (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
>> "Science Fiction")), and I can see where maybe I could use LOOP to
>> iterate through the elements of the list, but I just can't think of
>> the best way to read in and keep track of two different hash tables,
>> one from the first series of text files, and the other from the second
>> series, where it has the movie title and just the director. Should I
>> use a hash table of hash tables? Or a-lists and use things like the
>> ASSOC function? I'd like to make it s simple as possible, and I don't
>> care about performance- these files will be very small and the lists
>> will be short. I like the idea of using something very basic like the
>> list manipulating functions. Am I missing a really obvious feature of
>> Lisp that wold seem to lend itself to keeping track of and indexing
>> between two trees of lists, so to speak?
> 
> 
> You seem to have a good grasp of the tools you might apply, so... Just 
> Try It. You might do something daft at first, but when you go to use the 
> data you will then find you cannot do something you want to do. Then 
> change what you did up front until you can. The epiphany will come.
> 
> Uh, implicit in this is that, yes, you can do this with no more than 
> vanilla Lisp data structures and "library" routines*. Seibel's PCL 
> (available on-line) had a lot of examples.
> 
> One practical question I had, btw, was that it sounded as if you were 
> trying to keep the data separated based on origin. I am not sure that is 
> what you were saying, but if so, how come? I would think you would be 
> merging the data from differently formatted inputs into one normalized 
> database.
> 
> kt
> 
> * If you want to have a lot of fun, grab the free ACL and use 
> AllegroGraph, the solution will just drop out of a tree for you and the 
> data will be persistent. k
> 

ps. And if you /do/ have to retain info about where the data came from, 
RDF uses the "named graph" slot for that by convention. k

-- 
http://www.theoryyalgebra.com/

"In the morning, hear the Way;
  in the evening, die content!"
                     -- Confucius

From: Rainer Joswig
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 11:02:51 +0000
Message-ID: <joswig-4B4B19.12025005012008@news-europe.giganews.com>

In article 
<····································@y5g2000hsf.googlegroups.com>,
 christaylor <···@imipolex-g.com> wrote:

> I'm basically a clueless newbie that's just started playing around
> with Common Lisp, and I've decided on the first small project to learn
> some things. I've got some basic things working, but I just can't get
> a big-picture idea of the best way to approach what I want to do.
> Sorry if this is really simple- here's an example of what I'm trying
> to do, using a contrived list of movies and the stars, genre, etc.
> 
> Say I have a series of text files that look something like this:
> 
>   BLADE RUNNER
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
> 
>   CABARET
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
> 
>   ALL THAT JAZZ
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical
> 
> I have another series of text files that look like this:
> 
>   BLADE RUNNER
>   Director: Ridley Scott
> 
>   CABARET
>   Director: Bob Fosse
> 
>   ALL THAT JAZZ
>   Director: Bob Fosse
> 
> I want to read the files in and do something with them, and then print
> out something like:
> 
>   BLADE RUNNER
>   Dir: Ridley Scott
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
> 
>   CABARET
>   Dir: Bob Fosse
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
>   See also: ALL THAT JAZZ
> 
>   ALL THAT JAZZ
>   Dir: Bob Fosse
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical
>   See also: CABARET
> 
> So basically, I'm reading in two different series of text files,
> finding the similarities, and printing out a summary that consolidates
> the information in little easier-to-read format. Reading the files in
> and printing them out is no problem, I have that working fine. I want
> to show things like places where the same star or director show up
> multiple times. I have some Lisp code that makes a hash table where I
> can look up 'BLADE RUNNER' and get back a list that looks like
> (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
> "Science Fiction")), and I can see where maybe I could use LOOP to
> iterate through the elements of the list, but I just can't think of
> the best way to read in and keep track of two different hash tables,
> one from the first series of text files, and the other from the second
> series, where it has the movie title and just the director. Should I
> use a hash table of hash tables? Or a-lists and use things like the
> ASSOC function? I'd like to make it s simple as possible, and I don't
> care about performance- these files will be very small and the lists
> will be short. I like the idea of using something very basic like the
> list manipulating functions. Am I missing a really obvious feature of
> Lisp that wold seem to lend itself to keeping track of and indexing
> between two trees of lists, so to speak?

Hey, a very good question. You have some data, a programming
problem and a programming language. Now what? The task
is to find a good data representation that fits your requirements.


You have a basic data type movie:

(defclass movie ()
  (director main-star setting genre))

You need functions:

(defparameter *movies-by-title* (make-hash-table ...))

(defun read-movies-1 (stream movies)
"fills the hashtable"
  )

(defun merge-information-from-movies-2 (movies stream)
  "adds information from file 2 to movies")

(defun director->movies (movies director)
  "Returns a list of movies of that director"
; primitive: map over the movies hash-table and find
; all movies with that director
 )

(defmethod movie-see-also ((a-movie movie))
  (director->movies *movies-by-title*
                    (movie-director a-movie)))


(defmethod print-movies-with-see-also ((a-movie movie)

; print the slots
; print the director->movies results
)



; --------------


Above is one possibility. Low level. Programmatic.
You can image that you can write a general descriptive
version of that without the domain being hardwired.

Another way would be to write a little relational
database.

A very good example is in LISP, 3rd edition by Winston and Horn.
Chapter 30: Procedure-Writing Programs and Database Interfaces.
The domain there is 'screwdrivers', 'suppliers' and 'customers'.
The task there also is two write a little 'natural language'
query language.


; -------------

Another possibility.

Break up the representation completely into relations.
Take any small Prolog (or similar) implementation in Lisp
and use that.

See PAIP (Peter Norvig, Paradigms of Artificial Intelligence
Programming) for a small Prolog. This allows you
to add relations similar to (tell (director blade-runner ridley-scott))
and also retrieve things with (ask (director ?movies ridley-scott)) .
This TELL and ASK pattern you find very often in such programs.

-- 
http://lispm.dyndns.org/

From: viper-2
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 16:52:33 +0000
Message-ID: <0f1dccc2-817a-4535-9e07-e0d889e553fa@r60g2000hsc.googlegroups.com>

On Jan 5, 6:02 am, Rainer Joswig <······@lisp.de> wrote:
> In article
> <····································@y5g2000hsf.googlegroups.com>,
>
>
>
>  christaylor <····@imipolex-g.com> wrote:
> > I'm basically a clueless newbie that's just started playing around
> > with Common Lisp, and I've decided on the first small project to learn
> > some things.
...

> Another way would be to write a little relational
> database.
>
> A very good example is in LISP, 3rd edition by Winston and Horn.
> Chapter 30: Procedure-Writing Programs and Database Interfaces.
> The domain there is 'screwdrivers', 'suppliers' and 'customers'.
> The task there also is two write a little 'natural language'
> query language.

I can't tell you how much of a relief it is to know that there are
other lispers who read beyond chapter 3 of any lisp book. ;-)

I wouldn't point Chris, who admits to being a "clueless newbie" to
chapter 30 of Lisp 3rd edition just now though. I would explore some
basic structures for starters.

agt

From: christaylor
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 18:29:55 +0000
Message-ID: <46068e56-2961-4ddf-b4fa-4da0635d2783@s12g2000prg.googlegroups.com>

Awesome- thanks so much for all of your ideas! This isn't homework,
it's just something I'm playing around with for fun.

From: Pascal Bourguignon
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 11:12:54 +0000
Message-ID: <873atc34qx.fsf@thalassa.informatimago.com>

christaylor <···@imipolex-g.com> writes:
> [...]
> So basically, I'm reading in two different series of text files,
> finding the similarities, and printing out a summary that consolidates
> the information in little easier-to-read format. Reading the files in
> and printing them out is no problem, I have that working fine. I want
> to show things like places where the same star or director show up
> multiple times. I have some Lisp code that makes a hash table where I
> can look up 'BLADE RUNNER' and get back a list that looks like
> (("Main Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:"
> "Science Fiction")),  and I can see where maybe I could use LOOP to
> iterate through the elements of the list, but I just can't think of
> the best way to read in and keep track of two different hash tables,
> one from the first series of text files, and the other from the second
> series, where it has the movie title and just the director. Should I
> use a hash table of hash tables? Or a-lists and use things like the
> ASSOC function? I'd like to make it s simple as possible, and I don't
> care about performance- these files will be very small and the lists
> will be short. I like the idea of using something very basic like the
> list manipulating functions. Am I missing a really obvious feature of
> Lisp that wold seem to lend itself to keeping track of and indexing
> between two trees of lists, so to speak?

(defparameter *renames* '(("Director:" "Dir:")))

(let ((movies-1 (load-movie-file "file1"))
      (movies-2 (load-movie-file "file2")))
   (let ((merged (merge-movies movies-1 movies-2)))
      (save-movie-file (rename-fields merged *renames*) "file3")))

I don't see any difference between your files...




Note that how you implement your collections doesn't matter, you
should abstract it away, defining functions such as:

(make-movie list-of-list-of-field-name-and-field-value) -> movie
(movie-field-names movie) -> list-of-field-name
(movie-get-field movie field-name) -> field-value
(map-movie-fields (lambda (field-name field-value) ...) movie)

(make-movie-collection) -> movie-collection
(insert-movie movie-collection movie)
(remove-movie movie-collection movie)
(map-movies (lambda (movie) ...) movie-collection)


load-movie-file, rename-fields, merge-movies and save-movie-file can
be implemented using these abstract functions.  Then you can change
their implementations to use lists, a-lists, hash-tables, whatever.

-- 
__Pascal_Bourguignon__               _  Software patents are endangering
()  ASCII ribbon against html email (o_ the computer industry all around
/\  1962:DO20I=1.100                //\ the world http://lpf.ai.mit.edu/
    2001:my($f)=`fortune`;          V_/   http://petition.eurolinux.org/

From: Robert Uhl
Subject: Re: first Lisp program: not sure what to do
Date: Sat, 05 Jan 2008 20:17:29 +0000
Message-ID: <m3ejcwrpra.fsf@latakia.dyndns.org>

christaylor <···@imipolex-g.com> writes:

> I'm basically a clueless newbie that's just started playing around
> with Common Lisp, and I've decided on the first small project to learn
> some things.

That's always a good idea.

> Say I have a series of text files that look something like this:
>
>   BLADE RUNNER
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
>
>   CABARET
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
>
>   ALL THAT JAZZ
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical

It's not relevant, but I'm curious--do the files have meaningful names,
or are they just randomly-named?

> I have another series of text files that look like this:
>
>   BLADE RUNNER
>   Director: Ridley Scott
>
>   CABARET
>   Director: Bob Fosse
>
>   ALL THAT JAZZ
>   Director: Bob Fosse

Cool so far.

> I want to read the files in and do something with them, and then print
> out something like:
>
>   BLADE RUNNER
>   Dir: Ridley Scott
>   Main Star: Harrison Ford
>   Setting: Future
>   Genre: Science Fiction
>
>   CABARET
>   Dir: Bob Fosse
>   Main Star: Liza Minelli
>   Setting: 20th Century
>   Genre: Musical
>   See also: ALL THAT JAZZ
>
>   ALL THAT JAZZ
>   Dir: Bob Fosse
>   Main Star: Roy Scheider
>   Setting:  20th Century
>   Genre: Musical
>   See also: CABARET

Well, you're doing two things here: first, joining different files
together.  That's cool.  You're also suggesting other films based on
director; that's another problem, but also cool.

> So basically, I'm reading in two different series of text files,
> finding the similarities, and printing out a summary that consolidates
> the information in little easier-to-read format. Reading the files in
> and printing them out is no problem, I have that working fine. I want
> to show things like places where the same star or director show up
> multiple times. I have some Lisp code that makes a hash table where I
> can look up 'BLADE RUNNER' and get back a list that looks like (("Main
> Star:" "Harrison Ford") ("Setting:" "Future") ("Genre:" "Science
> Fiction")), and I can see where maybe I could use LOOP to iterate
> through the elements of the list, but I just can't think of the best
> way to read in and keep track of two different hash tables, one from
> the first series of text files, and the other from the second series,
> where it has the movie title and just the director.

Why use two hash tables, one for each file set?  It seems to me that
each film might be represented as a plist (or hash table, if you need
O(1) access time):

  (defvar *films*
          '((:title     "Blade Runner"
             :director  "Ridley Scott"
             :main-star "Harrison Ford"
             :setting   "Future"
             :genre     "Science Fiction")

            (:title     "Cabaret"
             :director  "Bob Fosse"
             :main-star "Liza Minelli"
             :setting   "20th Century"
             :genre     "Musical")

            (:title     "All That Jazz"
             :director  "Bob Fosse"
             :main-star "Roy Scheider"
             :setting   "20th Century" 
             :genre     "Musical")))

To list all the films you have, you could do:

  (mapcar (lambda (film) (getf film :title)) *films*)

  >>> ("Blade Runner" "Cabaret" "All That Jazz")

To search for a film, you could do:

  (find "All That Jazz" *films*
        :key (lambda (film) (getf film :title))
        :test #'string-equal)
  >>> (:title "All That Jazz" :director "Bob Fosse" :main-star "Roy Scheider"
       :setting "20th Century" :genre "Musical")

You could define a function SUGGEST-FILM like such:

  (defun suggest-films (film)
    "Return all films in *FILMS* which have the same director or
mains FILM"
    (remove film
            (remove-if-not (lambda (x)
                             (or (string-equal (getf x :director)
                                               (getf film :director))
                                 (string-equal (getf x :main-star)
                                               (getf film :main-star))))
                           *films*)
            :test #'equalp))
  (suggest-films (first *films*)) ; Blade Runner
  >>> nil
  (suggest-films (second *films*)) ; Cabaret
  >>> ((:title "All That Jazz" :director "Bob Fosse" :main-star "Roy Scheider"
  :setting "20th Century" :genre "Musical"))

There are, of course, rather more efficient algorithms and
representations to use--this would all bog down very, very badly on
something the size of IMDB.  The first optimisation would be to
represent a film as a hash table; you could even write a function
MAKE-FILM which will turn a film plist into a hash table:

  (defun make-film (film)
    (let ((hash (make-hash-table)))
      (mapcar (lambda (attribute)
                (setf (gethash attribute hash)
                      (getf film attribute)))
              (loop for item in film
                 for j = 1 then (1+ j)
                 when (oddp j)
                 collect item))
      hash))

And could modify your search and suggest bits accordingly.

Or you could create a class FILM with the appropriate slots and
accessors, which would make life somewhat easier in some instances:

  (defclass film ()
    ((title
      :accessor title
      :initarg :title)
     (director
      :accessor director
      :initarg :director)
     (main-star
      :accessor main-star
      :initarg :main-star)
     (genre
      :accessor genre
      :initarg :genre)
     (setting
      :accessor setting
      :initarg :setting)))

  (defun make-film (film)
    ;; this takes advantage that the plist keys are the same as the
    ;; initargs for FILM
    (apply #'make-instance 'film film))

This is all very primitive stuff, of course--there are better methods of
dealing with data that must be searched on multiple keys and so forth.
But for playing around it can be fun.

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
I had always secretly suspected that any country with better than 200
varieties of cheese couldn't be linguistically uniform either, so
having solid evidence thereof came as rather a relief to me.
                                           --Alianora Munro

From: Madhu
Subject: Re: first Lisp program: not sure what to do
Date: Sun, 06 Jan 2008 06:38:25 +0000
Message-ID: <m3r6gvjw66.fsf@robolove.meer.net>

* christaylor Wrote on Sat, 5 Jan 2008 00:12:02 -0800 (PST):

| and I can see where maybe I could use LOOP to iterate through the
| elements of the list, but I just can't think of the best way to read
| in and keep track of two different hash tables, one from the first
| series of text files, and the other from the second series, where it
| has the movie title and just the director. Should I use a hash table
| of hash tables? Or a-lists and use things like the ASSOC function? I'd
| like to make it s simple as possible,

beyond simple use of lists for hw, look at man comm(1) command on UNIX.
`comm' provides a way to think about how to compare 2 collections.  I've
been using this in a change detection library I've been working on, but
am wondering how hard it is for others to wrap their heads over this
API:

(COMPARE-TRIES TRIE1 TRIE2 &KEY COMM-12 COMM-23 COMM-13 TEST)

Call COMM-23 with arguments (KEY VAL1) on elements unique to TRIE1.
Call COMM-13 with arguments (KEY VAL2) on elements unique to TRIE2.
Call COMM-12 with arguments (KEY VAL1 VAL2) on elements common to TRIE1
and TRIE2.

TEST defaults to EQUAL and is used for comparing keys

COMPARE-TRIES would work on trees, hashtables, plists.  Again, I'm just
curious if anyone else finds a reasonable approach to solve your problem

--
Madhu



























and I don't care about
| performance- these files will be very small and the lists will be
| short. I like the idea of using something very basic like the list
| manipulating functions.

| Am I missing a really obvious feature of Lisp that wold seem to lend
| itself to keeping track of and indexing between two trees of lists, so
| to speak?

--
=======================================================================
Open system or closed system, enlightenment or ideology, those are the
questions.	         	    "John C. Mallery" <····@ai.mit.edu>

	    ·······@meer.net http://www.meer.net/~enometh/
	       803 Clayton St #2 San Francisco CA 94117
		    H:415-242-3375 W:408-343-6255

You have new mail

From: Madhu
Subject: Re: first Lisp program: not sure what to do
Date: Sun, 06 Jan 2008 06:53:02 +0000
Message-ID: <m3k5mnjvht.fsf@robolove.meer.net>

* christaylor Wrote on Sat, 5 Jan 2008 00:12:02 -0800 (PST):

| and I can see where maybe I could use LOOP to iterate through the
| elements of the list, but I just can't think of the best way to read
| in and keep track of two different hash tables, one from the first
| series of text files, and the other from the second series, where it
| has the movie title and just the director. Should I use a hash table
| of hash tables? Or a-lists and use things like the ASSOC function? I'd
| like to make it s simple as possible,

beyond simple use of lists for hw, look at man comm(1) command on UNIX.
`comm' provides a way to think about how to compare 2 collections.  I've
been using this in a change detection library I've been working on, but
am wondering how hard it is for others to wrap their heads over this
API:

(COMPARE-TRIES TRIE1 TRIE2 &KEY COMM-12 COMM-23 COMM-13 TEST)

Call COMM-23 with arguments (KEY VAL1) on elements unique to TRIE1.
Call COMM-13 with arguments (KEY VAL2) on elements unique to TRIE2.
Call COMM-12 with arguments (KEY VAL1 VAL2) on elements common to TRIE1
and TRIE2.

TEST defaults to EQUAL and is used for comparing keys

COMPARE-TRIES would work on trees, hashtables, plists.  Again, I'm just
curious if anyone else finds this a reasonable approach to solve your
problem

--
Madhu