Should I use LISP for this?

From: Star
Subject: Should I use LISP for this?
Date: Tue, 02 Mar 2004 02:54:35 +0000
Message-ID: <e0a0c75d.0403011854.7b8419c8@posting.google.com>

Hi

I have a program that handle very big structures of graphs. I was
using C++ language a SQL Server to handle that, but the perfomance
that I have is very slow. I have been suggested to use LISP to handle
this, but I haven't used LISP in many years, and before starting
working on this, I would like to know your opinion. Here is the
problem:

I have about 15 tables. Each table has several fields, however each
table
has a common field called 'Code'

ex.

Table Vehicles
--------------
Code
Make
Color

Table People
-------------
Code
Name
Age

Table Addresses
-----------------
Code
City
ZIP
...

Now we want to relate records from one table to another.  For example,
we
want to say 'John has two cars'
In order to do that, I have an external table called Links with source
an
destination fields (I store there the code of each table). For the
example we would have

Source    Dest
----------------
P132      V100
P132      V242

(P132 is a record from the People table with code 132, V100 a vehicle
from
the Vehicle table with code 100,  and V242 a vehicle
with code 242)

So, with all that, we know that John (P132) is related with two
vehicles. If
I want to run the query 'People associated with vehicles'
it would be easy to query that table and get that information.

Now, let's suppose we have this situation:

Source    Dest
----------------
P132      V100
P132      V242
P140      V500
P132      P140

In this example, John is related with Michael (P140) and Michael is
related
with a vehicle (V500).
If I want to run the same query than before 'People associated with
vehicles', because of P132 is related with P140 and P140 is related
with
a V500 this implies that P132 is related with V500.

Now, the query is not so simple, because it's like a graph and I need
to
make a Depth First Search to get what I want. Doing that, I will get
that
P132 is
related with V500 because there is a 'path' to get there.

I have done all this and it works fine. The problem that I have is
that I
need to make that Depth First Search for each people that I have in
the
database
in order to see if I have vehicles associated with it. If I don't have
many
records it's ok, but If I have 100,000 people on the database I would
have
to
make that search 100,000 times just to see if there is a path between
them,
I mean, if there is a way to get from a person to a vehicle.
In this case... this solution doesn't work, because it takes
forever...
I would need to make any kind of query on this structure.. ex.
vehicles associated with vehicles, persons with vehicles, addresses
with persons..

What do you guys suggest I can try? I though that one solution would
be to
have a process running on the server that maintains a table with ALL
the
links that we have in the database and the depth of the link. So, in
our
example we'd have a entry "P132 V500 with depth 2"
Having that table it would be easy to query. Inconvenience: the table
is
going to be HUGE, and also the process has to be smart enough to keep
track
of
deleted links, new links...

I'd appreciate if you guys have any idea or suggestions about this and
how I
can get a better perfomance.

Thanks a lot for reading this whole message!!

Re: Should I use LISP for this? Alan Walker
- Re: Should I use LISP for this? Karl A. Krueger
  - Re: Should I use LISP for this? Alan Walker
- Re: Should I use LISP for this? John Thingstad
  - Re: Should I use LISP for this? Christopher Browne
Re: Should I use LISP for this? Billy O'Connor
Re: Should I use LISP for this? David Steuber
Re: Should I use LISP for this? Ray Fink
Re: Should I use LISP for this? Star

From: Alan Walker
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 03:11:33 +0000
Message-ID: <1047uqarj72am1c@corp.supernews.com>

Seems more like a database problem.  Check out MySQL.  If you want data in
memory, use MySQL Cluster.

The query optimizer will look at your query and determine the best way to
handle it.  You can beat the database using pre-navigated, in-memory
structures, but it will shine for ad-hoc queries.  Also, you can add
secondary indexes on the fly in a database as needs change.  It will also
handle concurrency, updating the structure while people query it.

If you really insit on building in-memory structures, Lisp is a good choice.
C++ will work for simple structures, but will drive you nuts with complex
structures.  GC is also a plus.

Now, using Lisp as a front-end to your database, that's a good idea...

Alan.


"Star" <·······@hotmail.com> wrote in message
·································@posting.google.com...
> Hi
>
> I have a program that handle very big structures of graphs. I was
> using C++ language a SQL Server to handle that, but the perfomance
> that I have is very slow. I have been suggested to use LISP to handle
> this, but I haven't used LISP in many years, and before starting
> working on this, I would like to know your opinion. Here is the
> problem:
>
> I have about 15 tables. Each table has several fields, however each
> table
> has a common field called 'Code'
>
> ex.
>
> Table Vehicles
> --------------
> Code
> Make
> Color
>
> Table People
> -------------
> Code
> Name
> Age
>
> Table Addresses
> -----------------
> Code
> City
> ZIP
> ...
>
> Now we want to relate records from one table to another.  For example,
> we
> want to say 'John has two cars'
> In order to do that, I have an external table called Links with source
> an
> destination fields (I store there the code of each table). For the
> example we would have
>
> Source    Dest
> ----------------
> P132      V100
> P132      V242
>
> (P132 is a record from the People table with code 132, V100 a vehicle
> from
> the Vehicle table with code 100,  and V242 a vehicle
> with code 242)
>
> So, with all that, we know that John (P132) is related with two
> vehicles. If
> I want to run the query 'People associated with vehicles'
> it would be easy to query that table and get that information.
>
> Now, let's suppose we have this situation:
>
> Source    Dest
> ----------------
> P132      V100
> P132      V242
> P140      V500
> P132      P140
>
> In this example, John is related with Michael (P140) and Michael is
> related
> with a vehicle (V500).
> If I want to run the same query than before 'People associated with
> vehicles', because of P132 is related with P140 and P140 is related
> with
> a V500 this implies that P132 is related with V500.
>
> Now, the query is not so simple, because it's like a graph and I need
> to
> make a Depth First Search to get what I want. Doing that, I will get
> that
> P132 is
> related with V500 because there is a 'path' to get there.
>
> I have done all this and it works fine. The problem that I have is
> that I
> need to make that Depth First Search for each people that I have in
> the
> database
> in order to see if I have vehicles associated with it. If I don't have
> many
> records it's ok, but If I have 100,000 people on the database I would
> have
> to
> make that search 100,000 times just to see if there is a path between
> them,
> I mean, if there is a way to get from a person to a vehicle.
> In this case... this solution doesn't work, because it takes
> forever...
> I would need to make any kind of query on this structure.. ex.
> vehicles associated with vehicles, persons with vehicles, addresses
> with persons..
>
> What do you guys suggest I can try? I though that one solution would
> be to
> have a process running on the server that maintains a table with ALL
> the
> links that we have in the database and the depth of the link. So, in
> our
> example we'd have a entry "P132 V500 with depth 2"
> Having that table it would be easy to query. Inconvenience: the table
> is
> going to be HUGE, and also the process has to be smart enough to keep
> track
> of
> deleted links, new links...
>
> I'd appreciate if you guys have any idea or suggestions about this and
> how I
> can get a better perfomance.
>
> Thanks a lot for reading this whole message!!

From: Karl A. Krueger
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 06:16:32 +0000
Message-ID: <c218ru$sk9$1@baldur.whoi.edu>

Alan Walker <·························@charter.net> wrote:
> Seems more like a database problem.  Check out MySQL.  If you want data in
> memory, use MySQL Cluster.

MySQL is actually a pretty bad example of an SQL DBMS.  It does not
implement many of the "hard" parts of the SQL language, like triggers,
views, and stored procedures -- that is, some of the features that allow
the DBMS to do much of your work for you.

For a very rough analogy:  That's kind of like a Lisp with no macros,
and no way of implementing them.

Over time, the MySQL people -have- implemented a lot of those "hard"
features, such as transactions and relational integrity checking, albeit
somewhat more roughly than other open-source SQL DBMSes.  But oddly
enough, before they did so, they were telling their own users things
like "integrity checking is for lousy programmers" and "anything you can
do with transactions you can do in the application."  And those things
just are not true.

To continue the rough analogy:  A "relational" DBMS without relational
integrity checking is like a Lisp without, say, type checking -- one
which will tell you (CAR 'FOO) is 42 via an invalid memory access.  One
without transactions is like a Lisp without UNWIND-PROTECT or any
equivalent.

There are a number of other open-source SQL DBMSes, any of which are
more complete than MySQL.  PostgreSQL is the best-known, and CL-SQL
supports it.  There is also Firebird -- as well as MaxDB, which is also
these days put out by the MySQL company.  Unlike MySQL itself, MaxDB is
a regular SQL DBMS which used to be proprietary (as SapDB).  (Firebird's
origins are also proprietary; PostgreSQL was originally a research
project.)

MySQL is also missing a lot of useful _little_ features that many are
surprised to find aren't there.  It's often considered a useful DBMS for
"Internet-facing" applications ... but it does not have standard data
types for IP addresses or CIDR netblocks, and as a result cannot do real
comparisons on IP addresses.

As a result, many application programmers have had to implement their
own kludgy versions of these, based on long integers or strings, and
reimplementing IP address math in their applications.  My partner has
been working to deploy (not write) a network management application with
a MySQL back-end and has encountered many of these kludges in the code
base.

Meanwhile, I'm using PostgreSQL for a different application of my own
design, and can do things like ask the database questions about IP
addresses:

	SELECT * FROM hosts WHERE ip_addr << '10.128.128.0/25' ;

... where << is the set-member predicate.  (PostgreSQL implicitly casts
the string '10.128.128.0/25' to type 'cidr'.)

For more information on MySQL (and PostgreSQL) odd behaviors:

MySQL Gotchas:
	http://sql-info.de/mysql/gotchas.html

PostgreSQL Gotchas:
	http://sql-info.de/postgresql/postgres-gotchas.html

"Weird stuff in MySQL":
	http://vulcanus.its.tudelft.nl/~acm/got/antimysql.php

-- 
Karl A. Krueger <········@example.edu>
Woods Hole Oceanographic Institution
Email address is spamtrapped.  s/example/whoi/
"Outlook not so good." -- Magic 8-Ball Software Reviews

From: Alan Walker
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 23:41:26 +0000
Message-ID: <104a6saqrge7a6d@corp.supernews.com>

Karl,

We benchmarked one of our large applications on multiple databases and MySQL
matched Oracle, beat TimesTen and blew away PostgreSQL, it wasn't even in
the race.  MySQL does have transactions, we put a few million through it
every day.  Some of the other features you mention are missing.  We didn't
need all the advanced features.  Someone that's trying to do the whole think
in memory in C++ is probably looking for performance and trying to avoid
buying software.  Maybe I'm making too many assumptions here.

That said, my core message was "consider a database".  It's a trade off
between features, performance and the money one has available, you only get
to pick two.

Alan.


"Karl A. Krueger" <········@example.edu> wrote in message
·················@baldur.whoi.edu...
> Alan Walker <·························@charter.net> wrote:
> > Seems more like a database problem.  Check out MySQL.  If you want data
in
> > memory, use MySQL Cluster.
>
>
> MySQL is actually a pretty bad example of an SQL DBMS.  It does not
> implement many of the "hard" parts of the SQL language, like triggers,
> views, and stored procedures -- that is, some of the features that allow
> the DBMS to do much of your work for you.
>
> For a very rough analogy:  That's kind of like a Lisp with no macros,
> and no way of implementing them.
>
>
> Over time, the MySQL people -have- implemented a lot of those "hard"
> features, such as transactions and relational integrity checking, albeit
> somewhat more roughly than other open-source SQL DBMSes.  But oddly
> enough, before they did so, they were telling their own users things
> like "integrity checking is for lousy programmers" and "anything you can
> do with transactions you can do in the application."  And those things
> just are not true.
>
> To continue the rough analogy:  A "relational" DBMS without relational
> integrity checking is like a Lisp without, say, type checking -- one
> which will tell you (CAR 'FOO) is 42 via an invalid memory access.  One
> without transactions is like a Lisp without UNWIND-PROTECT or any
> equivalent.
>
>
> There are a number of other open-source SQL DBMSes, any of which are
> more complete than MySQL.  PostgreSQL is the best-known, and CL-SQL
> supports it.  There is also Firebird -- as well as MaxDB, which is also
> these days put out by the MySQL company.  Unlike MySQL itself, MaxDB is
> a regular SQL DBMS which used to be proprietary (as SapDB).  (Firebird's
> origins are also proprietary; PostgreSQL was originally a research
> project.)
>
>
> MySQL is also missing a lot of useful _little_ features that many are
> surprised to find aren't there.  It's often considered a useful DBMS for
> "Internet-facing" applications ... but it does not have standard data
> types for IP addresses or CIDR netblocks, and as a result cannot do real
> comparisons on IP addresses.
>
> As a result, many application programmers have had to implement their
> own kludgy versions of these, based on long integers or strings, and
> reimplementing IP address math in their applications.  My partner has
> been working to deploy (not write) a network management application with
> a MySQL back-end and has encountered many of these kludges in the code
> base.
>
> Meanwhile, I'm using PostgreSQL for a different application of my own
> design, and can do things like ask the database questions about IP
> addresses:
>
> SELECT * FROM hosts WHERE ip_addr << '10.128.128.0/25' ;
>
> ... where << is the set-member predicate.  (PostgreSQL implicitly casts
> the string '10.128.128.0/25' to type 'cidr'.)
>
>
> For more information on MySQL (and PostgreSQL) odd behaviors:
>
> MySQL Gotchas:
> http://sql-info.de/mysql/gotchas.html
>
> PostgreSQL Gotchas:
> http://sql-info.de/postgresql/postgres-gotchas.html
>
> "Weird stuff in MySQL":
> http://vulcanus.its.tudelft.nl/~acm/got/antimysql.php
>
> --
> Karl A. Krueger <········@example.edu>
> Woods Hole Oceanographic Institution
> Email address is spamtrapped.  s/example/whoi/
> "Outlook not so good." -- Magic 8-Ball Software Reviews

From: John Thingstad
Subject: Re: Should I use LISP for this?
Date: Wed, 03 Mar 2004 11:57:23 +0000
Message-ID: <opr4abhxwmxfnb1n@news.chello.no>

On Mon, 1 Mar 2004 21:11:33 -0600, Alan Walker 
<·························@charter.net> wrote:

MySQL is fast and adequate for his problem.
It uses ISAM (IBM) tables directly.
In it's fastest implementation (it has two) it sacrefises recoverabillity 
for speed.
Nevertheless it should be up to the job and many times faster than, say, 
Oracle.

> Seems more like a database problem.  Check out MySQL.  If you want data 
> in
> memory, use MySQL Cluster.
>
> The query optimizer will look at your query and determine the best way to
> handle it.  You can beat the database using pre-navigated, in-memory
> structures, but it will shine for ad-hoc queries.  Also, you can add
> secondary indexes on the fly in a database as needs change.  It will also
> handle concurrency, updating the structure while people query it.
>
> If you really insit on building in-memory structures, Lisp is a good 
> choice.
> C++ will work for simple structures, but will drive you nuts with complex
> structures.  GC is also a plus.
>
> Now, using Lisp as a front-end to your database, that's a good idea...
>
> Alan.
>
>
> "Star" <·······@hotmail.com> wrote in message
> ·································@posting.google.com...
>> Hi
>>
>> I have a program that handle very big structures of graphs. I was
>> using C++ language a SQL Server to handle that, but the perfomance
>> that I have is very slow. I have been suggested to use LISP to handle
>> this, but I haven't used LISP in many years, and before starting
>> working on this, I would like to know your opinion. Here is the
>> problem:
>>
>> I have about 15 tables. Each table has several fields, however each
>> table
>> has a common field called 'Code'
>>
>> ex.
>>
>> Table Vehicles
>> --------------
>> Code
>> Make
>> Color
>>
>> Table People
>> -------------
>> Code
>> Name
>> Age
>>
>> Table Addresses
>> -----------------
>> Code
>> City
>> ZIP
>> ...
>>
>> Now we want to relate records from one table to another.  For example,
>> we
>> want to say 'John has two cars'
>> In order to do that, I have an external table called Links with source
>> an
>> destination fields (I store there the code of each table). For the
>> example we would have
>>
>> Source    Dest
>> ----------------
>> P132      V100
>> P132      V242
>>
>> (P132 is a record from the People table with code 132, V100 a vehicle
>> from
>> the Vehicle table with code 100,  and V242 a vehicle
>> with code 242)
>>
>> So, with all that, we know that John (P132) is related with two
>> vehicles. If
>> I want to run the query 'People associated with vehicles'
>> it would be easy to query that table and get that information.
>>
>> Now, let's suppose we have this situation:
>>
>> Source    Dest
>> ----------------
>> P132      V100
>> P132      V242
>> P140      V500
>> P132      P140
>>
>> In this example, John is related with Michael (P140) and Michael is
>> related
>> with a vehicle (V500).
>> If I want to run the same query than before 'People associated with
>> vehicles', because of P132 is related with P140 and P140 is related
>> with
>> a V500 this implies that P132 is related with V500.
>>
>> Now, the query is not so simple, because it's like a graph and I need
>> to
>> make a Depth First Search to get what I want. Doing that, I will get
>> that
>> P132 is
>> related with V500 because there is a 'path' to get there.
>>
>> I have done all this and it works fine. The problem that I have is
>> that I
>> need to make that Depth First Search for each people that I have in
>> the
>> database
>> in order to see if I have vehicles associated with it. If I don't have
>> many
>> records it's ok, but If I have 100,000 people on the database I would
>> have
>> to
>> make that search 100,000 times just to see if there is a path between
>> them,
>> I mean, if there is a way to get from a person to a vehicle.
>> In this case... this solution doesn't work, because it takes
>> forever...
>> I would need to make any kind of query on this structure.. ex.
>> vehicles associated with vehicles, persons with vehicles, addresses
>> with persons..
>>
>> What do you guys suggest I can try? I though that one solution would
>> be to
>> have a process running on the server that maintains a table with ALL
>> the
>> links that we have in the database and the depth of the link. So, in
>> our
>> example we'd have a entry "P132 V500 with depth 2"
>> Having that table it would be easy to query. Inconvenience: the table
>> is
>> going to be HUGE, and also the process has to be smart enough to keep
>> track
>> of
>> deleted links, new links...
>>
>> I'd appreciate if you guys have any idea or suggestions about this and
>> how I
>> can get a better perfomance.
>>
>> Thanks a lot for reading this whole message!!
>
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

From: Christopher Browne
Subject: Re: Should I use LISP for this?
Date: Wed, 03 Mar 2004 13:20:00 +0000
Message-ID: <c24m20$1p59f0$1@ID-125932.news.uni-berlin.de>

Martha Stewart called it a Good Thing when John Thingstad <··············@chello.no> wrote:
> On Mon, 1 Mar 2004 21:11:33 -0600, Alan Walker
> <·························@charter.net> wrote:
>
> MySQL is fast and adequate for his problem.
> It uses ISAM (IBM) tables directly.
> In it's fastest implementation (it has two) it sacrefises
> recoverabillity for speed.
> Nevertheless it should be up to the job and many times faster than,
> say, Oracle.

It's only likely to be faster than anything else if there is only to
be a very few "update" processes, and if it is mostly going to get hit
with read-only queries.

It tends to crumble into deadlock if you have a lot of users doing
concurrent updates.

Furthermore, consider licensing carefully.  According to the vendor,
if you are not licensing your application code under the GPL, then you
are expected to pay for their "commercial" licenses.
-- 
"cbbrowne",·@","cbbrowne.com"
http://cbbrowne.com/info/rdbms.html
Digital Electronic Being Intended for Assassination and Nullification

From: Billy O'Connor
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 02:58:36 +0000
Message-ID: <87znb0q7h6.fsf@dps11.gnuyork.org>

·······@hotmail.com (Star) writes:

This is a database schema problem, I'm snipping most of it.  I
suggest you try one of the SQL newsgroups.

> In order to do that, I have an external table called Links with source

Unless you have a many-to-many relationship, which is unlikely with
car ownership, this extra table is not needed.

> In this example, John is related with Michael (P140) and Michael is
> related
Then you would want a simple link table of many people being related
to many others.

From: David Steuber
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 06:31:56 +0000
Message-ID: <m2ad2zep42.fsf@david-steuber.com>

·······@hotmail.com (Star) writes:

> Now, the query is not so simple, because it's like a graph and I
> need to make a Depth First Search to get what I want. Doing that, I
> will get that P132 is related with V500 because there is a 'path' to
> get there.

Are you sure that a Breadth First Search wouldn't be better?  That
should find the shortest path in a graph structure.

Then there is what others have said about your data...

-- 
Those who do not remember the history of Lisp are doomed to repeat it,
badly.

From: Ray Fink
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 16:40:41 +0000
Message-ID: <uy8qjdwxi.fsf@cableone.THIS_PART_IS_BOGUS.net>

You're doing what's called a "transitive closure" on a relation, and yes it 
gets computationally messy.  

http://www.nist.gov/dads/HTML/transitiveClosure.html has some links that might
be useful.

From: Star
Subject: Re: Should I use LISP for this?
Date: Tue, 02 Mar 2004 20:39:34 +0000
Message-ID: <e0a0c75d.0403021239.c5ce2bc@posting.google.com>

Thanks a lot for you help. I think it's worth to take a look at LISP
for this problem.