From: Alain Picard
Subject: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <86u1ldewr0.fsf@gondolin.local.net>
Dear Lispers,

I'm tuning an application which needs real database speed.
Currently, I'm using PostgreSQL (and Oracle, SQL Server)
with an ODBC driver using Lispworks.

The best I seem to be able to achieve with this setup
is 10 "visitors" per second, which translates roughly into
70 db transactions per second.  You'll have to take my word
for it that all the tables/indices are pretty much tuned to
the max (and it's a pretty simple schema, all told).

Recently I decided to see how much I was paying for all the
ODBC + SQL layers, and rewrote part of the app (well, enough
to do realistic benchmark testing) using Berkeley DB, and embedded
DB linked directly into the application, using Paul Foley's excellent
interface code.  Using that, I was able to get a whopping 125 
visitors per second.  Big win!

Unfortunately, Berkeley DB is very expensive to license if you are
unwilling to open source your app, so I will not be able to use it.

The question, finally, is: do I have any reasonable alternatives?
Does anyone else deal with applications where DB speed is essential?
How did you resolve it?

At the moment, I'm toying with the idea of writing glue code to
Faircom's C-tree Plus database, but have no guarantees that this
is a good idea and/or will have the same level of high performance
that I'm looking for.

I eagerly away comments, suggestions, and war stories.  :-)

From: Pierre R. Mai
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <m2znv598pu.fsf@ibook.bln.pmsf.net>
Alain Picard <·······················@optushome.com.au> writes:

> I'm tuning an application which needs real database speed.
> Currently, I'm using PostgreSQL (and Oracle, SQL Server)
> with an ODBC driver using Lispworks.
>
> The best I seem to be able to achieve with this setup
> is 10 "visitors" per second, which translates roughly into
> 70 db transactions per second.  You'll have to take my word
> for it that all the tables/indices are pretty much tuned to
> the max (and it's a pretty simple schema, all told).
>
> Recently I decided to see how much I was paying for all the
> ODBC + SQL layers, and rewrote part of the app (well, enough
> to do realistic benchmark testing) using Berkeley DB, and embedded
> DB linked directly into the application, using Paul Foley's excellent
> interface code.  Using that, I was able to get a whopping 125 
> visitors per second.  Big win!

I would first try a non-ODBC (i.e. a native) interface to an SQL DB.
It is quite likely that the whole ODBC driver chain may be your
biggest bottleneck.  Last time I looked ODBC performance on Unices was
abysmal, and even on Windows the performance was quite variable,
especially with the free ODBC drivers.

If this turns out to improve the situation, you might be able to
switch to CLSQL, possibly in conjunction with UncommonSQL, which
should give you access to the native PostgreSQL client libs.

> The question, finally, is: do I have any reasonable alternatives?
> Does anyone else deal with applications where DB speed is essential?
> How did you resolve it?

That rather depends on what parts of the functionality of an RDBMS you
are really using.  If you need proper ACID transactions,
online-backup, etc., then I'd be highly surprised if you could beat a
properly tuned professional RDBMS system.  However if you can make do
without a number of the features a proper RDBMS gives you, for at
least parts of your database, you can usually achieve high performance
gains.  E.g. if there is a large part of your database that is slow-
or even non-changing, and/or which isn't concurrently accessed for
writes, you really don't need all the transaction mechanisms, and
don't in fact need to keep most of it on slow magnetic media,
especially if at least the most heavily accessed data fits into RAM.

You can also implement your own specially tuned transactional
mechanisms, if you don't have to implement some of ACID, e.g. if you
can relax isolation, or handle it efficiently in some
application-specific manner.  Same for atomicity.

That said, I still think you should give RDBMS a second try.

Regs, Pierre.

PS:

> Unfortunately, Berkeley DB is very expensive to license if you are
> unwilling to open source your app, so I will not be able to use it.

If that is really the problem, you might be able to make do with the
last BSD-licensed version of Berkeley DB, IIRC 1.77, or something like
this.  However I'm not sure if this is a wise idea.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein
From: Alain Picard
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <86wuq5xdop.fsf@gondolin.local.net>
Pierre R. Mai <····@acm.org> writes:

> Alain Picard <·······················@optushome.com.au> writes:
> > I'm tuning an application which needs real database speed.
> > Currently, I'm using PostgreSQL (and Oracle, SQL Server)
> > with an ODBC driver using Lispworks.

> I would first try a non-ODBC (i.e. a native) interface to an SQL DB.
> It is quite likely that the whole ODBC driver chain may be your
> biggest bottleneck.  Last time I looked ODBC performance on Unices was
> abysmal, and even on Windows the performance was quite variable,
> especially with the free ODBC drivers.

Hum.  The only data-point I have to offer there is that Lispworks'
interface to Oracle is native (i.e. they link with liboracleclient.so)
and it doesn't seem any faster.  

My current hypothesis is that it's the to/fro of the commands and
result sets on the sockets, and the planning of the queries, which ends
up being the bottleneck.

However, you statement prompted me to look into the libpq API very
closely, and there I discovered something I really wanted: a way to
write my queries in such a way as to never block the Lisp scheduler.
[I have some queries which can take an hour to execute].  So this may
be a win for entirely different reasons.

> That rather depends on what parts of the functionality of an RDBMS you
> are really using.  If you need proper ACID transactions,

I'd be happy to forego transactions altogether; I have two processes,
one which is allowed to do writes, and one which only does reads.

I'm happily willing to lose a whole bunch of insets if someone yanks
the power line, AS LONG as the DB is not irretrievably corrupted.

I also need to do hot backups.


> online-backup, etc., then I'd be highly surprised if you could beat a
> properly tuned professional RDBMS system.

I guess you're probably right.


> If that is really the problem, you might be able to make do with the
> last BSD-licensed version of Berkeley DB, IIRC 1.77, or something like
> this.  However I'm not sure if this is a wise idea.

Indeed not, I think they still had limitations like 4 billion entries
maximum, which I'm likely to exceed.
From: Ng Pheng Siong
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <akldft$8mt$1@mawar.singnet.com.sg>
According to Alain Picard  <·······················@optushome.com.au>:
> At the moment, I'm toying with the idea of writing glue code to
> Faircom's C-tree Plus database, but have no guarantees that this
> is a good idea and/or will have the same level of high performance
> that I'm looking for.

Look at SQLite, http://www.hwaci.com/sw/sqlite, a public domain "embeddable
SQL database engine". 

Claims to be four times faster than PostgreSQL. 

The core API consists of three functions; the application collects database
output using a callback. (Which appears to rule out CMUCL, alas.)


-- 
Ng Pheng Siong <····@netmemetic.com> * http://www.netmemetic.com
From: Edi Weitz
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <87d6s1br7q.fsf@dyn164.dbdmedia.de>
····@vista.netmemetic.com (Ng Pheng Siong) writes:

> According to Alain Picard  <·······················@optushome.com.au>:
> > At the moment, I'm toying with the idea of writing glue code to
> > Faircom's C-tree Plus database, but have no guarantees that this
> > is a good idea and/or will have the same level of high performance
> > that I'm looking for.
> 
> Look at SQLite, http://www.hwaci.com/sw/sqlite, a public domain
> "embeddable SQL database engine".
> 
> Claims to be four times faster than PostgreSQL. 
> 
> The core API consists of three functions; the application collects
> database output using a callback. (Which appears to rule out CMUCL,
> alas.)

You can use the wrapper sqlite_get_table() instead of sqlite_exec()
and thus avoid the callbacks.

Edi.
From: Ng Pheng Siong
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <ako411$gc1$2@mawar.singnet.com.sg>
According to Edi Weitz  <···@agharta.de>:
> ····@vista.netmemetic.com (Ng Pheng Siong) writes:
> > The core API consists of three functions; the application collects
> > database output using a callback. (Which appears to rule out CMUCL,
> > alas.)
> 
> You can use the wrapper sqlite_get_table() instead of sqlite_exec()
> and thus avoid the callbacks.

You're right. I saw that immediately after I posted. 

Hmmm, USQL with SQLite... 


-- 
Ng Pheng Siong <····@netmemetic.com> * http://www.netmemetic.com
From: Wade Humeniuk
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <j5sb9.20153$rn4.982829@news1.telusplanet.net>
"Alain Picard" <·······················@optushome.com.au> wrote in message
···················@gondolin.local.net...
> The question, finally, is: do I have any reasonable alternatives?
> Does anyone else deal with applications where DB speed is essential?
> How did you resolve it?

If speed is the essential factor you can do away with having a ODBC/SQL database
altogether.  Just have your persitent data always available in your Lisp image, stored in
hash tables or any other data structure you think appropriate.  Just write the data access
routines based on internal storage schemas and tune the hell out of it.  The interprocess
communication between your app and the db is probably the main bottleneck and the simplest
solution is to get rid of it.  Instead of a few hundred transactions per second, think
many thousands.  If the database is not too large add lots of memory.

Of course there are issues of integrity and backing up your persistent data, but there are
a number of alternatives.

Wade
From: Mark Dalgarno
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <usn0xoegf.fsf@scientia.com>
Alain Picard <·······················@optushome.com.au> writes:

> The question, finally, is: do I have any reasonable alternatives?
> Does anyone else deal with applications where DB speed is essential?
> How did you resolve it?

This may not be a reasonable alternative in your case but we have a
number of scheduling applications where communication is achieved
through a shared RDBMS.

A typical configuration involves around 10 client PCs connected to a
shared RDBMS (although we have installations with 50+ users
collaboratively scheduling).

Each client PC contains a view of the entire schedule which a user may
modify offline and later commit to the shared RDBMS. We 'solve' the
RDBMs speed problem by only publishing changes to the RDBMS and from
there only publishing changes to each client that request them. We use
SQL + ODBC in recent versions of this software as we have to support a
wide variety of RDBMSs.

Lisp image sizes get up to around 100Mb which may be pretty small in
comparison to the size of data set you have to deal with so this may
not be an option. We also have to add some extra fields to each
database tuple to manage the change history.

Mark
From: Fernando Rodr�guez
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <mfmsmugfkms4t1d8vd4iq1010396qg9jvl@4ax.com>
On 29 Aug 2002 21:21:23 +1000, Alain Picard
<·······················@optushome.com.au> wrote:


>The question, finally, is: do I have any reasonable alternatives?
>Does anyone else deal with applications where DB speed is essential?
>How did you resolve it?

Give up the odbc approach and use some  native api to your db.  I'd suggest
you take a look at MySQL and Valentina (http://www.paradigmasoft.com/). The
former even has a Lisp interface (although I never used it).





-----------------------
Fernando Rodriguez
From: Christopher Browne
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <akn1be$1jrke7$2@ID-125932.news.dfncis.de>
In an attempt to throw the authorities off his trail, Alain Picard
<·······················@optushome.com.au> transmitted:
> I'm tuning an application which needs real database speed.
> Currently, I'm using PostgreSQL (and Oracle, SQL Server)
> with an ODBC driver using Lispworks.
>
> The best I seem to be able to achieve with this setup is 10
> "visitors" per second, which translates roughly into 70 db
> transactions per second.  You'll have to take my word for it that
> all the tables/indices are pretty much tuned to the max (and it's a
> pretty simple schema, all told).
>
> Recently I decided to see how much I was paying for all the ODBC +
> SQL layers, and rewrote part of the app (well, enough to do
> realistic benchmark testing) using Berkeley DB, and embedded DB
> linked directly into the application, using Paul Foley's excellent
> interface code.  Using that, I was able to get a whopping 125
> visitors per second.  Big win!
>
> Unfortunately, Berkeley DB is very expensive to license if you are
> unwilling to open source your app, so I will not be able to use it.

Look more carefully at that issue; it is _possible_ that the fact that
Paul has "open source" his _interface_ may be sufficient to allow the
_interface_ to be used.

I thought the same as you, that it would be necessary to specially
license Berkeley DB in order to use it with applications; the fact
that the ever-paranoid Debian folks _don't_ put it in the "non-free"
section should give reason to pause and wonder.

I made comment in a Python newsgroup that I was surprised that
Berkeley DB was in such wide use, as it seemed pretty proprietarily
licensed, and was rebuffed with the argument that it was sufficient
that the _Python library_ for interfacing with Berkeley DB was
released as free software.

Obviously, I Am Not A Lawyer; the fact that there is this diversity of
opinion suggests that it's worth looking more closely at the issue...

> The question, finally, is: do I have any reasonable alternatives?
> Does anyone else deal with applications where DB speed is essential?
> How did you resolve it?
>
> At the moment, I'm toying with the idea of writing glue code to
> Faircom's C-tree Plus database, but have no guarantees that this
> is a good idea and/or will have the same level of high performance
> that I'm looking for.
>
> I eagerly away comments, suggestions, and war stories.  :-)

I'd expect that this could be _significantly_ influenced by caching
the data that you wind up reading over and over again.

One bit of code I wrote today was to get zsh to search my PostgreSQL
environment to see what databases are present (via the "psql -l"
command).  

- In theory, I might set it up to run "psql -l" every time I am trying
  to do command completion.

- On the other hand, I don't add databases terribly often.  It makes
  more sense to run it once, when I start zsh, and read cached values.

There is almost certainly a lot of data that will be  static
that  may safely be loaded into memory, and whether it's accessible
via  ASSOC, AREF, or GETHASH, any of those options are liable to be a
LOT faster than GET-FROM-DATABASE.

If you make access to _some_ of the data fast, that may be sufficient
to make the application fast.

And it may be sufficient to do updates in memory, and have the
"updating the DBMS" part take place in a separate thread.  

If the DBMS gets updated a minute later, perhaps even with some
reorderings of updates to improve efficiency, might that suffice?
-- 
(concatenate 'string "aa454" ·@freenet.carleton.ca")
http://www.ntlug.org/~cbbrowne/x.html
Rules of  the Evil  Overlord #153.  "My Legions of  Terror will  be an
equal-opportunity employer. Conversely, when  it is prophesied that no
man  can defeat  me, I  will  keep in  mind the  increasing number  of
non-traditional gender roles." <http://www.eviloverlord.com/>
From: ozan s yigit
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <vi47khzhw27.fsf@blue.cs.yorku.ca>
Christopher Browne <········@acm.org> writes:

> I thought the same as you, that it would be necessary to specially
> license Berkeley DB in order to use it with applications; the fact
> that the ever-paranoid Debian folks _don't_ put it in the "non-free"
> section should give reason to pause and wonder.
> 
> I made comment in a Python newsgroup that I was surprised that
> Berkeley DB was in such wide use, as it seemed pretty proprietarily
> licensed, and was rebuffed with the argument that it was sufficient
> that the _Python library_ for interfacing with Berkeley DB was
> released as free software.

there has been a recent change in the Berkeley DB license; it is worth
reading it over at www.sleepycat.com/licensing.html as it permits open-source
distributions. also note that there is the pre-sleepycat version (1.85) that
comes with BSD license. it was a very good db library even at that version
[i know a thing or two about it :)] and is quite usable.

oz
---
if you want to have your head in the clouds
you need to keep your feet on the ground. -- terry pratchett
From: Christopher Browne
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <albe8t$1os3h7$1@ID-125932.news.dfncis.de>
The world rejoiced as ozan s yigit <··@blue.cs.yorku.ca> wrote:
> Christopher Browne <········@acm.org> writes:
>
>> I thought the same as you, that it would be necessary to specially
>> license Berkeley DB in order to use it with applications; the fact
>> that the ever-paranoid Debian folks _don't_ put it in the "non-free"
>> section should give reason to pause and wonder.
>> 
>> I made comment in a Python newsgroup that I was surprised that
>> Berkeley DB was in such wide use, as it seemed pretty proprietarily
>> licensed, and was rebuffed with the argument that it was sufficient
>> that the _Python library_ for interfacing with Berkeley DB was
>> released as free software.
>
> there has been a recent change in the Berkeley DB license; it is
> worth reading it over at www.sleepycat.com/licensing.html as it
> permits open-source distributions. also note that there is the
> pre-sleepycat version (1.85) that comes with BSD license. it was a
> very good db library even at that version [i know a thing or two
> about it :)] and is quite usable.

Suppose we write a module, BERKELEY-DB, which is a 'wrapper' for
Berkeley-DB, and release _that module_ freely, the question is of
whether that is sufficient to satisfy the Sleepycat folk.

Thus,  let us assume berkeley-db.lisp is released under the
LGPL/BSDL/Artistic License.

What is a _little_ ambiguous is whether or not having (require
:berkeley-db) in some other program mandates:
 a) Buying licenses from Sleepycat, or
 b) Releasing the code that contains (require :berkeley-db) under
    GPL/LGPL/BSDL/...L
or whether the fact that berkeley-db.lisp was made freely available
was sufficient.

The Python folk seemed to think that the fact that they had a Python
module that was "freely available" was sufficent.  I suspect they were
wrong, and that a) or b) are in fact required...
-- 
(reverse (concatenate 'string ····················@" "454aa"))
http://cbbrowne.com/info/nonrdbms.html
Do you know where your towel is?
From: Alain Picard
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <86heh2yv0l.fsf@gondolin.local.net>
Christopher Browne <········@acm.org> writes:


> Suppose we write a module, BERKELEY-DB, which is a 'wrapper' for
> Berkeley-DB, and release _that module_ freely, the question is of
> whether that is sufficient to satisfy the Sleepycat folk.

Well, I got it from the CEO of Sleepycat that that would NOT
satisfy them, at least not for a "real language" (his words :-)
like lisp.

They consider python "interpreted", and that the only "user"
of berkeley-db in a python app is the python "environment",
which is open sourced.

It doesn't make sense to me, but there you have it.
From: Christopher Browne
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <ald047$1pkag9$1@ID-125932.news.dfncis.de>
Quoth Alain Picard <·······················@optushome.com.au>:
> Christopher Browne <········@acm.org> writes:
>> Suppose we write a module, BERKELEY-DB, which is a 'wrapper' for
>> Berkeley-DB, and release _that module_ freely, the question is of
>> whether that is sufficient to satisfy the Sleepycat folk.
>
> Well, I got it from the CEO of Sleepycat that that would NOT satisfy
> them, at least not for a "real language" (his words :-) like lisp.
>
> They consider python "interpreted", and that the only "user" of
> berkeley-db in a python app is the python "environment", which is
> open sourced.
>
> It doesn't make sense to me, but there you have it.

O...K...  I think I'd need to smoke large quantities of restricted
substances to make that make sense _to me_.

Alternatively, perhaps this is an argument for getting an efficient CL
wrapper for Spread/Twisted/Isect working so that you'd punt DB
requests over via an RPC scheme to a Python instance that would do the
royalty-free DB access...  Believing that THAT makes sense might
require similar quantities of smoked material...

It's certainly been interesting...
-- 
(reverse (concatenate 'string ··········@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/wp.html
Rules of  the Evil Overlord  #226. "I will  have a staff  of competent
detectives  handy. If I  learn that  someone in  a certain  village is
plotting against  me, I will have  them find out who  rather than wipe
out the entire village in a preemptive strike."
<http://www.eviloverlord.com/>
From: Alain Picard
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <863csmy97k.fsf@gondolin.local.net>
ozan s yigit <··@blue.cs.yorku.ca> writes:

> also note that there is the pre-sleepycat version (1.85) that
> comes with BSD license. it was a very good db library even at that version
> [i know a thing or two about it :)] and is quite usable.

The reasons I think it would not be wise to use 1.85
(and please correct me if you know any differennt)

 * it's now unsupported, so I'd have to assume that cost

 * It has a 4 billion record limit

 * Doesn't support multiple readers (Is this true?)

 * Doesn't have transactions (hence no recovery and hot backup mechanisms)

Off the top of my head.

I've been playing with Faircom's c-tree plus for a few days...
it might finally be what I've been looking for, and the price
is definitely right.
From: Tim Josling
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <3D799A55.772DEFEC@melbpc.org.au>
Alain Picard wrote:
> 
> ozan s yigit <··@blue.cs.yorku.ca> writes:
> 
> The reasons I think it would not be wise to use 1.85
> (and please correct me if you know any differennt)
> 
>  * It has a 4 billion record limit
> 
> ...

No way I could store my record collection database in 1.85 then!

Tim Josling
From: Dmitri Ivanov
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <akmurr$2rk9$1@news.aha.ru>
Hello,

Alain Picard <·······················@optushome.com.au> wrote in message
···················@gondolin.local.net...
|
| Dear Lispers,
|
| I'm tuning an application which needs real database speed.
| Currently, I'm using PostgreSQL (and Oracle, SQL Server)
| with an ODBC driver using Lispworks.
|...snip...|

I experiment with public SQL-ODBC module and Lispworks on Windows. For
the MS Access driver, walking through a result set appears about 1.5-2
times slower than with the native DAO library. I don't experience any
performance dergadation while executing DML update statements.
--
Sincerely,
Dmitri Ivanov
www.aha.ru/~divanov
From: Frank A. Adrian
Subject: Re: Fast (and I mean fast) DB access from lisp
Date: 
Message-ID: <K1_b9.249$703.83854@news.uswest.net>
Dmitri Ivanov wrote:
> I experiment with public SQL-ODBC module and Lispworks on Windows. For
> the MS Access driver, walking through a result set appears about 1.5-2
> times slower than with the native DAO library. I don't experience any
> performance dergadation while executing DML update statements.

Unfortunately, the DAO library is relatively slow.  This is due to the 
combined slowness of both the drivers (even worse for ODBC) and the 
overhead of the COM layers in the DAO code itself.  Access also has 
different performance characteristics from "real databases", masking some 
of this slowness with its own.  I've found that the only way to get really 
good performance in combined code/DB systems is to go with a native C-layer 
API and transcode the data you want to use for use in your language.

All of this assumes, of course, that you've done everything you can with 
respect to proper optimization of table structures, indexed everything, 
avoided large numbers of joins by denormalization, etc.

faa

P.S.  I know.  I know, the term "real database" above is (semi-)deliberate 
flamebait.  Also, performance is relative -- ODBC, DAO, and Access might be 
fine for you.  Just don't expect to post 250 transactions a second through 
it unless you're very, very lucky and your system is very, very simple.