From: vtail
Subject: CLOS persistence
Date: 
Message-ID: <80e8d5b0-0735-43ef-922b-c51713c2b927@p69g2000hsa.googlegroups.com>
Happy New Year group,

I would appreciate comments on existing CLOS persistence libraries, in
particular elephant vs cl-perec: how reliable/fast each one is, how
large is the user base, how active are maintainers etc. For elephant
in particular, I'm also interested in performance of Postgresql
backend vs BerkeleyDB.

My use case is a multi-user web-based app (I'm porting existing Perl/
MySQL application), and I would need performance that is not far worse
then that of a existing system.

I'm using SBCL on linux-amd64. I'm aware of AllegroCL/AllegroCache,
but I'm only considering free implementations at that point.

Thank you very much,
Victor.

From: levy
Subject: Re: CLOS persistence
Date: 
Message-ID: <fbb683ed-2e6b-48a7-872c-e701dc6dffad@1g2000hsl.googlegroups.com>
On jan. 2, 08:54, vtail <··············@gmail.com> wrote:
> Happy New Year group,
>
> I would appreciate comments on existing CLOS persistence libraries, in
> particular elephant vs cl-perec: how reliable/fast each one is, how
> large is the user base, how active are maintainers etc. For elephant
> in particular, I'm also interested in performance of Postgresql
> backend vs BerkeleyDB.
>
> My use case is a multi-user web-based app (I'm porting existing Perl/
> MySQL application), and I would need performance that is not far worse
> then that of a existing system.
>
> I'm using SBCL on linux-amd64. I'm aware of AllegroCL/AllegroCache,
> but I'm only considering free implementations at that point.
>
> Thank you very much,
> Victor.

I am one of the authors of cl-perec, so I can answer your questions
about it.

The good news: We are using it currently in exactly one production
system (with a few thousand users, one hundred persistent classes some
of which has a million instances) without any issues so far. There's a
quite good test suite in the repository which assures us not breaking
things with changes. Of course we are continuously maintaining it and
also have some minor developments in mind and willing to take into
consideration users' needs too (well if there were any ;-).

Read performance is not bad at all, because cl-perec gives a quite
good RDBMS mapping for the CLOS classes you define. You can use the
provided query compiler, write your own SQL queries or even mix the
two. Using CLOS is still possible even if you have some hand optimized
queries and lazy loading will work from that. Write performance is not
optimized that much and I know a couple of ways how to improve it, but
remember that in the worst case you can always execute an SQL batch
update.

The bad news: AFAICT the user base is practically zero, we had as much
as three users who were able to set it up and give it a try.
Unfortunatly we don't exactly know how far did they get with it. Well,
as you probably know there's no documentation other than a few hundred
test cases.

levy
From: vtail
Subject: Re: CLOS persistence
Date: 
Message-ID: <24464ff4-33e2-4812-b6fb-474c3a3635e2@t1g2000pra.googlegroups.com>
On Jan 2, 6:37 am, levy <················@gmail.com> wrote:
> On jan. 2, 08:54, vtail <··············@gmail.com> wrote:
>
>
>
> > Happy New Year group,
>
> > I would appreciate comments on existing CLOS persistence libraries, in
> > particular elephant vs cl-perec: how reliable/fast each one is, how
> > large is the user base, how active are maintainers etc. For elephant
> > in particular, I'm also interested in performance of Postgresql
> > backend vs BerkeleyDB.
>

> > I'm using SBCL on linux-amd64. I'm aware of AllegroCL/AllegroCache,
> > but I'm only considering free implementations at that point.
>

> I am one of the authors of cl-perec, so I can answer your questions
> about it.
>
> The good news: We are using it currently in exactly one production
> system (with a few thousand users, one hundred persistent classes some
> of which has a million instances) without any issues so far. There's a
> quite good test suite in the repository which assures us not breaking
> things with changes. Of course we are continuously maintaining it and
> also have some minor developments in mind and willing to take into
> consideration users' needs too (well if there were any ;-).
>
> Read performance is not bad at all, because cl-perec gives a quite
> good RDBMS mapping for the CLOS classes you define. You can use the
> provided query compiler, write your own SQL queries or even mix the
> two. Using CLOS is still possible even if you have some hand optimized
> queries and lazy loading will work from that. Write performance is not
> optimized that much and I know a couple of ways how to improve it, but
> remember that in the worst case you can always execute an SQL batch
> update.
>
> The bad news: AFAICT the user base is practically zero, we had as much
> as three users who were able to set it up and give it a try.
> Unfortunatly we don't exactly know how far did they get with it. Well,
> as you probably know there's no documentation other than a few hundred
> test cases.
>
> levy

Thank you for your answers, Levy.

Congratulations on winning the contract - it is impressive what you
have done under such time constraints!

If it's appropriate to ask - what was the reason for writing cl-perec
in the first place (versus using elephant)? I understand that your
time constraints were pretty tight so you most likely have studied all
the available options before rolling your own thing...

Best Regards,
Victor.
From: levy
Subject: Re: CLOS persistence
Date: 
Message-ID: <4632510b-4f80-429d-85d5-e6d90e948dae@l32g2000hse.googlegroups.com>
On jan. 2, 18:22, vtail <··············@gmail.com> wrote:
> Thank you for your answers, Levy.
>
> Congratulations on winning the contract - it is impressive what you
> have done under such time constraints!
>
> If it's appropriate to ask - what was the reason for writing cl-perec
> in the first place (versus using elephant)? I understand that your
> time constraints were pretty tight so you most likely have studied all
> the available options before rolling your own thing...
>
> Best Regards,
> Victor.

Obviously we have not written cl-perec in that three months but in the
past two years (with many other things in parallel). At that time
elephant was in a very early stage and it's RDBMS mapping was well at
least hard to explain to people. I don't know if that changed by now
but I don't think so. The BDB backend was not an option due to its
licensing and the fact that most people here are happy when they see
their data stored in RDBMS rather than in X.

There was clsql and allegrocache too and I evaluated both.
Allegrocache was 1.0 at that time and had some very nice features but
we decided not to use it due to its price and not being RDBMS based.
Using lisp in itself is enough to explain and we didn't want to go
into this other issue. As for clsql we had quite a few problems with
it (and it's not really a CLOS persistence layer), enough to roll our
own RDBMS layer called cl-rdbms.

levy
From: John Thingstad
Subject: Re: CLOS persistence
Date: 
Message-ID: <op.t4ctizmeut4oq5@pandora.alfanett.no>
P� Thu, 03 Jan 2008 10:42:54 +0100, skrev levy  
<················@gmail.com>:

>
> There was clsql and allegrocache too and I evaluated both.
> Allegrocache was 1.0 at that time and had some very nice features but
> we decided not to use it due to its price and not being RDBMS based.
> Using lisp in itself is enough to explain and we didn't want to go
> into this other issue. As for clsql we had quite a few problems with
> it (and it's not really a CLOS persistence layer), enough to roll our
> own RDBMS layer called cl-rdbms.
>

If all you need is persistence wouldn't a simple file based solution work  
better?
Seems to me to make use of a database you should have a large dataset and  
random access by many simultaneous users. Similarly it is not clear to me  
you would want a mapping from a RDBMS and CLOS. CLOS creates a graph.  
(assuming you consider directed graphs, trees and lists a subset of  
graph)  RDBMS creates tables. Two orthogonal means of representation. In  
particular many to many relations are grossly inefficient in a RDBMS. I  
have seen examples of using a CLOS mapping to records using clsql and it  
struck me as a ugly cluge so I stuck with the functional form.
Perhaps you have different experiences?

--------------
John Thingstad
From: ··············@gmail.com
Subject: Re: CLOS persistence
Date: 
Message-ID: <6f6bce55-318c-43b1-922f-5cb03018bc3c@e6g2000prf.googlegroups.com>
> If all you need is persistence wouldn't a simple file based solution work
> better?


maybe because persistence is not all that we need? (hint: google for
ACID)


> Seems to me to make use of a database you should have a large dataset and
> random access by many simultaneous users. Similarly it is not clear to me


we use it as a backend of a webapp with a few thousand users.


> you would want a mapping from a RDBMS and CLOS. CLOS creates a graph.
> (assuming you consider directed graphs, trees and lists a subset of
> graph)  RDBMS creates tables. Two orthogonal means of representation. In


so? it's easier to work with the OO representation and therefore we
created an automated mapper. how much better would it be to do the
mapping by hand? would you write stuff in assembly just because both
lisp and asm are turing complete?


> particular many to many relations are grossly inefficient in a RDBMS. I
> have seen examples of using a CLOS mapping to records using clsql and it
> struck me as a ugly cluge so I stuck with the functional form.
> Perhaps you have different experiences?


no, all the OO -> SQL mappers are kludges in a way, but we've seen a
few kludges keeping the world in motion... sometimes you must make
compromises to get something useful done.


- attila
From: vtail
Subject: Re: CLOS persistence
Date: 
Message-ID: <ceee2afa-1b04-4fad-be94-5c44709953a2@u10g2000prn.googlegroups.com>
On Jan 3, 6:41 am, "John Thingstad" <·······@online.no> wrote:
> På Thu, 03 Jan 2008 10:42:54 +0100, skrev levy
> <················@gmail.com>:
>
>
>
> > There was clsql and allegrocache too and I evaluated both.
> > Allegrocache was 1.0 at that time and had some very nice features but
> > we decided not to use it due to its price and not being RDBMS based.
> > Using lisp in itself is enough to explain and we didn't want to go
> > into this other issue. As for clsql we had quite a few problems with
> > it (and it's not really a CLOS persistence layer), enough to roll our
> > own RDBMS layer called cl-rdbms.
>
> If all you need is persistence wouldn't a simple file based solution work
> better?

If I won't need to support multiple threads/transactions, I would used
something like cl-store indeed.
From: vtail
Subject: Re: CLOS persistence
Date: 
Message-ID: <34712cc6-6607-4618-bb3a-523cf8b05ae6@v4g2000hsf.googlegroups.com>
On Jan 3, 3:42 am, levy <················@gmail.com> wrote:

> On jan. 2, 18:22, vtail <··············@gmail.com> wrote:
> > If it's appropriate to ask - what was the reason for writing cl-perec
> > in the first place (versus using elephant)? I understand that your
> > time constraints were pretty tight so you most likely have studied all
> > the available options before rolling your own thing...

> Obviously we have not written cl-perec in that three months but in the
> past two years (with many other things in parallel). At that time
> elephant was in a very early stage and it's RDBMS mapping was well at
> least hard to explain to people. I don't know if that changed by now
> but I don't think so. The BDB backend was not an option due to its
> licensing and the fact that most people here are happy when they see
> their data stored in RDBMS rather than in X.

Interesting. At that point elephant seems to be easier to use, because
it has some pretty good documentation and tutorial, and is asdf-
installable. I have nothing wrong with getting a package via darcs -
in fact, distributing your sources via some distributed SCM is
probably the best way to involve community in development - but I
spent several hours yesterday trying to install all the dependencies
(and it still doesn't run: (asdf:oos 'asdf:load-op 'cl-perec) reports
"The name CL-PEREC does not designate any package" while doing  14:
(COMPILE-FILE #P"/home/victor/.sbcl/site/cl-perec/
configuration.lisp")). On the other hand, I really don't like how it's
misusing database backend for a hash. If cl-perec provides a better
map between objects and database, it's worth a serious
consideration!

I understand that making (good) documentation takes a lot of effort
and reading cases / sources is usually enough, but having good quick
tutorial + API documentation simplify things big way. Have you
considered any tools that automatically extracts documentation strings
(like doc/manual/docstrings.lisp from the sbcl source tree or such)?

Re: BDB - even after reading the "CLOS persistence" topic from
November, I still don't get what is wrong with BDB as a web-site
backend from the licensing perspective :(.

> There was clsql and allegrocache too and I evaluated both.
> Allegrocache was 1.0 at that time and had some very nice features but
> we decided not to use it due to its price and not being RDBMS based.
> Using lisp in itself is enough to explain and we didn't want to go
> into this other issue. As for clsql we had quite a few problems with
> it (and it's not really a CLOS persistence layer), enough to roll our
> own RDBMS layer called cl-rdbms.

I tried to use CLSQL too, but I didn't like the fact that it's so low
level - just thin wrapper around SQL - and that one have to manually
control different threads for using different connections etc. - am I
right that cl-rdbms is thread-safe by default?

On the other hand, CLSQL supports sqlite, which is an important
backend IMHO - very easy to install and sometimes faster than
Postgresql due to lower overhead. How hard it is to add sqlite backend
to cl-rdbms?

Overall, you guys have managed to write an impressive number of
impressive libraries!

Regards,
Victor.
From: Leslie P. Polzer
Subject: Re: CLOS persistence
Date: 
Message-ID: <51e5a1cd-765c-4748-87b0-8e9443820eb8@v29g2000hsf.googlegroups.com>
On Jan 4, 6:13 am, vtail <··············@gmail.com> wrote:

> Re: BDB - even after reading the "CLOS persistence" topic from
> November, I still don't get what is wrong with BDB as a web-site
> backend from the licensing perspective :(.

Redistribution must be accompanied by the source code of the program
using BDB, or else (where "else" is "get a commercial license").

But it's not clear whether the SaaS case is backed by the term.
With the GPL it didn't seem to suffice, hence the Affero GPL.

So either

1) gamble (with good chances since it's hard to see that a web site is
backed by BDB anyway)
2) get a commercial license
3) use another backend
From: levy
Subject: Re: CLOS persistence
Date: 
Message-ID: <2c4d5e37-0f9b-423d-8b89-7cc5bc3d487d@s19g2000prg.googlegroups.com>
On Jan 4, 6:13 am, vtail <··············@gmail.com> wrote:
> Interesting. At that point elephant seems to be easier to use, because
> it has some pretty good documentation and tutorial, and is asdf-
> installable. I have nothing wrong with getting a package via darcs -
> in fact, distributing your sources via some distributed SCM is
> probably the best way to involve community in development - but I
> spent several hours yesterday trying to install all the dependencies
> (and it still doesn't run: (asdf:oos 'asdf:load-op 'cl-perec) reports
> "The name CL-PEREC does not designate any package" while doing  14:
> (COMPILE-FILE #P"/home/victor/.sbcl/site/cl-perec/
> configuration.lisp")). On the other hand, I really don't like how it's
Hmm, I don't know what causes that, maybe you could send some more
details to the devel list?

> misusing database backend for a hash. If cl-perec provides a better
> map between objects and database, it's worth a serious
> consideration!
Cl-perec basically maps each persistent class into a table (which is
created automagically) and some persistent associations (namely the
many-to-many ones) too. The class tables have one (in non default
modes two) extra column for the oid and one or multiple columns per
each primitive slot and the usal foreign key columns for associations.
If a persistent class is abstract and would not have a column at all
then it does not have a table to avoid unnecessary inserts. Persistent
instances are cached within the transaction and their (slot) data is
prefetched and cached upon first access or when querying. Within a
transaction it is guaranteed that two instances will be eq iff their
oid is the same independently of how you got the two instances (by
querying or navigating, etc.)

> I understand that making (good) documentation takes a lot of effort
> and reading cases / sources is usually enough, but having good quick
> tutorial + API documentation simplify things big way. Have you
> considered any tools that automatically extracts documentation strings
> (like doc/manual/docstrings.lisp from the sbcl source tree or such)?
No, not yet, but patches are always welcomed! ;-)

> control different threads for using different connections etc. - am I
> right that cl-rdbms is thread-safe by default?
It is, we are using it in an application with more than 50 threads per
node and sometimes threads may even require nested transactions.

> On the other hand, CLSQL supports sqlite, which is an important
> backend IMHO - very easy to install and sometimes faster than
> Postgresql due to lower overhead. How hard it is to add sqlite backend
> to cl-rdbms?
I think it's quite straightforward but might take a couple of days to
do correctly. Basically you need to subclass the database and
transaction classes, write some generic functions to connect/
disconnect, do some reflection on tables, prepare and execute a
statement and specialize the SQL printer if needed.

If you look at the postgresql backend, it's only 370 LOC (based on
postmodern) which is not that much after all. The oracle backend is
nearly 2000 LOC not counting the generated CFFI interface.

levy
From: Leslie P. Polzer
Subject: Re: CLOS persistence
Date: 
Message-ID: <d273b558-a99f-42e0-bb22-2ad4400f6110@e10g2000prf.googlegroups.com>
On Jan 2, 8:54 am, vtail <··············@gmail.com> wrote:
> Happy New Year group,
>
> I would appreciate comments on existing CLOS persistence libraries, in
> particular elephant vs cl-perec: how reliable/fast each one is,

Elephant has a large test library to ensure reliability.

Elephant is easy to use for CLOS because of its MOP framework.


> how large is the user base,

Elephant is probably the most popular.
By the way, have you looked at cl-prevalence?


> how active are maintainers etc.

A bunch of knowledgeable people are on the list. Ian Eslick, the most
active one, is answering questions quickly.
There's a Trac system with a roadmap.


> For elephant
> in particular, I'm also interested in performance of Postgresql
> backend vs BerkeleyDB.
> My use case is a multi-user web-based app (I'm porting existing Perl/
> MySQL application), and I would need performance that is not far worse
> then that of a existing system.

IIRC the Elephant docs say that performance is best with BDB and
others are slower by a factor of three (check them out to verify
this). The only way to find out whether a given library satisfies your
performance needs is to try it. IMO the only disadvantage of Elephant
is the one cited by levy: Elephant abuses relational databases as flat
key-value-databases. But there's also a native SEXP backend in
development for Elephant.

  Leslie
From: vtail
Subject: Re: CLOS persistence
Date: 
Message-ID: <0a9adb72-6796-40b3-82de-3b98f1c23608@1g2000hsl.googlegroups.com>
On Jan 3, 6:14 am, "Leslie P. Polzer" <·············@gmx.net> wrote:
> On Jan 2, 8:54 am, vtail <··············@gmail.com> wrote:

>
> Elephant is probably the most popular.
> By the way, have you looked at cl-prevalence?

I did - and made a mental note to study its sources some day - but I'm
not easy to store my data in production with the library that has it's
last serious commit almost 2 years ago...

> > For elephant
> > in particular, I'm also interested in performance of Postgresql
> > backend vs BerkeleyDB.
> > My use case is a multi-user web-based app (I'm porting existing Perl/
> > MySQL application), and I would need performance that is not far worse
> > then that of a existing system.
>
> IIRC the Elephant docs say that performance is best with BDB and
> others are slower by a factor of three (check them out to verify
> this). The only way to find out whether a given library satisfies your
> performance needs is to try it. IMO the only disadvantage of Elephant
> is the one cited by levy: Elephant abuses relational databases as flat
> key-value-databases. But there's also a native SEXP backend in
> development for Elephant.

Thanks for the info. I also noticed that they have mentioned
significant performance increase with Postgres after switching to
postmodern. I should indeed give it a try and measure the performance
for different backends.

Their objest<->database mapping is very simple, indeed. I still have
to find how to implement many-to-many relationship in elephant
effectively. Also, I don't like the fact that "Garbage collection is
only supported via an offline migrate interface which will compact the
database by only copying reachable instances", and that every reading
of the slot-value is reading from the database (you "cannot" cache an
object instance in memory).

Regards,
Victor.
From: Leslie P. Polzer
Subject: Re: CLOS persistence
Date: 
Message-ID: <cced2f14-ccad-4693-9ccd-da2465544395@i29g2000prf.googlegroups.com>
On Jan 4, 6:21 am, vtail <··············@gmail.com> wrote:

> Also, I don't like the fact that "Garbage collection is
> only supported via an offline migrate interface which will compact the
> database by only copying reachable instances", and that every reading
> of the slot-value is reading from the database (you "cannot" cache an
> object instance in memory).

It shouldn't be too hard to add this to Elephant.