From: Harag
Subject: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo2bmg$78a$1@ctb-nnrp2.saix.net>
Ok this is a question you must get a lot but from my archive searches on
google their are'nt any awnswers that satisfy or they cause more questions.

Reading and writing to a file in lisp (on windows) I get right with :io and
:overwrite but it seems to be limited to inserting new data and I need to
change data (please dont tell me to write it to a temporary file and then
rename). I must add that I have only played with the string type 'reads' and
'writes'.

Maybe I am just showing my ignorance on the subject of IO in general (it is
a dark area for me I have been spoiled but also restricted by databases) but
surely if you can add you must be able to replace. I can imagine if lisp
only reads from disk at the time of a 'read' and only the requested length
(or does lisp load the wole file into a memory stream on open and then
rewrite the wole file when you close it?) that replacing a data segment with
one of differing length might be a problem, but then what trick is lisp or
the operating system employing to do the inserts that seem to be possible
(is it fracturing the file?).

Lastly writing to a temp file is not a option the files are just to big.

I would really appreciate any advise or comments or a refenrence to a site
or text.

From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87ism2j49n.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> Ok this is a question you must get a lot but from my archive searches on
> google their are'nt any awnswers that satisfy or they cause more questions.
> 
> Reading and writing to a file in lisp (on windows) I get right with :io and
> :overwrite but it seems to be limited to inserting new data and I need to
> change data (please dont tell me to write it to a temporary file and then
> rename). I must add that I have only played with the string type 'reads' and
> 'writes'.
> 
> Maybe I am just showing my ignorance on the subject of IO in general (it is
> a dark area for me I have been spoiled but also restricted by databases) but
> surely if you can add you must be able to replace. I can imagine if lisp
> only reads from disk at the time of a 'read' and only the requested length
> (or does lisp load the wole file into a memory stream on open and then
> rewrite the wole file when you close it?) that replacing a data segment with
> one of differing length might be a problem, but then what trick is lisp or
> the operating system employing to do the inserts that seem to be possible
> (is it fracturing the file?).
> 
> Lastly writing to a temp file is not a option the files are just to big.
> 
> I would really appreciate any advise or comments or a refenrence to a site
> or text.

As previously indicated, you can  use file-position to move inside the
file and overwrite parts of it.

Of course,  it may pose problems  if you must write  bigger or smaller
data.

unix-like OS provide only a stream of byte metaphore for files.

Older  OS provided  integrated  access methods  with  various kind  of
indexes, fixed or variable length records, etc.

You can implement something similar over the stream of byte.

For example,  a file containing the following  fixed-length (42 bytes)
space-filled lines:

+------------------------------------------+
| 1 Hao Wang, logicien americain.          |
| 2                                        |
| 3 L'algorithme en  question  a  �t�      |
| 4 publi�  en  1960  dans l'IBM Journal,  |
| 5 article intitule "Toward  Mechanical   |
| 6 Mathematics",avec des variantes et une |
| 7 extension au calcul  des  pr�dicats.   |
| 8 Il  s'agit  ici  du  "premier          |
| 9 programme" de Wang, systeme "P".       |
+------------------------------------------+

could contain actually the following records:

header: | first-rec: 004 | last-rec: 010 | free-list: 005 | rec-size: 42   |
   000: | next: 002 | prev: 004 | 2                                        |
   001: | next: 009 | prev: NIL |..........................................|
   002: | next: 003 | prev: 000 | 3 L'algorithme en  question  a  �t�      |
   003: | next: 006 | prev: 002 | 4 publi�  en  1960  dans l'IBM Journal,  |
   004: | next: 000 | prev: NIL | 1 Hao Wang, logicien americain.          |
   005: | next: 001 | prev: NIL |..........................................|
   006: | next: 011 | prev: 003 | 5 article intitule "Toward  Mechanical   |
   007: | next: 008 | prev: 011 | 7 extension au calcul  des  pr�dicats.   |
   008: | next: 010 | prev: 007 | 8 Il  s'agit  ici  du  "premier          |
   009: | next: NIL | prev: NIL |..........................................|
   010: | next: NIL | prev: 008 | 9 programme" de Wang, systeme "P".       |
   011: | next: 007 | prev: 006 | 6 Mathematics",avec des variantes et une |

Then it's really easy to delete  a record or insert a new record: just
seek and rewrite the current, next, previous, and header records.


-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo31r6$n40$1@ctb-nnrp2.saix.net>
> header: | first-rec: 004 | last-rec: 010 | free-list: 005 | rec-size: 42
|
>    000: | next: 002 | prev: 004 | 2
|
>    001: | next: 009 | prev: NIL
|..........................................|
>    002: | next: 003 | prev: 000 | 3 L'algorithme en  question  a  �t�
|
>    003: | next: 006 | prev: 002 | 4 publi�  en  1960  dans l'IBM Journal,
|
>    004: | next: 000 | prev: NIL | 1 Hao Wang, logicien americain.
|
>    005: | next: 001 | prev: NIL
|..........................................|
>    006: | next: 011 | prev: 003 | 5 article intitule "Toward  Mechanical
|
>    007: | next: 008 | prev: 011 | 7 extension au calcul  des  pr�dicats.
|
>    008: | next: 010 | prev: 007 | 8 Il  s'agit  ici  du  "premier
|
>    009: | next: NIL | prev: NIL
|..........................................|
>    010: | next: NIL | prev: 008 | 9 programme" de Wang, systeme "P".
|
>    011: | next: 007 | prev: 006 | 6 Mathematics",avec des variantes et une
|
>
> Then it's really easy to delete  a record or insert a new record: just
> seek and rewrite the current, next, previous, and header records.

Pascal thats great thanx I have been looking at more complicated formats
like HDF5 but the pure scope of implementing it had me backing off this will
do nicely for the little project I am working on now its even better than
the XML hack I was attempting. Now I can concentrate on getting the app
finished and learning more about lisp other than IO stuff.
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo35la$pj7$1@ctb-nnrp2.saix.net>
Ok after complaining about complicated structures I wnet ahead and slapped
this monster together to handle a mini db/tree with multiple levels.

(i did not mash this because its compicated enough I just did 2 recotds)

   000: | next reference: NIL | prev ref: NIL | parent struc  : NIL | prev
ref       : NIL | 1                       |
   001: | next keyword  : NIL | prev ref: NIL | parent struc  : 000 | prev
keyword   : NIL | identity                |
   002: | next summary  : 003 | prev ref: NIL | parent struc  : 000 | prev
summary   : NIL | blah blah blah blah     |
   003: | next summary  : 004 | prev ref: NIL | parent struc  : 000 | prev
summary   : 002 | blah blah blah blah     |
   004: | next summary  : 005 | prev ref: NIL | parent struc  : 000 | prev
summary   : 003 | blah blah blah blah     |
   005: | next summary  : NIL | prev ref: NIL | parent struc  : 000 | prev
summary   : 004 | blah blah blah blah     |
   006: | next author   : NIL | prev ref: NIL | parent struc  : 000 | prev
author    : NIL | Piet Snot               |
   007: | next aut-inits: NIL | prev ref: NIL | parent struc  : 006 | prev
auth-inits: NIL | P                       |
   008: | next book     : 010 | prev ref: NIL | parent struc  : 006 | prev
book      : NIL | IRC and identity        |
   009: | next datepubl : NIL | prev ref: NIL | parent struc  : 008 | prev
datepubl  : NIL | 23 October 1999         |
   010: | next book     : NIL | prev ref: NIL | parent struc  : 006 | prev
book      : 008 | IRC and identity        |
   011: | next datepubl : NIL | prev ref: NIL | parent struc  : 010 | prev
datepubl  : NIL | 23 October 1999         |


   010: | next reference: 001 | prev ref: 000 | parent struc  : NIL | prev
ref       : 010 | 2                       |
   011: | next keyword  : 001 | prev ref: 000 | parent struc  : 010 | prev
identity  : NIL | identity                |
   012: | next summary  : 001 | prev ref: 000 | parent struc  : 010 | prev
summary   : NIL | blah blah blah blah     |
   013: | next summary  : 001 | prev ref: 000 | parent struc  : 010 | prev
summary   : 012 | blah blah blah blah     |
   014: | next summary  : 001 | prev ref: 000 | parent struc  : 010 | prev
summary   : 013 | blah blah blah blah     |
   015: | next summary  : 001 | prev ref: 000 | parent struc  : 010 | prev
summary   : 014 | blah blah blah blah     |
   016: | next author   : 001 | prev ref: 000 | parent struc  : 010 | prev
author    : NIL | Gert Gieter             |
   017: | next aut-inits: 001 | prev ref: 000 | parent struc  : 016 | prev
auth-inits: NIL | P                       |
   018: | next book     : 001 | prev ref: 000 | parent struc  : 016 | prev
book      : NIL | Gerts folly             |
   019: | next datepubl : 001 | prev ref: 000 | parent struc  : 018 | prev
datepubl  : NIL | 17 Nov 2007             |
   020: | next book     : NIL | prev ref: NIL | parent struc  : 019 | prev
book      : 018 | IRC and identity        |
   021: | next datepubl : NIL | prev ref: NIL | parent struc  : 020 | prev
datepubl  : NIL | 23 October 1999         |


do you think this can work pascal ?
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <878ymy3gcd.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> Ok after complaining about complicated structures I wnet ahead and slapped
> this monster together to handle a mini db/tree with multiple levels.
> 
> (i did not mash this because its compicated enough I just did 2 recotds)
> 
>    000: | next reference: NIL | prev ref: NIL | parent struc  : NIL | prev
> ref       : NIL | 1                       |
>    001: | next keyword  : NIL | prev ref: NIL | parent struc  : 000 | prev
> keyword   : NIL | identity                |
>    002: | next summary  : 003 | prev ref: NIL | parent struc  : 000 | prev
>
> do you think this can work pascal ?
 
Now that you're getting more specific.

The principle to be able to delete  or insert data "in the middle of a
file", is  to cut this file in  records of fixed size  and implement a
higher level data structure on  this set of records.  Some records may
be "deleted" (garbage collection may be implemented with a free list).
The next question is that  of the access mode.  For sequential access,
a  simple linked  or double-linked  list is  enough.  If  you  need to
access records  based on keys, you  can add an index  layer. Here, you
may consider using one of the  db libraries provided in unix (db, dbm,
gdbm), thru UFFI. They essentially  implement a hash table, a map (key
-> value).

You may  consider each  record payload  as a row  in a  database.  The
longer  the record,  the less  impact  the overhead  of record  header
(previous/next pointers)  have.  But the  longer the record,  the more
space is lost when it's deleted  and not reused or when it's not fully
used  (variable  length records  may  be  implemented as  fixed-length
records of maximum size containing a length field).

You could have in each one of the records all your fields:

| ref | keyword | author | auth-inits | book | date publ | summary-------- |

(showing  only  the   payload,  the  previous-record  and  next-record
belonging to the lower level).

Of  course, like  in the  databases, you  may want  to  define several
tables and do joints, but you're reimplementing a database then.

An intermediate  technique is  to use a  record type field;  it's like
having  several tables stored  in one  file. If  the record  sizes are
vastly different, it's not economical space-wise (note the fillers).

| rec-type='author'  | ref | author | auth-inits | filler |
| rec-type='book'    | ref | book | date publ | filler    |
| rec-type='summary' | ref | index | summary --- line --- |

The structure  you're describing is  a degenerate case of  this, using
only one field per record.

If  you use  a  dbm, of  course you  can  use complex  keys, like  the
concatenation of major  key, minor key and record  type, but you could
not do a  "select" on a part of keys returning  several "rows", so you
would have to enumerate the keys to gather all the data.  On the other
hand gdbm,  contrarily to the old dbm  can handle any size  for key or
data, so you  could store big records with all the  data related to an
entity (and  then you'd  have to manage  the content of  your records,
inserting and deleting fields, but you can do that in RAM).



Otherwise, what would be the reason why not use a real RDBMS? mysql or
postgresql are quite nice.

-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo3hvo$3ck$1@ctb-nnrp2.saix.net>
OK thats a mouth full and it will take me a moment to work through it all
... thanks for the input pascal it is really appreciated

Their are a couple of reasons for not using a DB like mysql (which I use a
lot):

1. This is a lisp learning excercise and I dont have all the paraphanalia
(packages/libs and stuff) to access a db from lisp yet so I thought it would
be a good time to learn this file stuff. I have been programming for close
onto 10 years and have done no more with IO than write out a log file
sequencially (and replace the whole thing next itme). I was started in the
DB world and never could bother with the file stuff untill recently (see
next point).
2. I have a little gripe about relational dms as implemented currently (if
your interested I have summed it up here more or less
http://www.psychedelic.co.za/newrelational/newrelational.htm) so I thought I
would try my hand at one or two concepts and see what its all about.

I never had a tertiary education(or any education for that matter) in
programming so I ask a lot of stupid questions and re-invent the wheel a lot
when it comes to the basic / lowlevel stuff but I'm learning more and more
every day or atleast I hope so. This is why I really appreciate your input.
From: Paolo Amoroso
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87k76imtms.fsf@plato.moon.paoloamoroso.it>
Harag writes:

> 1. This is a lisp learning excercise and I dont have all the paraphanalia

It would be great if you could take this survey:

  The Road to Lisp Survey
  http://alu.cliki.net/The%20Road%20to%20Lisp%20Survey


Paolo
-- 
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87sml61o8w.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> 2. I have a little gripe about relational dms as implemented currently (if
> your interested I have summed it up here more or less
> http://www.psychedelic.co.za/newrelational/newrelational.htm) so I thought I
> would try my hand at one or two concepts and see what its all about.

It sounds like you're wanting an OODBS, an Object storage.

There are a few hits on OODB and Lisp:
http://www.google.com/search?as_q=OODB+Lisp

-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo3up5$c4s$1@ctb-nnrp2.saix.net>
> It sounds like you're wanting an OODBS, an Object storage.

I dont anything about oodbs ... is that what i'm attempting
http://www.psychedelic.co.za/newrelational/attempt.htm (this is draft stuff
so dont laugh to much ;o})?
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo3jk6$4e9$1@ctb-nnrp2.saix.net>
> If  you use  a  dbm, of  course you  can  use complex  keys, like  the
> concatenation of major  key, minor key and record  type, but you could
> not do a  "select" on a part of keys returning  several "rows", so you
> would have to enumerate the keys to gather all the data.

So what you are saying is that like in my example where each field occupies
a row using dbm I would not be able to do a type of select because a row
does not make up a whole record?

>On the other hand gdbm,  contrarily to the old dbm  can handle any size
for key or
> data, so you  could store big records with all the  data related to an
> entity (and  then you'd  have to manage  the content of  your records,
> inserting and deleting fields, but you can do that in RAM).

Ok what you are implying here is that dbm had limitations on the size of key
fields and gdbm does not. Thus I could use my structure and dump as much
meta-data with each field as I would like? The select would still be out of
the question though?
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87ptga1o4v.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> > If  you use  a  dbm, of  course you  can  use complex  keys, like  the
> > concatenation of major  key, minor key and record  type, but you could
> > not do a  "select" on a part of keys returning  several "rows", so you
> > would have to enumerate the keys to gather all the data.
> 
> So what you are saying is that like in my example where each field occupies
> a row using dbm I would not be able to do a type of select because a row
> does not make up a whole record?

Indeed. The usage pattern of db is  that of the hash table, a mere map
(key ->  value).  In that case, what  you want to do  is identify your
"objects" with  a key, and  use it to  find the object.  You  can skip
from  one to  the other  keeping  the keys  as reference,  or you  can
enumerate the keys  present in the db, like you do  with a hash table.
But  doing a  "select",  that is,  filtering  the objects  on a  given
criteria  is not done  automatically, you  have to  read the  keys and
values and select the ones you're interested in yourself.

 
> >On the other hand gdbm,  contrarily to the old dbm  can handle any size
> for key or
> > data, so you  could store big records with all the  data related to an
> > entity (and  then you'd  have to manage  the content of  your records,
> > inserting and deleting fields, but you can do that in RAM).
> 
> Ok what you are implying here is that dbm had limitations on the size of key
> fields and gdbm does not. Thus I could use my structure and dump as much
> meta-data with each field as I would like? The select would still be out of
> the question though?

Yes.

There  is  a  technique  to  have  several  "indexes"  and  non-unique
keys. Keep in one db file a mapping (id -> object), and create as many
index files you want that are other db files containing a mapping (key
-> (list of  ids)).  Then given  a set of  keys, you first  search the
index  db files,  may  be intersect  the  resulting lists  of id,  and
finally retrieve the objects from the data file.


Here is a prototype in Lisp (not including the I/O aspects) using hash
tables:

    (setq data (make-hash-table))
    (setf (gethash 1 data) 
            '(:author "Wood, Derick" :year 1993
              :title "Data Structures, Algorithms, and Performance"
              :isbn "0-201-52148-2"))
    (setf (gethash 2 data)
            '(:author "Celko, Joe" :year 2000
              :title "SQL for smarties: advanced SQL programming /2nd ed"
              :isbn "1-55860-576-2"))
    (setf (gethash 3 data)
            '(:author "Celko, Joe" :year 1999
              :title "Data & Databases: conseps in practice"
               :isbn "1-55860-432-4"))

So you can retrieve an object using the identifier:
[14]> (gethash 2 data)
(:AUTHOR "Celko, Joe" :YEAR 2000 :TITLE
 "SQL for smarties: advanced SQL programming /2nd ed" :ISBN "1-55860-576-2") ;
T

Now you can build indexes:


    (defun make-index (db keysf &key test)
       "returns a hash table mapping the keys returned by the keysf function
        applied to the values of the db hash table to the keys of the db hash."
      (let ((index (make-hash-table :test test)))
        (maphash (lambda (id value) 
                  (mapc (lambda (key) (pushnew id (gethash key index)))
                        (funcall keysf value)))
                 db)
        index))
                   
[16]> (setq year-index (make-index data (lambda (obj)(list (getf obj :year)))))
#S(HASH-TABLE EQL (1993 1) (2000 2) (1999 3))

[29]> (setq title-keywords-index 
            (make-index data 
                        (lambda (obj)
                            (delete "" 
                                (mapcar (lambda (word) 
                                          (string-trim ",.:/&" word)) 
                                        (split-string (getf obj :title) " "))
                                :test (function string=)))
                        :test (function equalp)))

#S(HASH-TABLE EQUALP ("Performance" 1) ("and" 1) ("Algorithms" 1) 
    ("Structures" 1) ("ed" 2) ("2nd" 2) ("programming" 2) ("advanced" 2) 
    ("smarties" 2) ("for" 2) ("SQL" 2) ("practice" 3) ("in" 3) 
    ("conseps" 3) ("Databases" 3) ("Data" 1 3))

(note the two ID for "Data"). The same for author:

[30]> (setq author-keywords-index 
            (make-index data 
                        (lambda (obj)
                            (delete "" 
                                (mapcar (lambda (word) 
                                          (string-trim ",.:/&" word)) 
                                        (split-string (getf obj :author) " "))
                                :test (function string=)))
                        :test (function equalp)))

#S(HASH-TABLE EQUALP ("Derick" 1) ("Wood" 1) ("Joe" 2 3) ("Celko" 2 3))

(split-string  is not  CL; see  emacs split-string  for an  example of
implementation).

Then you can find your objects by different indexes:

[33]> (gethash "Celko" author-keywords-index)
(2 3) ;
T
[34]> (gethash 1999 year-index)
(3) ;
T
[38]> (intersection (gethash "Celko" author-keywords-index) 
                    (gethash 1999 year-index))
(3)

[39]> (mapcar (lambda (id) (gethash id data)) 
              (gethash "Celko" author-keywords-index))

((:AUTHOR "Celko, Joe" :YEAR 2000 
  :TITLE "SQL for smarties: advanced SQL programming /2nd ed" 
  :ISBN "1-55860-576-2")
 (:AUTHOR "Celko, Joe" :YEAR 1999 
  :TITLE "Data & Databases: conseps in practice" 
  :ISBN  "1-55860-432-4"))

[40]> (mapcar (lambda (id) (gethash id data)) 
              (gethash "Data" title-keywords-index))

((:AUTHOR "Wood, Derick" :YEAR 1993 
  :TITLE "Data Structures, Algorithms, and Performance"
  :ISBN "0-201-52148-2")
 (:AUTHOR "Celko, Joe" :YEAR 1999 
  :TITLE "Data & Databases: conseps in practice" 
  :ISBN  "1-55860-432-4"))


Since  you're  interested  in  the  I/O aspects  in  Lisp,  you  could
implement an equivalent  of gdbm in lisp. It could  have the same high
level kind of API than the hash tables.

Read the chapter about the Streams 
http://www.lisp.org/HyperSpec/Body/chap-21.html
(read-sequence, write-sequence, file-position)
[ use `(array unsigned-byte (,blocksize)) as buffer ].

Add a layer to allocate, read, write, and free blocks.

On this  layer, implement the  storage of records (see  print-object ;
some type of lisp data may be hard or impossible to store or read back
automatically).

If you  want to allow  variable sized keys  and values like  gdbm, you
should take care in the allocation of the records and their reference,
since this  would break  the 1-1 mapping  between the records  and the
blocks. This  mean that you may  need to have a  map (record-number ->
(block-number...)) at this layer.

And finally,  add a layer for  the key->value mapping.   That could be
done merely  storing the hash table mapping  the (key-record-number ->
value-record-number), and then you would  load the whole hash table in
Lisp core  to access the file, or  you may prefer keep  the mapping in
the file.


-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo3v2f$c9a$1@ctb-nnrp2.saix.net>
> Since  you're  interested  in  the  I/O aspects  in  Lisp,  you  could
> implement an equivalent  of gdbm in lisp. It could  have the same high
> level kind of API than the hash tables.

Ok maybe thats an idea I'll put some stuff together and slap it on my
website and then only put up change notices onto this news group so you or
anybody else (if anybody is interested) can follow the progress...else I
might just get accused of abbusing this newsgroup...(lol)
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bpe72t$8vj$1@ctb-nnrp2.saix.net>
Well Pascal the hack now exists (very primitive and needs more testing)
...associative-gdbm or something ;o}

...in true tradition(actually lazyness) documentation is a bit on the thin
side...to follow one day

http://www.psychedelic.co.za/newrelational/lacsap.htm
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87llqdtde5.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> Well Pascal the hack now exists (very primitive and needs more testing)
> ...associative-gdbm or something ;o}
> 
> ...in true tradition(actually lazyness) documentation is a bit on the thin
> side...to follow one day
> 
> http://www.psychedelic.co.za/newrelational/lacsap.htm

Indeed. You should either use more explicit names than:

(defvar *lid* 1)
(defvar *lrid* 1)

or add documentation strings:

(defvar *lid*  1 "Largest Integer Designating.")   ;; for example...
(defvar *lrid* 1 "Lowest Rotund Ignominious Decimal.")


The same for the functions, instead of writting a comment _before_ (I
hate this), put it as documentation string conveniently placed just
_after_ the defun signature:

(defun print-vpos (vpos stream)
 "This struct is used to store details about an entity's value's
physical position."
  (format stream "(~A ~A \"~A\")"
    ...))

instead of:


;;this struct is used to store details about an entity's value's
;;physical position
(defun print-vpos (vpos stream)
  (format stream "(~A ~A \"~A\")"
    ...))

Oops!  That  comment  did  not  correspond  to  the  adjacent  object!
defstruct too accepts a documentation string.

I would write the following:       rather as:

(equal parent nil)           --->  (null parent)
(equal rid 0)                --->  (= 0 rid)     or:   (zerop rid)



(defun make-tindex (e-db)
  (let ((index (make-hash-table :test (function string=))))

This is not Common-Lisp:
[28]> (load "lacsap")
*** - MAKE-HASH-TABLE: illegal :TEST argument #<SYSTEM-FUNCTION STRING=>

Use (function equal) instead, in hash tables.




So, you're  building an index named  "entity.db" of a  data file named
"value.db".  Unfortunately, since you're  not using fixed size records
in the data file, you're needing functions such as:

(defun defrag-e-values (db newpath)
  (maphash (lambda (key entity)
             (set-e-val entity (get-e-sval entity) newpath))
    ;;; OOPS! set-e-val is not defined, it's set-e-value 
           db)
  newpath)


You've got a  funny strange little toy here.  But  what are you trying
to accomplish?  This looks a little unfocused.

It looks like  you're trying to manipulate in-memory  a data structure
that would be implemented only on disk.

In memory, we don't usually  refer objects with reference numbers thru
hash tables. (At worse, we could do it with indices of an array).  

If you  want to have "entity"  objects, when you create  a new entity,
you get it, and when you  insert the entity in a "database", you still
keep the entity to further manipulate it. It's much easier to do so in
memory  than to  having to  retrieve  the object  from the  "database"
everytime.

Then, the  mapping between the  objects and their reference  number is
done only by the database access layer.


(setf pid (add-e (create-entity "author") *db*))
(setf id (add-e (create-entity "surname" :parent (gethash pid *db*)) *db*))
(set-e-value (gethash id *db*) "Celcko")
(setf id (add-e (create-entity "names" :parent (gethash pid *db*)) *db*))
(set-e-value (gethash id *db*) "Joe")
(setf id (add-e (create-entity "book" :parent (gethash pid *db*)) *db*))
(set-e-value (gethash id *db*) "SQL for smarties: advanced SQL programming /2nd ed")
(setf id (add-e (create-entity "year" :parent (gethash id *db*)) *db*))
(set-e-value (gethash id *db*) "2000")
(setf id (add-e (create-entity "book" :parent (gethash pid *db*)) *db*))
(set-e-value (gethash id *db*) "Data & Databases: conseps in practice")
(setf id (add-e (create-entity "year" :parent (gethash id *db*)) *db*))
(set-e-value (gethash id *db*) "1999")

could be rewritten as:

(let ((author (create-entity "author")))
  (add-e author *db*)
  (let ((surname (create-entity "surname" :parent author)))
    (add-e surname *db*)
    (set-e-value surname "Celcko"))
  (let ((name (create-entity "name" :parent author)))
    (add-e name *db*)
    (set-e-value name "Joe"))
  (let ((book (create-entity "book" :parent author)))
    (add-e book *db*)
    (set-e-value book "SQL for smarties: advanced SQL programming /2nd ed")
    (let ((year (create-entity "book" :parent book)))
      (add-e year *db*)
      (set-e-value year "2000")))
  (let ((book (create-entity "book" :parent author)))
    (add-e book *db*)
    (set-e-value book "Data & Databases: conseps in practice")
    (let ((year (create-entity "book" :parent book)))
      (add-e year *db*)
      (set-e-value year "1999"))))

Then  you could  simplify  it  in two  directions,  depending on  your
needs. Either note the repeatition of the same form

(let ((ENTITY (create-entity ENTITY-NAME)))
  (add-e ENTITY *db*)
  (OTHER-STUFF-WITH-ENTITY-AS-PARENT))

which leads to:


(DEFMACRO WITH-GENSYMS (SYMS &BODY BODY)
  "Replaces given symbols with gensyms. Useful for creating macros.
This version by Paul Graham in On Lisp."
  `(LET ,(MAPCAR #'(LAMBDA (S) `(,S (GENSYM))) SYMS) ,@BODY)
  )

(defvar *with-entity-parent* nil)
(proclaim '(special *with-entity-parent*))

(defmacro with-entity (entity-name entity-value &body body)
    (with-gensyms (entity value)
    `(let ((,entity (create-entity ,entity-name :parent *with-entity-parent*))
           (,value  ,entity-value))
        (add-e ,entity *db*)
        ,@(when value (list `(set-e-value ,entity ,value)))
        ,@body)))
     
and then:

   
(with-entity ("author")
   (with-entity ("surname" "Wood"))
   (with-entity ("name"    "Derick"))
   (with-entity ("book"    "SQL for smarties: advanced SQL programming /2nd ed")
     (with-entity ("year"    "2000")))
   (with-entity ("book"    "Data & Databases: conseps in practice")
     (with-entity ("year"    "1999"))))


But  you could  as well  define  a function  that would  parse a  tree
defining your entity:

(make-entity-from-description
 '(:author nil
   (:surname "Wood")
   (:name    "Derick")
   (:book    "SQL for smarties: advanced SQL programming /2nd ed"
    (:year   "2000"))
   (:book    "Data & Databases: conseps in practice"
    (:year   "1999")))) 


For example:

(defun make-entity-from-description (description &optional parent)
  (let ((entity (create-entity (string (first description)) :parent parent)))
    (add-e entity *db*)
    (when (second description) (set-e-value entity (second description)))
    (map nil (lambda (subentity)
               (make-entity-from-description subentity entity))
         (cddr description))
    entity))




You should read the book of Peter Seibel, in particular this chapter:

  http://www.gigamonkeys.com/book/practical-a-simple-database.html


-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bpfsck$sc1$1@ctb-nnrp2.saix.net>
>
> So, you're  building an index named  "entity.db" of a  data file named
> "value.db".  Unfortunately, since you're  not using fixed size records
> in the data file, you're needing functions such as:
>
> (defun defrag-e-values (db newpath)
>   (maphash (lambda (key entity)
>              (set-e-val entity (get-e-sval entity) newpath))
>     ;;; OOPS! set-e-val is not defined, it's set-e-value
>            db)
>   newpath)

My appologies here I was renaming functions and did not do enough testing
before "publishing"...

>
>
> You've got a  funny strange little toy here.  But  what are you trying
> to accomplish?  This looks a little unfocused.
>
> It looks like  you're trying to manipulate in-memory  a data structure
> that would be implemented only on disk.

Yes any "reasonable" size DB in this format will have to be implemented on
disk with only sub "trees" in memory to manipulate. For now I was just using
the hashtable to try out some ideas.

Maybe a couple of examples of what I'm aiming at in the end will help to
explain where I'm trying to go.

A typical personell record:

Personnel Table:
    Personnel Number     : 001
    Surname                    : Snot
    Name                        : Piet
    Initials                        : P
    Sex                            : Male

Personnel Contacts:
    Personnel Number     : 001
    Type                          : Home
    Phone                        : 123456789

    Personnel Number     : 001
    Type                          : Mobile
    Phone                        : 987654321

Personnel Dependants:
    Personnel Number     : 001
    Relation                     : Spouse
    Surname                    : Snot
    Name                        : Sannie
    Initials                        : S
    Sex                            : Female

    Personnel Number     : 001
    Relation                     : Child
    Surname                    : Snot
    Name                        : Pietie
    Initials                        : P
    Sex                            : Male

Personnel Salary:
    Personnel Number     : 001
    Year                          : 2000
    Annual                       : $10

    Personnel Number     : 001
    Year                          : 2001
    Annual                       : $9

    Personnel Number     : 001
    Year                          : 2002
    Annual                       : $8   (he lives in a third world country
;o})


Personnel Table:
    Personnel Number     : 002
    Surname                    : Snot
    Name                        : Sannie
    Initials                        : S
    Sex                            : Female


Personnel Contacts:
    Personnel Number     : 002
    Type                          : Home
    Phone                        : 123456789

    Personnel Number     : 002
    Type                          : Mobile
    Phone                        : 987654321

Personnel Dependants:
    Personnel Number     : 002
    Relation                     : Spouse
    Surname                    : Snot
    Name                        : Piet
    Initials                        : P
    Sex                            : Male

    Personnel Number     : 002
    Relation                     : Child
    Surname                    : Snot
    Name                        : Pietie
    Initials                        : P
    Sex                            : Male

----OR-----

Personnel:
            Number     : 001
            Surname                    : Snot
            Name                        : Piet
            Initials                        : P
            Sex                            : Male
            Contacts:
                   Home Phone               : 123456789
                   Mobile                        : 987654321
             Dependants:
                    Relation                     : Spouse
                           Surname                    : Snot
                           Name                        : Sannie
                           Initials                        : S
                           Sex                            : Female
                    Relation                     : Child
                           Surname                    : Snot
                           Name                        : Pietie
                           Initials                        : P
                           Sex                            : Male
            Personnel Salary:
                  Year                          : 2000
                          Annual                       : $10
                  Year                          : 2001
                          Annual                       : $9
                  Year                          : 2002
                          Annual                       : $8

Personnel:
            Number     : 002
            Surname                    : Snot
            Name                        : Sannie
            Initials                        : S
            Sex                            : Female
            Contacts:
                   Home Phone               : 123456789
                   Mobile                        : 987654321
             Dependants:
                    Relation                     : Spouse
                           Surname                    : Snot
                           Name                        : Piet
                           Initials                        : P
                           Sex                            : Male
                    Relation                     : Child
                           Surname                    : Snot
                           Name                        : Pietie
                           Initials                        : P
                           Sex                            : Male



---OR--- (and this is what I am aiming at)

EID = ENTITY (INSTANCE) ID
PID = PARENT ENTITY ID

Personnel: (EID 1 PID 0)
            Number     : 001 (EID 2 PID 1)
            Person (EID 3 PID 1)
                   Surname                    : Snot   (EID 4 PID 3)
                   Name                        : Piet    (EID 5 PID 3)
                   Initials                        : P        (EID 6 PID 3)
                    Sex                           : Male  (EID 7 PID 3)
            Contacts (EID 8 PID 1)
                   Home Phone               : 123456789 (EID 9 PID 8)
                   Mobile                        : 987654321 (EID 10 PID 8)
             Dependants (EID 11 PID 1)
                    Relation                     : Spouse (EID 12 PID 11)
                           Person (EID 13 PID 12)
                                  Surname                    : Snot     (EID
14 PID 13)
                                  Name                        : Sannie  (EID
15 PID 13)
                                  Initials                        : S
(EID 16 PID 13)
                                  Sex                            : Female
(EID 17 PID 13)
                    Relation                     : Child (EID 18 PID 11)
                           Person (EID 19 PID 18)
                                  Surname                    : Snot  (EID 20
PID 19)
                                  Name                        : Pietie (EID
21 PID 19)
                                  Initials                        : P
(EID 22 PID 19)
                                  Sex                            : Male
(EID 23 PID 19)
            Personnel Salary (EID 24 PID 1)
                  Year                          : 2000 (EID 25 PID 24)
                          Annual                       : $10 (EID 26 PID 25)
                  Year                          : 2001 (EID 27 PID 24)
                          Annual                       : $9 (EID 28 PID 27)
                  Year                          : 2002 (EID 29 PID 24)
                          Annual                       : $8 (EID 30 PID 29)

Personnel (EID 31 PID 0)
            Number     : (EID 32 PID 31)
            Person (EID 13  PID 31)
                   Surname                    : Snot     (EID 14 PID 13)
                   Name                        : Sannie  (EID 15 PID 13)
                   Initials                        : S          (EID 16 PID
13)
                   Sex                            : Female (EID 17 PID 13)
            Contacts: (EID 8 PID 31)
                   Home Phone               : 123456789 (EID 9 PID 8)
                   Mobile                        : 987654321 (EID 10 PID 8)
             Dependants: (EID 33 PID 31)
                    Relation                     : Spouse (EID 34 PID 33)
                           Person (EID 3 PID 34)
                                  Surname                    : Snot  (EID 4
PID 3)
                                  Name                        : Piet   (EID
5 PID 3)
                                  Initials                        : P
(EID 6 PID 3)
                                  Sex                            : Male (EID
7 PID 3)
                    Relation                     : Child (EID 35 PID 33)
                           Person (EID 19 PID 35)
                                  Surname                    : Snot  (EID 20
PID 19)
                                  Name                        : Pietie (EID
21 PID 19)
                                  Initials                        : P
(EID 22 PID 19)
                                  Sex                            : Male
(EID 23 PID 19)

Note that the indentetion indicates the RELATIONSHIP between the various
data piece or ENTITIES.

What do I get for all this trouble:

1. Although I shoe the enities' data more thanonce the actual data will only
be stored once for the EID.
2. I'm hoping to set up heuristics that USES the EID,PID and ENTITY TYPE
(like Person) to infer new "data / information" like
            PIET and SANNIE are a COUPLE
            PIET and SANNIE are PIETIE's parents
            etc
    *NOTE* This heuristics bit is still vary vague in my head!!
3. If I get an EID I can roll up and down its structure without knowing
about "tables and there field structures and there relationships upfront. I
find this structure much more intuative to work with personnally and I want
to give it a go EVEN though I suspect that speed/processing time will be
slower/more intensive.


>
> In memory, we don't usually  refer objects with reference numbers thru
> hash tables. (At worse, we could do it with indices of an array).

The reason I'm sticking to the hashtable is because after some use there
might be uge gaps in my numbering ranges (EID) and then it would not map
naturally to an array.

>
> If you  want to have "entity"  objects, when you create  a new entity,
> you get it, and when you  insert the entity in a "database", you still
> keep the entity to further manipulate it. It's much easier to do so in
> memory  than to  having to  retrieve  the object  from the  "database"
> everytime.
>

Yes I agree it needs work and I will come back to you on this issue once I
fully understand the path to the macro at the end of your code ;o}....(after
2 weeks of lisp Harag has till not tried his hand at a macro).

Thanks for the link to Peter Seibel's chapter. I also strated out with
propertylists but changed to structures, for some forgotten reason, some
where in the "growth" (can't call it development) of the code.

There is still a lot of areas that needs fine scrutiny but I am trying my
hand at Paul Grahams idea of using lisp as a specification tool.
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bpgk4c$67e$1@ctb-nnrp2.saix.net>
"Pascal Bourguignon" <····@thalassa.informatimago.com> wrote in message
···················@thalassa.informatimago.com...
> "Harag" <·········@psychedelic.co.za> writes:
>
> > > So, you're  building an index named  "entity.db" of a  data file named
> > > "value.db".  Unfortunately, since you're  not using fixed size records
> > > in the data file, you're needing functions such as:
> > >
> > > (defun defrag-e-values (db newpath)
> > >   (maphash (lambda (key entity)
> > >              (set-e-val entity (get-e-sval entity) newpath))
> > >     ;;; OOPS! set-e-val is not defined, it's set-e-value
> > >            db)
> > >   newpath)
> >
> > My appologies here I was renaming functions and did not do enough
testing
> > before "publishing"...
>
> Never mind.  The point was about the record size.  With fixed record
> sizes, you would not need so much to "defrag" the file, because you
> could easily reuse empty ("deleted") records.

No agrument there but if I remember correctly you said gdbm had variable
length keys and values, not that I have ever seen gdbm or used it, so I was
kind off aiming for something like that.

> > [...]
> > 1. Although I shoe the enities' data more thanonce the actual data will
only
> > be stored once for the EID.
>
> Usually, entities are considered as a whole.  Some function may access
> only some fields of the entities  it processes, but most often we need
> to handle  all the attributes  of an entity  at once.  So,  unless you
> have some very  big fields (let's say: a hires  picture of the person,
> recordings  of  his speaches,  etc,  in  which  case we  would  rather
> consider rather  the big  item as individual  entities related  to the
> person), it would be much more  efficient to keep all the fields of an
> entity in the same record and load and save them at once.
>
> Which does not  prevent you to add relations,  be them hierarchical or
> not, but not  necessarily embedded.  It may be worthwhile  to do it in
> separate files ("tables").
>

All true in a simple static world.

Consider the following senario (I'm going for the abstract here because I
can't think of a "real word" scenario right now).

A
    B
    C
        K
        L
        M
E
    F
    D
    G

The resulting records (in their own tables) would be:
A-BC
C-KLM
E-FDG

***The first letter being the key and the rest the attributes.


If I desided to change the "structure" after some time or because of some
obscure reason to the following

A
    B
        L
        M
    C
        K
E
    F
    D
    G

or had to accommodate new attributes like this

A
    B
    C
        K
        L
        M
        X
    Y
E
    F
    D
        Z
    G

then I think my physical schema should work better.

------------
What am I getting at:



1. Programming and data structure flexibility

        In my schema I would just change a relationships or add a
relationship and not have to bother about the physical storage at all, where
for records I would have to go and move data about.



2. Catering for temporal and/or varying data



        I have no argument with the fact that in my schema I pay in
performance for flexibility. Whether the flexibility is really needed is
arguable. What comes to mind is having to pivot records (I'm thinking
SQL syntax here not programmatically) to work with a "whole entity" because
the "attributes" are variable or optional. Surely in those cases at least,
"records" and "tables" hinder more than they help.



        A typical "real life" scenario would be a payroll. You have many
small "entities" like salary and other pay codes (on average between 100 and
250 numerical entities alone)  that should actually be attributes of a
personnel member, but because they vary in number (and some times shape)
they can not or should not be made attributes. Each little entity being its
own record now kind of levels the playing field when it comes to storing the
physical data does it not?



        The other problem in this scenario is that a lot of the data in a
payroll is temporal (is that the right word?) in nature. An example would be
the hours worked for February by a contract worker is very important this
month but wont feature in calculations next month. This type of
attribute/entity you would not want to tie down in the personnel record even
if you could because it would be a waste. Futher you cant abbon the February
hours completly either because of  reporting issues so next month you have
to create a new record for the hours and so on.



------------



I don't know if all that made any sense to you at all, a lot of it is gut
feel, having had to work on such databases for some years  ;o}
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87smkkqd7l.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:
> No agrument there but if I remember correctly you said gdbm had variable
> length keys and values, not that I have ever seen gdbm or used it, so I was
> kind off aiming for something like that.

Yes, but  the variable-sized is managed inside  fixed-sized. 

Look for  example how malloc  is usually implemented,  with "buckets",
arrays of blocks  of same size. When you want to  allocate x byte, you
get the address of a  memory block that is actually 2^int(1+log2(x-1))
(when x is small enough; when it's much greater than a page size, it's
simply allocated rounding to the next page border).

Actually, it's a compromize betweeen  managing each block sized to the
byte and  having only one-size-fit-all.   If you allocate  exactly the
number of  bytes needed,  when it's freed,  it may  be hard to  find a
second request  for the exact same  number of bytes. That  is, given a
request for y bytes, it's harder  to find a free block with a matching
size,  and to find  an algorithm  that will  minimize the  lost space.
When you have one size for all,  of course you lose a lot more on each
request, with the gain that the algorithm is utterly simple: just give
the first  free block  (but of  course, then there  is a  maximum over
which you can't allocate).


So  you have to  choose between  the complexity  of managing  the free
space and the space lost on  average on each allocation. Of course, if
you don't  delete items (or rarely),  you can forget  the freed space,
and count  on a  defragment feature  to clean up  the garbage.  It all
depends on what your requirements are.

 

-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bpgnsm$7c3$1@ctb-nnrp2.saix.net>
"Pascal Bourguignon" <····@thalassa.informatimago.com> wrote in message
···················@thalassa.informatimago.com...
>
> "Harag" <·········@psychedelic.co.za> writes:
> > No agrument there but if I remember correctly you said gdbm had variable
> > length keys and values, not that I have ever seen gdbm or used it, so I
was
> > kind off aiming for something like that.
>
> Yes, but  the variable-sized is managed inside  fixed-sized.
>

Right I'm with you there. In the "record" senariao are you storing the data
in adjacent blocks or is there some "trick" to have it spread.
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87d6boq84z.fsf@thalassa.informatimago.com>
"Harag" <·········@psychedelic.co.za> writes:

> "Pascal Bourguignon" <····@thalassa.informatimago.com> wrote in message
> ···················@thalassa.informatimago.com...
> >
> > "Harag" <·········@psychedelic.co.za> writes:
> > > No agrument there but if I remember correctly you said gdbm had variable
> > > length keys and values, not that I have ever seen gdbm or used it, so I
> was
> > > kind off aiming for something like that.
> >
> > Yes, but  the variable-sized is managed inside  fixed-sized.
> >
> 
> Right I'm with you there. In the "record" senariao are you storing the data
> in adjacent blocks or is there some "trick" to have it spread.

Of course, it works better in  the cases where there is an upper bound
to the size  of your variable-sized stuff, and when  most of our stuff
is all about the same size. In these conditions you can set easily one
fixed  record size  and don't  lose too  much when  the record  is not
completely filled.

Otherwise,  either you  may segment  your data  in subrecords  if that
would help the upper bound and the deviation of the sizes, or you have
to manage  the storage  and you can  effectively spread big  data over
several blocks not  necessarily adjacent, like it's done  for files in
filesystems.




-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Paul Foley
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <m2vfq2z582.fsf@mycroft.actrix.gen.nz>
On 02 Nov 2003 17:06:26 +0100, Pascal Bourguignon wrote:

> access records  based on keys, you  can add an index  layer. Here, you
> may consider using one of the  db libraries provided in unix (db, dbm,
> gdbm), thru UFFI. They essentially  implement a hash table, a map (key
-> value).

FWIW, I have a Berkeley DB interface at
http://users.actrix.co.nz/mycroft/cl.html

-- 
Cogito ergo I'm right and you're wrong.                 -- Blair Houghton

(setq reply-to
  (concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
From: Pascal Bourguignon
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <87znfdzfsr.fsf@thalassa.informatimago.com>
Paul Foley <···@below.invalid> writes:

> On 02 Nov 2003 17:06:26 +0100, Pascal Bourguignon wrote:
> 
> > access records  based on keys, you  can add an index  layer. Here, you
> > may consider using one of the  db libraries provided in unix (db, dbm,
> > gdbm), thru UFFI. They essentially  implement a hash table, a map (key
> -> value).
> 
> FWIW, I have a Berkeley DB interface at
> http://users.actrix.co.nz/mycroft/cl.html

Yep. I should have refered the OP to http://www.cliki.org/
where it's referenced.

-- 
__Pascal_Bourguignon__
http://www.informatimago.com/
From: Adam Warner
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <pan.2003.11.02.07.37.27.30350@consulting.net.nz>
Hi Harag,

> Ok this is a question you must get a lot but from my archive searches on
> google their are'nt any awnswers that satisfy or they cause more
> questions.
> 
> Reading and writing to a file in lisp (on windows) I get right with :io
> and :overwrite but it seems to be limited to inserting new data and I
> need to change data (please dont tell me to write it to a temporary file
> and then rename). I must add that I have only played with the string
> type 'reads' and 'writes'.

Sounds like you should be using :if-exists :supersede instead of
:if-exists :overwrite. Check out the documentation for OPEN in the
HyperSpec.

Regards,
Adam
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo2e8e$9ec$1@ctb-nnrp2.saix.net>
Thats where I started but superede destroys the data see below:

Hyperspec === The existing file is superseded; that is, a new file with the
same name as the old one is created. If possible, the implementation should
not destroy the old file until the new stream is closed.

CL-USER 45 : 1 > (setf path (make-pathname :name "eish2.iesh"))
#P"eish2.iesh"

CL-USER 46 : 1 > (setf str (open path :direction :io :if-exists :supersede))
#<STREAM::LATIN-1-FILE-STREAM C:\Documents and Settings\Phil\eish2.iesh>

CL-USER 47 : 1 > (read-line str)
Error: End of file while reading stream #<STREAM::LATIN-1-FILE-STREAM
C:\Documents and Settings\Phil\eish2.iesh>.
  1 (abort) Return to level 1.

CL-USER 48 : 2 > (close str)
T

When I check my file after this its empty. Am I doing something stupid??



"Adam Warner" <······@consulting.net.nz> wrote in message
··································@consulting.net.nz...
> Hi Harag,
>
> > Ok this is a question you must get a lot but from my archive searches on
> > google their are'nt any awnswers that satisfy or they cause more
> > questions.
> >
> > Reading and writing to a file in lisp (on windows) I get right with :io
> > and :overwrite but it seems to be limited to inserting new data and I
> > need to change data (please dont tell me to write it to a temporary file
> > and then rename). I must add that I have only played with the string
> > type 'reads' and 'writes'.
>
> Sounds like you should be using :if-exists :supersede instead of
> :if-exists :overwrite. Check out the documentation for OPEN in the
> HyperSpec.
>
> Regards,
> Adam
From: Adam Warner
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <pan.2003.11.02.09.34.23.860142@consulting.net.nz>
Hi Harag,

> Thats where I started but supersede destroys the data see below:

Yes :supersede will replace the original file. So I misunderstood your
intentions.

> Hyperspec === The existing file is superseded; that is, a new file with the
> same name as the old one is created. If possible, the implementation should
> not destroy the old file until the new stream is closed.
> 
> CL-USER 45 : 1 > (setf path (make-pathname :name "eish2.iesh"))
> #P"eish2.iesh"
> 
> CL-USER 46 : 1 > (setf str (open path :direction :io :if-exists :supersede))
> #<STREAM::LATIN-1-FILE-STREAM C:\Documents and Settings\Phil\eish2.iesh>
> 
> CL-USER 47 : 1 > (read-line str)
> Error: End of file while reading stream #<STREAM::LATIN-1-FILE-STREAM
> C:\Documents and Settings\Phil\eish2.iesh>.
>   1 (abort) Return to level 1.
> 
> CL-USER 48 : 2 > (close str)
> T
> 
> When I check my file after this its empty. Am I doing something stupid??

(Top posting)

I think I understand the problem and no you will not be able to insert
data. Let's create a file to operate upon:

(with-open-file (stream "test" :direction :output :if-exists :supersede)
  (format stream "~S ~S ~S~%" 'these 'are 'symbols))

The file "test" contains these characters:
these are symbols

Now let's modify test, first using overwrite:

(with-open-file (stream "test" :direction :io :if-exists :overwrite)
   (prin1 'a-new-symbol stream))

This works as expected, overwriting the start of the file:
a-new-symbolmbols

Then append:
(with-open-file (stream "test" :direction :io :if-exists :append)
  (prin1 'a-new-symbol stream))

Again we have the expected result:
a-new-symbolmbols
a-new-symbol

Let's try some bidirectional operations on the original file:
(with-open-file (stream "test" :direction :io :if-exists :overwrite)
  (print (read stream)) (prin1 'insert stream))

The symbol `these' was read and then the text `insert' written to
the stream.

The resulting output is(!!!):
these are symbols
insert

Which seems very odd to me as the file pointer should have been
positioned after `these'. Everyone, are CMUCL (and SBCL) broken?

Here's the expected result that I get with CLISP:
these insertmbols

There appears to be no way (at least portably) to insert data into the
stream. If you want to do that it appears you will have to have one file
that you :input from and one file that you :output to.

Regards,
Adam
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo2oo2$h2r$1@ctb-nnrp2.saix.net>
Thanks for your help Adam.

The result in LispWorks is the same as in CLISP my original file had white
spacesand newlines in it that was hiding the overwriting, doing a simpler
example, converting your example to read-line and write-line, showed up my
error. The other thing to notice/remember is that when it overwrites
'newline' counts for 2 characters ...;o}

Is there any way to reset the position of the 'read' on the stream, I see
you can check the position file-position?

Regards
Harag
From: Henrik Motakef
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <863cd7dn45.fsf@pokey.internal.henrik-motakef.de>
"Harag" <·········@psychedelic.co.za> writes:

> Is there any way to reset the position of the 'read' on the stream, I see
> you can check the position file-position?

From the hyperspec:

| Function FILE-POSITION
| 
| Syntax:
| file-position stream => position
| file-position stream position-spec => success-p
 
| Description:
| 
| Returns or changes the current position within a stream.
| 
| When position-spec is not supplied, file-position returns the current
| file position in the stream, or nil if this cannot be determined.
| 
| When position-spec is supplied, the file position in stream is set to
| that file position (if possible). file-position returns true if the
| repositioning is performed successfully, or false if it is not.
From: Harag
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <bo2qc5$i87$1@ctb-nnrp2.saix.net>
tah

missed that last part in the reading my lisp 'bible' (the book i use) is not
as clear on the issue as the spec ;o}

"Henrik Motakef" <············@henrik-motakef.de> wrote in message
···················@pokey.internal.henrik-motakef.de...
> "Harag" <·········@psychedelic.co.za> writes:
>
> > Is there any way to reset the position of the 'read' on the stream, I
see
> > you can check the position file-position?
>
> From the hyperspec:
>
> | Function FILE-POSITION
> |
> | Syntax:
> | file-position stream => position
> | file-position stream position-spec => success-p
>
> | Description:
> |
> | Returns or changes the current position within a stream.
> |
> | When position-spec is not supplied, file-position returns the current
> | file position in the stream, or nil if this cannot be determined.
> |
> | When position-spec is supplied, the file position in stream is set to
> | that file position (if possible). file-position returns true if the
> | repositioning is performed successfully, or false if it is not.
From: Frode Vatvedt Fjeld
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <2hu15nm1jl.fsf@vserver.cs.uit.no>
"Harag" <·········@psychedelic.co.za> writes:

> Is there any way to reset the position of the 'read' on the stream,
> I see you can check the position file-position?

If you give file-position a second argument, it will try to set the
file pointer to that position.

-- 
Frode Vatvedt Fjeld
From: Adam Warner
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <pan.2003.11.02.11.41.31.700627@consulting.net.nz>
Hi Harag,

> Thanks for your help Adam.
> 
> The result in LispWorks is the same as in CLISP my original file had white
> spaces and newlines in it that was hiding the overwriting, doing a
> simpler example, converting your example to read-line and write-line,
> showed up my error. The other thing to notice/remember is that when it
> overwrites 'newline' counts for 2 characters ...;o}
> 
> Is there any way to reset the position of the 'read' on the stream, I
> see you can check the position file-position?

Usually (file-position <stream name> <integer position>). Refer again to
the HyperSpec definition of file-position and file position designator
(you'll see you can also use the symbols :start and :end).

If "test" contains the string 0123456789 then observe these results:

(with-open-file (stream "test" :direction :io :if-exists :overwrite)
  (print (read-char stream))
  (file-position stream 5)
  (print (read-char stream))
  (write-char #\a stream)
  (file-position stream 6)
  (print (read-char stream)))

Output is:

#\0
#\5
#\a

I read a character from the stream and printed it out. I then set the file
position to 5 and read+printed the character at that position. This
incremented the file position by one so when I wrote the character #\a to
the stream it was at file position 6 and the file position incremented
again to 7. I then reset the file position to 6 so the newly written
character would be read again.

Regards,
Adam

PS: You may have noticed I always use the WITH-OPEN-FILE macro. I've never
had a reason to use OPEN explicitly.
From: Adam Warner
Subject: Re: NEWBIE: Read/write IO (Random Access)/:overwrite- replace?
Date: 
Message-ID: <pan.2003.11.05.23.46.00.955165@consulting.net.nz>
> (with-open-file (stream "test" :direction :output :if-exists :supersede)
>   (format stream "~S ~S ~S~%" 'these 'are 'symbols))
> 
> The file "test" contains these characters:
> these are symbols
> 
> Now let's modify test, first using overwrite:
> 
> (with-open-file (stream "test" :direction :io :if-exists :overwrite)
>    (prin1 'a-new-symbol stream))
> 
> This works as expected, overwriting the start of the file:
> a-new-symbolmbols
> 
> Then append:
> (with-open-file (stream "test" :direction :io :if-exists :append)
>   (prin1 'a-new-symbol stream))
> 
> Again we have the expected result:
> a-new-symbolmbols
> a-new-symbol
> 
> Let's try some bidirectional operations on the original file:
> (with-open-file (stream "test" :direction :io :if-exists :overwrite)
>   (print (read stream)) (prin1 'insert stream))
> 
> The symbol `these' was read and then the text `insert' written to
> the stream.
> 
> The resulting output is(!!!):
> these are symbols
> insert
> 
> Which seems very odd to me as the file pointer should have been
> positioned after `these'.

This is now confirmed fixed in CMUCL CVS (compare revisions 1.72 and 1.74
of code/fd-stream.lisp).

Regards,
Adam