From: Richard M Kreuter
Subject: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <87ir3ky19u.fsf@progn.net>
Howcome the standard OPEN ... :IF-EXISTS actions don't include an
option for truncating the file when the stream is created?  The LispM
seems to have supported a :TRUNCATE here, implementations have always
had to support direct I/O in order to do :OVERWRITE or :APPEND, and
truncation is the natural way to open files under Unix at least; so I
wonder why there isn't a standard way to ask for this behavior.

Thanks,
RmK

From: Steven M. Haflich
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <BdK3j.78277$Um6.75251@newssvr12.news.prodigy.net>
Richard M Kreuter wrote:
> Howcome the standard OPEN ... :IF-EXISTS actions don't include an
> option for truncating the file when the stream is created?  The LispM
> seems to have supported a :TRUNCATE here, implementations have always
> had to support direct I/O in order to do :OVERWRITE or :APPEND, and
> truncation is the natural way to open files under Unix at least; so I
> wonder why there isn't a standard way to ask for this behavior.

Have you considered :supersede ?  It isn't exactly the same as :truncate
(on certain operating systems, or some processes have the file open) but
the usual behavior will be the same.

Your "direct I/O" reference doesn't really mean anything.  Perhaps you
should explain.
From: Kaz Kylheku
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <e16e4a98-ea8f-4115-87ee-3b015c0f655b@d27g2000prf.googlegroups.com>
On Nov 29, 6:05 pm, "Steven M. Haflich" <····@alum.mit.edu> wrote:
> Have you considered :supersede ?  It isn't exactly the same as :truncate
> (on certain operating systems, or some processes have the file open) but
> the usual behavior will be the same.

Even if other processes don't have the file open, the effect of
creating a new inode might be undesireable. You might be breaking a
hard link relationship.

Sometimes this is a good thing (it's basically doing copy-on-write),
sometimes it's not a good thing. I used to clone CVS repositories
using hard link farming. It was a good thing that the CVS server would
create a new object when re-writing a ,v history file, providing the
copy-on-write semantics needed to make the space-saving sharing look
invisible: commits in one copy of the repo were not seen in the other.

In other kinds situations, you may want the same object, for various
good reasons.

If there were a :TRUNCATING-OVERWRITE option, the set of possibilities
covered would be more complete.

Perhaps :NEW-VERSION could be mapped to doing a truncating overwrite
on a non-versioned filesystem. A new version means writing new data
within the same object; only in this case, the old version is lost. :)
There is that extra requirement to have :NEWEST as the version
component of the path, which is a minor annoyance.
From: Richard M Kreuter
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <87bq9by96b.fsf@progn.net>
"Steven M. Haflich" <···@alum.mit.edu> writes:
> Richard M Kreuter wrote:
>> Howcome the standard OPEN ... :IF-EXISTS actions don't include an
>> option for truncating the file when the stream is created?
>
> Have you considered :supersede ?  It isn't exactly the same as
> :truncate (on certain operating systems, or some processes have the
> file open) but the usual behavior will be the same.

Well, the CLHS entry for OPEN says that when :IF-EXISTS is :SUPERSEDE,
"a new file with the same name as the old one is created".  ISTM that
there's some difference between truncating a file and replacing one
file with another (using something like POSIX rename(), say):

* On a file system with hard links (Unix and NTFS, say), given a file
  with multiple links, replacing one of the file's names with a new
  file "breaks" one of the links, and the other file names will still
  be associated with the old contents.  Opening the file in a
  truncating way guarantees that all the file's truenames will be
  associated with the new contents.

* With Unix's screwy file permissions, a user can write to a file he
  can't delete, and can delete files he can't write to.  So the ways
  that a process can fail to create a new file or replace one file
  with another are mostly disjoint from the ways that a process can
  fail to open a file for truncation.

* Again under Unix, it's tricky and not always possible to ensure that
  a new file has functionally equivalent metadata as an existing file.
  For example, it's not usually possible for an unprivileged process
  to change the ownership of a file, and implementations might miss
  some exotic metadata like ACLs.  By contrast, Unix's truncating open
  only affects metadata describing file contents (file length, write
  date).  So software that depends on file metadata stands a chance of
  breaking under file replacing schemes.

As a result, ISTM that not having an explicit way to ask for
truncation leaves implementation-independent Lisp unable to
deliberately interoperate with Unix software in various details.

(Several implementations implement :SUPERSEDE as a truncating open on
Unix.  The conformance of this point seems debatable, but for the
detail that the term "file" is implementation-defined in CL, so that
the implementor can stipulate that their OPEN is conforming.  But this
almost makes things worse: users might assume that :SUPERSEDE is
supposed to mean a truncating open, and their programs will lose on
implementations that do it differently.  So again I think it would be
worthwhile to have a way to ask for truncating explicitly.)

> Your "direct I/O" reference doesn't really mean anything.  Perhaps you
> should explain.

I was trying to come up with a story according to which some Lisp
could do all the OPEN :IF-EXISTS actions, but be unable to to a
truncating open.  The entries for :OVERWRITE and :APPEND say "[o]utput
operations on the stream destructively modify the existing file",
which I take to mean that the existing file is to be scribbled on, and
not replaced by a new file.  I can't think of a way to do this without
at some point doing I/O directly to the existing file.  By contrast,
:NEW-VERSION, :SUPESEDE, :RENAME, :RENAME-AND-DELETE need not do I/O
to an existing file (unless a file system requires I/O to a file in
order to replace the file with another one, I suppose).

--
From: David Lichteblau
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <slrnfl0klg.jt0.usenet-2006@radon.home.lichteblau.com>
On 2007-11-30, Richard M Kreuter <·······@progn.net> wrote:
> Well, the CLHS entry for OPEN says that when :IF-EXISTS is :SUPERSEDE,
> "a new file with the same name as the old one is created".  ISTM that
> there's some difference between truncating a file and replacing one
> file with another (using something like POSIX rename(), say):

Indeed.  I was quite stumped when my SBCL heap got corrupted while
writing new files using :SUPERSEDE -- I had those files mmap()ed into
dynamic space... :-)  Of course, it is useful to have an
implementation-specific way to do O_TRUNC integrated into OPEN, but
:SUPERSEDE should not be it.

SBCL's :APPEND has a similar bug.  The spec says "The file pointer is
initially positioned at the end of the file", but SBCL maps it to
O_APPEND instead.


d.
From: Richard M Kreuter
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <877ijzy1z0.fsf@progn.net>
David Lichteblau <···········@lichteblau.com> writes:

> SBCL's :APPEND has a similar bug.  The spec says "The file pointer is
> initially positioned at the end of the file", but SBCL maps it to
> O_APPEND instead.

Well clearly the file pointer is supposed to be positioned to the end
of the file, but I don't see anything that says that says the
implementation is forbidden from opening the file with O_APPEND-y
semantics, only that the file is destructively modified.

(Also, if you're interested in being able to do random-access writes,
doing and starting at the end of the file, I would expect this to do
the trick on Unix file systems:

  (with-open-file (stream pathname :direction :overwrite)
     (file-position stream :end)
     ... stuff ...)

This might not work on some other file systems though.  For example,
while the Unix file system ensures that when you lseek to or past the
end of a file and then do some writing, the file will automatically be
resized (possibly sparsely), I think I've read that some older file
systems required you to manually allocate space for files; anyway you
can imagine a file system like that.  Maybe a CL implementation that
talked to such a file system would constrain :OVERWRITE to not resize
the file, but have :APPEND do the resizing implicitly.)

--
From: Rob Warnock
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <-KGdnUE5mLW-Rc3anZ2dnUVZ_uOmnZ2d@speakeasy.net>
Richard M Kreuter  <·······@progn.net> wrote:
+---------------
| "Steven M. Haflich" <···@alum.mit.edu> writes:
| > Your "direct I/O" reference doesn't really mean anything.
| > Perhaps you should explain.
| 
| I was trying to come up with a story according to which some Lisp
| could do all the OPEN :IF-EXISTS actions, but be unable to to a
| truncating open.  The entries for :OVERWRITE and :APPEND say "[o]utput
| operations on the stream destructively modify the existing file",
| which I take to mean that the existing file is to be scribbled on, and
| not replaced by a new file.  I can't think of a way to do this without
| at some point doing I/O directly to the existing file.  By contrast,
| :NEW-VERSION, :SUPESEDE, :RENAME, :RENAME-AND-DELETE need not do I/O
| to an existing file (unless a file system requires I/O to a file in
| order to replace the file with another one, I suppose).
+---------------

Don't use the term "Direct I/O" for this -- that term is already
in standard use in the Unix/Linux/POSIX world [following SGI's
original use of the term in the XFS filesystem on Irix] for file
I/O which goes "directly" from (to) the user program's address
space to (from) the I/O device without going through the operating
system's buffer cache. E.g., from "man 2 open" on FreeBSD:

    O_DIRECT may be used to minimize or eliminate the cache effects
    of reading and writing.  The system will attempt to avoid caching
    the data you read or write.  If it cannot avoid caching the data,
    it will minimize the impact the data has on the cache.  Use of this
    flag can drastically reduce performance if not used with care.

and on Linux:

    O_DIRECT
      Try to minimize cache effects of the I/O to and from this  file.
      In  general  this  will degrade performance, but it is useful in
      special situations, such  as  when  applications  do  their  own
      caching.   File I/O is done directly to/from user space buffers.
      The I/O is synchronous, i.e., at the completion of  the  read(2)
      or  write(2) system call, data is guaranteed to have been trans-
      ferred.  Under Linux 2.4 transfer sizes, and  the  alignment  of
      user buffer and file offset must all be multiples of the logical
      block size of the file system.  Under  Linux  2.6  alignment  to
      512-byte boundaries suffices.
      A  semantically similar interface for block devices is described
      in raw(8).

"Direct I/O" usually implies (but does not guarantee) "zero-copy DMA".


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Steven M. Haflich
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <VB54j.27678$lD6.1770@newssvr27.news.prodigy.net>
Richard M Kreuter wrote:
> "Steven M. Haflich" <···@alum.mit.edu> writes:
>> Richard M Kreuter wrote:
>>> Howcome the standard OPEN ... :IF-EXISTS actions don't include an
>>> option for truncating the file when the stream is created?
>> Have you considered :supersede ?  It isn't exactly the same as
>> :truncate (on certain operating systems, or some processes have the
>> file open) but the usual behavior will be the same.
> 
> Well, the CLHS entry for OPEN says that when :IF-EXISTS is :SUPERSEDE,
> "a new file with the same name as the old one is created".  ISTM that
> there's some difference between truncating a file and replacing one
> file with another (using something like POSIX rename(), say):

Yes, that is why I wrote that it was not _exactly_ the same.

Rather than argue whether Unix's file permissions are "screwy" (you 
should go back half a human lifespan and read some of the early 
descriptions behind the design of Unix.  But that has little bearing on
your literal question, which I will procede no
w to answer:

    Why doesn't (open ... :if-exists) support :truncate?

The ANSI CL standardization effort was conceived as a way of unifying
contemporary (ca 1988) Lisp programming from a number of divergent
dialects and running on a number of very different platforms, from Lisp
machines and Unix boxes to wrist watches.  The filesystem interface (not
to mention the pathname system) was something of a reasonable subset of
possible functionality that would be implementable on most platforms of
interest.  At the time it wasn't perceived as a crucial problem making
the interface capable of every nuance that any particular operating
system could support.  Language portability was considered essential.

If you think :truncate is important, consider whether the on-disk
file system should support a Lisp property lisp on every inode.  The
Lisp Machine filesystem did, and if all file systems supported this the
world of files would be a stranger but more interesting place.  (Ken 
Thompson: "We have persistent objects, they're called files.")

Any implementation could legally be extended to support :truncate on
operating systems that can implement it.  But I suspect it couldn't be
implemented portably on every OS of interest nearly 20 years ago.
Various implementations have extensions that provide more complete and
more modern control of what most OS file systems now provide, but it
remains different even papering over the operational differences between
*nix and Windows.

The observation is that it is not always possible to make a high-level
interface to low-level capabilities.  This should not be surprising.
From: Richard M Kreuter
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <87y7cdwt1v.fsf@progn.net>
"Steven M. Haflich" <···@alum.mit.edu> writes:
> Richard M Kreuter wrote:
>    Why doesn't (open ... :if-exists) support :truncate?
>
> The ANSI CL standardization effort was conceived as a way of
> unifying contemporary (ca 1988) Lisp programming... it wasn't
> perceived as a crucial problem making the interface capable of every
> nuance that any particular operating system could support.

Yes, I know.  I wondered, however, whether there was a particular
reason that CL lacked a truncating open (or, indeed, any standard way
to truncate an existing file).  As it turns out, others had the same
question even before ANSI standardization [1], so perhaps file
truncation wasn't too unusual a feature back then.

> Any implementation could legally be extended to support :truncate on
> operating systems that can implement it. 

And yet nobody seems to.  Instead, Allegro, CMUCL, SBCL, Clisp, ECL
implement :SUPERSEDE as a truncating open on the Linux hosts where
I've checked. (OpenMCL creates a new file; I don't have access to
other implementations at the moment).  ISTM slightly weird that so
many implementations do something that's contrary to the
straightforward interpretation of the standard, inasmuch as users
aren't really justified in expecting this behavior from :SUPERSEDE
from the language in ANSI.

> Various implementations have extensions that provide more complete
> and more modern control of what most OS file systems now provide,
> but it remains different even papering over the operational
> differences between *nix and Windows.

Of course.  I'd like to note, however, that while most file system
features are easy enough to take advantage of through something like
an FFI, OPEN is one of Lisp's "high level primitives": it has to both
deal with an operating system and construct a suitable Lisp
file-stream.  So even where calling into C is available, rolling your
own OPEN nevertheless remains implementation-dependent territory, and
some implementations don't have a public interface for making Lisp
streams from things like file descriptors or file handles.

--

[1] I found the following a couple nights ago at

http://www.saildart.org/prog/LSP/html/019982?6740,60510

09-Aug-83  0809	@USC-ECL,@··········@SCRC-TENEX 	File opening, :TRUNCATE    
Received: from USC-ECL by SU-AI with TCP/SMTP; 9 Aug 83  08:09:27 PDT
Received: from MIT-XX by USC-ECL; Tue 9 Aug 83 08:07:50-PDT
Received: from SCRC-BEAGLE by SCRC-SPANIEL with CHAOS; Tue 9-Aug-83 11:04:41-EDT
Date: Tuesday, 9 August 1983, 11:04-EDT
From: Bernard S. Greenberg <BSG at SCRC-TENEX>
Subject: File opening, :TRUNCATE
To: Common-Lisp%SU-AI at USC-ECL
Cc: File-protocol at SCRC-TENEX

Was it ever proposed or rejected that there be a :IF-EXISTS
:TRUNCATE, being like :OVERWRITE, except that the file content
is effectively set to empty before writing starts?  There is
need for such a thing, and it is a natural behavior on many
systems.  

The default :IF-EXISTS of :ERROR is not useful on file systems
that do not have versions (note that a version of :NEWEST
changes the default to :NEW-VERSION).   We propose that the
default :IF-EXISTS be changed to :SUPERSEDE for file sytems
that do not have versions.   

Is there any reason why :IF-EXISTS is ignored in :OUTPUT/:IO
instead of generating an error?  
From: Mark H.
Subject: Re: Why doesn't (open ... :if-exists) support :truncate?
Date: 
Message-ID: <011c2fe6-e5d7-43f6-9b12-88aedc0ef771@s19g2000prg.googlegroups.com>
The ECL mailing list had a discussion on this topic a couple weeks
back.  I think somebody patched ECL so that :SUPERSEDE has semantics
like a transaction:  it writes to a temp file, and only "commits"
(by shifting hard links around so that the original file's name
links to the new file) when the temp file is closed.  This would
ensure, for example, that the original file would never contain
incomplete data.

mfh