From: Richard M Kreuter
Subject: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <87odnlmw2f.fsf@progn.net>
Can the elements of the list returned by DIRECTORY be wild pathnames?

Thanks,
RmK

From: Barry Margolin
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <barmar-A9A5B7.22114722022007@comcast.dca.giganews.com>
In article <··············@progn.net>,
 Richard M Kreuter <·······@progn.net> wrote:

> Can the elements of the list returned by DIRECTORY be wild pathnames?

I don't think we thought to explicitly disallow it, but I can't see what 
sense it could make to do so.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Richard M Kreuter
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <87hctdmq7g.fsf@progn.net>
Barry Margolin <······@alum.mit.edu> writes:
>  Richard M Kreuter <·······@progn.net> wrote:
>
>> Can the elements of the list returned by DIRECTORY be wild pathnames?
>
> I don't think we thought to explicitly disallow it, but I can't see what 
> sense it could make to do so.

Do you mean that you can't see what sense it would make to explicitly
disallow it, or that you can't see what sense it would make to return
wild pathnames?  (I'm sorry, but I seem to be parsing that sentence
both ways.)

Thanks,
RmK
From: Barry Margolin
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <barmar-D26217.22364523022007@comcast.dca.giganews.com>
In article <··············@progn.net>,
 Richard M Kreuter <·······@progn.net> wrote:

> Barry Margolin <······@alum.mit.edu> writes:
> >  Richard M Kreuter <·······@progn.net> wrote:
> >
> >> Can the elements of the list returned by DIRECTORY be wild pathnames?
> >
> > I don't think we thought to explicitly disallow it, but I can't see what 
> > sense it could make to do so.
> 
> Do you mean that you can't see what sense it would make to explicitly
> disallow it, or that you can't see what sense it would make to return
> wild pathnames?  (I'm sorry, but I seem to be parsing that sentence
> both ways.)

I meant I can't see the sense in returning wild pathnames.  The point of 
DIRECTORY is to take a wild pathname and enumerate all the files that 
match it.

The specification says that the returned list of pathnames are the 
truenames of the matching files.  Is there any way in which a wild 
pathname could conceivably be the truename of a specific file?

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Rob Warnock
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <ZJ-dnfYxJ4gQhX3YnZ2dnUVZ_qunnZ2d@speakeasy.net>
Barry Margolin  <······@alum.mit.edu> wrote:
+---------------
| Is there any way in which a wild pathname could
| conceivably be the truename of a specific file?
+---------------

Try this with your favorite shell & CL:

    $ echo "harmless, really" >badnamesortof
    $ echo "bad file, BAD!" >badname\*
    $ ls -l badname*
    -rw-r--r--  1 rpw3  rpw3  15 Feb 24 02:30 badname*
    -rw-r--r--  1 rpw3  rpw3  17 Feb 24 02:36 badnamesortof
    $ cmu
    cmu> (pathname "badname*")

    #p"badname*"
    cmu> (describe *)

    #p"badname*" is a structure of type PATHNAME.
    HOST: #<COMMON-LISP::UNIX-HOST>.
    DEVICE: NIL.
    DIRECTORY: NIL.
    NAME: #<COMMON-LISP::PATTERN "badname" :MULTI-CHAR-WILD>.
    TYPE: NIL.
    VERSION: :NEWEST.
    cmu> (directory **)

    (#p"/u/rpw3/badname\\*" #p"/u/rpw3/badnamesortof")
    cmu> (directory (car *))

    (#p"/u/rpw3/badname\\*")
    cmu> (truename (car *))

    #p"/u/rpw3/badname\\*"
    cmu> (with-open-file (s *)
	   (loop for line = (read-line s nil nil)
		 while line do
	     (format t "~a~%" line)))
    bad file, BAD!
    NIL
    cmu> (describe **)

    #p"/u/rpw3/badname\\*" is a structure of type PATHNAME.
    HOST: #<COMMON-LISP::UNIX-HOST>.
    DEVICE: NIL.
    DIRECTORY: (:ABSOLUTE "u" "rpw3").
    NAME: "badname*".
    TYPE: NIL.
    VERSION: :NEWEST.
    cmu> 

Hmmmm... That's not actually a "wild pathname", is it?

Though note that the *printed* representations of both
the "real" wild pathname and the individual file were the
same, and if one were particularly sloppy in subsequent
manipulations one might get back to a wild pathname by
mistake, e.g.:

    cmu> (pathname (pathname-name ***))

    #p"badname*"
    cmu> (describe *)

    #p"badname*" is a structure of type PATHNAME.
    HOST: #<COMMON-LISP::UNIX-HOST>.
    DEVICE: NIL.
    DIRECTORY: NIL.
    NAME: #<COMMON-LISP::PATTERN "badname" :MULTI-CHAR-WILD>.
    TYPE: NIL.
    VERSION: :NEWEST.
    cmu> 

Oops.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Richard M Kreuter
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <87vehrkwoe.fsf@progn.net>
Barry Margolin <······@alum.mit.edu> writes:

> The specification says that the returned list of pathnames are the
> truenames of the matching files.  Is there any way in which a wild
> pathname could conceivably be the truename of a specific file?

Not on any file system I've ever heard of, no.  I don't see anything
that says that a truename can't be a wild pathname, however.

For my part, I was hoping there'd be some way to infer from the spec
that returning wild pathnames from DIRECTORY would be forbidden.  As
it is, the DIRECTORY in several implementations returns pathnames that
the implementation construes as wild (which prevents you from opening
them), or doesn't construe as wild but nonetheless treats as wild
under certain operators (which is just plain weird) if some files'
names resemble wild namestrings.

--
RmK
From: Barry Margolin
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <barmar-D88EB5.21055524022007@comcast.dca.giganews.com>
In article <··············@progn.net>,
 Richard M Kreuter <·······@progn.net> wrote:

> Barry Margolin <······@alum.mit.edu> writes:
> 
> > The specification says that the returned list of pathnames are the
> > truenames of the matching files.  Is there any way in which a wild
> > pathname could conceivably be the truename of a specific file?
> 
> Not on any file system I've ever heard of, no.  I don't see anything
> that says that a truename can't be a wild pathname, however.
> 
> For my part, I was hoping there'd be some way to infer from the spec
> that returning wild pathnames from DIRECTORY would be forbidden.  As
> it is, the DIRECTORY in several implementations returns pathnames that
> the implementation construes as wild (which prevents you from opening
> them), or doesn't construe as wild but nonetheless treats as wild
> under certain operators (which is just plain weird) if some files'
> names resemble wild namestrings.

But it's not returning pathnames with :WILD or :WILD-INFERIORS in it, is 
it?  That's the only thing that the spec considers to be a "wild 
pathname", I believe.

If the file system's native syntax is ambiguous, that's a separate issue.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Richard M Kreuter
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <87r6seklku.fsf@progn.net>
Barry Margolin <······@alum.mit.edu> writes:
> In article <··············@progn.net>, Richard M Kreuter <·······@progn.net> wrote:
>
>> [T]he DIRECTORY in several implementations returns pathnames that
>> the implementation construes as wild... or... treats as wild... if
>> some files' names resemble wild namestrings.
>
> But it's not returning pathnames with :WILD or :WILD-INFERIORS in
> it, is it?  That's the only thing that the spec considers to be a
> "wild pathname", I believe.

:WILD and :WILD-INFERIORS are the only portable wild component values,
but 19.2.2.3 explains that other things can be wild components.

Further, I think that an argument can be made that wildness is
supposed to be equivalent to "might match more than one pathname", no
matter what the wild components are: the glossary defines a wild
pathname as a pathname that might match several pathnames.  If that
definition is enough to say that the set of wild pathnames is
equivalent to those that "might match more than one pathname", then
the argument is done.

However, if the definition is only a one-way implication from "wild"
to "might match more than one pathname", then since the interface to
the matching relation, PATHNAME-MATCH-P, is supposed to be consistent
with DIRECTORY, and since DIRECTORY is intended to return at most one
pathname when its argument is non-wild, the intent seems to be that a
non-wild pathname is not supposed to match more than one pathname
under the matching relation.

So wild pathnames are supposed to be all and only those pathnames that
can match more than one pathname, no matter what the wild components
are.  At least that's the intent, right?

> If the file system's native syntax is ambiguous, that's a separate
> issue.

What ambiguity are you referring to?

On Unix, the notation for denoting variably-many files lets you
specify a verbatim character by escaping it.  This notation is
unambiguous, and can express any possible filename in a determinate,
non-wild way.  (A quick googling suggests that different but analogous
issues exist on Windows, that and different but analogous escaping may
be available there, in some cases.)  Given that most implementations
on Unix seem to have their namestrings resemble Unix's
variably-many-files notation (more or less), it would be reasonably
principled for implementations to use that notation's escaping for
namestrings.  This way, for instance, a file named "foo*bar" would be
represented in a namestring as "foo\\*bar", and accordingly in a
pathname.

But maybe you're referring to the ambiguity that comes from having the
same string mean different things to different routines (e.g., open()
and readdir() versus glob()).  That is ugly and unfortunate, but it's
just an implementation issue: if an implementation provides a
namestring notation that's supposed to resemble and denote similarly
to routine A's argument notation and not routine B's argument
notation, shouldn't the implementation be expected to translate
arguments from A's notation into something that appropriately
preserves the semantics when calling B, and vice versa?

Of course I'm not suggesting that CL should specify implementation
details that way.  The trouble is that most implementations don't seem
to notice that such translations are called for if they support fancy
wildcards or namestring syntax for :WILD and :WILD-INFERIORS.
Consequently, these implementations are unable to denote a range of
corner-case filenames usably (i.e., with a pathname you can pass to
OPEN), or else denote these filenames by having arguably-nonconformant
definitions of wild pathname (e.g., by having pathnames for which
DIRECTORY returns several pathnames and which match several non-wild
pathnames under PATHNAME-MATCH-P, but that WILD-PATHNAME-P says aren't
wild and that can be passed to OPEN).

What made me start this thread was a program that was trying to
process files written by a web spider, which spider created some files
with question marks in the name.  So even though the files existed,
would have been openable according to the operating system, and were
listed by the implementation, the implementation couldn't operate on
these files, because of a failure to distinguish verbatim file name
notation and the implementation's namestring notation:

(let ((pathnames (directory ...)))
  (dolist (path pathnames)
    (with-open-file (s path)
       ...)))
!! Error: pathname #P"foo?bar=baz" is wild.

It seemed to me that if it could be inferred that DIRECTORY had to
produce non-wild pathnames, then implementors would be forced to
recognize that different contexts use different notations, and so the
possibility that an implementation could choke on its own directory
listings would be precluded.

--
RmK
From: Vassil Nikolov
Subject: Re: Can DIRECTORY return wild pathnames?
Date: 
Message-ID: <yy8v7iu815cx.fsf@eskimo.com>
On Thu, 22 Feb 2007 21:44:56 -0500, Richard M Kreuter <·······@progn.net> said:

| Can the elements of the list returned by DIRECTORY be wild pathnames?

  I don't know if this is a case that you had in mind, but... here there
  be dragons:

     $ touch "*"  #in an empty directory
     $ clisp -q
    [1]> (wild-pathname-p (first (directory #P"*")))
    T

  ---Vassil.

-- 
Our programs do not have bugs; it is just that the users' expectations
differ from the way they are implemented.