Pathnames (beginner issues)

From: grackle
Subject: Pathnames (beginner issues)
Date: Sat, 16 Dec 2006 19:33:47 +0000
Message-ID: <1166297627.532793.213090@n67g2000cwd.googlegroups.com>

I just learned not to depend on the #P printed version of pathnames.
In the instance that stung me, SBCL on Linux requires that :device must
be nil (as far as I can tell) when accessing the filesystem.
Specifying "" or "/" as the device component creates an unusable
pathname that looks correct when printed.  I ran into this when
debugging some code that specified "/" as the device component when
constructing a Unix pathname.  :device "/" evidently works on some
implementations of Common Lisp on Unix, but not others.

Moral #1:  If a pathname look good but doesn't work, examine its
components using pathname-host, pathname-device, etc.

Moral #2:  If you want to specify an absolute pathname, it's probably
more portable to use (pathname <string>) than to assemble a pathname
from pieces.  When assembling a new pathname using components from an
existing pathname, make-pathname is the more portable solution.

Logical pathnames are puzzling.  They seem to be a mechanism around
which a portability layer for mapping strings to pathnames can be
built, which is fine and dandy, but why in the world was such a thing
standardized?  Was it regarded as indispensible for day-to-day
programming, like regexes are today?

-David

Re: Pathnames (beginner issues) Pascal Bourguignon
Re: Pathnames (beginner issues) Harald Hanche-Olsen
- Re: Pathnames (beginner issues) Pascal Bourguignon
  - Re: Pathnames (beginner issues) Harald Hanche-Olsen
Re: Pathnames (beginner issues) Richard M Kreuter

From: Pascal Bourguignon
Subject: Re: Pathnames (beginner issues)
Date: Sat, 16 Dec 2006 20:46:08 +0000
Message-ID: <871wmzh8mn.fsf@thalassa.informatimago.com>

"grackle" <···········@gmail.com> writes:

> I just learned not to depend on the #P printed version of pathnames.
> In the instance that stung me, SBCL on Linux requires that :device must
> be nil (as far as I can tell) when accessing the filesystem.
> Specifying "" or "/" as the device component creates an unusable
> pathname that looks correct when printed.  I ran into this when
> debugging some code that specified "/" as the device component when
> constructing a Unix pathname.  :device "/" evidently works on some
> implementations of Common Lisp on Unix, but not others.
>
> Moral #1:  If a pathname look good but doesn't work, examine its
> components using pathname-host, pathname-device, etc.

Indeed.  I use:

(defun print-pathname (p)
  (format t "~&·····@(~9A~) : ~S~&~}~}"
          (mapcar (lambda (name field) (list name (funcall field p)))
                  '(host device directory name type version)
                  '(pathname-host pathname-device pathname-directory
                    pathname-name pathname-type pathname-version)))
  p)


(defun compare-pathnames (p1 p2)
  (flet ((compare (name field)
           (unless (equal (funcall field p1) (funcall field p2))
             (format t "~&~A DIFFERENT: ~A /= ~A~%"
                     name (funcall field p1) (funcall field p2)))))
    (compare 'host      (function pathname-host))
    (compare 'device    (function pathname-device))
    (compare 'directory (function pathname-directory))
    (compare 'name      (function pathname-name))
    (compare 'type      (function pathname-type))
    (compare 'version   (function pathname-version))))



> Moral #2:  If you want to specify an absolute pathname, it's probably
> more portable to use (pathname <string>) than to assemble a pathname
> from pieces.  When assembling a new pathname using components from an
> existing pathname, make-pathname is the more portable solution.

Yes.


> Logical pathnames are puzzling.  They seem to be a mechanism around
> which a portability layer for mapping strings to pathnames can be
> built, which is fine and dandy, but why in the world was such a thing
> standardized?  Was it regarded as indispensible for day-to-day
> programming, like regexes are today?

See: LOAD-LOGICAL-PATHNAME-TRANSLATIONS

In theory, you should be able to install the files somewhere, edit a
logical pathname translation file in the right place in your system,
and then you just have to:

(LOAD-LOGICAL-PATHNAME-TRANSLATIONS "MYAPP")
(LOAD "MYAPP:LOADER.LISP")

to have your application loading all the files it needs from whatever
file system it's installed on.

Thanks to these logical pathname translations, you can even gather
pieces from different parts of the file system and put them in a
"virtual" place, so you can select different files just by modifying
the logical pathname translation.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Specifications are for the weak and timid!"

From: Harald Hanche-Olsen
Subject: Re: Pathnames (beginner issues)
Date: Sat, 16 Dec 2006 21:24:08 +0000
Message-ID: <pcok60rttzb.fsf@shuttle.math.ntnu.no>

+ "grackle" <···········@gmail.com>:

| I just learned not to depend on the #P printed version of pathnames.
|
| Moral #1:  If a pathname look good but doesn't work, examine its
| components using pathname-host, pathname-device, etc.

Hmm.  It seems wrong to me to print a pathname using the #P notation
if inputting the same does not produce a functionally equivalent
pathname.  Shouldn't the resulting pathname print along the lines of
#<unprintable pathname ...> instead?

The following seems perverse to me.

cl-user> (make-pathname :device ""
		      :directory '(:absolute "tmp")
		      :name "ClassicScanComplete")
#P"/tmp/ClassicScanComplete"
cl-user> (probe-file *)
nil
cl-user> (probe-file #P"/tmp/ClassicScanComplete")
#P"/tmp/ClassicScanComplete"
cl-user> 

| Logical pathnames are puzzling.  They seem to be a mechanism around
| which a portability layer for mapping strings to pathnames can be
| built, which is fine and dandy, but why in the world was such a
| thing standardized?

Perhaps because filesystems and the conventions for naming files in
these filesystems were vastly more different in those days than they
are today?

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell

From: Pascal Bourguignon
Subject: Re: Pathnames (beginner issues)
Date: Sat, 16 Dec 2006 21:51:53 +0000
Message-ID: <87hcvvfr0m.fsf@thalassa.informatimago.com>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> + "grackle" <···········@gmail.com>:
>
> | I just learned not to depend on the #P printed version of pathnames.
> |
> | Moral #1:  If a pathname look good but doesn't work, examine its
> | components using pathname-host, pathname-device, etc.
>
> Hmm.  It seems wrong to me to print a pathname using the #P notation
> if inputting the same does not produce a functionally equivalent
> pathname.  Shouldn't the resulting pathname print along the lines of
> #<unprintable pathname ...> instead?
>
> The following seems perverse to me.

The point is that it's implementation dependant.  You need to be very
careful to build a pathname that's not implementation
dependant.  Actually, only logical pathnames can be built in an
implementation independant way.

Notably, in clisp, "" is invalid for a :device.

> cl-user> (make-pathname :device ""
> 		      :directory '(:absolute "tmp")
> 		      :name "ClassicScanComplete")
> #P"/tmp/ClassicScanComplete"
> cl-user> (probe-file *)
> nil
> cl-user> (probe-file #P"/tmp/ClassicScanComplete")
> #P"/tmp/ClassicScanComplete"
> cl-user> 

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
You never feed me.
Perhaps I'll sleep on your face.
That will sure show you.

From: Harald Hanche-Olsen
Subject: Re: Pathnames (beginner issues)
Date: Sat, 16 Dec 2006 22:26:36 +0000
Message-ID: <pcopsajmq8z.fsf@shuttle.math.ntnu.no>

+ Pascal Bourguignon <···@informatimago.com>:

| Harald Hanche-Olsen <······@math.ntnu.no> writes:
|
|> + "grackle" <···········@gmail.com>:
|>
|> | I just learned not to depend on the #P printed version of pathnames.
|> |
|> | Moral #1:  If a pathname look good but doesn't work, examine its
|> | components using pathname-host, pathname-device, etc.
|>
|> Hmm.  It seems wrong to me to print a pathname using the #P notation
|> if inputting the same does not produce a functionally equivalent
|> pathname.  Shouldn't the resulting pathname print along the lines of
|> #<unprintable pathname ...> instead?
|>
|> The following seems perverse to me.
|
| The point is that it's implementation dependant.  You need to be
| very careful to build a pathname that's not implementation
| dependant.

I know that, but /my/ point is with how pathnames are printed, not
with what happens when you build them.

Your response led me to check the CLHS more carefully.

First, the section on Variable *PRINT-READABLY*

  If *print-readably* is true, some special rules for printing objects
  go into effect. Specifically, printing any object O1 produces a
  printed representation that, when seen by the Lisp reader while the
  standard readtable is in effect, will produce an object O2 that is
  similar to O1.

Thus, since SBCL prints the result of

  (make-pathname :device ""
                 :directory '(:absolute "tmp")
                 :name "ClassicScanComplete")

as #P"/tmp/ClassicScanComplete" even when *print-readably* is true,
then reading in the latter should produce a pathname that is similar
to the result of the former, right?

So we turn next to section 3.2.4.2.2 Definition of Similarity

  Two objects S (in source code) and C (in compiled code) are defined
  to be similar if and only if they are both of one of the types
  listed here (or defined by the implementation) and they both satisfy
  all additional requirements of similarity indicated for that type.

  pathname

    Two pathnames S and C are similar if all corresponding pathname
    components are similar.

But the two pathnames, both of which print in exactly the same name,
has a pathname-device of, respectively, "" and nil, so they are not
similar.

If the result of the above (make-pathname ...) were to print as
#<unprintable pathname ...>, then the OP might have less confused, or
at least confused at a higher level.  8-)

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell

From: Richard M Kreuter
Subject: Re: Pathnames (beginner issues)
Date: Sun, 17 Dec 2006 07:15:02 +0000
Message-ID: <87psajuh6x.fsf@progn.net>

"grackle" <···········@gmail.com> writes:

> I just learned not to depend on the #P printed version of pathnames.
> In the instance that stung me, SBCL on Linux requires that :device must
> be nil (as far as I can tell) when accessing the filesystem.
> Specifying "" or "/" as the device component creates an unusable
> pathname that looks correct when printed.  
> I ran into this when debugging some code that specified "/" as the
> device component when constructing a Unix pathname.  :device "/"
> evidently works on some implementations of Common Lisp on Unix, but
> not others.

This is probably a bug in SBCL: in SBCL, if you can namestring-ify a
pathname, then reading the namestring back in is supposed to give you
a pathname that's EQUAL to the one you started with.  And although
EQUALity for pathname components is implementation-defined, SBCL
evidently doesn't treat nil as equivalent to "/" for devices:

(equal (make-pathname  :device "/") (make-pathname :device nil))
NIL

Perhaps you should contact the SBCL developers.

Note: the print-read roundtrip equality of pathnames is not generally
safe to assume for other implementations, but it's supposed to hold in
SBCL.

> Moral #1:  If a pathname look good but doesn't work, examine its
> components using pathname-host, pathname-device, etc.

In general, there are 4 sorts of things you can (try to) do with
a pathname:

(1) merge it with another pathname (either as the merge-ee or as the
    defaults)

(2) make a namestring out of it,

(3) open a file with it,

(4) supply it as an argument to a filesystem operation (chapter 20).

Not every pathname can be used for all 4 of these:

* the pathname whose name, type, and version are :wild might be passed
  to cl:directory, but not cl:open.

* the pathname whose directory is (:relative :back) can be used for
  merging, but not namestring-ifying, opening, or any filesystem
  operation (unless that operation implicitly merges it).

There are other cases.  AFAIU, the only thing you can do with all of
them is (1).

> Moral #2:  If you want to specify an absolute pathname, it's probably
> more portable to use (pathname <string>) than to assemble a pathname
> from pieces.  When assembling a new pathname using components from an
> existing pathname, make-pathname is the more portable solution.

Using a string as a pathname designator means that the program will be
subject to implementation-defined namestring parsing.  Using
merge-pathnames to construct a pathname means that the program will be
subject to implementation-supplied make-pathname behavior [1].  You
may well lose in either case, depending on where you get the strings
from, and unless the strings are uninteresting (where the definition
of "uninteresting" is implementation-specific).

[1] Some implementations' make-pathname act as if they performed
parsing on the strings supplied as arguments, while others construct
pathnames with the strings as verbatim components.

;SBCL
* (wild-pathname-p "*")
T
* (make-pathname :name "*")
#P"\\*"
* (wild-pathname-p *)
NIL

;Allegro
CL-USER(1): (wild-pathname-p "*")
T
CL-USER(2): (make-pathname :name "*")
#P"*"
CL-USER(3): (wild-pathname-p *)
NIL

;OpenMCL
? (wild-pathname-p "*")
0
? (make-pathname :name "*")
#P"*"
? (wild-pathname-p *)
0

;Clisp
[1]> (wild-pathname-p "*")
T
[2]> (make-pathname :name "*")
#P"*"
[3]> (wild-pathname-p *)
T

;ECL
> (wild-pathname-p "*")
T
> (make-pathname :name "*")
#P"*"
> (wild-pathname-p *)
T

Similar results may occur when question marks, dots, or left-brackets
occur in string arguments to make-pathname.

> Logical pathnames are puzzling.  They seem to be a mechanism around
> which a portability layer for mapping strings to pathnames can be
> built...

Technically the logical pathnames facility maps pathnames to
pathnames.  Any place you see a string supplied to an operator that
takes a pathname designator, the string gets parsed.

> which is fine and dandy, but why in the world was such a thing
> standardized?  

Logical pathnames are meant to facilitate portability of applications
to different deployment environments.  The programmer can more or less
hard-code logical pathname namestrings into the program without
constraining the user; the user can move the program's files around
relatively freely, and then define a set of translations for the
program to use to find the files at runtime.

> Was it regarded as indispensible for day-to-day programming, like
> regexes are today?

Debatable.  How many Unix programs have some sort of ad-hoc "where do
I find my configuration file that tells me where I find my files"
machinery?  How much of that stuff is ever transparent to the
programmer?

--
RmK