From: Bernd Beuster
Subject: List of all files in directory
Date: 
Message-ID: <clga8g$p0b$1@online.de>
I wonder if there is an easy functional implementation for finding all 
files in a given directory, i.e. the equivalent of the Unix command:

$ find <path> -type f

For example:

$ find . -type f
./a/1
./a/2
./a/3
./a/a/1
./a/a/2
./a/a/3
./a/b/1
./a/c/1
./a/c/2
./a/c/3
./b/1
./b/2
./b/3

(find-files #p"./") ==> (./a/1 ./a/2 ... ./b/3)


Currently I use an imperative version what is sufficient for my 
application in mind (CDDB database), because it has only one directory 
level.  System is CMUCL on Linux.


;; Is there a functional implementation with arbitrary directory depth?
((defun find-files (path)
   "Find all files in PATH. Only one additional directory is allowed."
   (let ((files '()))
     (dolist (d (directory path))
       (setf files (nconc files (directory d))))
   files))

From: Paul Khuong
Subject: Re: List of all files in directory
Date: 
Message-ID: <a828a711.0410241151.2831d3@posting.google.com>
Bernd Beuster <·············@lycos.de> wrote in message news:<············@online.de>...
> I wonder if there is an easy functional implementation for finding all 
> files in a given directory, i.e. the equivalent of the Unix command:
[...]
> Currently I use an imperative version what is sufficient for my 
> application in mind (CDDB database), because it has only one directory 
> level.  System is CMUCL on Linux.
> 
> 
> ;; Is there a functional implementation with arbitrary directory depth?
> ((defun find-files (path)
>    "Find all files in PATH. Only one additional directory is allowed."
>    (let ((files '()))
>      (dolist (d (directory path))
>        (setf files (nconc files (directory d))))
>    files))
Paths can have wildcards: http://www.lisp.org/HyperSpec/Body/fun_directory.html
"Function DIRECTORY

Syntax:

directory pathspec &key => pathnames

Arguments and Values:

pathspec---a pathname designator, which may contain wild components.

pathnames---a list of physical pathnames. "

Paul Khuong
From: Bernd Beuster
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgaj0$p0b$2@online.de>
Bernd Beuster wrote:

> (find-files #p"./") ==> (./a/1 ./a/2 ... ./b/3)

Should be

(find-files #p"./") ==> (#p"./a/1" #p"./a/2" ... #p"./b/3")
From: Peter Seibel
Subject: Re: List of all files in directory
Date: 
Message-ID: <m3acuc2jqx.fsf@javamonkey.com>
Bernd Beuster <·············@lycos.de> writes:

> I wonder if there is an easy functional implementation for finding all
> files in a given directory, i.e. the equivalent of the Unix command:
>
> $ find <path> -type f
>
> For example:
>
> $ find . -type f
> ./a/1
> ./a/2
> ./a/3
> ./a/a/1
> ./a/a/2
> ./a/a/3
> ./a/b/1
> ./a/c/1
> ./a/c/2
> ./a/c/3
> ./b/1
> ./b/2
> ./b/3
>
> (find-files #p"./") ==> (./a/1 ./a/2 ... ./b/3)
>
>
> Currently I use an imperative version what is sufficient for my
> application in mind (CDDB database), because it has only one directory
> level.  System is CMUCL on Linux.
>
>
> ;; Is there a functional implementation with arbitrary directory depth?
> ((defun find-files (path)
>    "Find all files in PATH. Only one additional directory is allowed."
>    (let ((files '()))
>      (dolist (d (directory path))
>        (setf files (nconc files (directory d))))
>    files))

You might want to take a look at the portable pathname library I
develop in my upcoming Common Lisp book.

  <http://www.gigamonkeys.com/book/practical-a-portable-pathname-library.html>

Dealing with pathnames/filenames tends to be not only OS specific but
Lisp-implementation specific. The library in that chapter tries to
abstract away as many of those differences as possible.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp
From: ·········@random-state.net
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgo4i$d5dh0$1@midnight.cs.hut.fi>
Bernd Beuster <·············@lycos.de> wrote:

> I wonder if there is an easy functional implementation for finding all 
> files in a given directory, i.e. the equivalent of the Unix command:

Doing portably and reliably with pure CL is effectively impossible, but
implementation specific solutions exist. If you're willing to use a little
UFFI-portable extension library called Osicat[*] it's downright trivial:
simple. Quick hack:

(defun find-files (directory &key (test #'identity))
    (mapcan (lambda (pathname)
              (cond ((eq :directory (osicat:file-kind pathname))
                     (find-files pathname :test test))
                    ((funcall test pathname)
                     (list pathname))))
            (osicat:mapdir #'merge-pathnames (pathname directory))))

[*] <http://common-lisp.net/project/osicat/>

Cheers,

 -- Nikodemus                   "Not as clumsy or random as a C++ or Java. 
                             An elegant weapon for a more civilized time."
From: Harald Hanche-Olsen
Subject: Re: List of all files in directory
Date: 
Message-ID: <pco6550uoes.fsf@shuttle.math.ntnu.no>
+ Bernd Beuster <·············@lycos.de>:

| I wonder if there is an easy functional implementation for finding all
| files in a given directory, i.e. the equivalent of the Unix command:
| 
| $ find <path> -type f

Something like

(loop for pathname in
      (directory #P"/some/path/**/*.*")
      when (pathname-name pathname)
      collect pathname)

might be a good start.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Frank Buss
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgit1$pka$1@newsreader2.netcologne.de>
Harald Hanche-Olsen <······@math.ntnu.no> wrote:

> (loop for pathname in
>       (directory #P"/some/path/**/*.*")
>       when (pathname-name pathname)
>       collect pathname)

on Windows with LispWorks this doesn't find files without an extension and 
prints sub-directories, too, not only files.

-- 
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
From: Harald Hanche-Olsen
Subject: Re: List of all files in directory
Date: 
Message-ID: <pcok6tgm52w.fsf@shuttle.math.ntnu.no>
+ Frank Buss <··@frank-buss.de>:

| Harald Hanche-Olsen <······@math.ntnu.no> wrote:
| 
| > (loop for pathname in
| >       (directory #P"/some/path/**/*.*")
| >       when (pathname-name pathname)
| >       collect pathname)
| 
| on Windows with LispWorks this doesn't find files without an
| extension and prints sub-directories, too, not only files.

Well, I think it is very hard to do this sort fo thing in a truly
portable way.  The OP specified CMUCL on Linux, and there it should
work.  I'm curious, though:  How are pathnames without an extension
represented in LispWorks, not to mention directories?  I dare not
state that what you are reporting is non-conforming, but it sure seems
odd to me.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Frank Buss
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgogu$6kg$1@newsreader2.netcologne.de>
Harald Hanche-Olsen <······@math.ntnu.no> wrote:

> Well, I think it is very hard to do this sort fo thing in a truly
> portable way.  The OP specified CMUCL on Linux, and there it should
> work.  I'm curious, though:  How are pathnames without an extension
> represented in LispWorks, not to mention directories?  I dare not
> state that what you are reporting is non-conforming, but it sure seems
> odd to me.

sorry, I was wrong, it finds files without extensions, too, but also 
directories:

(#P"C:/tmp/test/t" #P"C:/tmp/test/dir1/t" #P"C:/tmp/test/dir1/dir2/" 
#P"C:/tmp/test/dir1/" #P"C:/tmp/test/a.txt")

C:\tmp\test>dir /s

 Directory of C:\tmp\test

24.10.2004  19:18    <DIR>          .
24.10.2004  19:18    <DIR>          ..
24.10.2004  19:17                 0 a.txt
24.10.2004  19:18    <DIR>          dir1
24.10.2004  19:17                 0 t
               2 File(s)              0 bytes

 Directory of C:\tmp\test\dir1

24.10.2004  19:18    <DIR>          .
24.10.2004  19:18    <DIR>          ..
24.10.2004  19:18    <DIR>          dir2
24.10.2004  19:17                 0 t
               1 File(s)              0 bytes

 Directory of C:\tmp\test\dir1\dir2

24.10.2004  19:18    <DIR>          .
24.10.2004  19:18    <DIR>          ..
               0 File(s)              0 bytes

-- 
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
From: Edi Weitz
Subject: Re: List of all files in directory
Date: 
Message-ID: <u65505800.fsf@agharta.de>
On Sun, 24 Oct 2004 17:24:46 +0000 (UTC), Frank Buss <··@frank-buss.de> wrote:

> sorry, I was wrong, it finds files without extensions, too, but also
> directories:

Look up LispWorks' extensions to DIRECTORY in their reference manual,
specifically :TEST and :DIRECTORIES.

Cheers,
Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")
From: Harald Hanche-Olsen
Subject: Re: List of all files in directory
Date: 
Message-ID: <pcopt38f1to.fsf@shuttle.math.ntnu.no>
+ Frank Buss <··@frank-buss.de>:

| Harald Hanche-Olsen <······@math.ntnu.no> wrote:
| 
| > I dare not state that what you are reporting is non-conforming,
| > but it sure seems odd to me.
| 
| sorry, I was wrong, it finds files without extensions, too, but also
| directories:
| 
| (#P"C:/tmp/test/t" #P"C:/tmp/test/dir1/t" #P"C:/tmp/test/dir1/dir2/" 
| #P"C:/tmp/test/dir1/" #P"C:/tmp/test/a.txt")

Hmm.  Maybe the implementation uses :unspecific for the filename.
That is probably conforming, too.  Check with

(loop for pathname in
      (directory #P"C:/tmp/test/**/*.*")
      do (format t "~&~S ~S" pathname (pathname-name pathname)))

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Frank Buss
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgrhg$bas$1@newsreader2.netcologne.de>
Harald Hanche-Olsen <······@math.ntnu.no> wrote:

> Hmm.  Maybe the implementation uses :unspecific for the filename.

yes

(remove-if #'directory-pathname-p (directory #P"/tmp/test/**/"))

this is the directory-pathname-p function from 
http://www.gigamonkeys.com/book/practical-a-portable-pathname-library.html 

(defun directory-pathname-p  (p)
  (flet ((component-present-p (value)
           (and value (not (eql value :unspecific)))))
    (and 
     (not (component-present-p (pathname-name p)))
     (not (component-present-p (pathname-type p)))
     p)))

-- 
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
From: Harald Hanche-Olsen
Subject: Re: List of all files in directory
Date: 
Message-ID: <pco3c036gxc.fsf@shuttle.math.ntnu.no>
+ Frank Buss <··@frank-buss.de>:

| this is the directory-pathname-p function from 
| http://www.gigamonkeys.com/book/practical-a-portable-pathname-library.html 

Damn, I should have known it.  Got too much blood in my caffeine.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Bernd Beuster
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgice$6or$1@online.de>
Harald Hanche-Olsen wrote:

> (loop for pathname in
>       (directory #P"/some/path/**/*.*")
>       when (pathname-name pathname)
>       collect pathname)
It works, thank you.  What is the meaning of the double wildcard "**" 
what does the trick?
From: David Golden
Subject: Re: List of all files in directory
Date: 
Message-ID: <Z1Red.39952$Z14.14353@news.indigo.ie>
Bernd Beuster wrote:

> Harald Hanche-Olsen wrote:
> 
>> (loop for pathname in
>>       (directory #P"/some/path/**/*.*")
>>       when (pathname-name pathname)
>>       collect pathname)
> It works, thank you.  What is the meaning of the double wildcard "**"
> what does the trick?

The common lisp hyperspec documents ** in _some_ namestrings (it means
"wild-inferiors"
http://www.lispworks.com/reference/HyperSpec/Body/19_ca.htm
And wild-inferiors are documented here:
http://www.lispworks.com/reference/HyperSpec/Body/19_bbdc.htm

A similar ** is present in at least one unix shell - zsh. zsh also uses
*** for "** and dare to follow symlinks". 

I think, reading the spec (N.B. possibly incorrectly - the slightly
baroque common lisp file name stuff is not something I've really got
into), it is only absolutely required that implementations support
wild-inferiors in namestrings that are parsed into logical pathnames�.
Implementors are free to just signal a file-error if the see ** in a
namestring for a filesystem and hold that the file system underneath
doesn't really support it (and there could well be non-hierarchical
filesystems where the notion just doesn't make sense).  However, I
think implementations on unix or windows probably support it in all
namestrings, even implementation-dependent physical ones, 'cos it's
basically pretty easy to on conventional unix or windows hierarchical
filesystems, and useful too.


�logical pathnames notionally refer to things that have been merged into
a Common Lisp implementation's internal sortof-VFS-layer via
logical-pathname-translations.
http://www.lispworks.com/reference/HyperSpec/Body/f_logica.htm
From: Bernd Beuster
Subject: Re: List of all files in directory
Date: 
Message-ID: <clgj8f$6or$2@online.de>
Good enough for me yet.

bash-2.05b$ find . -type f
./a/1
./a/2
./a/3
./a/a/1
./a/a/2
./a/a/3
./a/b/1
./a/c/1
./a/c/2
./a/c/3
./b/1
./b/2
./b/3
./cddb.lisp
./.cddb.lisp.swp


* (remove-if-not #'pathname-name (directory #p"./**/"))

(#p"/home/bernd/src/lisp/cddb2/.cddb.lisp.swp"
  #p"/home/bernd/src/lisp/cddb2/a/1" #p"/home/bernd/src/lisp/cddb2/a/2"
  #p"/home/bernd/src/lisp/cddb2/a/3" #p"/home/bernd/src/lisp/cddb2/a/a/1"
  #p"/home/bernd/src/lisp/cddb2/a/a/2" #p"/home/bernd/src/lisp/cddb2/a/a/3"
  #p"/home/bernd/src/lisp/cddb2/a/b/1" #p"/home/bernd/src/lisp/cddb2/a/c/1"
  #p"/home/bernd/src/lisp/cddb2/a/c/2" #p"/home/bernd/src/lisp/cddb2/a/c/3"
  #p"/home/bernd/src/lisp/cddb2/b/1" #p"/home/bernd/src/lisp/cddb2/b/2"
  #p"/home/bernd/src/lisp/cddb2/b/3" 
#p"/home/bernd/src/lisp/cddb2/cddb.lisp")
From: Harald Hanche-Olsen
Subject: Re: List of all files in directory
Date: 
Message-ID: <pcofz44m4uy.fsf@shuttle.math.ntnu.no>
+ Bernd Beuster <·············@lycos.de>:

| Harald Hanche-Olsen wrote:
| 
| > (loop for pathname in
| >       (directory #P"/some/path/**/*.*")
| >       when (pathname-name pathname)
| >       collect pathname)
| It works, thank you.  What is the meaning of the double wildcard "**"
| what does the trick?

It introduces a :WILD-INFERIORS component into the directory part of
the filename.  You need to read section 19.2 in the hyperspec to
understand how CL pathnames are structured.  Note that not all
filesystems support :WILD-INFERIORS.

Using (remove-if-not #'pathname-name ...) as in your followup to
yourself is a good call.  Don't know why I didn't think of it.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow
From: Edi Weitz
Subject: Re: List of all files in directory
Date: 
Message-ID: <uwtxgm7fg.fsf@agharta.de>
On Sun, 24 Oct 2004 15:21:20 +0200, Bernd Beuster <·············@lycos.de> wrote:

> I wonder if there is an easy functional

Any reason why it must be functional?

> implementation for finding all files in a given directory, i.e. the
> equivalent of the Unix command:
>
> $ find <path> -type f

See also

  <http://www.gigamonkeys.com/book/practical-a-portable-pathname-library.html>

Edi.

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")