From: Jack Unrue
Subject: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <7evfd29iri14vhj27adg2vkvf8jlid0jc3@4ax.com>
I just took a look at the CloserLookAtCharacters article recently
posted to cliki.net, which I understand was authored by Pascal
Bourguinon. It's good info, thanks Pascal!

I did have a quibble with one paragraph, though. Pascal writes:

  'In conclusion, if you want to copy a file of bytes, you should
   use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
   want on your implementation/OS). If you want to copy a text file
   with a specific encoding, you must give the implementation specific
   :EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
   on "well formed" "text" files, i.e. text files produced by the same
   lisp implementation on the same system.'

If one implements a file copy operation, such code should treat every
file as a binary file -- not only to preserve character encodings
but also end-of-line format. Opening a file in text-mode simply to copy
it verbatim introduces the unnecessary risk of some byte sequence getting
translated accidentally.

I think a better word to use in the second sentence is "process" rather
than "copy".

-- 
Jack Unrue

From: Jack Unrue
Subject: Re: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <cr0gd2tq8frim5ru88i0hu47r4b1o9d1rq@4ax.com>
I wrote:
>
> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> [snip]

Also, it would be useful in the article to mention
Unicode Byte Order Marks.

-- 
Jack Unrue
From: Nathan Baum
Subject: Re: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <Pine.LNX.4.64.0608080439240.21556@localhost>
On Tue, 8 Aug 2006, Jack Unrue wrote:
>
> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> I did have a quibble with one paragraph, though. Pascal writes:
>
>  'In conclusion, if you want to copy a file of bytes, you should
>   use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
>   want on your implementation/OS). If you want to copy a text file
>   with a specific encoding, you must give the implementation specific
>   :EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
>   on "well formed" "text" files, i.e. text files produced by the same
>   lisp implementation on the same system.'
>
> If one implements a file copy operation, such code should treat every
> file as a binary file -- not only to preserve character encodings
> but also end-of-line format. Opening a file in text-mode simply to copy
> it verbatim introduces the unnecessary risk of some byte sequence getting
> translated accidentally.
>
> I think a better word to use in the second sentence is "process" rather
> than "copy".

You know CLiki is a Wiki, right? ;)

> -- 
> Jack Unrue
From: Jack Unrue
Subject: Re: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <112gd210cqms5lfkg94cr26ll87ihet542@4ax.com>
On Tue, 8 Aug 2006 04:40:02 +0100, Nathan Baum <···········@btinternet.com> wrote:
>
> You know CLiki is a Wiki, right? ;)

heh...yeah I'll add my comments directly.

-- 
Jack Unrue
From: Pascal Bourguignon
Subject: Re: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <87slk7tqme.fsf@thalassa.informatimago.com>
Jack Unrue <·······@example.tld> writes:

> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> I did have a quibble with one paragraph, though. Pascal writes:
>
>   'In conclusion, if you want to copy a file of bytes, you should
>    use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
>    want on your implementation/OS). If you want to copy a text file
>    with a specific encoding, you must give the implementation specific
>    :EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
>    on "well formed" "text" files, i.e. text files produced by the same
>    lisp implementation on the same system.'
>
> If one implements a file copy operation, such code should treat every
> file as a binary file -- not only to preserve character encodings
> but also end-of-line format. Opening a file in text-mode simply to copy
> it verbatim introduces the unnecessary risk of some byte sequence getting
> translated accidentally.

Indeed. 

What I had in mind is that there are two ways to consider the lisp
system <-> file system interactions:

- either you consider the file system to be primordial, and lisp is
  only there to process existing files as any other programming
  language environment, 

- or you consider the lisp system to be primordial, and the file
  system is only there to provide persistent storage for lisp data.


In the first case, you'd want to do the same as with these other
programming language environments, which means in a C / POSIX
environment, use bytes.

In the second case, we don't consider files not created by the lisp
system, so it's safe to process (including copy) files with whatever
lisp OPEN option (as long as you stay consistent).


> I think a better word to use in the second sentence is "process" rather
> than "copy".

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

ADVISORY: There is an extremely small but nonzero chance that,
through a process known as "tunneling," this product may
spontaneously disappear from its present location and reappear at
any random place in the universe, including your neighbor's
domicile. The manufacturer will not be responsible for any damages
or inconveniences that may result.
From: Jack Unrue
Subject: Re: CloserLookAtCharacters article at cliki.net
Date: 
Message-ID: <f4chd2l293kutmo4le1q63el1ulvrhn6t5@4ax.com>
On Tue, 08 Aug 2006 13:26:49 +0200, Pascal Bourguignon <···@informatimago.com> wrote:
>
> Indeed. 
>
> What I had in mind is that there are two ways to consider the lisp
> system <-> file system interactions:
>
> - either you consider the file system to be primordial, and lisp is
>   only there to process existing files as any other programming
>   language environment, 
>
> - or you consider the lisp system to be primordial, and the file
>   system is only there to provide persistent storage for lisp data.
>
>
> In the first case, you'd want to do the same as with these other
> programming language environments, which means in a C / POSIX
> environment, use bytes.
>
> In the second case, we don't consider files not created by the lisp
> system, so it's safe to process (including copy) files with whatever
> lisp OPEN option (as long as you stay consistent).

OK, I understand what you're saying.

-- 
Jack Unrue