I just took a look at the CloserLookAtCharacters article recently
posted to cliki.net, which I understand was authored by Pascal
Bourguinon. It's good info, thanks Pascal!
I did have a quibble with one paragraph, though. Pascal writes:
'In conclusion, if you want to copy a file of bytes, you should
use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
want on your implementation/OS). If you want to copy a text file
with a specific encoding, you must give the implementation specific
:EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
on "well formed" "text" files, i.e. text files produced by the same
lisp implementation on the same system.'
If one implements a file copy operation, such code should treat every
file as a binary file -- not only to preserve character encodings
but also end-of-line format. Opening a file in text-mode simply to copy
it verbatim introduces the unnecessary risk of some byte sequence getting
translated accidentally.
I think a better word to use in the second sentence is "process" rather
than "copy".
--
Jack Unrue
I wrote:
>
> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> [snip]
Also, it would be useful in the article to mention
Unicode Byte Order Marks.
--
Jack Unrue
On Tue, 8 Aug 2006, Jack Unrue wrote:
>
> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> I did have a quibble with one paragraph, though. Pascal writes:
>
> 'In conclusion, if you want to copy a file of bytes, you should
> use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
> want on your implementation/OS). If you want to copy a text file
> with a specific encoding, you must give the implementation specific
> :EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
> on "well formed" "text" files, i.e. text files produced by the same
> lisp implementation on the same system.'
>
> If one implements a file copy operation, such code should treat every
> file as a binary file -- not only to preserve character encodings
> but also end-of-line format. Opening a file in text-mode simply to copy
> it verbatim introduces the unnecessary risk of some byte sequence getting
> translated accidentally.
>
> I think a better word to use in the second sentence is "process" rather
> than "copy".
You know CLiki is a Wiki, right? ;)
> --
> Jack Unrue
On Tue, 8 Aug 2006 04:40:02 +0100, Nathan Baum <···········@btinternet.com> wrote:
>
> You know CLiki is a Wiki, right? ;)
heh...yeah I'll add my comments directly.
--
Jack Unrue
Jack Unrue <·······@example.tld> writes:
> I just took a look at the CloserLookAtCharacters article recently
> posted to cliki.net, which I understand was authored by Pascal
> Bourguinon. It's good info, thanks Pascal!
>
> I did have a quibble with one paragraph, though. Pascal writes:
>
> 'In conclusion, if you want to copy a file of bytes, you should
> use :ELEMENT-TYPE '(UNSIGNED-BYTE 8) (assuming it does what you
> want on your implementation/OS). If you want to copy a text file
> with a specific encoding, you must give the implementation specific
> :EXTERNAL-FORMAT. What you've programmed is only guaranteed to work
> on "well formed" "text" files, i.e. text files produced by the same
> lisp implementation on the same system.'
>
> If one implements a file copy operation, such code should treat every
> file as a binary file -- not only to preserve character encodings
> but also end-of-line format. Opening a file in text-mode simply to copy
> it verbatim introduces the unnecessary risk of some byte sequence getting
> translated accidentally.
Indeed.
What I had in mind is that there are two ways to consider the lisp
system <-> file system interactions:
- either you consider the file system to be primordial, and lisp is
only there to process existing files as any other programming
language environment,
- or you consider the lisp system to be primordial, and the file
system is only there to provide persistent storage for lisp data.
In the first case, you'd want to do the same as with these other
programming language environments, which means in a C / POSIX
environment, use bytes.
In the second case, we don't consider files not created by the lisp
system, so it's safe to process (including copy) files with whatever
lisp OPEN option (as long as you stay consistent).
> I think a better word to use in the second sentence is "process" rather
> than "copy".
--
__Pascal Bourguignon__ http://www.informatimago.com/
ADVISORY: There is an extremely small but nonzero chance that,
through a process known as "tunneling," this product may
spontaneously disappear from its present location and reappear at
any random place in the universe, including your neighbor's
domicile. The manufacturer will not be responsible for any damages
or inconveniences that may result.
On Tue, 08 Aug 2006 13:26:49 +0200, Pascal Bourguignon <···@informatimago.com> wrote:
>
> Indeed.
>
> What I had in mind is that there are two ways to consider the lisp
> system <-> file system interactions:
>
> - either you consider the file system to be primordial, and lisp is
> only there to process existing files as any other programming
> language environment,
>
> - or you consider the lisp system to be primordial, and the file
> system is only there to provide persistent storage for lisp data.
>
>
> In the first case, you'd want to do the same as with these other
> programming language environments, which means in a C / POSIX
> environment, use bytes.
>
> In the second case, we don't consider files not created by the lisp
> system, so it's safe to process (including copy) files with whatever
> lisp OPEN option (as long as you stay consistent).
OK, I understand what you're saying.
--
Jack Unrue