From: Mirko
Subject: reading windows file on sbcl & unix
Date: 
Message-ID: <a31da843-17c1-4177-a14e-cebaa37a7234@40g2000prx.googlegroups.com>
Hello,

I am trying to read a windows ascii file on unix & sbcl.  I know of
dos2unix, but I would rather keep the file intact for now.

I tried setting the :external-format keyword to :windows-1257/8, but
the read statement still captures the carriage return (^M).

Any other options?

Thank you,

Mirko

From: Pascal J. Bourguignon
Subject: Re: reading windows file on sbcl & unix
Date: 
Message-ID: <7cfxklbztj.fsf@pbourguignon.anevia.com>
Mirko <·············@gmail.com> writes:

> Hello,
>
> I am trying to read a windows ascii file on unix & sbcl.  I know of
> dos2unix, but I would rather keep the file intact for now.
>
> I tried setting the :external-format keyword to :windows-1257/8, but
> the read statement still captures the carriage return (^M).
>
> Any other options?

Use clisp and  :external-format #+clisp (ext:make-encoding :charset charset:windows-1257 
                                                           :line-terminator :dos
                                                           :input-error-action #\? 
                                                           :output-error-action :error)
                                #-clisp :default

-- 
__Pascal Bourguignon__
From: Kaz Kylheku
Subject: Re: reading windows file on sbcl & unix
Date: 
Message-ID: <20090103114523.0@gmail.com>
On 2008-12-18, Pascal J. Bourguignon <···@informatimago.com> wrote:
> Mirko <·············@gmail.com> writes:
>
>> Hello,
>>
>> I am trying to read a windows ascii file on unix & sbcl.  I know of
>> dos2unix, but I would rather keep the file intact for now.
>>
>> I tried setting the :external-format keyword to :windows-1257/8, but
>> the read statement still captures the carriage return (^M).
>>
>> Any other options?
>
> Use clisp and  :external-format #+clisp (ext:make-encoding :charset charset:windows-1257 
>                                                            :line-terminator :dos
>                                                            :input-error-action #\? 
>                                                            :output-error-action :error)
>                                 #-clisp :default

On input, CLISP will munge line terminators as it pleases regardless of
:line-terminator, which affects output.

Section 13.10 documents a rationale for this:

``Justification. Unicode Newline Guidelines say: .Even if you know which
characters represents NLF on your particular platform, on input and in
interpretation, treat CR, LF, CRLF, and NEL the same. Only on output do you
need to distinguish between them.''
From: Pascal J. Bourguignon
Subject: Re: reading windows file on sbcl & unix
Date: 
Message-ID: <7c3aglbgyb.fsf@pbourguignon.anevia.com>
Kaz Kylheku <········@gmail.com> writes:

> On 2008-12-18, Pascal J. Bourguignon <···@informatimago.com> wrote:
>> Mirko <·············@gmail.com> writes:
>>
>>> Hello,
>>>
>>> I am trying to read a windows ascii file on unix & sbcl.  I know of
>>> dos2unix, but I would rather keep the file intact for now.
>>>
>>> I tried setting the :external-format keyword to :windows-1257/8, but
>>> the read statement still captures the carriage return (^M).
>>>
>>> Any other options?
>>
>> Use clisp and  :external-format #+clisp (ext:make-encoding :charset charset:windows-1257 
>>                                                            :line-terminator :dos
>>                                                            :input-error-action #\? 
>>                                                            :output-error-action :error)
>>                                 #-clisp :default
>
> On input, CLISP will munge line terminators as it pleases regardless of
> :line-terminator, which affects output.
>
> Section 13.10 documents a rationale for this:
>
> ``Justification. Unicode Newline Guidelines say: .Even if you know which
> characters represents NLF on your particular platform, on input and in
> interpretation, treat CR, LF, CRLF, and NEL the same. Only on output do you
> need to distinguish between them.''

Which does not matter at all if the file only contains CR-LF  and not stray CR or LF.

-- 
__Pascal Bourguignon__