From: Andy
Subject: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <3CF370F9.88E3AC74@smi.de>
Hi all,
please help me on a problem with a file that was generated by a C
program. 
It starts with a header that was written with fwrite from a structure

struct header {
  unsigned file_id;
  insigned entries;		// number of data records
  unsigned process_code;	
  char     layer_label;		// byte with application info
}

The rest of the file contains data records of the type

struct tile {
  unsigned x0;
  unsigned y0;
  unsigned x1;
  unsigned y1;
  char     label;             // byte with tile info
}

I need to access the datarecords arbitrary for read and write. Since
i need only a few of them at the same time i don't want to read the file
at all.
I also need to have the infos from the header (read only).

My current idea is to use FFI to interface to the fread/fwrite calls of
C and
to convert all byte by byte. But that don't looks very smart to me.

Is there a better solution ? Specially: how to read/write binary
structures 
with random access ?

Thanks in advance
Best regards
AHz

From: Frode Vatvedt Fjeld
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <2hg00c5v7z.fsf@vserver.cs.uit.no>
Andy <···@smi.de> writes:

> Hi all, please help me on a problem with a file that was generated
> by a C program.  It starts with a header that was written with
> fwrite from a structure

I have written a library for dealing with octet-based binary
files. Have a look at
<URL:http://www.cs.uit.no/~frodef/sw/binary-types/>.

This library however does not relate to the C language, in the sense
that you have to manually deal with the issue of translating C's
semi-abstract types to a concrete binary structure. In other words,
you must translate "int" to for example "32-bit little-endian", "char"
to "8-bit ASCII character" and so on.

> struct header {
>   unsigned file_id;
>   insigned entries; // number of data records
>   unsigned process_code;
>   char layer_label; // byte with application info
> }

(bt:define-binary-class header ()
  ((file-id
     :binary-type bt:u32
     ;; ..and whatever defclass slot-options you want.
     :accessor header-file-id)
   (entries
     :binary-type bt:u32)
   (process-code
     :binary-type bt:u32)
   (layer-label
     :char8)))

  (let ((bt:*endian* :big-endian))
    (bt:read-binary 'header stream)

=> #<HEADER instance>

-- 
Frode Vatvedt Fjeld
From: Andy
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <3CF3830C.52101535@smi.de>
Perfect. Thank you very much.
Best regards
AHz

Frode Vatvedt Fjeld wrote:
> 
> Andy <···@smi.de> writes:
> 
> > Hi all, please help me on a problem with a file that was generated
> > by a C program.  It starts with a header that was written with
> > fwrite from a structure
> 
> I have written a library for dealing with octet-based binary
> files. Have a look at
> <URL:http://www.cs.uit.no/~frodef/sw/binary-types/>.
> 
> This library however does not relate to the C language, in the sense
> that you have to manually deal with the issue of translating C's
> semi-abstract types to a concrete binary structure. In other words,
> you must translate "int" to for example "32-bit little-endian", "char"
> to "8-bit ASCII character" and so on.
> 
> > struct header {
> >   unsigned file_id;
> >   insigned entries; // number of data records
> >   unsigned process_code;
> >   char layer_label; // byte with application info
> > }
> 
> (bt:define-binary-class header ()
>   ((file-id
>      :binary-type bt:u32
>      ;; ..and whatever defclass slot-options you want.
>      :accessor header-file-id)
>    (entries
>      :binary-type bt:u32)
>    (process-code
>      :binary-type bt:u32)
>    (layer-label
>      :char8)))
> 
>   (let ((bt:*endian* :big-endian))
>     (bt:read-binary 'header stream)
> 
> => #<HEADER instance>
> 
> --
> Frode Vatvedt Fjeld
From: Jochen Schmidt
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <acvs1f$bpj$1@rznews2.rrze.uni-erlangen.de>
Andy wrote:

> My current idea is to use FFI to interface to the fread/fwrite calls of
> C and
> to convert all byte by byte. But that don't looks very smart to me.

You would not need that. You could calculate the size of the both structs 
either by hand or automatically by using the FFI and the sizeof() 
information of the C side. For accessing your data you would not need the 
FFI. You open the binary file with the option :if-exists :overwrite to the 
OPEN function and then you simply use READ-SEQUENCE and WRITE-SEQUENCE with 
appropriatly sized arrays (using the above sizeof information). You move 
the file-pointer by using the function FILE-POSITION.

Another option would be to use the "Binary-types" package:

  http://ww.telent.net/cliki/Binary-types

ciao,
Jochen

--
http://www.dataheaven.de
From: Andy
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <3CF3837D.7D97312F@smi.de>
Jochen Schmidt wrote:
> You would not need that. You could calculate the size of the both structs
> either by hand or automatically by using the FFI and the sizeof()
Is there a way of size calcualtion for lisp structures (ok, thats
another
newbie question ;-)
> 
> Another option would be to use the "Binary-types" package:
> 
>   http://ww.telent.net/cliki/Binary-types
> 
I just downloaded it. Thanks for the hint.
Best regards
AHz
From: Frode Vatvedt Fjeld
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <2hy9e44c83.fsf@vserver.cs.uit.no>
Andy <···@smi.de> writes:

> Is there a way of size calcualtion for lisp structures (ok, thats
> another newbie question ;-)

No. The question indicates you don't quite understand the nature of
lisp types and values. What should be the size of the value "5"? Or
the value #s(foo :bar 5)? These questions make little sense.

However, the job of binary-types is to define a (two-way) mapping
between lisp values and some binary representation of fixed size. So,
continuing my previous example:

  (bt:sizeof 'header)

 => 17.  ; If I remember correctly, 4 ints plus 1 char.

But this says nothing about lisp instances of the class header, only
that (bt:read-binary 'header <stream>) will advance <stream>'s
file-position by 17.

PS: Please don't mail me copies of your follow-up articles.

-- 
Frode Vatvedt Fjeld
From: Andy
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <3CF38C48.6807AC09@smi.de>
Frode Vatvedt Fjeld wrote:
> 
> Andy <···@smi.de> writes:
> 
> > Is there a way of size calcualtion for lisp structures (ok, thats
> > another newbie question ;-)
> 
> No. The question indicates you don't quite understand the nature of
> lisp types and values. What should be the size of the value "5"? Or
> the value #s(foo :bar 5)? These questions make little sense.
> 
> However, the job of binary-types is to define a (two-way) mapping
> between lisp values and some binary representation of fixed size. So,
> continuing my previous example:
> 
>   (bt:sizeof 'header)
> 
>  => 17.  ; If I remember correctly, 4 ints plus 1 char.
> 
> But this says nothing about lisp instances of the class header, only
> that (bt:read-binary 'header <stream>) will advance <stream>'s
> file-position by 17.
> 
Yes, i asked the wrong question. What i want to know is exact what
you describe.

Best regards
AHz
From: Bulent Murtezaoglu
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <87znykw0yb.fsf@nkapi.internal>
>>>>> "Andy" == Andy  <···@smi.de> writes:

    Andy> Hi all, please help me on a problem with a file that was
    Andy> generated by a C program.  It starts with a header that was
    Andy> written with fwrite from a structure [...]

Note that fwrite will dump the struct as it is laid out in memory and in 
general C makes NO guarantees about laying out the structure members 
without padding or holes.  This will vary from compiler to compiler and 
even with compiler switches used even on the same processor/OS.  Now, I 
suspect you are aware of this and the way the example structs are defined 
seems likely to be safe.  I just wanted to make sure that this was pointed 
out.  There are ample examples on the net (start with the C faq and follow
pointers either to the paper literature or regoogle using new keywords) to
work around this.  

cheers,

BM
From: Andy
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <3CF3D91E.19777313@smi.de>
Bulent Murtezaoglu wrote:
> 
> Note that fwrite will dump the struct as it is laid out in memory and in
> general C makes NO guarantees about laying out the structure members
> without padding or holes.  This will vary from compiler to compiler and
> even with compiler switches used even on the same processor/OS.  
Oh, yes. I have had that problem some times ago. However, the strucures
are more
complex in practice and are written with a routine that prevends
allignment problems.

> Now, I suspect you are aware of this and the way the example structs are defined
> seems likely to be safe.
Four unsigned and one char does typically not allign clean. But this
will only
matter when you write more then one in an array with one fwrite. And
then only
on machines that doesn't allign to byte (wasn't MIPS such one ? AND
MC68000 does
dword alligning if i remember me right).

Thanks for the hint
Best regards
AHz
From: Johan Kullstam
Subject: Re: NEWBIE Question: How to handle binary files
Date: 
Message-ID: <m2lma3x24o.fsf@euler.axel.nom>
Andy <···@smi.de> writes:

> Bulent Murtezaoglu wrote:
> > 
> > Note that fwrite will dump the struct as it is laid out in memory and in
> > general C makes NO guarantees about laying out the structure members
> > without padding or holes.  This will vary from compiler to compiler and
> > even with compiler switches used even on the same processor/OS.  
> Oh, yes. I have had that problem some times ago. However, the strucures
> are more
> complex in practice and are written with a routine that prevends
> allignment problems.
> 
> > Now, I suspect you are aware of this and the way the example structs are defined
> > seems likely to be safe.
> Four unsigned and one char does typically not allign clean. But this
> will only
> matter when you write more then one in an array with one fwrite. And
> then only
> on machines that doesn't allign to byte (wasn't MIPS such one ? AND
> MC68000 does
> dword alligning if i remember me right).

since machines which can align on a byte are often faster if things
are on 32bit word boundaries, your C compiler may or may not have
padded the structure.  it depends upon the compiler and perhaps the
switches given to it.  this is just the original point of Bulent
Murtezaoglu.

and then you have the additional joy of "int" being 16, 32 or 64 or
perhaps some other number of bits.  and there is always the endianness
issue.

-- 
Johan KULLSTAM