Reading C float from binary file - how? (Using PCL binary data utils)

From: Frank Goenninger DG1SBG
Subject: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sat, 20 Oct 2007 17:24:33 +0000
Message-ID: <lzabqdwvfy.fsf@pcsde001.de.goenninger.net>

Hi all:

I am trying to read a complex data structure from a file that has been
written by a C program. I have created a small test program:

#include <stdio.h>
#include <stdlib.h>

int main ( int nArgc, char **ppcArgv )
{
  float fFloat = 0.0;

  char *pcFilename = ppcArgv[ 1 ];
  char *pcFloatStr = ppcArgv[ 2 ];

  FILE *hFile = (FILE *) NULL;

  hFile = fopen( pcFilename, "w" );

  if( hFile == (FILE *) NULL)
  {
    perror( "Cannot open file !" );
    exit( 1 );
  }

  fFloat = atof( pcFloatStr );
  fwrite( &fFloat, sizeof( fFloat ), 1, hFile );

  fclose( hFile );

  return 0;
}

Using it as:

./test-float test.bin 123.456

I generate a file having the following byte sequence (in Hex):

e9 79 42 f6

(This is a little endian Mac with Lispworks 5 PE)

I am trying to read this with the define-binary-type/class utils from
Peter Seibel's PCL ... So, how do I interprete the byte sequence? It
doesn't seem to be a IEEE single float format - I modeled this with
define-binary-type but no luck...

How are C floats encoded in binary? 

Thx for any pointers / hints ... !!

Cheers
   Frank

-- 

  Frank Goenninger

  frgo(at)mac(dot)com

  "Don't ask me! I haven't been reading comp.lang.lisp long enough to 
  really know ..."

Re: Reading C float from binary file - how? (Using PCL binary data utils) Pillsy
- Re: Reading C float from binary file - how? (Using PCL binary data utils) Frank Goenninger DG1SBG
  - Re: Reading C float from binary file - how? (Using PCL binary data utils) George Neuner
Re: Reading C float from binary file - how? (Using PCL binary data utils) Michael Weber
- Re: Reading C float from binary file - how? (Using PCL binary data utils) Frank Goenninger DG1SBG
Re: Reading C float from binary file - how? (Using PCL binary data utils) Thomas F. Burdick
- Re: Reading C float from binary file - how? (Using PCL binary data utils) George Neuner

From: Pillsy
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sat, 20 Oct 2007 19:03:51 +0000
Message-ID: <1192907031.625439.222570@t8g2000prg.googlegroups.com>

On Oct 20, 1:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
wrote:
[...]
> I am trying to read this with the define-binary-type/class utils from
> Peter Seibel's PCL ... So, how do I interprete the byte sequence? It
> doesn't seem to be a IEEE single float format - I modeled this with
> define-binary-type but no luck...

It sure likes like an IEEE single float to me. More to the point, I
used the IEEE single float format[1] to write the following function,
and it passed my desultory testing, under both LispWorks and SBCL on
my Intel MacBook.

It's not integrated into PCL's spiffy binary format parsing set-up,
but it should do the trick. I have no idea as to its efficiency.

Cheers,
Pillsy

(defconstant +float-fraction-scale+ (expt (float 2) -23))

(defun encode-float (bits)
  "Takes a 32-bit unsigned integer and encodes it as a
single-precision floating-point number."
  (declare (type (unsigned-byte 32) bits))
  (let ((sign (ldb (byte 1 31) bits))
	(exponent (ldb (byte 8 23) bits))
	(fraction (ldb (byte 23 0) bits)))
    (case exponent
      (0 ; It's a denormalized float.
       (* (expt -1 (1- sign)) (expt (float 2) -126)
	  (* +float-fraction-scale+ fraction)))
      (255
       (if (zerop fraction)
	   ;; Not sure what the CLy way to handle NaNs and Infinities
           ;; is, not tested anyway.
           'infinity
	   'not-a-number))
      (otherwise
       (* (expt -1 sign)
	  (expt (float 2) (- exponent 127))
	  (+ 1 (* +float-fraction-scale+ fraction)))))))

[1] http://en.wikipedia.org/wiki/IEEE_floating-point_standard

From: Frank Goenninger DG1SBG
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sat, 20 Oct 2007 21:25:28 +0000
Message-ID: <lz8x5xiilz.fsf@pcsde001.de.goenninger.net>

Pillsy <·········@gmail.com> writes:

> On Oct 20, 1:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
> wrote:
> [...]
>> I am trying to read this with the define-binary-type/class utils from
>> Peter Seibel's PCL ... So, how do I interprete the byte sequence? It
>> doesn't seem to be a IEEE single float format - I modeled this with
>> define-binary-type but no luck...
>
> It sure likes like an IEEE single float to me. More to the point, I
> used the IEEE single float format[1] to write the following function,
> and it passed my desultory testing, under both LispWorks and SBCL on
> my Intel MacBook.
>
> It's not integrated into PCL's spiffy binary format parsing set-up,
> but it should do the trick. I have no idea as to its efficiency.
>
> Cheers,
> Pillsy
>
> (defconstant +float-fraction-scale+ (expt (float 2) -23))
>
> (defun encode-float (bits)
>   "Takes a 32-bit unsigned integer and encodes it as a
> single-precision floating-point number."
>   (declare (type (unsigned-byte 32) bits))
>   (let ((sign (ldb (byte 1 31) bits))
> 	(exponent (ldb (byte 8 23) bits))
> 	(fraction (ldb (byte 23 0) bits)))
>     (case exponent
>       (0 ; It's a denormalized float.
>        (* (expt -1 (1- sign)) (expt (float 2) -126)
> 	  (* +float-fraction-scale+ fraction)))
>       (255
>        (if (zerop fraction)
> 	   ;; Not sure what the CLy way to handle NaNs and Infinities
>            ;; is, not tested anyway.
>            'infinity
> 	   'not-a-number))
>       (otherwise
>        (* (expt -1 sign)
> 	  (expt (float 2) (- exponent 127))
> 	  (+ 1 (* +float-fraction-scale+ fraction)))))))
>
> [1] http://en.wikipedia.org/wiki/IEEE_floating-point_standard
>

Thx! Helped me find a bug (I had the wrong return value in my reader
function) and also figure out that the basic read-unsigned-integer
function did not read the bytes correctly from the file (wrong byte
order). I also found this web site:

http://www.rit.edu/~meseec/eecc250-winter99/IEEE-754.html

which also helped. 

Thanks again!

  Frank

-- 

  Frank Goenninger

  frgo(at)mac(dot)com

  "Don't ask me! I haven't been reading comp.lang.lisp long enough to 
  really know ..."

From: George Neuner
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sun, 21 Oct 2007 04:10:18 +0000
Message-ID: <a5flh3d8ur4i9dh3ip3l18ectgtr33tufc@4ax.com>

On Sat, 20 Oct 2007 23:25:28 +0200, Frank Goenninger DG1SBG
<·············@nomail.org> wrote:

>Pillsy <·········@gmail.com> writes:
>
>> On Oct 20, 1:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
>> wrote:
>> [...]
>>> I am trying to read this with the define-binary-type/class utils from
>>> Peter Seibel's PCL ... So, how do I interprete the byte sequence? It
>>> doesn't seem to be a IEEE single float format - I modeled this with
>>> define-binary-type but no luck...
>>
>> It sure likes like an IEEE single float to me. More to the point, I
>> used the IEEE single float format[1] to write the following function,
>> and it passed my desultory testing, under both LispWorks and SBCL on
>> my Intel MacBook.
>>
>> It's not integrated into PCL's spiffy binary format parsing set-up,
>> but it should do the trick. I have no idea as to its efficiency.
>>
>> Cheers,
>> Pillsy
>>
>> (defconstant +float-fraction-scale+ (expt (float 2) -23))
>>
>> (defun encode-float (bits)
>>   "Takes a 32-bit unsigned integer and encodes it as a
>> single-precision floating-point number."
>>   (declare (type (unsigned-byte 32) bits))
>>   (let ((sign (ldb (byte 1 31) bits))
>> 	(exponent (ldb (byte 8 23) bits))
>> 	(fraction (ldb (byte 23 0) bits)))
>>     (case exponent
>>       (0 ; It's a denormalized float.
>>        (* (expt -1 (1- sign)) (expt (float 2) -126)
>> 	  (* +float-fraction-scale+ fraction)))
>>       (255
>>        (if (zerop fraction)
>> 	   ;; Not sure what the CLy way to handle NaNs and Infinities
>>            ;; is, not tested anyway.
>>            'infinity
>> 	   'not-a-number))
>>       (otherwise
>>        (* (expt -1 sign)
>> 	  (expt (float 2) (- exponent 127))
>> 	  (+ 1 (* +float-fraction-scale+ fraction)))))))
>>
>> [1] http://en.wikipedia.org/wiki/IEEE_floating-point_standard
>>
>
>Thx! Helped me find a bug (I had the wrong return value in my reader
>function) and also figure out that the basic read-unsigned-integer
>function did not read the bytes correctly from the file (wrong byte
>order). I also found this web site:

This is not much help with an existing file, but for the future...


IEEE does not define any particular byte order so that CPU designers
are free to optimize their chips however they please.  Also, the
popular operating systems don't define any particular external data
representation - they just serialize bytes in order from memory.

It's best not to use binary in the first place, but if you must, the
only way to use it reliably across different CPUs and/or languages is
to agree on a standard external data representation.  You can design
your own, but its better to use one of the established network
standards because there are libraries for them.

The network standards define a byte order for integer data and
transfer floating point data as an integer of corresponding bit
length.

Ex. Using TCP/IP protocol

 union { long L; float F; } data;
 
 // write
 data.F = ...
 data.L = htonl(data.L);
 write( <fd>, &data, sizeof(long) );

 // read
 read( <fd>, &data, sizeof(long) );
 data.L = ntohl(data.L);
 ... = data.F;

You can also use "ntohll" and "htonll" (long long) for 64-bit data on
platforms that support IPv6.

If you don't like raw TCP, binary data transfers are also defined by
COM, CORBA and others.


>http://www.rit.edu/~meseec/eecc250-winter99/IEEE-754.html
>
>which also helped. 
>
>Thanks again!
>
>  Frank

George
--
for email reply remove "/" from address

From: Michael Weber
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sat, 20 Oct 2007 20:44:37 +0000
Message-ID: <1192913077.573462.251580@q5g2000prf.googlegroups.com>

On Oct 20, 7:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
wrote:
> e9 79 42 f6

I get (with x86):

% ccc 'float f = 123.456; fwrite(&f, sizeof f, 1, stdout);'|hexdump -C
00000000  79 e9 f6 42                                       |y###|
00000004

CL-USER> (ieee-floats:decode-float32 #x42f6e979)
123.456

http://common-lisp.net/project/ieee-floats/


Cheers,
Michael

From: Frank Goenninger DG1SBG
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Mon, 22 Oct 2007 10:27:32 +0000
Message-ID: <lz4pgjigvf.fsf@pcsde001.de.goenninger.net>

Michael Weber <·········@foldr.org> writes:

> On Oct 20, 7:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
> wrote:
>> e9 79 42 f6
>
> I get (with x86):
>
> % ccc 'float f = 123.456; fwrite(&f, sizeof f, 1, stdout);'|hexdump -C
> 00000000  79 e9 f6 42                                       |y###|
> 00000004
>
> CL-USER> (ieee-floats:decode-float32 #x42f6e979)
> 123.456
>
> http://common-lisp.net/project/ieee-floats/
>
>
> Cheers,
> Michael
>

Thx! Will try to integrate this into the define-binary-... mechanism.

Frank

-- 

  Frank Goenninger

  frgo(at)mac(dot)com

  "Don't ask me! I haven't been reading comp.lang.lisp long enough to 
  really know ..."

From: Thomas F. Burdick
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sun, 21 Oct 2007 10:40:26 +0000
Message-ID: <1192963226.385247.216970@i13g2000prf.googlegroups.com>

On Oct 20, 7:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
wrote:
> Hi all:
>
> I am trying to read a complex data structure from a file that has been
> written by a C program.

In general, there are two types of complex binary data structures that
C programs will write: carefully designed formats where the exact
layout of the bytes is defined; and more-or-less structured binary
dumps of how things are represented in memory.  In the former case,
the file will be byte-for-byte the same[*] whether it's written by a C
program on a PPC Mac, Alpha Linux, or x86 Windows.  For these files,
go ahead and read Peter Seibel's chapter in Practical Common Lisp, and
use the ieee-floats library.

However, I suspect you're dealing with the second case.  Really, this
isn't a proper file format, it's a window into the heap of a C
program.  In this case, I prefer to think of the problem as just one
case of interfacing to alien C data structures.  So use your
implementation's FFI facilities; they're designed exactly for this
purpose.  Using SBCL, it's generally a simple matter of defining a few
C structure types, then either mmapping the file or stack-allocating
the structures (with with-alien) and using low-level reads (eg, using
read(2) directly) to fill them in.  Or if you don't want to marshal
the bits between alien structures and more Lispy representations, heap-
allocated alien objects are pretty pleasant to use directly.

[*] That, or there will be an endianness flag at the start of the
file.  But in principle it's the same situation, the bytes are well
defined, you just get two options to make reading/writing faster in
the common case.

From: George Neuner
Subject: Re: Reading C float from binary file - how? (Using PCL binary data utils)
Date: Sun, 21 Oct 2007 18:15:10 +0000
Message-ID: <gu1nh3thau46qel7d74lqafeneqmnfi9g0@4ax.com>

On Sun, 21 Oct 2007 10:40:26 -0000, "Thomas F. Burdick"
<········@gmail.com> wrote:

>On Oct 20, 7:24 pm, Frank Goenninger DG1SBG <·············@nomail.org>
>wrote:
>> Hi all:
>>
>> I am trying to read a complex data structure from a file that has been
>> written by a C program.
>
>In general, there are two types of complex binary data structures that
>C programs will write: carefully designed formats where the exact
>layout of the bytes is defined; and more-or-less structured binary
>dumps of how things are represented in memory.  In the former case,
>the file will be byte-for-byte the same[*] whether it's written by a C
>program on a PPC Mac, Alpha Linux, or x86 Windows.  For these files,
>go ahead and read Peter Seibel's chapter in Practical Common Lisp, and
>use the ieee-floats library.

[*] you mean the first case.  

In the second case (memory dump) the data format will depend on the
chip: x86 is little endian, 68K is big endian, PPC and Alpha can be
either on a per process basis (some PPC models can do it on a per page
basis within a process).

But note that all of those chips conform to IEEE-754 bit formats
regardless of memory endianess.  There are chips that don't.  There
are even chips that use different endian formats for integer and
floating point.  See below.

>However, I suspect you're dealing with the second case.  Really, this
>isn't a proper file format, it's a window into the heap of a C
>program.  In this case, I prefer to think of the problem as just one
>case of interfacing to alien C data structures.  So use your
>implementation's FFI facilities; they're designed exactly for this
>purpose.  Using SBCL, it's generally a simple matter of defining a few
>C structure types, then either mmapping the file or stack-allocating
>the structures (with with-alien) and using low-level reads (eg, using
>read(2) directly) to fill them in.  Or if you don't want to marshal
>the bits between alien structures and more Lispy representations, heap-
>allocated alien objects are pretty pleasant to use directly.
>
>[*] That, or there will be an endianness flag at the start of the
>file.  But in principle it's the same situation, the bytes are well
>defined, you just get two options to make reading/writing faster in
>the common case.

I once had the misfortune to work with a DSP that had 40-bit floats
and 32-bit integers with opposite endianess (and 48-bit instructions).
Banked 16-bit memory accessible to the DSP as 16/32/40/48, but to the
host computer only as 16/32.  The DSP app was HRT and couldn't afford
much in the way of comm overhead on board, but required 2-way real
time data transfers with the host.  So packed binary data in DSP
native format DMA'd in and out.  It was an experience I don't like to
think too much about.

George
--
for email reply remove "/" from address