Rainer Joswig wrote:
> Hi,
>
> is there a way to determine the byte order
> of the underlying 'machine'? Portable?
>
> Regards,
>
> Rainer Joswig
>
PythonWin 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit
(Intel)] on win32.
Portions Copyright 1994-2006 Mark Hammond - see 'Help/About PythonWin'
for further copyright information.
>>> import sys
>>> sys.byteorder
'little'
>>>
Just read the source and port it to Lisp;)
On Jan 7, 6:09 pm, Pekka Niiranen <··············@pp5.inet.fi> wrote:
> >>> import sys
> >>> sys.byteorder
> 'little'
> >>>
>
> Just read the source and port it to Lisp;)
You are plenty of right in that there are many libraries lacking for
Lisp.
Thanks for the reminding, though.
Rainer Joswig <······@lisp.de> writes:
> is there a way to determine the byte order
> of the underlying 'machine'? Portable?
You can obtain the answer to a related question by writing 1 to a
file using an (UNSIGNED-BYTE 16) stream and then reading from the
same file using an (UNSIGNED-BYTE 8) stream. Yes, behavior still
depends on the implementation, so it only gives a partial answer.
(In practice, it is highly probable that all implementations will
work as we'd like them to.)
A trivial first draft (run on x86):
* (defun little-endian-p ()
(with-open-file (s "ub16" :direction :output :element-type '(unsigned-byte 16))
(print (stream-element-type s) *trace-output*) ;implementation-dependent [*]
(write-byte 1 s))
(with-open-file (s "ub16" :direction :input :element-type '(unsigned-byte 8))
(print (stream-element-type s) *trace-output*) ;implementation-dependent
(= 1 (read-byte s))))
little-endian-p
* (little-endian-p)
(unsigned-byte 16)
(unsigned-byte 8)
t
$ od -t x1 ub16
0000000 01 00
0000002
[*] See the last paragraph of the "Exceptional Situations"
subsection of OPEN's specification.
* * *
I am tempted to think that the following is portable in C (it
compiles cleanly with ``gcc -ansi -pedantic'', but that isn't
everything, of course):
==> little-endian-p.c <==
int main () {
int i;
i = 1;
return (*(char *)&i == 0);
}
but I am also tempted to think that I might be missing some
provision of the standard.
Implementing a test that can also detect PDP-endianness is left as
an exercise...
---Vassil.
--
Bound variables, free programmers.
In article <···············@luna.vassil.nikolov.name>,
Vassil Nikolov <···············@pobox.com> wrote:
> Rainer Joswig <······@lisp.de> writes:
> > is there a way to determine the byte order
> > of the underlying 'machine'? Portable?
>
> You can obtain the answer to a related question by writing 1 to a
> file using an (UNSIGNED-BYTE 16) stream and then reading from the
> same file using an (UNSIGNED-BYTE 8) stream. Yes, behavior still
> depends on the implementation, so it only gives a partial answer.
> (In practice, it is highly probable that all implementations will
> work as we'd like them to.)
Thanks for the answer!
> A trivial first draft (run on x86):
>
> * (defun little-endian-p ()
> (with-open-file (s "ub16" :direction :output :element-type '(unsigned-byte 16))
> (print (stream-element-type s) *trace-output*) ;implementation-dependent [*]
> (write-byte 1 s))
> (with-open-file (s "ub16" :direction :input :element-type '(unsigned-byte 8))
> (print (stream-element-type s) *trace-output*) ;implementation-dependent
> (= 1 (read-byte s))))
>
> little-endian-p
> * (little-endian-p)
>
> (unsigned-byte 16)
> (unsigned-byte 8)
> t
That looked good. Then I ran it on a Lisp Machine:
Command: (little-endian-p)
(UNSIGNED-BYTE 16)
Error: File was created with byte size 16; it may not be opened with byte size 8.
For RJNXP:>joswig>ub16.lisp.1
LMFS:OPEN-LOCAL-LMFS-1
Arg 0: #P"RJNXP:>joswig>ub16.lisp.newest"
Arg 1 (LMFS:LOGPATH): #P"RJNXP:>joswig>ub16.lisp.newest"
Arg 2 (LMFS:ACCESS-PATH): #<FS:LOCAL-LMFS-ACCESS-PATH RJNXP using LOCAL-FILE 2050271547>
Rest Arg (LMFS:OPTIONS): (:DIRECTION :INPUT :ELEMENT-TYPE (UNSIGNED-BYTE 8) ...)
s-A, <Resume>: Retry OPEN of RJNXP:>joswig>ub16.lisp.newest
s-B: Retry OPEN using a different pathname
s-C, <Abort>: Return to Lisp Top Level in a TELNET server
s-D: Restart process TELNET terminal
->
Sigh. It will be hardcoded, then.
Btw., it is a little-endian machine, too.
I tried another variant: use an ARRAY of 32 bits and another one
of 4 times 8 bits displaced to it. Then compile the access code
with SAFETY 0 to get rid of the runtime check. Does also
not work on the platforms I'd be interested in...
More embarrassing, the software I was looking at (not written by me, I swear),
named it wrong - %big-endian was really %little-endian.
>
> $ od -t x1 ub16
> 0000000 01 00
> 0000002
>
> [*] See the last paragraph of the "Exceptional Situations"
> subsection of OPEN's specification.
>
> * * *
>
> I am tempted to think that the following is portable in C (it
> compiles cleanly with ``gcc -ansi -pedantic'', but that isn't
> everything, of course):
>
> ==> little-endian-p.c <==
>
> int main () {
> int i;
> i = 1;
> return (*(char *)&i == 0);
> }
>
> but I am also tempted to think that I might be missing some
> provision of the standard.
>
>
> Implementing a test that can also detect PDP-endianness is left as
> an exercise...
>
> ---Vassil.
--
http://lispm.dyndns.org/
Rainer Joswig <······@lisp.de> writes:
> In article <···············@luna.vassil.nikolov.name>,
> Vassil Nikolov <···············@pobox.com> wrote:
> ...
>> (In practice, it is highly probable that all implementations will
>> work as we'd like them to.)
> ...
>> (defun little-endian-p ()
>> (with-open-file (s "ub16" :direction :output :element-type '(unsigned-byte 16))
>> (print (stream-element-type s) *trace-output*) ;implementation-dependent [*]
>> (write-byte 1 s))
>> (with-open-file (s "ub16" :direction :input :element-type '(unsigned-byte 8))
>> (print (stream-element-type s) *trace-output*) ;implementation-dependent
>> (= 1 (read-byte s))))
>> ...
> That looked good. Then I ran it on a Lisp Machine:
>
> Command: (little-endian-p)
> (UNSIGNED-BYTE 16)
> Error: File was created with byte size 16; it may not be opened with byte size 8.
> For RJNXP:>joswig>ub16.lisp.1
Good point. I was in a unix state of mind. "Highly probable" it
isn't.
If there is no way to determine endianness programmatically [*]
_within_ a Lisp Machine, perhaps it can be done extramurally, as it
were, by writing to an (UNSIGNED-BYTE 16) value to a network
destination and then getting an (UNSIGNED-BYTE 8) value echoed?
(Ignoring how impractical this is for the sake of the exercise.)
[*] At the Common Lisp level; I suppose there are low-level
facilities that allow one to examine "raw" memory contents.
By the way, perhaps it will help to know why exactly you need to
detect endianness. A specific reason might hint at a specific
solution.
---Vassil.
--
Bound variables, free programmers.
On 2008-01-08 01:47:38 +0000, Vassil Nikolov <···············@pobox.com> said:
> By the way, perhaps it will help to know why exactly you need to
> detect endianness. A specific reason might hint at a specific
> solution.
Yes, this begs the question :) If it is nigh impossible to
determine endianness, what use could have this information
apart from display purpose ("Your machine is little-endian")?
The only case that springs to mind is interoprability at the
file format level, where you'd have to write "by hand" 16, 32
or 64 bits values in the same way the _host_ does. But then
that file format ought to be specified at the outset, and
you would know which endianness to use regardless of the host's
no?
--
JFB
Vassil Nikolov <···············@pobox.com> writes:
> Rainer Joswig <······@lisp.de> writes:
>> is there a way to determine the byte order
>> of the underlying 'machine'? Portable?
>
> You can obtain the answer to a related question by writing 1 to a
> file using an (UNSIGNED-BYTE 16) stream and then reading from the
> same file using an (UNSIGNED-BYTE 8) stream. Yes, behavior still
> depends on the implementation, so it only gives a partial answer.
> (In practice, it is highly probable that all implementations will
> work as we'd like them to.)
Not portably. Clisp always writes its files in little-endian, to make
its files portable from one platform to another.
--
__Pascal Bourguignon__ http://www.informatimago.com/
"Our users will know fear and cower before our software! Ship it!
Ship it and let them flee like the dogs they are!"
Rainer Joswig <······@lisp.de> writes:
> is there a way to determine the byte order
> of the underlying 'machine'? Portable?
Not in Common Lisp.
Not in standard C either.
In clisp, when you have clx: #+CLX-LITTLE-ENDIAN / #+CLX-BIG-ENDIAN
In gcc on normal 32-bit platforms:
#include <stdio.h>
int main(void){
union {
int i;
char c[sizeof(int)];
} v;
v.i=0x01020304;
if((c[0]==0x01)&&(c[1]==0x02)(c[2]==0x03)(c[3]==0x04)){
printf("big endian\n");
}else if((c[3]==0x01)&&(c[2]==0x02)(c[1]==0x03)(c[0]==0x04)){
printf("little endian\n");
}else if((c[1]==0x01)&&(c[0]==0x02)(c[3]==0x03)(c[2]==0x04)){
printf("pdf endian\n"); /* IIRC */
}else{
printf("other endian\n");
}
return(0);
}
You could do the same with some FFI in lisps having a FFI, or similar
in lisp providing a low-level access to the memory.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Until real software engineering is developed, the next best practice
is to develop with a dynamic system that has extreme late binding in
all aspects. The first system to really do this in an important way
is Lisp. -- Alan Kay
On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
> Rainer Joswig <······@lisp.de> writes:
> > is there a way to determine the byte order
> > of the underlying 'machine'? Portable?
>
> Not in Common Lisp.
>
> Not in standard C either.
You're joking, right?
// returns 1 if little endian, 0 if big
int am_i_little_endian() {
int x = 0x12345678;
char* b = (char*)&x;
return *b == (x & 0xFF);
}
Jeff M.
"Jeff M." <·······@gmail.com> writes:
> On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>> Rainer Joswig <······@lisp.de> writes:
>> > is there a way to determine the byte order
>> > of the underlying 'machine'? Portable?
>>
>> Not in Common Lisp.
>>
>> Not in standard C either.
>
> You're joking, right?
>
> // returns 1 if little endian, 0 if big
> int am_i_little_endian() {
> int x = 0x12345678;
> char* b = (char*)&x;
> return *b == (x & 0xFF);
> }
Are you *absolutely* sure that's standard C, and not just
something that happens to work most of the time?
On Jan 8, 2:37 pm, Raymond Wiker <····@RawMBP.local> wrote:
> "Jeff M." <·······@gmail.com> writes:
> > On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
> >> Rainer Joswig <······@lisp.de> writes:
> >> > is there a way to determine the byte order
> >> > of the underlying 'machine'? Portable?
>
> >> Not in Common Lisp.
>
> >> Not in standard C either.
>
> > You're joking, right?
>
> > // returns 1 if little endian, 0 if big
> > int am_i_little_endian() {
> > int x = 0x12345678;
> > char* b = (char*)&x;
> > return *b == (x & 0xFF);
> > }
>
> Are you *absolutely* sure that's standard C, and not just
> something that happens to work most of the time?
Tell ya what, I'm more than willing to admit being wrong - I've had to
do it many times over the course of my life (and I'm sure I have many
more ahead of me). :-)
But if you can give me an example where it doesn't work, that's
infinitely more useful (to me and the OP) than just questioning
whether or not it works. I've used the above code many, many times in
production code and on numerous platforms. I have yet to see it not
work. Does that make me *absolutely* sure? No, but let's just say...
confident. :-)
Jeff M.
On Tue, 08 Jan 2008 13:23:50 -0800, Jeff M. wrote:
> Tell ya what, I'm more than willing to admit being wrong - I've had to
> do it many times over the course of my life (and I'm sure I have many
> more ahead of me).
>
> But if you can give me an example where it doesn't work, that's
> infinitely more useful (to me and the OP) than just questioning whether
> or not it works. I've used the above code many, many times in production
> code and on numerous platforms. I have yet to see it not work. Does that
> make me *absolutely* sure? No, but let's just say... confident.
Why do you use the value 0x12345678? Wouldn't the value 1 be sufficient
and more readable? I might be missing something, but:
int am_i_little_endian()
{
int x = 1;
char * b = (char*)&x;
return *b==1;
}
Didn't actually try the above though :-)
--
Sohail Somani
http://uint32t.blogspot.com
"Jeff M." <·······@gmail.com> writes:
> On Jan 8, 2:37 pm, Raymond Wiker <····@RawMBP.local> wrote:
>> "Jeff M." <·······@gmail.com> writes:
>> > On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>> >> Rainer Joswig <······@lisp.de> writes:
>> >> > is there a way to determine the byte order
>> >> > of the underlying 'machine'? Portable?
>>
>> >> Not in Common Lisp.
>>
>> >> Not in standard C either.
>>
>> > You're joking, right?
>>
>> > // returns 1 if little endian, 0 if big
>> > int am_i_little_endian() {
>> > int x = 0x12345678;
>> > char* b = (char*)&x;
>> > return *b == (x & 0xFF);
>> > }
>>
>> Are you *absolutely* sure that's standard C, and not just
>> something that happens to work most of the time?
>
> Tell ya what, I'm more than willing to admit being wrong - I've had to
> do it many times over the course of my life (and I'm sure I have many
> more ahead of me). :-)
>
> But if you can give me an example where it doesn't work, that's
> infinitely more useful (to me and the OP) than just questioning
> whether or not it works. I've used the above code many, many times in
> production code and on numerous platforms. I have yet to see it not
> work. Does that make me *absolutely* sure? No, but let's just say...
> confident. :-)
Well, IIRC, there's nothing in the C standard that prevents an
implementation to box the integers.
Assume that 0x12345678 is stored as:
+--+--+--+--+--+
|78|12|34|56|78|
+--+--+--+--+--+
(for example, there's a type tag 0x7 for integer, a size 0x4<<1 and a
bit = 0 for some garbage collector or anything else). Then your
function will return true, when obviously the bytes are really stored
in big endian, with boxing.
Now, if you restrict yourself to implementations running on 32-bit
machines with 8-bit char with unboxed 2-complement integers, ok, it
will work. You're just lucky this kind of machine represent 99.999%
of the chips sold. (In the above example, we'd have sizeof(int)==5,
assuming a 8-bit char).
--
__Pascal Bourguignon__ http://www.informatimago.com/
HEALTH WARNING: Care should be taken when lifting this product,
since its mass, and thus its weight, is dependent on its velocity
relative to the user.
On Tue, 08 Jan 2008 22:48:16 +0100, Pascal Bourguignon
<···@informatimago.com> wrote:
>"Jeff M." <·······@gmail.com> writes:
>
>> On Jan 8, 2:37 pm, Raymond Wiker <····@RawMBP.local> wrote:
>>> "Jeff M." <·······@gmail.com> writes:
>>> > On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>>> >> Rainer Joswig <······@lisp.de> writes:
>>> >> > is there a way to determine the byte order
>>> >> > of the underlying 'machine'? Portable?
>>>
>>> >> Not in Common Lisp.
>>>
>>> >> Not in standard C either.
>>>
>>> > You're joking, right?
>>>
>>> > // returns 1 if little endian, 0 if big
>>> > int am_i_little_endian() {
>>> > int x = 0x12345678;
>>> > char* b = (char*)&x;
>>> > return *b == (x & 0xFF);
>>> > }
>>>
>>> Are you *absolutely* sure that's standard C, and not just
>>> something that happens to work most of the time?
>>
>> Tell ya what, I'm more than willing to admit being wrong - I've had to
>> do it many times over the course of my life (and I'm sure I have many
>> more ahead of me). :-)
>>
>> But if you can give me an example where it doesn't work, that's
>> infinitely more useful (to me and the OP) than just questioning
>> whether or not it works. I've used the above code many, many times in
>> production code and on numerous platforms. I have yet to see it not
>> work. Does that make me *absolutely* sure? No, but let's just say...
>> confident. :-)
>
>Well, IIRC, there's nothing in the C standard that prevents an
>implementation to box the integers.
That's technically true but see below. The C standard doesn't say a
lot of things ... for example, it does not say you can't implement
integers as lines of dancing elephants.
>Assume that 0x12345678 is stored as:
>+--+--+--+--+--+
>|78|12|34|56|78|
>+--+--+--+--+--+
>(for example, there's a type tag 0x7 for integer, a size 0x4<<1 and a
>bit = 0 for some garbage collector or anything else). Then your
>function will return true, when obviously the bytes are really stored
>in big endian, with boxing.
>
>Now, if you restrict yourself to implementations running on 32-bit
>machines with 8-bit char with unboxed 2-complement integers, ok, it
>will work. You're just lucky this kind of machine represent 99.999%
>of the chips sold. (In the above example, we'd have sizeof(int)==5,
>assuming a 8-bit char).
The C standard is somewhat less than consistent and hard to reason
about ... particularly where pointers, type conversions and characters
are concerned. The whole notion of characters and strings in C was an
afterthought. C89/C90 is the least restrictive of the existing
standards, but even there I think your example would be
non-conforming.
First, C89 does seem to require 2's complement format - it's not
stated anywhere but the need is implicit in the integer promotion
rules ($3.2.1.2).
As for your tagged integer:
C89 specifies that sizeof() returns # of bytes and sizeof(char) == 1
($3.3.3.4). So char == byte. Though the bit-width of a byte is left
implementation dependent, this means that chars are atomic values
which could not have a separate tag.
The pointer casting rules ($3.3.4) specify that "It is guaranteed that
a pointer to an object of a given alignment may be converted to a
pointer to an object of the same alignment or a less strict alignment
and back again; the result shall compare equal to the original
pointer". char* is the least restrictive pointer type and so the int*
cast to a char* would not change the address.
char is defined to be an integer type ($3.2.1.5,6). The section on
integral type conversions ($3.2.1.2) says "when an integer is demoted
to an unsigned integer with smaller size, the result is the
nonnegative remainder on division by the number one greater than the
largest unsigned number that can be represented in the type with
smaller size. When an integer is demoted to a signed integer with
smaller size, or an unsigned integer is converted to its corresponding
signed integer, if the value cannot be represented the result is
implementation-defined."
Finally, the assignment operator ($3.3.16.1) specifies that "if the
value being stored in an object is accessed from another object that
overlaps in any way the storage of the first object, then the overlap
shall be exact". Thus the value represented by the bits of the
integer that are overlapped by the char must be preserved. Since a
char is an integer type, that value will be an integer.
Though nothing explicitly says an integer can't have a tag, it seems
to me that the char* cast from the integer's address could not legally
point to the integer's tag, but could only point to the integer
itself.
YMMV.
George
--
for email reply remove "/" from address
"Jeff M." <·······@gmail.com> writes:
> On Jan 8, 2:37 pm, Raymond Wiker <····@RawMBP.local> wrote:
>> "Jeff M." <·······@gmail.com> writes:
>> > On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>> >> Rainer Joswig <······@lisp.de> writes:
>> >> > is there a way to determine the byte order
>> >> > of the underlying 'machine'? Portable?
>>
>> >> Not in Common Lisp.
>>
>> >> Not in standard C either.
>>
>> > You're joking, right?
>>
>> > // returns 1 if little endian, 0 if big
>> > int am_i_little_endian() {
>> > int x = 0x12345678;
>> > char* b = (char*)&x;
>> > return *b == (x & 0xFF);
>> > }
>>
>> Are you *absolutely* sure that's standard C, and not just
>> something that happens to work most of the time?
>
> Tell ya what, I'm more than willing to admit being wrong - I've had to
> do it many times over the course of my life (and I'm sure I have many
> more ahead of me). :-)
Oh, I'm not saying that it won't work, and I've done something
similar myself - but I'm not absolutely certain that it is guaranteed
to work. Elsewhere in this thread, Maciej Katafiasz quoted some
passages from the standard that may be a sufficiently strong
guarantee, though.
> But if you can give me an example where it doesn't work, that's
> infinitely more useful (to me and the OP) than just questioning
> whether or not it works. I've used the above code many, many times in
> production code and on numerous platforms. I have yet to see it not
> work. Does that make me *absolutely* sure? No, but let's just say...
> confident. :-)
On Jan 8, 9:37 pm, Raymond Wiker <····@RawMBP.local> wrote:
> >> Not in Common Lisp.
>
> >> Not in standard C either.
>
> > You're joking, right?
>
> > // returns 1 if little endian, 0 if big
> > int am_i_little_endian() {
> > int x = 0x12345678;
> > char* b = (char*)&x;
> > return *b == (x & 0xFF);
> > }
>
> Are you *absolutely* sure that's standard C, and not just
> something that happens to work most of the time?
§6.2.6.1.4 of ISO/IEC 9899:1999 provides explicit provisions for
casting objects to arrays of bytes, and guarantees that such an array
will actually correspond to the in-memory representation.
The standard also gives the possibility of integer types containing
padding bits, and it is not specified how such bits can be detected.
It is, however, specified that value bits must be continuous, and that
char is the smallest integer type. Therefore, assuming 8-bit bits, the
above code is bound to be portable and reliably detect little-
endianness[1] of the host machine.
Cheers,
Maciej
[1] Little-endianness understood as "least significant value byte
comes first". An implementation using little-endian value
representation with padding bits coming before the value bits wouldn't
be reliably detected, but then, any such padding-using implementation
would need to be treated as another type of endianness anyway by any
code that cares about endianness, so it's not really an issue.
Maciej Katafiasz <········@gmail.com> writes:
> ...
> �6.2.6.1.4 of ISO/IEC 9899:1999 provides explicit provisions for
> casting objects to arrays of bytes, and guarantees that such an array
> will actually correspond to the in-memory representation.
>
> The standard also gives the possibility of integer types containing
> padding bits, and it is not specified how such bits can be detected.
> It is, however, specified that value bits must be continuous, and that
> char is the smallest integer type. Therefore, assuming 8-bit bits, the
> above code is bound to be portable and reliably detect little-
> endianness[1] of the host machine.
But it seems that---at least in theory---that would fail to
distinguish little-endian from any-kind-of-endian where there are
(at least) eight padding bits at the low address and they happen to
have the same value as the eight least significant bits of the
integer value...
In any case, this is interview "anti-question" material if nothing
else...
Thanks for the precise reference to the standard.
---Vassil.
--
Bound variables, free programmers.
>>>>> "Maciej" == Maciej Katafiasz <········@gmail.com> writes:
Maciej> On Jan 8, 9:37�pm, Raymond Wiker <····@RawMBP.local> wrote:
>> >> Not in Common Lisp.
>>
>> >> Not in standard C either.
>>
>> > You're joking, right?
>>
>> > // returns 1 if little endian, 0 if big
>> > int am_i_little_endian() {
>> > � � int x = 0x12345678;
>> > � � char* b = (char*)&x;
>> > � � return *b == (x & 0xFF);
>> > }
>>
>> � � � � Are you *absolutely* sure that's standard C, and not just
>> something that happens to work most of the time?
Maciej> �6.2.6.1.4 of ISO/IEC 9899:1999 provides explicit provisions for
Maciej> casting objects to arrays of bytes, and guarantees that such an array
Maciej> will actually correspond to the in-memory representation.
Maciej> The standard also gives the possibility of integer types containing
Maciej> padding bits, and it is not specified how such bits can be detected.
Maciej> It is, however, specified that value bits must be continuous, and that
Maciej> char is the smallest integer type. Therefore, assuming 8-bit bits, the
Maciej> above code is bound to be portable and reliably detect little-
Maciej> endianness[1] of the host machine.
Neat.
Off-topic, but perhaps interesting....
I remember, long ago, that I once used a Harris H800 (?) "super-mini
computer". A rather interesting machine. It had 24-bit words, with
word addressable memory.
Pointers to strings were interesting. Because the machines were only
word addressable, characters were packed 3 per word, a pointer had to
be able to point to one of the 3 characters in a word. I forget the
exact details, but I think 2 bits out of a word were used to indicate
which of the 3 characters were being used. Of course, this meant you
couldn't address all of the possible address space available because
you used 2 of the 24 bits for other things. But pointers to words or
floats didn't have this, so casting pointers of these types to each
other caused problems.
And long ints were also weird. You'd think that 2 24-bit words were
used for a long int. That's right. But there was some kind of gap in
the middle so the expected 48 bit long ints didn't have 48 bits of
data in them. Normally you couldn't see this, but if you played some
games with shifting and oring/anding, you could actually see the gap
and create weird stuff.
I guess if the H800 were still alive (is it?) it wouldn't probably
wouldn't be able to support an ISO/IEC 9899:1999 compiler.
I certainly learned a lot about writing portable code back then, since
my code ran on H800 and pc's and some Sun workstations. :-)
Ray
On Tue, 08 Jan 2008 21:37:53 +0100, Raymond Wiker <···@RawMBP.local>
wrote:
>"Jeff M." <·······@gmail.com> writes:
>
>> On Jan 7, 2:02 pm, Pascal Bourguignon <····@informatimago.com> wrote:
>>> Rainer Joswig <······@lisp.de> writes:
>>> > is there a way to determine the byte order
>>> > of the underlying 'machine'? Portable?
>>>
>>> Not in Common Lisp.
>>>
>>> Not in standard C either.
>>
>> You're joking, right?
>>
>> // returns 1 if little endian, 0 if big
>> int am_i_little_endian() {
>> int x = 0x12345678;
>> char* b = (char*)&x;
>> return *b == (x & 0xFF);
>> }
>
> Are you *absolutely* sure that's standard C, and not just
>something that happens to work most of the time?
The C standard doesn't specify whether chars are signed or unsigned so
this code might produce a compiler warning on ==, but it is guaranteed
to work: (x & 0xFF) = 0x78, *b = 0x78 if little endian or *b = 0x01 if
big endian.
It is definitely not as straight forward as Pascal Bourguignon's
solution using unions (though Pascal's tests were overkill).
George
--
for email reply remove "/" from address
> is there a way to determine the byte order
> of the underlying 'machine'? Portable?
>
Easiest solution would be to compare the result of
(machine-type)
with the contents of a previously made assoc. list.
Most probably a single pairings list could be constructed that would
work in every implementation.
In article
<····································@m77g2000hsc.googlegroups.com>,
·······@eurogaran.com wrote:
> > is there a way to determine the byte order
> > of the underlying 'machine'? Portable?
> >
>
> Easiest solution would be to compare the result of
> (machine-type)
> with the contents of a previously made assoc. list.
That would not be enough. On some processors
two different operating systems (or even programs)
may use different byte-orders.
>
> Most probably a single pairings list could be constructed that would
> work in every implementation.
--
http://lispm.dyndns.org/
> > Easiest solution would be to compare the result of
> > (machine-type)
> > with the contents of a previously made assoc. list.
>
> That would not be enough. On some processors
> two different operating systems (or even programs)
> may use different byte-orders.
I didn't know that. Could you give some example cases?
On Tue, 8 Jan 2008 02:47:08 -0800 (PST), ·······@eurogaran.com wrote:
>> > Easiest solution would be to compare the result of
>> > (machine-type)
>> > with the contents of a previously made assoc. list.
>>
>> That would not be enough. On some processors
>> two different operating systems (or even programs)
>> may use different byte-orders.
>
>I didn't know that. Could you give some example cases?
Just an addition to Rainer Joswig's longer response ... some of the
PowerPCs support changing byte order on a per VM page basis, so even
within a single process you could have your data either way. It's
quite useful for (un)marshalling data in a heterogenous environment.
George
btw: please make sure you attribute comments from other posters. It's
very hard to follow the conversion when you don't know who wrote what.
--
for email reply remove "/" from address
In article
<····································@v46g2000hsv.googlegroups.com>,
·······@eurogaran.com wrote:
> > > Easiest solution would be to compare the result of
> > > (machine-type)
> > > with the contents of a previously made assoc. list.
> >
> > That would not be enough. On some processors
> > two different operating systems (or even programs)
> > may use different byte-orders.
>
> I didn't know that. Could you give some example cases?
ARM6, SPARC v9, DEC Alpha, many PowerPC etc.
do support different byte-orders (to some degree).
For example the Virtual PC emulator on the PowerPC (G4)
used that to more efficiently emulate a little-endian
machine on a big-endian processor. So, you
had a Mac running Mac OS as big-endian and
the emulator running under Mac OS was running
in little-endian processing mode. Later the PowerPC 970 did not have
that capability.
http://developer.apple.com/documentation/Hardware/DeviceManagers/pci_srvcs/pci_cards_drivers/PCI_BOOK.24e.html
From some SUN paper:
However, some
recent computer architectures support both big endian and little endian modes.
Why?
The reason for a processor architecture to support both big and little endian
operation invariably derives from a requirement to support legacy
environments�operating systems and applications�of both endiannesses.
MIPS was originally a big endian architecture; MIPS added little endian
support to induce DEC, with a little endian legacy, to adopt MIPS processors
for its desktop systems. IBM had a big endian legacy on its workstation and
server systems and a little endian legacy on its Intel-based personal computers
and wanted to support both with the PowerPC architecture. The IA-64
architecture resulted from a collaboration between Intel, with a little endian
legacy, and Hewlett Packard with a big endian legacy on its workstations, so it,
too, supports both.
--
http://lispm.dyndns.org/
> > I didn't know that. Could you give some example cases?
>
> ARM6, SPARC v9, DEC Alpha, many PowerPC etc.
> do support different byte-orders (to some degree).
>
> For example the Virtual PC emulator on the PowerPC (G4)
> used that to more efficiently emulate a little-endian
> machine on a big-endian processor. So, you
> had a Mac running Mac OS as big-endian and
> the emulator running under Mac OS was running
> in little-endian processing mode. Later the PowerPC 970 did not have
> that capability.
Interesting. So MCL running on the MacOS would see PowerPC (64bit, big-
endian) as machine type, while say CLISP running simultaneously inside
the PC emulator would probably detect i386 (32bit, little-endian).
Which one should be considered as "correct" remains a rather
metaphysical question.
Perhaps the need to know the endianness should be considered an
indicative of bad programming style to begin with.
In article
<····································@f3g2000hsg.googlegroups.com>,
·······@eurogaran.com wrote:
> > > I didn't know that. Could you give some example cases?
> >
> > ARM6, SPARC v9, DEC Alpha, many PowerPC etc.
> > do support different byte-orders (to some degree).
> >
> > For example the Virtual PC emulator on the PowerPC (G4)
> > used that to more efficiently emulate a little-endian
> > machine on a big-endian processor. So, you
> > had a Mac running Mac OS as big-endian and
> > the emulator running under Mac OS was running
> > in little-endian processing mode. Later the PowerPC 970 did not have
> > that capability.
>
> Interesting. So MCL running on the MacOS would see PowerPC (64bit, big-
> endian) as machine type, while say CLISP running simultaneously inside
> the PC emulator would probably detect i386 (32bit, little-endian).
> Which one should be considered as "correct" remains a rather
> metaphysical question.
>
> Perhaps the need to know the endianness should be considered an
> indicative of bad programming style to begin with.
So your software does not exchange data with other systems?
--
http://lispm.dyndns.org/
Rainer Joswig <······@lisp.de> wrote:
> In article
> <····································@f3g2000hsg.googlegroups.com>,
> ·······@eurogaran.com wrote:
[...]
> > Perhaps the need to know the endianness should be considered an
> > indicative of bad programming style to begin with.
It can be brittle.
> So your software does not exchange data with other systems?
I would recommend doing such data exchange using machine independent data
formats as much as possible. In particular, regardless of what endianness
your machine has, always read/write the data using one endianness (little,
big, or whatever). You shouldn't need to know the endianness of a machine
to do that.
-Vesa Karvonen
> I would recommend doing such data exchange using machine independent data
> formats as much as possible. In particular, regardless of what endianness
> your machine has, always read/write the data using one endianness (little,
> big, or whatever). You shouldn't need to know the endianness of a machine
> to do that.
>
That's exactly what they do with images:
JPEG contains big-endian values while GIF images contain little-endian
values.
·······@eurogaran.com writes:
>> I would recommend doing such data exchange using machine independent data
>> formats as much as possible. In particular, regardless of what endianness
>> your machine has, always read/write the data using one endianness (little,
>> big, or whatever). You shouldn't need to know the endianness of a machine
>> to do that.
>>
>
> That's exactly what they do with images:
> JPEG contains big-endian values while GIF images contain little-endian
> values.
And ESRI shapefiles contain both big-endian and little-endian values,
depending on the field. Fun!
Zach
In article <············@oravannahka.helsinki.fi>,
Vesa Karvonen <·············@cs.helsinki.fi> wrote:
> Rainer Joswig <······@lisp.de> wrote:
> > In article
> > <····································@f3g2000hsg.googlegroups.com>,
> > ·······@eurogaran.com wrote:
> [...]
> > > Perhaps the need to know the endianness should be considered an
> > > indicative of bad programming style to begin with.
>
> It can be brittle.
>
> > So your software does not exchange data with other systems?
>
> I would recommend doing such data exchange using machine independent data
> formats as much as possible. In particular, regardless of what endianness
> your machine has, always read/write the data using one endianness (little,
> big, or whatever). You shouldn't need to know the endianness of a machine
> to do that.
>
> -Vesa Karvonen
But you have to know. If you want to write big-endian binary data,
you need to know if your Lisp runs little- or big-endian.
The software I'm looking at is exactly doing that. It deals
with binary data and should run on different platforms.
Example:
If you write 32 bit values to a binary stream, the result
will be different depending what machine / OS your Lisp system
runs on.
If you use a Lisp on a SPARC/Solaris, you get different
results, than under, say x86/Solaris.
So, if you need to write big-endian 32bit data,
that means on a big-endian system you can just write the binary
data with WRITE-BYTE. On a little-endian system you have
to reorder the bytes.
It starts with TCP/IP. 32bit values are big-endian. You might
need to deal with that if you are on a little-endian system.
Java is also using big-endian.
On Jan 8, 3:35 pm, Rainer Joswig <······@lisp.de> wrote:
> In article <············@oravannahka.helsinki.fi>,
> Vesa Karvonen <·············@cs.helsinki.fi> wrote:
>
>
>
> > Rainer Joswig <······@lisp.de> wrote:
> > > In article
> > > <····································@f3g2000hsg.googlegroups.com>,
> > > ·······@eurogaran.com wrote:
> > [...]
> > > > Perhaps the need to know the endianness should be considered an
> > > > indicative of bad programming style to begin with.
>
> > It can be brittle.
>
> > > So your software does not exchange data with other systems?
>
> > I would recommend doing such data exchange using machine independent data
> > formats as much as possible. In particular, regardless of what endianness
> > your machine has, always read/write the data using one endianness (little,
> > big, or whatever). You shouldn't need to know the endianness of a machine
> > to do that.
>
> > -Vesa Karvonen
>
> But you have to know. If you want to write big-endian binary data,
> you need to know if your Lisp runs little- or big-endian.
> The software I'm looking at is exactly doing that. It deals
> with binary data and should run on different platforms.
You can always read and write 8-bit bytes at a time, and use ldb and
(setf ldb) to arrange those into larger values in the desired byte
order. Then your code will run unmodified under big-, little-, or
wacky-endian machines.
In article
<····································@l32g2000hse.googlegroups.com>,
"Thomas F. Burdick" <········@gmail.com> wrote:
> On Jan 8, 3:35 pm, Rainer Joswig <······@lisp.de> wrote:
> > In article <············@oravannahka.helsinki.fi>,
> > Vesa Karvonen <·············@cs.helsinki.fi> wrote:
> >
> >
> >
> > > Rainer Joswig <······@lisp.de> wrote:
> > > > In article
> > > > <····································@f3g2000hsg.googlegroups.com>,
> > > > ·······@eurogaran.com wrote:
> > > [...]
> > > > > Perhaps the need to know the endianness should be considered an
> > > > > indicative of bad programming style to begin with.
> >
> > > It can be brittle.
> >
> > > > So your software does not exchange data with other systems?
> >
> > > I would recommend doing such data exchange using machine independent data
> > > formats as much as possible. In particular, regardless of what endianness
> > > your machine has, always read/write the data using one endianness (little,
> > > big, or whatever). You shouldn't need to know the endianness of a machine
> > > to do that.
> >
> > > -Vesa Karvonen
> >
> > But you have to know. If you want to write big-endian binary data,
> > you need to know if your Lisp runs little- or big-endian.
> > The software I'm looking at is exactly doing that. It deals
> > with binary data and should run on different platforms.
>
> You can always read and write 8-bit bytes at a time, and use ldb and
> (setf ldb) to arrange those into larger values in the desired byte
> order. Then your code will run unmodified under big-, little-, or
> wacky-endian machines.
Performance?
Rainer Joswig <······@lisp.de> writes:
> Performance?
Try it first, then profile, then fix it where performance is
unacceptable.
I use (unsigned-byte 8) streams and vectors to write
endian-independent values to image files, and it works fast enough for
me to generate tens of thousands of graphics every day. Your needs
might be different.
Zach
On Jan 8, 4:29 pm, Rainer Joswig <······@lisp.de> wrote:
> In article
> <····································@l32g2000hse.googlegroups.com>,
> "Thomas F. Burdick" <········@gmail.com> wrote:
>
>
>
> > On Jan 8, 3:35 pm, Rainer Joswig <······@lisp.de> wrote:
> > > In article <············@oravannahka.helsinki.fi>,
> > > Vesa Karvonen <·············@cs.helsinki.fi> wrote:
>
> > > > Rainer Joswig <······@lisp.de> wrote:
> > > > > In article
> > > > > <····································@f3g2000hsg.googlegroups.com>,
> > > > > ·······@eurogaran.com wrote:
> > > > [...]
> > > > > > Perhaps the need to know the endianness should be considered an
> > > > > > indicative of bad programming style to begin with.
>
> > > > It can be brittle.
>
> > > > > So your software does not exchange data with other systems?
>
> > > > I would recommend doing such data exchange using machine independent data
> > > > formats as much as possible. In particular, regardless of what endianness
> > > > your machine has, always read/write the data using one endianness (little,
> > > > big, or whatever). You shouldn't need to know the endianness of a machine
> > > > to do that.
>
> > > > -Vesa Karvonen
>
> > > But you have to know. If you want to write big-endian binary data,
> > > you need to know if your Lisp runs little- or big-endian.
> > > The software I'm looking at is exactly doing that. It deals
> > > with binary data and should run on different platforms.
>
> > You can always read and write 8-bit bytes at a time, and use ldb and
> > (setf ldb) to arrange those into larger values in the desired byte
> > order. Then your code will run unmodified under big-, little-, or
> > wacky-endian machines.
>
> Performance?
Unless you're writing a deep-packet inspecting router, I seriously
doubt that any reasonable CL will give you performance problems with
this technique (or obvious variations on it -- think of read-sequence,
for example).
But what makes you think you have alternatives? Okay, so you're doing
little-endian i/o and you've successfully detected that you're on a
big-endian machine; what do you do now? You *have* to arrange the
bytes in the order you need them, there's no magical way of avoiding
it.
P� Tue, 08 Jan 2008 16:29:45 +0100, skrev Rainer Joswig <······@lisp.de>:
>
> Performance?
What do you think buffering is for?
--------------
John Thingstad
Rainer Joswig <······@lisp.de> writes:
> In article <············@oravannahka.helsinki.fi>,
> Vesa Karvonen <·············@cs.helsinki.fi> wrote:
>
>> Rainer Joswig <······@lisp.de> wrote:
>> > In article
>> > <····································@f3g2000hsg.googlegroups.com>,
>> > ·······@eurogaran.com wrote:
>> [...]
>> > > Perhaps the need to know the endianness should be considered an
>> > > indicative of bad programming style to begin with.
>>
>> It can be brittle.
>>
>> > So your software does not exchange data with other systems?
>>
>> I would recommend doing such data exchange using machine independent data
>> formats as much as possible. In particular, regardless of what endianness
>> your machine has, always read/write the data using one endianness (little,
>> big, or whatever). You shouldn't need to know the endianness of a machine
>> to do that.
>>
>> -Vesa Karvonen
>
> But you have to know. If you want to write big-endian binary data,
> you need to know if your Lisp runs little- or big-endian.
> The software I'm looking at is exactly doing that. It deals
> with binary data and should run on different platforms.
See my response earlier on this thread. In addition to what I said
there about read-vector/write-vector, we also supply as part of our
"osi" (operating system interface) module the four inet functions
htonl, htons, ntohl, and ntohs:
http://www.franz.com/support/documentation/8.1/doc/os-interface.htm#ntohl-op-bookmarkxx
for doing specific conversions without knowing the endianness of the
architecture you are on.
--
Duane Rettig ·····@franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
On Jan 8, 9:35 am, Rainer Joswig <······@lisp.de> wrote:
>
> But you have to know. If you want to write big-endian binary data,
> you need to know if your Lisp runs little- or big-endian.
> The software I'm looking at is exactly doing that. It deals
> with binary data and should run on different platforms.
>
> Example:
>
> If you write 32 bit values to a binary stream, the result
> will be different depending what machine / OS your Lisp system
> runs on.
>
> If you use a Lisp on a SPARC/Solaris, you get different
> results, than under, say x86/Solaris.
>
> So, if you need to write big-endian 32bit data,
> that means on a big-endian system you can just write the binary
> data with WRITE-BYTE. On a little-endian system you have
> to reorder the bytes.
I think this particular passage and this thread in general, are in
danger of confusing the separate issues of "data layout in byte-
addressed memory" with "data layout in I/O streams." I/O streams (and
files) are going to have complexities that go beyond "little endian"
and "big endian" distinctions. In particular, whether the streams have
implementation-, file-system- or OS-specific support for indicating
byte order and byte packing conventions.
Writing integers to a stream of element type (UNSIGNED-BYTE 32) is not
checking at all the same thing as C code that packs a long/char[4]
union in memory and reads back the result.
> It starts with TCP/IP. 32bit values are big-endian. You might
> need to deal with that if you are on a little-endian system.
>
> Java is also using big-endian.
Rainer Joswig <······@lisp.de> writes:
> In article <············@oravannahka.helsinki.fi>,
> Vesa Karvonen <·············@cs.helsinki.fi> wrote:
>
>> Rainer Joswig <······@lisp.de> wrote:
>> > In article
>> > <····································@f3g2000hsg.googlegroups.com>,
>> > ·······@eurogaran.com wrote:
>> [...]
>> > > Perhaps the need to know the endianness should be considered an
>> > > indicative of bad programming style to begin with.
>>
>> It can be brittle.
>>
>> > So your software does not exchange data with other systems?
>>
>> I would recommend doing such data exchange using machine independent data
>> formats as much as possible. In particular, regardless of what endianness
>> your machine has, always read/write the data using one endianness (little,
>> big, or whatever). You shouldn't need to know the endianness of a machine
>> to do that.
>>
>> -Vesa Karvonen
>
> It starts with TCP/IP. 32bit values are big-endian. You might
> need to deal with that if you are on a little-endian system.
I guess, there are the
htonl, htons, ntohl, ntohs -- convert values between host and network
byte order
set of native functions (man section 3) to deal with byte order.
With best regards,
Victor
Vesa Karvonen <·············@cs.helsinki.fi> writes:
>I would recommend doing such data exchange using machine independent data
>formats as much as possible. In particular, regardless of what endianness
>your machine has, always read/write the data using one endianness (little,
>big, or whatever). You shouldn't need to know the endianness of a machine
>to do that.
Ah, but to do so requires some part of the software to know the endianness.
(The only way around that is using only textual representations which is not
highly efficient)
Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
Casper H.S. Dik <··········@sun.com> wrote:
> Vesa Karvonen <·············@cs.helsinki.fi> writes:
> >I would recommend doing such data exchange using machine independent data
> >formats as much as possible. In particular, regardless of what endianness
> >your machine has, always read/write the data using one endianness (little,
> >big, or whatever). You shouldn't need to know the endianness of a machine
> >to do that.
> Ah, but to do so requires some part of the software to know the endianness.
In the case of integers it does not, because you can always easily take
them apart via repeated division. Floating point values are more
difficult to serialize without primitive support from the compiler (but it
can be done).
-Vesa Karvonen
* Vesa Karvonen <············@oravannahka.helsinki.fi>
Wrote on 8 Jan 2008 13:28:51 GMT:
| Rainer Joswig <······@lisp.de> wrote:
| It can be brittle.
|
|> So your software does not exchange data with other systems?
|
| I would recommend doing such data exchange using machine independent data
| formats as much as possible. In particular, regardless of what endianness
| your machine has, always read/write the data using one endianness (little,
| big, or whatever). You shouldn't need to know the endianness of a machine
| to do that.
I'm sure Rainer has a case where some data is being generated in the
host byte format. For example tcpdump(8) savefiles, intended for
dissection with pcap(3), can be read on a machine with a different byte
order than the ones they were saved on. Some field(s) may be specified
to be host byte order, in which case it becomes necessary to swab stuff
when reading the file, and to determine whether the byte order of the
machine differs from the machine where the dump was done.
Here is how I do it: the file magic indicates the byte order of the
machine doing the dump. When reading the file, start reading the magic
as little-endian, and then determine correct endianness to read the rest
of the file. (Any loopholes?) Sketch:
(defvar *pcap-file-header-magic*
'((:little-endian #xd4c3b2a1) ;1234
(:pdp-endian #xb2a1d4c3) ;3412 (BOGUS)!
(:big-endian #xa1b2c3d4))) ;4321
(defun read-dump-file (...) ...
(let* ((*endian* :little-endian) ...
(magic (read-binary ....)))
;; Then set endian again depending on the value if swapped
(let ((*endian*
(car (rassoc magic *pcap-file-header-magic* :key #'car))))
...
(read-binary ...) ...)))
--
Madhu
Madhu <·······@meer.net> writes:
> ...
> Here is how I do it: the file magic indicates the byte order of the
> machine doing the dump.
And then there is the zero-width non-breaking space (U+FFFE, or is
it U+FEFF), and... UTF-8 rules. I don't know what the moral of this
story is---I wonder what the Duchess would say...
---Vassil.
--
Bound variables, free programmers.
Rainer Joswig <······@lisp.de> writes:
> In article
> <····································@f3g2000hsg.googlegroups.com>,
> ·······@eurogaran.com wrote:
>
>> > > I didn't know that. Could you give some example cases?
>> >
>> > ARM6, SPARC v9, DEC Alpha, many PowerPC etc.
>> > do support different byte-orders (to some degree).
>> >
>> > For example the Virtual PC emulator on the PowerPC (G4)
>> > used that to more efficiently emulate a little-endian
>> > machine on a big-endian processor. So, you
>> > had a Mac running Mac OS as big-endian and
>> > the emulator running under Mac OS was running
>> > in little-endian processing mode. Later the PowerPC 970 did not have
>> > that capability.
>>
>> Interesting. So MCL running on the MacOS would see PowerPC (64bit, big-
>> endian) as machine type, while say CLISP running simultaneously inside
>> the PC emulator would probably detect i386 (32bit, little-endian).
>> Which one should be considered as "correct" remains a rather
>> metaphysical question.
>>
>> Perhaps the need to know the endianness should be considered an
>> indicative of bad programming style to begin with.
>
> So your software does not exchange data with other systems?
Ah, but this is a totally different question!
You need to know the byte order of your file and transmission media.
Nothing to do with the bytesex of the processor/system.
You should write byte sequences. If your file format specifies a
24-bit integer I=(A*256+B)*256+C (with A, B, C in [0..255]) stored as:
+---+---+---+
| A | B | C | "big endian"
+---+---+---+
then you write it with:
(with-open-file (stream path :direction :output :external-format '(unsigned-byte 8))
(write-byte (ldb (byte 8 16) i))
(write-byte (ldb (byte 8 8) i))
(write-byte (ldb (byte 8 0) i)))
On the other hand, if it specifies this order:
+---+---+---+
| C | A | B | "random endian"
+---+---+---+
then you write it with:
(with-open-file (stream path :direction :output :external-format '(unsigned-byte 8))
(write-byte (ldb (byte 8 0) i))
(write-byte (ldb (byte 8 16) i))
(write-byte (ldb (byte 8 8) i)))
etc.
--
__Pascal Bourguignon__ http://www.informatimago.com/
"Our users will know fear and cower before our software! Ship it!
Ship it and let them flee like the dogs they are!"
<·······@eurogaran.com> wrote:
+---------------
| > On some processors two different operating systems
| > (or even programs) may use different byte-orders.
|
| I didn't know that. Could you give some example cases?
+---------------
SGI's MIPS-based Irix workstations were big-endian;
DEC's MIPS-based Ultrix workstations were little-endian.
-Rob
-----
Rob Warnock <····@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
Madhu <·······@meer.net> writes:
> * Rainer Joswig <····························@news-europe.giganews.com> :
> Wrote on Tue, 08 Jan 2008 11:38:24 +0100:
>
> | In article
> | <····································@m77g2000hsc.googlegroups.com>,
> | ·······@eurogaran.com wrote:
> |
> |> > is there a way to determine the byte order
> |> > of the underlying 'machine'? Portable?
> |> >
> |>
> |> Easiest solution would be to compare the result of
> |> (machine-type)
> |> with the contents of a previously made assoc. list.
> |
> | That would not be enough. On some processors
> | two different operating systems (or even programs)
> | may use different byte-orders.
>
> Can you use the technique I outlined in another post in this thread? It
> would involve creating 2 files with a certain magic, one on a big endian
> machine and one on a little endian machine. Copying the files to the
> target machine. Reading the files would be sufficient to determine
> endianness. I think this should be portable enough for use inside an
> application if you have control over configuration management.
The Common Lisp standard doesn't define any mapping of its file
formats (external-format) to the host files. The only standard
external-format is :DEFAULT, and nothing is specified about it!
However most implementation on 8-bit addressable system do implement
:element-type '(unsigned-byte 8) :external-format :default sanely,
that is, mapping each byte to an octet in the host file, without any
overhead. This would be the most you can do to read and write files
portably. However, it would be totally useless to infer byte order
since you have to explicitely write the bytes in the order you want.
For :element-type (unsigned-byte N) with (/= N 8), anything can
happen. To implement the requirements of the standard, most
implementations will add a header or a trailer to the file when N is
not a multiple of 8. Some implementation will write the bytes in an
order that depend on the host system, but some won't.
For example, clisp ALWAYS writes binary files in little-endian order,
to ensure portability of the files from all the platforms it works on.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Un chat errant
se soulage
dans le jardin d'hiver
Shiki
·······@eurogaran.com writes:
·······@eurogaran.com writes:
>> is there a way to determine the byte order
>> of the underlying 'machine'? Portable?
>>
>
> Easiest solution would be to compare the result of
> (machine-type)
> with the contents of a previously made assoc. list.
>
> Most probably a single pairings list could be constructed that would
> work in every implementation.
AFAIK, PowerPC, ARM, etc, can work either in little endian and in big
endian. Knowing only the processor name is not enough to know what
endianness is used by the system.
--
__Pascal Bourguignon__ http://www.informatimago.com/
"Our users will know fear and cower before our software! Ship it!
Ship it and let them flee like the dogs they are!"
Rainer Joswig <······@lisp.de> writes:
> Hi,
>
> is there a way to determine the byte order
> of the underlying 'machine'? Portable?
From the rest of this thread, you obviously understand that endianness
is not a characteristic of a machine but of a state of a machine (I
presume that's why you put quotes around 'machine'). Thus it tends to
be more a characteristic of an operating system than of a machine. I
suppose it could even be made a per-program basis, like the
distinction between 32 and 64 bits on machines which support both.
But usually those systems have separate libraries for each (or
libraries which can be configured for each), and I've never seen a set
of libraries that differ only in endianness - in that sense the
endianness tends to get paired with the operating system itself.
What operations are you trying to perform? In Allegro CL, we do two
things: We provide either :big-endian or :little-endian on *features*,
and we also implement simple-streams, which has an extended function
pair called read-vector/write-vector:
http://www.franz.com/support/documentation/8.1/doc/operators/excl/read-vector.htm
http://www.franz.com/support/documentation/8.1/doc/operators/excl/write-vector.htm
These are similar to read-sequence/write-sequence, but they are more
octet-fill oriented rather than element oriented. Each of these
functions has an :endian-swap keyword argument - you can specify a bit
pattern or certain keywords to indicate the kind of swapping, which
includes the :network-order keyword. Thus any files written with
write-vector with :endian-swap :network-order can be read from any
lisp with read-vector call with :endian-swap :network-order.
--
Duane Rettig ·····@franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
In article <··············@gemini.franz.com>,
Duane Rettig <·····@franz.com> wrote:
> Rainer Joswig <······@lisp.de> writes:
>
> > Hi,
> >
> > is there a way to determine the byte order
> > of the underlying 'machine'? Portable?
>
> From the rest of this thread, you obviously understand that endianness
> is not a characteristic of a machine but of a state of a machine (I
> presume that's why you put quotes around 'machine'). Thus it tends to
> be more a characteristic of an operating system than of a machine. I
> suppose it could even be made a per-program basis, like the
> distinction between 32 and 64 bits on machines which support both.
> But usually those systems have separate libraries for each (or
> libraries which can be configured for each), and I've never seen a set
> of libraries that differ only in endianness - in that sense the
> endianness tends to get paired with the operating system itself.
>
> What operations are you trying to perform? In Allegro CL, we do two
> things: We provide either :big-endian or :little-endian on *features*,
> and we also implement simple-streams, which has an extended function
> pair called read-vector/write-vector:
>
> http://www.franz.com/support/documentation/8.1/doc/operators/excl/read-vector.htm
> http://www.franz.com/support/documentation/8.1/doc/operators/excl/write-vector.htm
>
> These are similar to read-sequence/write-sequence, but they are more
> octet-fill oriented rather than element oriented. Each of these
> functions has an :endian-swap keyword argument - you can specify a bit
> pattern or certain keywords to indicate the kind of swapping, which
> includes the :network-order keyword. Thus any files written with
> write-vector with :endian-swap :network-order can be read from any
> lisp with read-vector call with :endian-swap :network-order.
Thanks for the info!