How to write portable bitwise operation code? (or how not to think like a c hacker)

From: Liu Fung Sin
Subject: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sat, 07 Apr 2007 01:09:00 +0000
Message-ID: <1175908140.509372.71000@w1g2000hsg.googlegroups.com>

Hi all,

Coming from a C programming background, I have no problem writing this
routine to check whether a number is a valid ipv4 netmask:

int is_power_of_2_minus_1 (uint32 x)
{
    return (x & (x+1)) == 0;
}

int valid_netmask (uint32 x)
{
    return is_power_of_2_minus_1(~x);
}

valid_netmask(0xffffff00) => 1


However, I can't do the same in common lisp. For instance, I can't
assume the size of fixnum and that the result of lognot is not a
unsigned 32bit integer.

(defun power-of-2-minus-1-p (x)
  "This kinda works"
  (= (logand x (1+ x)) 0))

(power-of-2-minus-1-p #b11111111111111111111111111111111) => t
(power-of-2-minus-1-p #b00000000000000001111111111111111) => t
(power-of-2-minus-1-p #b00000000000000001110111111111111) => nil
(power-of-2-minus-1-p #b0) => t

(defun ipv4-netmask-p (netmask)
  "This blows"
  (power-of-2-minus-1-p (lognot netmask)))

So how do I do the following?

(uint32-lognot #b00000000000000001111111111111111)

=> #b11111111111111110000000000000000

Or maybe my question really is, how not to think like a c hacker when
I'm in the REPL? :-)

Thanks

Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Rob Warnock
- Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Rob Warnock
  - Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Liu Fung Sin
    - Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Rob Warnock
Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Vassil Nikolov
Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Zach Beane
- Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Liu Fung Sin
  - Re: How to write portable bitwise operation code? (or how not to think like a c hacker) Damien Kick

From: Rob Warnock
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sat, 07 Apr 2007 03:37:34 +0000
Message-ID: <5fudnQ5-KIdjjIrbnZ2dnUVZ_uygnZ2d@speakeasy.net>

Liu Fung Sin <···········@gmail.com> wrote:
+---------------
| Coming from a C programming background, I have no problem writing this
| routine to check whether a number is a valid ipv4 netmask:
...[C code elided]...
| valid_netmask(0xffffff00) => 1
| 
| However, I can't do the same in common lisp. For instance, I can't
| assume the size of fixnum and that the result of lognot is not a
| unsigned 32bit integer.
+---------------

Try not thinking so much about C and unsigned-32 and look at the
math of what you're trying to do in general [see if the number is
a string of 1's followed by a string of 0's and nothing else] and
then see what's already in Common Lisp that will help you. E.g., I
think this might do what you want:

   > (defun valid-netmask-p (x &optional (addr-width 32))
       (and (plusp x)
	    (= (integer-length x) addr-width)
	    (let ((y (- (ash 1 addr-width) 1 x))) ; strip/cmpl 1st run of 1's
	      (= y (1- (ash 1 (integer-length y))))))) ; also a run of 1's?

   VALID-NETMASK-P
   > (mapcar 'valid-netmask-p '(#b11111111111111111111111111111111
				#b11111111111111111111111100000000
				#b11111111111111110000000000000000
				#b10000000000000000000000000000000
				#b11111101111111110000000000000000
				#b01111111111111110000000000000000
				#b11111111111111110001000000000000
				#b00000000000000000000000000000000
				))

    (T T T T NIL NIL NIL NIL)
    > 

You'll notice I generalized it just a wee bit:

    > (valid-netmask-p #b111111111111111110000000000000000000000000000000 48)

    T
    > (valid-netmask-p #xffffffffffffffffffff000000000000 128)

    T
    > (valid-netmask-p #xffffffffffffffffffff000001000000 128)

    NIL
    > 


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Rob Warnock
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sat, 07 Apr 2007 09:20:42 +0000
Message-ID: <bvudnW_wetz3_4rbnZ2dnUVZ_oernZ2d@speakeasy.net>

A few hours ago, I wrote:
+---------------
|   (defun valid-netmask-p (x &optional (addr-width 32))
|     (and (plusp x)
|          (= (integer-length x) addr-width)
|          (let ((y (- (ash 1 addr-width) 1 x))) ; strip/cmpl 1st run of 1's
| 	     (= y (1- (ash 1 (integer-length y))))))) ; also a run of 1's?
+---------------

Upon further reflection, I noticed a *much* simpler way to do this
[and probably much faster, since it doesn't use INTEGER-LENGTH]:

    (defun valid-netmask-p (x &optional (addr-width 32))
      (= x (- (ash 1 addr-width) (logand x (- x)))))  


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Liu Fung Sin
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sat, 07 Apr 2007 17:22:44 +0000
Message-ID: <1175966564.643644.178350@y66g2000hsf.googlegroups.com>

Hi Rob,

On Apr 7, 2:20 am, ····@rpw3.org (Rob Warnock) wrote:
> Try not thinking so much about C and unsigned-32 and look at the
> math of what you're trying to do in general [see if the number is
> a string of 1's followed by a string of 0's and nothing else] and
> then see what's already in Common Lisp that will help you.

I guess my problem is that I'm far more comfortable to visualize what
I'm doing in C (you can basically imagine what the machine is doing
with each c statement) than in lisp.

Also I'm only familiar with the subset of CL vocabulary that help
building abstractions (function, macro, control structure) and not the
darker corner of the language (types, integer-length...)

> Upon further reflection, I noticed a *much* simpler way to do this
> [and probably much faster, since it doesn't use INTEGER-LENGTH]:

Thanks. That works perfectly.

I did some comparisons, for educational purpose.

It seems that (at least on 32-bit lisp) because everything is tagged
and there's no 32-bit fixnum arithmetic, there are certain things that
can't be done as efficiently as languages that are "closer" to the
machine (e.g. C)

But again, thanks for your reply. I learned quite a bit.

int valid_netmask (unsigned int netmask)
{
    unsigned int x = ~netmask;
    return (x & (x+1)) == 0;
}

/*
0804837c <valid_netmask>:
 804837c:	55                   	push   %ebp
 804837d:	89 e5                	mov    %esp,%ebp
 804837f:	83 ec 10             	sub    $0x10,%esp
 8048382:	8b 45 08             	mov    0x8(%ebp),%eax
 8048385:	f7 d0                	not    %eax
 8048387:	89 45 fc             	mov    %eax,0xfffffffc(%ebp)
 804838a:	8b 45 fc             	mov    0xfffffffc(%ebp),%eax
 804838d:	40                   	inc    %eax
 804838e:	23 45 fc             	and    0xfffffffc(%ebp),%eax
 8048391:	85 c0                	test   %eax,%eax
 8048393:	0f 94 c0             	sete   %al
 8048396:	0f b6 c0             	movzbl %al,%eax
 8048399:	c9                   	leave
 804839a:	c3                   	ret
*/

(deftype ipv4-integer ()
  '(integer 0 #xffffffff))

(defun ipv4-netmask-p (x)
  (declare (optimize (speed 3) (safety 0)))
  (declare (ipv4-integer x))
  (= x (the ipv4-integer
         (- #.(ash 1 32)
            (the ipv4-integer
              (logand x (the (integer #x-ffffffff 0) (- x))))))))

(disassemble #'ipv4-netmask-p)

; 0AD3B109:       8B55F4           MOV EDX, [EBP-12]          ; no-arg-
parsing entry point
;       0C:       E81D522CF6       CALL #x100032E             ;
GENERIC-NEGATE
;       11:       7302             JNB L0
;       13:       8BE3             MOV ESP, EBX
;       15: L0:   8BFA             MOV EDI, EDX
;       17:       8B55F4           MOV EDX, [EBP-12]
;       1A:       8BDC             MOV EBX, ESP
;       1C:       55               PUSH EBP
;       1D:       83EC08           SUB ESP, 8
;       20:       8BEB             MOV EBP, EBX
;       22:       B908000000       MOV ECX, 8
;       27:       FF15CC060005     CALL DWORD PTR [#x50006CC]
;       2D:       7302             JNB L1
;       2F:       8BE3             MOV ESP, EBX
;       31: L1:   8BFA             MOV EDI, EDX
;       33:       8B15ACB0D30A     MOV EDX, [#xAD3B0AC]       ;
4294967296
;       39:       E889502CF6       CALL #x10001C7             ;
GENERIC--
;       3E:       7302             JNB L2
;       40:       8BE3             MOV ESP, EBX
;       42: L2:   8BC2             MOV EAX, EDX
;       44:       A803             TEST AL, 3
;       46:       7405             JEQ L3
;       48:       8B48FD           MOV ECX, [EAX-3]
;       4B:       EB05             JMP L4
;       4D: L3:   C1F802           SAR EAX, 2
;       50:       8BC8             MOV ECX, EAX
;       52: L4:   8B45F4           MOV EAX, [EBP-12]
;       55:       A803             TEST AL, 3
;       57:       7405             JEQ L5
;       59:       8B40FD           MOV EAX, [EAX-3]
;       5C:       EB03             JMP L6
;       5E: L5:   C1F802           SAR EAX, 2
;       61: L6:   39C8             CMP EAX, ECX
;       63:       750F             JNE L8
;       65:       BA27000005       MOV EDX, 83886119
;       6A: L7:   8D65F8           LEA ESP, [EBP-8]
;       6D:       F8               CLC
;       6E:       8B6DFC           MOV EBP, [EBP-4]
;       71:       C20400           RET 4
;       74: L8:   BA0B000005       MOV EDX, 83886091
;       79:       EBEF             JMP L7
;       7B:       90               NOP
;       7C:       90               NOP
;       7D:       90               NOP
;       7E:       90               NOP
;       7F:       90               NOP
;       80:       CC0A             BREAK 10                   ; error
trap
;       82:       02               BYTE #X02
;       83:       18               BYTE #X18                  ;
INVALID-ARG-COUNT-ERROR
;       84:       4D               BYTE #X4D                  ; ECX

From: Rob Warnock
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sun, 08 Apr 2007 02:22:19 +0000
Message-ID: <2sidnTt_LfRGzIXbnZ2dnUVZ_tKjnZ2d@speakeasy.net>

Liu Fung Sin <···········@gmail.com> wrote:
+---------------
| ····@rpw3.org (Rob Warnock) wrote:
| > Upon further reflection, I noticed a *much* simpler way to do this...
...
| I did some comparisons, for educational purpose.
| It seems that (at least on 32-bit lisp) because everything is tagged
| and there's no 32-bit fixnum arithmetic, there are certain things that
| can't be done as efficiently as languages that are "closer" to the
| machine (e.g. C)
...
| (deftype ipv4-integer ()
|   '(integer 0 #xffffffff))
| 
| (defun ipv4-netmask-p (x)
|   (declare (optimize (speed 3) (safety 0)))
|   (declare (ipv4-integer x))
|   (= x (the ipv4-integer
|          (- #.(ash 1 32)
|             (the ipv4-integer
|               (logand x (the (integer #x-ffffffff 0) (- x))))))))
| 
| (disassemble #'ipv4-netmask-p)
| ...[elided]...
+---------------

Well, if you *only* want to do the 32-bit case, then my code is
a bit too general. Part of the problem you're seeing is the
(ASH 1 32), which is generating a 33-bit number, which forces
bignum arithmetic. (Oops.)

If one rewrites the expression to get rid of that, and uses
an explicit LOGAND to trim the result to 32 bits instead of
declarations or THE [which *don't* always trim them properly],
then things get quite a bit better [at least in CMUCL, which seems
to do a pretty good job of propagating (LOGAND #xffffffff EXPR)
down to the leaves of EXPR] -- not much bigger than your C code
if you ignore the differences in calling-sequence overheads:

> (defun ipv4-netmask-p (x)
    (declare (optimize (speed 3) (safety 0)))
    (declare (type (unsigned-byte 32) x))
    (and (plusp x) ; Needed 'cuz the tight version doesn't reject 0.
	 (= x (logand #xffffffff (1+ (- #xffffffff (logand x (- x))))))))

IPV4-NETMASK-P
> (disassemble *)
; Compiling LAMBDA (X): 
; Compiling Top-Level Form: 

48131020:       .ENTRY "LAMBDA (X)"(x)       ; (FUNCTION (#) (MEMBER T NIL))
      38:       POP     DWORD PTR [EBP-8]
      3B:       LEA     ESP, [EBP-32]
      3E:       MOV     EAX, EDX

      40:       TEST    AL, 3                ; [:NON-LOCAL-ENTRY]
      42:       JEQ     L0
      44:       MOV     EAX, [EAX-3]
      47:       JMP     L1
      49: L0:   SAR     EAX, 2

      4C: L1:   CMP     EAX, 0               ; No-arg-parsing entry point
                                             ; [:NON-LOCAL-ENTRY]
      4F:       JNBE    L4

;;; [11] (AND (PLUSP X) (= X (LOGAND 4294967295 #)))

      51:       MOV     EDX, #x28F0000B      ; NIL
                                             ; [:BLOCK-START]

      56: L2: L3:MOV    ECX, [EBP-8]         ; [:BLOCK-START]
      59:       MOV     EAX, [EBP-4]
      5C:       ADD     ECX, 2
      5F:       MOV     ESP, EBP
      61:       MOV     EBP, EAX
      63:       JMP     ECX

;;; [15] (1+ (- 4294967295 (LOGAND X #)))

      65: L4:   MOV     ECX, EAX             ; [:BLOCK-START]
      67:       NEG     ECX
      69:       MOV     EDX, EAX
      6B:       AND     EDX, ECX
      6D:       MOV     ECX, 4294967295
      72:       SUB     ECX, EDX
      74:       INC     ECX
      75:       CMP     ECX, EAX
      77:       JEQ     L5

;;; [13] (= X (LOGAND 4294967295 (1+ #)))

      79:       MOV     EDX, #x28F0000B      ; NIL
                                             ; [:BLOCK-START]
      7E:       JMP     L3

      80: L5:   MOV     EDX, 686817319       ; T
                                             ; [:BLOCK-START]
      85:       JMP     L3
      87:       NOP
> 


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Vassil Nikolov
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Sat, 07 Apr 2007 23:38:49 +0000
Message-ID: <m3zm5jkb5i.fsf@localhost.localdomain>

On 6 Apr 2007 18:09:00 -0700, "Liu Fung Sin" <···········@gmail.com> said:
| ...
| However, I can't do the same in common lisp. For instance, I can't
| assume the size of fixnum and that the result of lognot is not a
| unsigned 32bit integer.

| (defun power-of-2-minus-1-p (x)
|   "This kinda works"
|   (= (logand x (1+ x)) 0))

  In the context of IPv4, you can probably assume that the type of X
  will be (UNSIGNED-BYTE 32) (and declare it accordingly).  Then,
  since in C, when x is a 32-bit unsigned integer, x+1 is equivalent
  to (MOD (+ X 1) (EXPT 2 32)) for an (UNSIGNED-BYTE 32) X, I'd check
  what the compiler generates for the latter expression (possibly with
  (OPTIMIZE SPEED (SAFETY 0))) before deciding on the implementation.

  ---Vassil.


-- 
The truly good code is the obviously correct code.

From: Zach Beane
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Mon, 09 Apr 2007 13:37:07 +0000
Message-ID: <m3hcrpekjg.fsf@unnamed.xach.com>

"Liu Fung Sin" <···········@gmail.com> writes:

> Hi all,
> 
> Coming from a C programming background, I have no problem writing this
> routine to check whether a number is a valid ipv4 netmask:
> 
> int is_power_of_2_minus_1 (uint32 x)
> {
>     return (x & (x+1)) == 0;
> }
> 
> int valid_netmask (uint32 x)
> {
>     return is_power_of_2_minus_1(~x);
> }
> 
> valid_netmask(0xffffff00) => 1
> 
> 
> However, I can't do the same in common lisp. For instance, I can't
> assume the size of fixnum and that the result of lognot is not a
> unsigned 32bit integer.

Here's what I'd write:

   (defun ipv4-netmask-p (integer)
     (let ((x (logandc2 #xFFFFFFFF integer)))
       (zerop (logand x (1+ x)))))

(ipv4-netmask-p #xFFFFFF00) => t
(ipv4-netmask-p #xFFFF0001) => nil

> So how do I do the following?
> 
> (uint32-lognot #b00000000000000001111111111111111)
>
> => #b11111111111111110000000000000000


(write (logandc2 #xFFFFFFFF #b00000000000000001111111111111111)
       :radix t :base 2) 

=> #b11111111111111110000000000000000

> Or maybe my question really is, how not to think like a c hacker when
> I'm in the REPL? :-)

Just takes practice.

Zach

From: Liu Fung Sin
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Mon, 09 Apr 2007 17:58:28 +0000
Message-ID: <1176141508.806714.49660@b75g2000hsg.googlegroups.com>

Zach Beane wrote:
> Here's what I'd write:
>
>    (defun ipv4-netmask-p (integer)
>      (let ((x (logandc2 #xFFFFFFFF integer)))
>        (zerop (logand x (1+ x)))))

Nice... this is a 1-1 translation of the c code.

I've read the description of logandc2 in clhs before but I don't
understand where/how to apply it. I think I am getting the idea, since
number in CL has arbitrary precision the compliment of a number
doesn't necessary have a fixed address width like in C (32/64 bit).
This is a bad habit coming from C that I always visualize what the
bits are like in the machine.

> Just takes practice.

Before I can do that I first need to understand the subject.

Thank you and also Rob for taking the time to show me the way.

From: Damien Kick
Subject: Re: How to write portable bitwise operation code? (or how not to think like a c hacker)
Date: Thu, 26 Apr 2007 07:54:09 +0000
Message-ID: <BUYXh.3942$Ut6.3669@newsread1.news.pas.earthlink.net>

Liu Fung Sin wrote:
> Zach Beane wrote:
>> Here's what I'd write:
>>
>>    (defun ipv4-netmask-p (integer)
>>      (let ((x (logandc2 #xFFFFFFFF integer)))
>>        (zerop (logand x (1+ x)))))
> 
> Nice... this is a 1-1 translation of the c code.
> 
> I've read the description of logandc2 in clhs before but I don't
> understand where/how to apply it. I think I am getting the idea, since
> number in CL has arbitrary precision the compliment of a number
> doesn't necessary have a fixed address width like in C (32/64 bit).
> This is a bad habit coming from C that I always visualize what the
> bits are like in the machine.

I'm not sure if you meant to imply this but visualizing the bits when 
writing C/C++ code is a bad idea, too; not only when trying to write 
lisp code.  I've been involved with products which had assumed sizes of 
char/short/int/long, which then tripped over such code when the platform 
was upgraded and supported a bigger address space, i.e. integer types 
got bigger.  I'm currently involved in a product which has a lot of 
networking code which was written on a big-endian architecture which is 
now tripping over code which "visualized the bits" now that the platform 
has been changed to little-endian.  Of course, all of that nasty bad 
code was written *way* perform I started working on it.  No, really. 
Honest.

>> Just takes practice.
> 
> Before I can do that I first need to understand the subject.
> 
> Thank you and also Rob for taking the time to show me the way.

I looked back in this thread on Google and didn't see anybody mention of 
the _Practical Common Lisp_ <http://www.gigamonkeys.com/book/> chapter 
on "Parsing Binary Files" <http://tinyurl.com/9p6ly>.  It was my own 
personal introduction to the wonderful little things, LDB and DPB.  It 
also showed me the wonders of being able to produce lisp code which was 
just about as succinct as the bad old C-style

struct Message { /* ... */ };
Message* p = (Message*)buffer;

But without all of the horrible nasties hiding in the C version; 
alignment, padding, assuming sizes and endianness of members, and 
whatever other little nasal demons it is waiting to provoke to fly out 
of ones nose.  Sure, it takes a bit of work to develop the framework 
that allows for something like the following:

(define-tagged-binary-class id3v2.3-frame ()
   ((id                (frame-id :length 4))
    (size              u4)
    (flags             u2)
    (decompressed-size (optional :type 'u4
                                 :if (frame-compressed-p flags)))
    (encryption-scheme (optional :type 'u1
                                 :if (frame-encrypted-p flags)))
    (grouping-identity (optional :type 'u1
                                 :if (frame-grouped-p flags))))
   (:dispatch (find-frame-class id)))

(defun frame-compressed-p (flags) (logbitp 7 flags))

(defun frame-encrypted-p (flags) (logbitp 6 flags))

(defun frame-grouped-p (flags) (logbitp 5 flags))

But once that framework is in place, the end result is, IMO, brilliant.