Hello!
I am trying to add one more representation scheme to Spartns[0], and it
would be necessary to use individual bits from the index keys, so I was
wondering if it is possible to treat any Common Lisp object such
as symbols, strings and structures as sequences of bits so I can use
logior, logand, ash etc (as I would do, for example, in C with sizeof
and uint8_t *).
From what I can see, it would be possible for strings, although in a
somewhat awkward way: use char-int on each character, then do the
operations, then use digit-char on the result.
So I search the web and CLtL, but there was nothing that could help with
this. Is there something I missed?
Thanks a lot,
J.
[0] http://aleph0.info/spartns -- I am planning to allow non-numbers
in the indices, turning it into both a tensor representation library and
an efficient multi-dimensional dictionary.
Jeronimo Pellegrini wrote:
> Hello!
>
> I am trying to add one more representation scheme to Spartns[0], and
> it would be necessary to use individual bits from the index keys, so
> I was wondering if it is possible to treat any Common Lisp object such
> as symbols, strings and structures as sequences of bits so I can use
> logior, logand, ash etc (as I would do, for example, in C with sizeof
> and uint8_t *).
>
> From what I can see, it would be possible for strings, although in a
> somewhat awkward way: use char-int on each character, then do the
> operations, then use digit-char on the result.
I know of no way to do this in general. As far as strings go I have
converted them to bit vectors and then used the functions BIT-XOR,
BIT-AND, etc. on the bit vectors, coercing the result back to a string.
Some other object types could first be written to strings, and in the
case of symbols the function SYMBOL-NAME used for that task. If this
approach would be useful and you want to see my "coerce" defuns to go to
and from bit vectors and strings I'll be glad to post them.
Carl Taylor
On 2008-08-22, Carl Taylor <··········@att.net> wrote:
> Jeronimo Pellegrini wrote:
>> Hello!
>>
>> I am trying to add one more representation scheme to Spartns[0], and
>> it would be necessary to use individual bits from the index keys, so
>> I was wondering if it is possible to treat any Common Lisp object such
>> as symbols, strings and structures as sequences of bits so I can use
>> logior, logand, ash etc (as I would do, for example, in C with sizeof
>> and uint8_t *).
>>
>> From what I can see, it would be possible for strings, although in a
>> somewhat awkward way: use char-int on each character, then do the
>> operations, then use digit-char on the result.
>
> I know of no way to do this in general. As far as strings go I have
> converted them to bit vectors and then used the functions BIT-XOR,
> BIT-AND, etc. on the bit vectors, coercing the result back to a string.
> Some other object types could first be written to strings, and in the
> case of symbols the function SYMBOL-NAME used for that task. If this
> approach would be useful and you want to see my "coerce" defuns to go to
> and from bit vectors and strings I'll be glad to post them.
Hmm, that could work. I was expecting to get an immediate sequence of
bit (or bytes, whatever) that uniquely identified each object. This is
for efficiency and also because it would be kind of elegant, but
anyway...
About the efficiency issue: I can solve the problem for a single
character:
(defun char-ash (c n)
(declare (optimize (speed 3) (debug 3) (safety 0))
(type character c)
(type (signed-byte 60) n))
(the fixnum (ash (char-int c) n)))
In SBCL, this compiles to:
i
;;;; (disassemble 'char-ash) ...
; 03BFD079: 4809C9 OR RCX, RCX ;
no-arg-parsing entry point
; 7C: 7912 JNS L1
; 7E: 48F7D9 NEG RCX
; 81: 4883F93F CMP RCX, 63
; 85: 7604 JBE L0
; 87: 31D2 XOR EDX, EDX
; 89: EB08 JMP L2
; 8B: L0: 48D3EA SHR RDX, CL
; 8E: EB03 JMP L2
; 90: L1: 48D3E2 SHL RDX, CL
; 93: L2: 48C1E203 SHL RDX, 3
; 97: 488D65F0 LEA RSP, [RBP-16]
; 9B: F8 CLC
; 9C: 488B6DF8 MOV RBP, [RBP-8]
; A0: C20800 RET 8
And if I only need to shift to left (n>0), I can declare
(type (unsigned-byte 60) n)), and I get:
;;;; (disassemble 'char-ash) ...
; 02D501C5: 48C1E203 SHL RDX, 3 ; no-arg-parsing entry point
; C9: 48C1F903 SAR RCX, 3
; CD: 48D3E2 SHL RDX, CL
; D0: 488D65F0 LEA RSP, [RBP-16]
; D4: F8 CLC
; D5: 488B6DF8 MOV RBP, [RBP-8]
; D9: C20800 RET 8
Pretty much what I intended! All functions inlined, and the assembly
goes right to what needs to be done (no checking, setting up, etc). [0]
So I suppose using, for example char-int, char-code and code-char in each
element is the right way to deal with strings?
[0] In Spartns, the optimize dclarations are configurable; I use
safety 0 for heavily tested code only.
Jeronimo Pellegrini <···@aleph0.info> writes:
> I am trying to add one more representation scheme to Spartns[0], and it
> would be necessary to use individual bits from the index keys, so I was
> wondering if it is possible to treat any Common Lisp object such
> as symbols, strings and structures as sequences of bits so I can use
> logior, logand, ash etc (as I would do, for example, in C with sizeof
> and uint8_t *).
What makes you think that symbols are represented with bits?
This is a computer:
http://en.wikipedia.org/wiki/MONIAC_Computer
there's no bit in there.
I'm not saying that there's a CL implementation running on that kind
of computer, but nonetheless, symbols don't have bits, they've got identity.
> From what I can see, it would be possible for strings, although in a
> somewhat awkward way: use char-int on each character, then do the
> operations, then use digit-char on the result.
>
> So I search the web and CLtL, but there was nothing that could help with
> this. Is there something I missed?
>
> Thanks a lot,
> J.
>
> [0] http://aleph0.info/spartns -- I am planning to allow non-numbers
> in the indices, turning it into both a tensor representation library and
> an efficient multi-dimensional dictionary.
Well, CL has GETHASH to do that.
(let ((table (make-hash-table :test (function equal))))
(setf (gethash 'index table) 1
(gethash 42 table) 2
(gethash "Hi" table) 3)
(list (gethash 'index table)
(gethash 42 table)
(gethash "Hi" table)))
--> (1 2 3)
--
__Pascal Bourguignon__ http://www.informatimago.com/
"I have challenged the entire quality assurance team to a Bat-Leth
contest. They will not concern us again."
On 2008-08-22, Pascal J. Bourguignon <···@informatimago.com> wrote:
> Jeronimo Pellegrini <···@aleph0.info> writes:
>> I am trying to add one more representation scheme to Spartns[0], and it
>> would be necessary to use individual bits from the index keys, so I was
>> wondering if it is possible to treat any Common Lisp object such
>> as symbols, strings and structures as sequences of bits so I can use
>> logior, logand, ash etc (as I would do, for example, in C with sizeof
>> and uint8_t *).
>
> What makes you think that symbols are represented with bits?
Oops! I had read about the symbol table before, but forgot about it.
> This is a computer:
> http://en.wikipedia.org/wiki/MONIAC_Computer
> there's no bit in there.
Right, but since I am running ANSI Common Lisp, whatever the computer
it runs on, the functions logior, logxor, logand, ash, etc should
work... :-)
> I'm not saying that there's a CL implementation running on that kind
> of computer, but nonetheless, symbols don't have bits, they've got identity.
Anyway -- I understand that not all objects will have an easy-to-get
sequence of bits that identify them. That leaves strings and characters that
could still be used with bitwise operators.
>> [0] http://aleph0.info/spartns -- I am planning to allow non-numbers
>> in the indices, turning it into both a tensor representation library and
>> an efficient multi-dimensional dictionary.
>
> Well, CL has GETHASH to do that.
>
> (let ((table (make-hash-table :test (function equal))))
> (setf (gethash 'index table) 1
> (gethash 42 table) 2
> (gethash "Hi" table) 3)
> (list (gethash 'index table)
> (gethash 42 table)
> (gethash "Hi" table)))
> --> (1 2 3)
Yes -- I wanted to implement, for example, Patricia trees. They would
have different properties, and would allow me to list all bit-vector
keys with a common prefix; traverse the whole set in the same order,
regardless of how the items were included, etc. That would not work
with hashtables.
(And actually, Spartns already has a "HASH" representation scheme,
which basically uses hashtables in the way you suggested).
J.
In article <························@read.cnntp.org>,
Jeronimo Pellegrini <···@aleph0.info> wrote:
> On 2008-08-22, Pascal J. Bourguignon <···@informatimago.com> wrote:
> > Jeronimo Pellegrini <···@aleph0.info> writes:
> >> I am trying to add one more representation scheme to Spartns[0], and it
> >> would be necessary to use individual bits from the index keys, so I was
> >> wondering if it is possible to treat any Common Lisp object such
> >> as symbols, strings and structures as sequences of bits so I can use
> >> logior, logand, ash etc (as I would do, for example, in C with sizeof
> >> and uint8_t *).
> >
> > What makes you think that symbols are represented with bits?
>
> Oops! I had read about the symbol table before, but forgot about it.
>
> > This is a computer:
> > http://en.wikipedia.org/wiki/MONIAC_Computer
> > there's no bit in there.
>
> Right, but since I am running ANSI Common Lisp, whatever the computer
> it runs on, the functions logior, logxor, logand, ash, etc should
> work... :-)
Yeah, there are bits down in the hardware, but what would it mean to
perform these operations on them?
A complex object like a symbol is a container with a bunch of pointers
in it, pointing to the name, value, function, plist, etc. It wouldn't
be a good idea to depend on these pointer values, because the garbage
collector can move all those objects around and change the pointers.
Anything you'd done that depended on the old raw values would no longer
be valid.
If you want a number that represents an object, and can be used safely,
use SXHASH.
--
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
Barry Margolin <······@alum.mit.edu> writes:
> If you want a number that represents an object, and can be used safely,
> use SXHASH.
That will work as long as there is no requirement that the number
uniquely identify the object.
Although a hash function has to always map an object to the same hash
code, objects are allowed to share hash codes.
I would think that the OP needs to write his own "characteristic number"
function to generate the identifiers that he seeks. That is the only
way to guarantee that the resulting numbers have the desired properties.
For example, in ACL 7.0, vectors seem to hash to their length:
CL-USER> (sxhash #(1 2 3))
3
CL-USER> (sxhash #(4 5 6))
3
CL-USER> (sxhash #(4 5 6 7 8))
5
--
Thomas A. Russ, USC/Information Sciences Institute