From: David Gudeman
Subject: Representations of Dynamic Type Information
Date: 
Message-ID: <37749@optima.cs.arizona.edu>
I'm preparing a document that is intended to be an encycleopedic
summary of all of the different ways of encoding the type of a value
in dynamically typed languages.  In order to make this list as
comprehensive as possible, I thought I'd post to some relevant groups
on the net to see if anyone has a technique I'm not aware of.  Instead
of asking for techniques and having to go through all of the responses
looking for new ones, It seemed easier (for me anyway :-) to list the
ones I know about and hope that some implementers will be willing to
go through my list looking for their own favorite technique, and let
me know if it is missing.  Also, many of these techniques are either
folk-lore, or (probably re-)invented by me, so I would appreciate
knowing if anyone can give me references that have some claim to
originiality.

I don't need to know about cdr-coding or other methods to represent
data, I'm only interested in methods to encode dynamic type
information.

If you can help I prefer to get replies by mail.  If you post a
follow-up to this article, please notice that there are a lot of
groups in the subject line, and some replies may not be relevant for
some groups (I recommend comp.lang.misc for general discussions of
representational schemes).

Any and all help is greatly appreciated.

The LIST:

tagged-words (type information is in the machine word)
  tag fields (word is broken up into tag and data fields)
    tag field in high end (most-significant bits) of word
       use tag of all zeros for one type to avoid tagging cost
       negative ints get a tag of all ones, non-negative ints
                                  get a tag of all zeros
       use sign bit for one type
         use sign-bit = 0 for one type and optimize another type by
                                  giving it the tag of all ones in the
                                  high end and tagging by negation.
    tag field in low end of word
      use two-bit tags to represent word pointers to avoid shifting
        use the tag 00 for word pointers to save the cost of tagging
      use all zeros to optimize integer arithmetic
      optimize integer arithmetic by adding/subtracting tag
                                  after subtraction/addition
    tag field in both ends of word
       various combinations of tricks from the other two options
  partitioning by magnitude (type is encoded in the magnitude
                             of the word used to represent it)
    simple range tests to identify types
    segments with statically determined types
    segments with dynamically determined types
      use BIBOP to identify types
      identify type in the segment itself
        first word of segment
        last word of segment
object-pointers (untagged pointers refering to self-identifying blocks
                 on the heap)
  combinations of this scheme with the tagged-word scheme
descriptors (two-word data elements divided into a type word and a
             value word)
  encoding a qualifier in a descriptor
  encoding a cons cell in a descriptor
-- 
					David Gudeman
·······@cs.arizona.edu