Representations of Dynamic Type Information

From: David Gudeman
Subject: Representations of Dynamic Type Information
Date: Fri, 23 Apr 1993 21:04:06 +0000
Message-ID: <93-04-095@comp.compilers>

I'm preparing a document that is intended to be an encyclopedic summary
of all of the different ways of encoding the type of a value in
dynamically typed languages.  In order to make this list as comprehensive
as possible, I thought I'd post to some relevant groups on the net to see
if anyone has a technique I'm not aware of.  Instead of asking for
techniques and having to go through all of the responses looking for new
ones, I thought I'd list the ones I know about and hope that some
implementers will be willing to go through my list looking for their own
favorite technique, and let me know if it is missing.  Also, many of these
techniques are either folk-lore, or (probably re-)invented by me, so I
would appreciate knowing if anyone can give me references that have some
claim to originiality.

I don't need to know about cdr-coding or other methods to represent data,
I'm only interested in methods to encode dynamic type information.

If you can help I prefer to get replies by mail.  If you post a follow-up
to this article, please notice that there are a lot of groups in the
subject line, and some replies may not be relevant for some groups (I
recommend comp.lang.misc for general discussions of representational
schemes).

Any and all help is greatly appreciated.

The LIST:

tagged-words (type information is in the machine word)
  tag fields (word is broken up into tag and data fields)
    tag field in high end (most-significant bits) of word
       use tag of all zeros for one type to avoid tagging cost
       negative ints get a tag of all ones, non-negative ints
                                  get a tag of all zeros
       use sign bit for one type
         use sign-bit = 0 for one type and optimize another type by
                                  giving it the tag of all ones in the
                                  high end and tagging by negation.
    tag field in low end of word
      use two-bit tags to represent word pointers to avoid shifting
        use the tag 00 for word pointers to save the cost of tagging
      use all zeros to optimize integer arithmetic
      optimize integer arithmetic by adding/subtracting tag
                                  after subtraction/addition
    tag field in both ends of word
       various combinations of tricks from the other two options
  partitioning by magnitude (type is encoded in the magnitude
                             of the word used to represent it)
    simple range tests to identify types
    segments with statically determined types
    segments with dynamically determined types
      use BIBOP to identify types
      identify type in the segment itself
        first word of segment
        last word of segment
object-pointers (untagged pointers refering to self-identifying blocks
                 on the heap)
  combinations of this scheme with the tagged-word scheme
descriptors (two-word data elements divided into a type word and a
             value word)
  encoding a qualifier in a descriptor
  encoding a cons cell in a descriptor
--
David Gudeman
·······@cs.arizona.edu
-- 
Send compilers articles to ·········@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.