Adding an "any" type to the JVM (was Re: Java vs lisp)

From: David Hopwood
Subject: Adding an "any" type to the JVM (was Re: Java vs lisp)
Date: Mon, 07 Apr 1997 00:00:00 +0000
Message-ID: <1997040704074274952@zetnet.co.uk>

In message <··········@epervier.CC.UMontreal.CA>
        ······@JSP.UMontreal.CA (MENARD Steve) writes: 

> Espen Vestre (··@nextel.no) wrote:
> : ······@lavielle.com (Rainer Joswig) writes:
> : 
> : > [...] SUN has built a *Java* virtual machine - and not a more
> : > universal VM. Sigh.
> : 
> : Could you, or someone else, elaborate on that?  It's somewhat
> : disappointing, I sort of naively hoped that Java was close enough to
> : lisp to make the Java VM suitable as a lisp platform, [...]
> 
>  Well, I'm not an expert at writing compilers, but I have just finished
> writing a Scheme->JVM backend for the Gambit-C system, so I can tell you
> whay some features lacking in the JVM make it hard, if not impossible, to
> write efficient translators.

>   First, the lack of pointer to basic types. Since, at any moment, any
> data can be put in a list or vector, you must always be able to reference
> any data. So, to use simple types (like int, char, double, etc), you must
> wrap them in a class. Simple enough to do, however, the overhead of
> getting the data from within the wrapper object, and allocating an object
> for even simple arithmetics like +, is quite annoying.

Would the addition of an "any" type be sufficient? It seems to me that this
wouldn't be too hard to implement: "any" would be a tagged primitive type
holding a integer, char, boolean, float or reference (longs and doubles are
inconveniently large when tag bits are needed; it might be better to
implement those using wrappers). There would be bytecodes for conversion
between "any" and other primitive types (checked, where necessary), and some
operations such as addition that follow the normal Smalltalk or Lisp semantics
(either do a primitive addition or call a method).

For example (assuming the signature character for "any" is "T"):
  t2l:  convert an any to a long, either directly or by calling
        "longValue()J" - throw an exception if longValue is not found
  t2i:  convert an any to an int, either directly or by calling
        "intValue()I" - throw an exception if intValue is not found
  tadd: add two any values, using primitive addition (creating a
        java.math.BigInteger or BigDecimal on overflow) or by calling
        "add(T)T"
  also t2a, l2t, i2t, a2t, tmul, tsub, tdiv, tload, tstore, treturn, etc.

I think an "any" type would be a useful addition to the Java language, as
well as the VM (it would help in calling code in other languages, and could
be used to make the Reflection API and many other things much cleaner).

>   Second, there is no instruction, in the JVM, to jump to an arbitrary,
> unknown at compile-time location. You can't read an address within a
> variable and jump there.

If you could do that, it would be infeasible to verify the bytecode.

> So, you can either make each lambda (scheme
> procedures) in its own java method, which means you can't to
>  tail-call optimisation, or you end up creating a dispatch-table(which is
> what I did) and "hosting" all the scheme procedures within a single Java
> method. Quite easy, until you try to implement "load" ...

Can't tail-call optimisation be done fairly simply by a JIT compiler? I
would think it just requires the JIT to recognize a method invocation
followed by a return, and compile that more efficiently.

> Lastly, although the JVM is a stack machine, the local stack handling is
> very primitives. Most stack machines I've seen have ways to read cells
> other than the top, which does not hold true with the JVM. 

Yes, I agree, new bytecodes might have to be added to handle that.

The other change that would be useful to support other languages is
expanded types (using Eiffel terminology), especially arrays of expanded
types.

David Hopwood
·············@lmh.ox.ac.uk, ·······@zetnet.co.uk