From: Will Hartung
Subject: Commercial "Java" machine
Date: 
Message-ID: <2rtn6gF1e1me3U1@uni-berlin.de>
A company is coming out with a custom Java Appliance. Custom CPU, etc. It's
not interactive, it's basically just a Java VM machine to run server
applications.

http://java.ittoolbox.com/news/dispnews.asp?i=121712

A comment on a message board (since we're still lacking details):

"Yeah, you're missing something: One Azul box will have more Java horsepower
than an entire rack of servers (I don't know yet if they've unveiled just
how much, so that's all I can say,) and it can handle a couple of hundred GB
on a Java heap with no full GC pauses."

So, the Lisp Machine is born Yet Again, but this time its speaking Java.

Regards,

Will Hartung
(·····@msoft.com)

From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096397104.218757.40230@h37g2000oda.googlegroups.com>
Will Hartung wrote:
>
> "Yeah, you're missing something: One Azul box will have more Java
horsepower
> than an entire rack of servers (I don't know yet if they've unveiled
just
> how much, so that's all I can say,) and it can handle a couple of
hundred GB
> on a Java heap with no full GC pauses."
>
> So, the Lisp Machine is born Yet Again, but this time its speaking
Java.

Sun were (going to) produce Java chips some time in the late 90s,
weren't they?  I wonder why they didn't take off since
language-specific HW is obviously such a great idea?

--tim
From: Will Hartung
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <2rtvl1F1epsleU1@uni-berlin.de>
"Tim Bradshaw" <··········@tfeb.org> wrote in message
····························@h37g2000oda.googlegroups.com...
> Will Hartung wrote:
> > So, the Lisp Machine is born Yet Again, but this time its speaking
> Java.
>
> Sun were (going to) produce Java chips some time in the late 90s,
> weren't they?  I wonder why they didn't take off since
> language-specific HW is obviously such a great idea?

Technically, it's a JVM vs a Java Language machine, but that's nitting
picks.

But what is interesting is that these folks may have added hardware support
of somekind for Garbage Collection (assuming the "several hundred GB heap
without a pausing full GC" is accurate).

It may well simply pan out as a multi-core CPU technology.

Regards,

Will Hartung
(·····@msoft.com)
From: Mike
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <10ljk0v8lf7rh12@corp.supernews.com>
In article <···············@uni-berlin.de>, Will Hartung wrote:
> 
> "Tim Bradshaw" <··········@tfeb.org> wrote in message
> ····························@h37g2000oda.googlegroups.com...
>> Will Hartung wrote:
>> > So, the Lisp Machine is born Yet Again, but this time its speaking
>> Java.
>>
>> Sun were (going to) produce Java chips some time in the late 90s,
>> weren't they?  I wonder why they didn't take off since
>> language-specific HW is obviously such a great idea?
> 
> Technically, it's a JVM vs a Java Language machine, but that's nitting
> picks.
> 
> But what is interesting is that these folks may have added hardware support
> of somekind for Garbage Collection (assuming the "several hundred GB heap
> without a pausing full GC" is accurate).
> 
> It may well simply pan out as a multi-core CPU technology.
> 
> Regards,
> 
> Will Hartung
> (·····@msoft.com)

There are JVMs embedded in PIC chips. Search for 'Tini'. It's an
interesting device.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096449907.297563.244290@h37g2000oda.googlegroups.com>
Will Hartung wrote:
>
> But what is interesting is that these folks may have added hardware
support
> of somekind for Garbage Collection (assuming the "several hundred GB
heap
> without a pausing full GC" is accurate).

Well, permit me to play the cynic.

For recent (I measured the Sun 1.5RC JVM) stock-HW JVM implementations
the basic memory-management performance already seems fairly good. For
lots of transient data it is really good already I think (but this is
kind of a best case for a modern GC I suspect).  The harder case is
not pausing for huge long-lived-heaps without just copping out and
tenuring the whole lot.  Isn't that `just' a GC which runs in parallel
with the rest of the system?  While that's hard, I don't think it's
really new technology is it?  Existing JVMs already have some of this
anyway (see http://java.sun.com/docs/hotspot/gc1.4.2/index.html), and
they'll be getting more as time goes on.

I guess the claim is that some HW trick makes this much better, and
that might well be true.  However it has to not make everything else
worse, and also not mean you can't keep up with the speeds of stock-HW
systems, as well.

--tim
From: Will Hartung
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <2s071bF1caekaU1@uni-berlin.de>
"Tim Bradshaw" <··········@tfeb.org> wrote in message
·····························@h37g2000oda.googlegroups.com...
> Will Hartung wrote:
> >
> > But what is interesting is that these folks may have added hardware
> support
> > of somekind for Garbage Collection (assuming the "several hundred GB
> heap
> > without a pausing full GC" is accurate).
>
> Well, permit me to play the cynic.

But of course!

> For recent (I measured the Sun 1.5RC JVM) stock-HW JVM implementations
> the basic memory-management performance already seems fairly good. For
> lots of transient data it is really good already I think (but this is
> kind of a best case for a modern GC I suspect).  The harder case is
> not pausing for huge long-lived-heaps without just copping out and
> tenuring the whole lot.  Isn't that `just' a GC which runs in parallel
> with the rest of the system?  While that's hard, I don't think it's
> really new technology is it?  Existing JVMs already have some of this
> anyway (see http://java.sun.com/docs/hotspot/gc1.4.2/index.html), and
> they'll be getting more as time goes on.

IMHO, Sun is basically at the forefront of GC tech right now, trying their
best to get Java working as best as it can. They've dumped a lot of time and
money into getting their JVM GC to work on their super mutli-processor
machines. For our apps, the GC system has been working just fine, save at
extreme edge cases. (However, we have a current system that for some reason
has lost its ability to do incremental GC's after a certain point, and
rather just does Full GCs...very annoying -- it will Full GC from 1GB down
to 400MB, and then fill it back up again).

> I guess the claim is that some HW trick makes this much better, and
> that might well be true.  However it has to not make everything else
> worse, and also not mean you can't keep up with the speeds of stock-HW
> systems, as well.

This goes back to the whole "LispOS" and "Lisp Hardware" "thing". It has
never been clearly explained to me what advantages custom hardware/custom OS
would have in a GC environment. The other benefit of Lisp Hardware is
hardware support for tagging, but that's clearly not relevant in a JVM
machine.

As far as GC support in the OS, the best I've heard is where the GC is more
tightly integrated with the Virtual Memory system to the point where when
performing it's GC, it can know to not page in pages off of disk. Delaying
GCing on them until they actually load, or whatever. Also, something about
"write boundaries" and such that I never quite understood (having not done
any real research in GC's in general).

So, I would THINK that if someone were going to implement a specific JVM
device, and that they were building their own CPUs to do it, they'd be able
to design the chips and/or underlying BIOS or OS to enhance the GC process.
I mean, don't you think there would be some benefit worth the time to
implement it?

Regards,

Will Hartung
(·····@msoft.com)
From: Mike
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <10lltqg34vrj2cc@corp.supernews.com>
In article <···············@uni-berlin.de>, Will Hartung wrote:
> 
> "Tim Bradshaw" <··········@tfeb.org> wrote in message
> ·····························@h37g2000oda.googlegroups.com...
>> Will Hartung wrote:
>> >
>> > But what is interesting is that these folks may have added hardware
>> support
>> > of somekind for Garbage Collection (assuming the "several hundred GB
>> heap
>> > without a pausing full GC" is accurate).
>>
>> Well, permit me to play the cynic.
> 
> But of course!
> 
>> For recent (I measured the Sun 1.5RC JVM) stock-HW JVM implementations
>> the basic memory-management performance already seems fairly good. For
>> lots of transient data it is really good already I think (but this is
>> kind of a best case for a modern GC I suspect).  The harder case is
>> not pausing for huge long-lived-heaps without just copping out and
>> tenuring the whole lot.  Isn't that `just' a GC which runs in parallel
>> with the rest of the system?  While that's hard, I don't think it's
>> really new technology is it?  Existing JVMs already have some of this
>> anyway (see http://java.sun.com/docs/hotspot/gc1.4.2/index.html), and
>> they'll be getting more as time goes on.
> 
> IMHO, Sun is basically at the forefront of GC tech right now, trying their
> best to get Java working as best as it can. They've dumped a lot of time and
> money into getting their JVM GC to work on their super mutli-processor
> machines. For our apps, the GC system has been working just fine, save at
> extreme edge cases. (However, we have a current system that for some reason
> has lost its ability to do incremental GC's after a certain point, and
> rather just does Full GCs...very annoying -- it will Full GC from 1GB down
> to 400MB, and then fill it back up again).
> 
>> I guess the claim is that some HW trick makes this much better, and
>> that might well be true.  However it has to not make everything else
>> worse, and also not mean you can't keep up with the speeds of stock-HW
>> systems, as well.
> 
> This goes back to the whole "LispOS" and "Lisp Hardware" "thing". It has
> never been clearly explained to me what advantages custom hardware/custom OS
> would have in a GC environment. The other benefit of Lisp Hardware is
> hardware support for tagging, but that's clearly not relevant in a JVM
> machine.
> 
> As far as GC support in the OS, the best I've heard is where the GC is more
> tightly integrated with the Virtual Memory system to the point where when
> performing it's GC, it can know to not page in pages off of disk. Delaying
> GCing on them until they actually load, or whatever. Also, something about
> "write boundaries" and such that I never quite understood (having not done
> any real research in GC's in general).
> 
> So, I would THINK that if someone were going to implement a specific JVM
> device, and that they were building their own CPUs to do it, they'd be able
> to design the chips and/or underlying BIOS or OS to enhance the GC process.
> I mean, don't you think there would be some benefit worth the time to
> implement it?
> 
> Regards,
> 
> Will Hartung
> (·····@msoft.com)

Honestly, the only reason I can think of for having a LispOS or
Lisp hardware is the neatness factor. It goes back to garage construction
of your computers and such. Creating a custom wearable that has
embedded lisp with head mounted display, twiddler keyboard, fully
mobile and connected at the same time. That's cool.

Mike
From: Christopher C. Stacy
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <uekkklsiv.fsf@news.dtpq.com>
"Will Hartung" <·····@msoft.com> writes:
> This goes back to the whole "LispOS" and "Lisp Hardware" "thing".
> It has never been clearly explained to me what advantages custom
> hardware/custom OS would have in a GC environment. The other benefit
> of Lisp Hardware is hardware support for tagging, but that's clearly
> not relevant in a JVM machine.

Two big things offhand are "Invisible Forwarding Pointer" type tags,
and GC bits (eg. read/write barriers) supported in the cache and TLB.
(I'm no expert on the Symbolics GC system, though.)
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096634133.690997.306920@k17g2000odb.googlegroups.com>
Will Hartung wrote:
>
> IMHO, Sun is basically at the forefront of GC tech right now, trying
their
> best to get Java working as best as it can. They've dumped a lot of
time and
> money into getting their JVM GC to work on their super
mutli-processor
> machines.

That's the impression I get from reading the manuals.  Last time I had
to deal with java (in the 1.0/1.1 timeframe) the GC on the Sun JVM was
really laughably bad - some 70s mark-and-sweep thing I think.  Now
there are clearly several very sophisticated collectors including ones
designed to support big machines.  I'm not sure if there are Lisp
systems with GCs as good as the Sun JVM now (perhaps Allegro, although
I don't know if they do big-machine-friendly GC yet).

> As far as GC support in the OS, the best I've heard is where the GC
is more
> tightly integrated with the Virtual Memory system to the point where
when
> performing it's GC, it can know to not page in pages off of disk.
Delaying
> GCing on them until they actually load, or whatever. Also, something
about
> "write boundaries" and such that I never quite understood (having not
done
> any real research in GC's in general).

I'd assume that this kind of framework would be useful in a
non-Lisp/Java OS too, and I wouldn't be surprised if Unices start
getting it, if they don't already have it (for instance I can think of
lots of hairy-IO scenarios where it would be very nice to be told when
pages arrive in core, so operations on their data can be scheduled as
the pages arrive).

I don't know about write barriers.  I remember someone (I think Duane)
who said that the single thing that would help most for Lisp-family
languages (incuding, I guess, Java) would be fast traps (no trip
through the OS), which may be related to this.  Again, is there a
reason this can't be done in a conventional OS?  I'm not sure.

--tim
From: Joe Marshall
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <vfdujndi.fsf@ccs.neu.edu>
"Tim Bradshaw" <··········@tfeb.org> writes:

> I don't know about write barriers.  I remember someone (I think Duane)
> who said that the single thing that would help most for Lisp-family
> languages (incuding, I guess, Java) would be fast traps (no trip
> through the OS), which may be related to this.  Again, is there a
> reason this can't be done in a conventional OS?  I'm not sure.

The hardware has to support `user-space' interrupts for the OS to be
able to do this.
From: Steven M. Haflich
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <b908d.5976$nj.699@newssvr13.news.prodigy.com>
Joe Marshall wrote:

> The hardware has to support `user-space' interrupts for the OS to be
> able to do this.

To amplify on Joe's remark, consider why we don't have this facility in
practice.

The problem for a language / gc implementor is that he really
wants to choose a design that will work across a wide range of
available systems.  Even if one motherboard implementor and OS
purveyor combination provides what we would want, there is still a
strong disincentive to explouit those features if the implementation
won't port to other platforms.  It is expensive and impractical to
support multiple fundamentally-different low-level designs, so we
are forced to choose designs that will work on common-denominator
platforms.

A secondary problem with an "infrequently used" feature like a
write barrier is that OS / hardware providers might not treat
them as critical parts of the OS.  If Microsoft or Dell broke a
feature like write barrier in some periodic OS/hardware release,
it might be a year or two before it was fixed.  This is
unbearable by purveyors of small-user-community languages.

So we language implementors don't use it because most of them
platform implementors  don't provide it, and most of them platform
implementors don't provide it because most of us language
implementors don't use it.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096898803.624629.183980@k26g2000oda.googlegroups.com>
Steven M. Haflich wrote:
> The problem for a language / gc implementor is that he really
> wants to choose a design that will work across a wide range of
> available systems.  Even if one motherboard implementor and OS
> purveyor combination provides what we would want, there is still a
> strong disincentive to explouit those features if the implementation
> won't port to other platforms.  It is expensive and impractical to
> support multiple fundamentally-different low-level designs, so we
> are forced to choose designs that will work on common-denominator
> platforms.
>

I agree completely.  I'm still curious as to whether such a feature is
realistically possible on a modern machine (even though it still might
not be advisable to use it for the reasons you describe). By
`realistically
possible' I guess I mean `could be supported without some huge change
to the way the OS works' where `huge change' might be `not having
(parts of) the OS run in some privileged mode' or something like that
(I'm trying to avoid the silly `yes this can be done if you can live
with no hardware-enforced memory protection'-type answer that I
suspect I may get otherwise: I can't live with that).  Alternatively
`could it be done on current hardware while supporting a modern
Unix-flavour OS'.
(And this isn't a rhetorical question: I'd like to know!)

--tim
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096884842.507016.311400@k26g2000oda.googlegroups.com>
Joe Marshall wrote:
>
> The hardware has to support `user-space' interrupts for the OS to be
> able to do this.

Do typical modern architectures do so?

--tim
From: Joe Marshall
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <7jq6iewr.fsf@ccs.neu.edu>
"Tim Bradshaw" <··········@tfeb.org> writes:

> Joe Marshall wrote:
>>
>> The hardware has to support `user-space' interrupts for the OS to be
>> able to do this.
>
> Do typical modern architectures do so?

I don't think it is common, and it'd be pretty hard to do it right.
From: Michael J. Ferrador
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <nxF6d.10606$kq6.4678046@news4.srv.hcvlny.cv.net>
Tim Bradshaw wrote:
> Will Hartung wrote:
> 
>>"Yeah, you're missing something: One Azul box will have more Java
> 
> horsepower
> 
>>than an entire rack of servers (I don't know yet if they've unveiled
> 
> just
> 
>>how much, so that's all I can say,) and it can handle a couple of
> 
> hundred GB
> 
>>on a Java heap with no full GC pauses."
>>
>>So, the Lisp Machine is born Yet Again, but this time its speaking
> 
> Java.
> 
> Sun were (going to) produce Java chips some time in the late 90s,
> weren't they?  I wonder why they didn't take off since
> language-specific HW is obviously such a great idea?
> 
> --tim


Arm's Jazelle (But WHERE can a get a PDA w/ one !!!)

http://www.arm.com/products/solutions/Jazelle.html
From: Andras Simon
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <vcdsm926yen.fsf@csusza.math.bme.hu>
"Will Hartung" <·····@msoft.com> writes:

> 
> So, the Lisp Machine is born Yet Again, but this time its speaking Java.
> 

Maybe it is, but I hear Common Lisp (via abcl). 

Andras
From: Gareth McCaughan
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <87acvannz6.fsf@g.mccaughan.ntlworld.com>
Will Hartung wrote:

> A comment on a message board (since we're still lacking details):
> 
> "Yeah, you're missing something: One Azul box will have more Java horsepower
> than an entire rack of servers (I don't know yet if they've unveiled just
> how much, so that's all I can say,) and it can handle a couple of hundred GB
> on a Java heap with no full GC pauses."

That would be very interesting if it were true.

We are constantly hearing how modern Java VMs are able to
run Java code at something like the same speed as native
code in languages like C -- say, within a factor of 2 on
typical problems.

So if these "Azul boxes" run Java an order of magnitude
or so faster than ordinary machines do, and ordinary machines
run Java at roughly the same speed as they run C, then
an Azul box should be able to run Java an order of magnitude
faster than a Pentium 4 Xeon or an Opteron or whatever runs C.

Call me skeptical if you will, but that doesn't seem
likely.

-- 
Gareth McCaughan
.sig under construc
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415b0413.43892168@news.eircom.net>
On Tue, 28 Sep 2004 20:32:59 GMT, Gareth McCaughan
<················@pobox.com> wrote:

>So if these "Azul boxes" run Java an order of magnitude
>or so faster than ordinary machines do, and ordinary machines
>run Java at roughly the same speed as they run C, then
>an Azul box should be able to run Java an order of magnitude
>faster than a Pentium 4 Xeon or an Opteron or whatever runs C.
>
>Call me skeptical if you will, but that doesn't seem
>likely.

I think it's plausible. The thing is, C is a performance killer
because you're nailed to decades-old architectural decisions. The
point of Java is that, being VM/JIT-based to begin with, targeting it
specifically gives you the priceless gift of _a clean sheet of paper_.
If you use it properly, a great deal of the complexity of a Pentium or
whatever can just go away.

I'm looking forward to seeing how these things end up performing in
practice.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: John Thingstad
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <opse3q93gspqzri1@mjolner.upc.no>
On Wed, 29 Sep 2004 18:53:14 GMT, Russell Wallace  
<················@eircom.net> wrote:

> On Tue, 28 Sep 2004 20:32:59 GMT, Gareth McCaughan
> <················@pobox.com> wrote:
>
>> So if these "Azul boxes" run Java an order of magnitude
>> or so faster than ordinary machines do, and ordinary machines
>> run Java at roughly the same speed as they run C, then
>> an Azul box should be able to run Java an order of magnitude
>> faster than a Pentium 4 Xeon or an Opteron or whatever runs C.
>>
>> Call me skeptical if you will, but that doesn't seem
>> likely.
>
> I think it's plausible. The thing is, C is a performance killer
> because you're nailed to decades-old architectural decisions. The
> point of Java is that, being VM/JIT-based to begin with, targeting it
> specifically gives you the priceless gift of _a clean sheet of paper_.
> If you use it properly, a great deal of the complexity of a Pentium or
> whatever can just go away.
>
> I'm looking forward to seeing how these things end up performing in
> practice.
>

I can see why the Pentium chip would be linked to decades old  
architectural decisions.
However on RISC machines C runs clean.
(Particularly a generous number of general purpose registers help.)
I fail to see how the Java VM/JIT has something to offer that intrinsically
makes it faster.
If you can be a bit more concrete maybe I would take it a bit more  
seriously.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: John Thingstad
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <opse3se5jepqzri1@mjolner.upc.no>
On Wed, 29 Sep 2004 21:04:41 +0200, John Thingstad  
<··············@chello.no> wrote:

>> I fail to see how the Java VM/JIT has something to offer that
> intrinsically makes it faster.

That should have been faster on newer architectures.

I am not an expert on hardware but the only advance I know of that
Sun has introduced recently is more extensive use of asynchronous logic.
What I can't see is how this benefits Java VM any more than C compiled to
machine code.


-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415b87a7.77581246@news.eircom.net>
On Wed, 29 Sep 2004 21:04:41 +0200, "John Thingstad"
<··············@chello.no> wrote:

>I can see why the Pentium chip would be linked to decades old  
>architectural decisions.
>However on RISC machines C runs clean.
>(Particularly a generous number of general purpose registers help.)
>I fail to see how the Java VM/JIT has something to offer that intrinsically
>makes it faster.

It's not what the JVM has that helps, it's what it _doesn't_ have -
or, more exactly, what it doesn't assume that C (by which I mean not
"a lawyer's interpretation of the C standard" but "typical extant C
code written for Unix-like operating systems") does.

- No hardware memory protection, virtual memory, paging etc.
I'm pretty sure this is a huge win. Look at some of the discussions on
comp.arch for just what a headache-inducing mess this is on all
existing systems.

- No decode, use LIW.
RISC and Itanium have cleaner decode than x86, but Transmeta-style LIW
is cleaner still. JVM wins because you can do this because you don't
need to preserve binary compatibility across CPU generations.

- Cleaner separation of code and data.
Don't need to support code pointers per se. Unlike x86, don't need to
support self-modifying code. Maybe can do like Transmeta and go for
code execution from I-cache only.

- No byte addressing, 16 bits is the smallest unit.
Probably only a small saving, but every little helps.

- Don't need so much complexity in out of order execution etc, assume
thread switch on cache miss.
Server-side Java tends to be heavily multithreaded, so you can go for
total throughput rather than desperately trying to crank as much
performance as possible out of a single thread.

- That means you don't need a zillion physical rename registers, from
what I've heard on comp.arch this sounds like a significant win.

- Similarly, don't need ultra deep pipelining to ratchet up the clock
speed, spend the resources on more cores instead.

- That means you don't need to be so careful about branch prediction
because the mispredict penalty isn't as bad.

- I've a feeling exception/interrupt handling can be simplified, but
I'm not familiar enough with how modern systems do this to make this
more than a vague guess.

- I'm guessing locking/thread sync and its interaction with cache,
garbage collection etc can also be simplified and streamlined if you
assume you're supporting nothing but Java and you know the GC
algorithm in advance, but again I'm not familiar enough with how
existing systems do it to substantiate the claim - but again look at
some of the threads on comp.arch for what a headache-inducing mess
this can be - at least in the new system's case the chip designers, GC
programmers etc will be talking to each other rather than guessing
what the other wants and/or working from Posix etc.

I find it plausible the above might amount to enough savings in size
and (more important) heat dissipation to let you get 24 cores on a
chip.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Adam Warner
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <pan.2004.09.30.05.58.34.717584@consulting.net.nz>
Hi Russell Wallace,

> - No hardware memory protection, virtual memory, paging etc. I'm pretty
> sure this is a huge win. Look at some of the discussions on comp.arch
> for just what a headache-inducing mess this is on all existing systems.

In Java there is no way to address more than 2^31-1 elements in a
primitive array. Until Sun addresses this Java is unsuitable for many
important 64-bit computing tasks.

Any conforming high end JVM CPU architecture will inherit this flaw,
putting it at a disadvantage with general purpose 64-bit architectures.
And programmers won't want a return to writing code using segmented arrays
just as writing for huge address spaces becomes a reality.

I'm uncertain whether Sun will successfully address this issue in a
reasonable time frame. This casts a dark shadow over the viability of
future CPU hardware implementing the Java Virtual Machine.

Regards,
Adam
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415ba8ca.86065350@news.eircom.net>
On Thu, 30 Sep 2004 17:58:38 +1200, Adam Warner
<······@consulting.net.nz> wrote:

>In Java there is no way to address more than 2^31-1 elements in a
>primitive array. Until Sun addresses this Java is unsuitable for many
>important 64-bit computing tasks.

Yeah, that's a problem alright.

>Any conforming high end JVM CPU architecture will inherit this flaw,
>putting it at a disadvantage with general purpose 64-bit architectures.
>And programmers won't want a return to writing code using segmented arrays
>just as writing for huge address spaces becomes a reality.

On the upside, it's easier to add a feature than remove one, and this
one's both orthogonal and backward compatible. All these guys really
need to do is just unilaterally implement long array indexes. "Okay
programmers, now if you want big arrays you can use them. If you
don't, no problem, but if you do, go ahead on our system - and your
code will take advantage of it on Sun's systems when Sun wake up and
officially add it to the Java specification."

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: John Thingstad
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <opse5qq0vzpqzri1@mjolner.upc.no>
On Thu, 30 Sep 2004 04:24:25 GMT, Russell Wallace  
<················@eircom.net> wrote:

> - No hardware memory protection, virtual memory, paging etc.
> I'm pretty sure this is a huge win. Look at some of the discussions on
> comp.arch for just what a headache-inducing mess this is on all
> existing systems.
>

This is not a C problem. It's a OS problem.
The reason to implement paging in hardware is to reduce the cost of having
many programs running with out interfering with each other's address  
spaces.
Paging in hardware is relativly cheap because pointer comparisons can be  
done in
paralell without taking up extra CPU cycles.
In a perfect world perhaps Java programs would be free of segment overuns  
by design.
I find this difficult to accept without hard (real world) evidence.

>
> - Don't need so much complexity in out of order execution etc, assume
> thread switch on cache miss.
> Server-side Java tends to be heavily multithreaded, so you can go for
> total throughput rather than desperately trying to crank as much
> performance as possible out of a single thread.
>

VLIW requires a more complex compiler be it C or Java because pipelining
is explicit.
http://www.semiconductors.philips.com/acrobat/other/vliw-wp.pdf

Still not a C issue.

>
> I find it plausible the above might amount to enough savings in size
> and (more important) heat dissipation to let you get 24 cores on a
> chip.
>

I think the great win of a Java OS is in shared Java libraies
and central OS controlled garbage collection (multi processor friendly  
perhaps).

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415c7389.137976425@news.eircom.net>
On Thu, 30 Sep 2004 22:48:26 +0200, "John Thingstad"
<··············@chello.no> wrote:

>In a perfect world perhaps Java programs would be free of segment overuns  
>by design.

Doesn't need perfection, just needs disallowing of pointer arithmetic.
That reduces the problem to null pointer and array bounds checking,
both of which real Java implementations in fact do. (And factoring
array bounds checks out of inner loops is doable in practice, not just
in theory.)

>VLIW requires a more complex compiler be it C or Java because pipelining
>is explicit.

Well, Transmeta do it. I don't know how - their stuff is a trade
secret AFAIK; my guess would be that it's easier for JIT than static
compilation because you can look at what the code's actually doing at
run time rather than guessing what it will do; and easier for a
language with less random aliasing than C. In any event, that proves
it's doable.

>Still not a C issue.

In practice it is. C is designed on the assumption the distribution
format will be binary machine code, and the culture and existing code
base reflect that assumption. Java is designed on the assumption the
distribution format will be JITted byte code, which is what VLIW
needs.

>I think the great win of a Java OS is in shared Java libraies
>and central OS controlled garbage collection (multi processor friendly  
>perhaps).

*nod* Those are probably big wins too, particularly when they can
design the memory/MP system knowing how the GC works and vice versa.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096625972.059599.198370@k26g2000oda.googlegroups.com>
John Thingstad wrote:
>
> This is not a C problem. It's a OS problem.
> The reason to implement paging in hardware is to reduce the cost of
having
> many programs running with out interfering with each other's address

> spaces.

Well, another, very significant, reason is to do clever memory-mapping
tricks.  Modern OS's are full of these things, and they are a big win
in terms of elegence and (I suspect) performance.  The sort of thing I
mean is the facility to map, say, a file into a bit of address space,
and then to share that mapping between multiple processes.  This all
relies on virtual memory to work, and it's worth having, I think.

> I think the great win of a Java OS is in shared Java libraies
> and central OS controlled garbage collection (multi processor
friendly
> perhaps).

I'm not sure that the Java libraries aren't shared now: why wouldn't
they be?  Maybe the fact that they have to be loaded rather than just
mapped into memory, meaning you can't just share the memory mapping.
Hmm, that would make sense.  Of course this only matters if you run
more than one JVM - most of the applications I deal with have one per
machine.

--tim
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415dc970.1281303@news.eircom.net>
On 1 Oct 2004 03:19:32 -0700, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>Well, another, very significant, reason is to do clever memory-mapping
>tricks.  Modern OS's are full of these things, and they are a big win
>in terms of elegence and (I suspect) performance.  The sort of thing I
>mean is the facility to map, say, a file into a bit of address space,
>and then to share that mapping between multiple processes.  This all
>relies on virtual memory to work, and it's worth having, I think.

It may well be if you're running legacy Unix operating systems and C
programs that assume physical memory is inadequate but virtual memory
is present. If you have a clean sheet of paper, though, being able to
ditch this would be a big win.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096885007.479008.102840@h37g2000oda.googlegroups.com>
Russell Wallace wrote:
> It may well be if you're running legacy Unix operating systems and C
> programs that assume physical memory is inadequate but virtual memory
> is present. If you have a clean sheet of paper, though, being able to
> ditch this would be a big win.

Precisely the opposite is true.  These facilities are used most by
applications (and OS code) which have been written relatively
recently.  It's nothing to do with applicationhs which assume they
have to page memory because there isn't enough, it's to do with
applications which can benefit from address-remapping.

--tim
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <416186c0.246389768@news.eircom.net>
On 4 Oct 2004 03:16:47 -0700, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>Russell Wallace wrote:
>> It may well be if you're running legacy Unix operating systems and C
>> programs that assume physical memory is inadequate but virtual memory
>> is present. If you have a clean sheet of paper, though, being able to
>> ditch this would be a big win.
>
>Precisely the opposite is true.  These facilities are used most by
>applications (and OS code) which have been written relatively
>recently.

Typically using operating systems, programming languages and
techniques designed in the 1970s.

>It's nothing to do with applicationhs which assume they
>have to page memory because there isn't enough, it's to do with
>applications which can benefit from address-remapping.

Sometimes they can _on existing hardware and operating systems that
are optimized for that, where the price has already been paid_. That
has nothing to do with what would be the most efficient design
starting with blank silicon - which is what the current discussion is
about.

Or do you think current systems are an optimal design, that they're
what you'd come up with if you had a clean sheet of paper and nothing
better is possible? If not, what's your idea how to do better?

-- 
"Always look on the bright side of life."
To reply by email, remove the small snack from address.
From: Joe Marshall
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <r7oijn7i.fsf@ccs.neu.edu>
"John Thingstad" <··············@chello.no> writes:

> This is not a C problem. It's a OS problem.
> The reason to implement paging in hardware is to reduce the cost of having
> many programs running with out interfering with each other's address
> spaces.

That is *a* reason, but I think the primary reason is because you
don't want buggy programs taking down the entire system.
From: Julian Stecklina
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <867jq950aw.fsf@goldenaxe.localnet>
················@eircom.net (Russell Wallace) writes:

> - No hardware memory protection, virtual memory, paging etc.

Would that not be a huge step backwards? At least virtual memory and
paging are vital for modern OS design. Say, what would you do if you
run low on RAM? Would you want to do some kind of user-level swapping?

Regards,
-- 
                    ____________________________
 Julian Stecklina  /  _________________________/
  ________________/  /
  \_________________/  LISP - truly beautiful
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <415f0e76.84499595@news.eircom.net>
On Sat, 02 Oct 2004 04:19:35 +0200, Julian Stecklina
<··········@web.de> wrote:

>················@eircom.net (Russell Wallace) writes:
>
>> - No hardware memory protection, virtual memory, paging etc.
>
>Would that not be a huge step backwards? At least virtual memory and
>paging are vital for modern OS design.

"Modern" operating systems are, to put it frankly, so badly designed
"crazy" doesn't even begin to cover it. I've no idea what would; I'd
probably have to go rummaging in H.P. Lovecraft for appropriate words.

And modern hardware+software combinations waste at least an order of
magnitude of performance relative to what could be obtained at the
current process node, maybe two orders. In other words, at least 90%
of potential performance is being squandered on supporting bad
software. This might be a way out of that.

>Say, what would you do if you
>run low on RAM? Would you want to do some kind of user-level swapping?

Suspend some processes and swap them to disk until the remaining ones
have enough RAM; that can be done purely in software on a
virtual-machine based system.

-- 
"Sore wa himitsu desu."
To reply by email, remove
the small snack from address.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096885815.989053.131160@h37g2000oda.googlegroups.com>
Russell Wallace wrote:
> And modern hardware+software combinations waste at least an order of
> magnitude of performance relative to what could be obtained at the
> current process node, maybe two orders. In other words, at least 90%
> of potential performance is being squandered on supporting bad
> software. This might be a way out of that.
>

Right.  So when I benchmark applications on these terrible modern
systems which can allocate ephemeral memory at 5-10 instructions per
word, including GC overhead, I should presumably be expecting 1/2
instruction per word, Including GC?  When I benchmark IO systems which
can sustain disk I/O at 90% of memory bandwidth (using fancy memory
mapping tricks, by the way, to avoid expensive copies of data), I
should, presumably be expecting disk I/O at 10x memory bandwidth? That
would certainly be impressive.

> Suspend some processes and swap them to disk until the remaining ones
> have enough RAM; that can be done purely in software on a
> virtual-machine based system.

Right, right, whole process swapping.  Like the PDP11 did.  Really
good news for performance, that.

Do you actually have any idea at *all* about the performance of modern
systems?

--tim
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <4161827f.245300886@news.eircom.net>
On 4 Oct 2004 03:30:16 -0700, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>Russell Wallace wrote:
>> And modern hardware+software combinations waste at least an order of
>> magnitude of performance relative to what could be obtained at the
>> current process node, maybe two orders. In other words, at least 90%
>> of potential performance is being squandered on supporting bad
>> software. This might be a way out of that.
>
>Right.  So when I benchmark applications on these terrible modern
>systems which can allocate ephemeral memory at 5-10 instructions per
>word, including GC overhead, I should presumably be expecting 1/2
>instruction per word, Including GC?

I said "performance", not "memory allocation performance". Does the
software you use at the moment spend most of its time in memory
allocation and GC?

>When I benchmark IO systems which
>can sustain disk I/O at 90% of memory bandwidth (using fancy memory
>mapping tricks, by the way, to avoid expensive copies of data), I
>should, presumably be expecting disk I/O at 10x memory bandwidth? That
>would certainly be impressive.

You have enough money to spare that there's no problem buying machines
that can sustain disk I/O at 90% of memory bandwidth (assuming the
reason for this isn't lack of memory bandwidth)? If you have, _that's_
impressive - and you probably don't need the new system under
discussion (which is being specifically advertised as a budget saver).
I'm talking about what could be done with limited resources, not what
can be done if you have truckloads of money to splash around.

>Right, right, whole process swapping.  Like the PDP11 did.  Really
>good news for performance, that.

This is 2004; the way to get things running fast is to have enough RAM
for your working set. Are you really under the impression having your
working set swapping to disk is an efficient way to get good
performance? Swapping to disk _at all_ is only a good idea nowadays if
a) you have processes that aren't doing anything at the moment, b) you
don't care about performance or c) you have money to burn.

(Yes, there are jobs that involve processing data sets in the
terabyte-to-petabyte range which still can't fit in RAM. No, a virtual
memory architecture designed over two decades ago is not the most
efficient way to get performance on those jobs.)

>Do you actually have any idea at *all* about the performance of modern
>systems?

Hints:
1) Take a look at the relative cost of memory bandwidth versus disk
bandwidth.
2) Take a look at a die photo of a modern CPU.

-- 
"Always look on the bright side of life."
To reply by email, remove the small snack from address.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096972447.347864.69540@k17g2000odb.googlegroups.com>
Russell Wallace wrote:
> You have enough money to spare that there's no problem buying
machines
> that can sustain disk I/O at 90% of memory bandwidth (assuming the
> reason for this isn't lack of memory bandwidth)? If you have,
_that's_
> impressive - and you probably don't need the new system under
> discussion (which is being specifically advertised as a budget
saver).

They're not very expensive.  Not quite PCs but not big high-end boxes.
`Mid-range' I think is the term (although the things we use are sold
as entry-level).

> [Blah]

No, sorry, I don't really have the energy to reply to all this
idiocy.  You go off and design your machines.

--tim
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <41629875.316469658@news.eircom.net>
On 5 Oct 2004 03:34:07 -0700, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>No, sorry, I don't really have the energy to reply to all this
>idiocy.  You go off and design your machines.

So what you're saying here is that you're an obnoxious prick as well
as a fool. Fine, I'll leave you to it.

-- 
"Always look on the bright side of life."
To reply by email, remove the small snack from address.
From: Tim Bradshaw
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <1096997501.264578.99480@k26g2000oda.googlegroups.com>
>
> So what you're saying here is that you're an obnoxious prick as well
> as a fool. Fine, I'll leave you to it.

Oh yes.  I am indeed incredibly rude, and I know absolutely nothing
about anything, and I am trapped in the past.  You only have to look
at my cll postings to see that!  You on the other hand are very
smart.  I see that now, very clearly.

Here's a nice problem for you, I'd be interested to know how you'd go
about solving it with your clean sheet design.

Data is arriving from some IO device - a network or a disk or
something like that.  It appears in some buffer.  You have several
people (programs, if you like) who might want to look at this data.
You don't trust these programs.  They need to be able to modify the
data without doing anything special but those modifications should not
be visible to any of the other programs.  There might be many of these
programs (they may be running on several processors, although it's
probably best to assume this is an SMP machine), and there is a lot of
this data, and it is arriving very fast.  Design a system which allows
you to do this, and make it as efficient as you can (speed is more
important than space, but space matters too).  Take into account the
fundemental constraints of modern systems (the most important being, I
think, that high main memory latency is physically unavoidable, and
low-latency memory is always small (even though most of the CPU is
made of it nowadays)). I know how this is done in practice, but
obviously the systems I know about are all just legacy crap, so the
tricks they use are not interesting to a truly modern man such as
yourself.

--tim
From: John Thingstad
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <opsfejjoiupqzri1@mjolner.upc.no>
On Tue, 05 Oct 2004 12:51:16 GMT, Russell Wallace  
<················@eircom.net> wrote:

> On 5 Oct 2004 03:34:07 -0700, "Tim Bradshaw" <··········@tfeb.org>
> wrote:
>
>> No, sorry, I don't really have the energy to reply to all this
>> idiocy.  You go off and design your machines.
>
> So what you're saying here is that you're an obnoxious prick as well
> as a fool. Fine, I'll leave you to it.
>

You obviously have an interest in OS design.
I suggest you read up on modern OS design despite the impression that it  
seems stupid.
Pay particular attention to chapters on virtual memory.
When you are done continue attacking C and replacing it with Java. ;)
(Me I prefer Lisp)

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
From: Russell Wallace
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <4162b697.281050@news.eircom.net>
On Tue, 05 Oct 2004 16:51:14 +0200, "John Thingstad"
<··············@chello.no> wrote:

>I suggest you read up on modern OS design despite the impression that it  
>seems stupid.

I have done - and believe me, I'm not claiming the people working on
Unix, VMS etc are stupid; I understand the problems they have to solve
regarding backward compatibility, and it amazes me anyone can do that
sort of stuff without ending up in the loony bin.

But the subject under discussion was what could be done if you had a
clean sheet of paper and you don't have to support 10 million lines of
legacy C; and in that situation, "how would you solve problem X?" may
be best answered by "I wouldn't create problem X in the first place".

>When you are done continue attacking C and replacing it with Java. ;)
>(Me I prefer Lisp)

Lisp is a better designed language than Java; now go ahead and
convince the world's IT departments of that ^.~ In the meantime, a
move to anything that doesn't allow casual access to random memory
addresses would be a huge step forward.

-- 
"Always look on the bright side of life."
To reply by email, remove the small snack from address.
From: Gareth McCaughan
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <877jqcn3l5.fsf@g.mccaughan.ntlworld.com>
Russell Wallace wrote:

[I said:]
>> So if these "Azul boxes" run Java an order of magnitude
>> or so faster than ordinary machines do, and ordinary machines
>> run Java at roughly the same speed as they run C, then
>> an Azul box should be able to run Java an order of magnitude
>> faster than a Pentium 4 Xeon or an Opteron or whatever runs C.
>> 
>> Call me skeptical if you will, but that doesn't seem
>> likely.

[Russell:]
> I think it's plausible. The thing is, C is a performance killer
> because you're nailed to decades-old architectural decisions. The
> point of Java is that, being VM/JIT-based to begin with, targeting it
> specifically gives you the priceless gift of _a clean sheet of paper_.
> If you use it properly, a great deal of the complexity of a Pentium or
> whatever can just go away.

C isn't all that bad. But I'm sure it's true that if you
sat down now to write a language whose compiled code could
execute really fast, you could do better than C. But, um,
Java was even less designed with that in mind than C was,
no?

-- 
Gareth McCaughan
.sig under construc
From: Robert Swindells
Subject: Re: Commercial "Java" machine
Date: 
Message-ID: <pan.2004.09.28.21.19.19.487506@fdy2.demon.co.uk>
On Tue, 28 Sep 2004 12:01:40 -0700, Will Hartung wrote:

> A company is coming out with a custom Java Appliance. Custom CPU, etc. It's
> not interactive, it's basically just a Java VM machine to run server
> applications.
> 
> http://java.ittoolbox.com/news/dispnews.asp?i=121712
> 
> A comment on a message board (since we're still lacking details):
> 
> "Yeah, you're missing something: One Azul box will have more Java horsepower
> than an entire rack of servers (I don't know yet if they've unveiled just
> how much, so that's all I can say,) and it can handle a couple of hundred GB
> on a Java heap with no full GC pauses."

There has been a brief thread on this in comp.arch.

Robert Swindells