From: ············@gmail.com
Subject: Next Generation of Language
Date: 
Message-ID: <1168262484.810792.90430@s34g2000cwa.googlegroups.com>
>From this link
http://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
(Note : Japanese)
Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
desktop computers will be common, it's nearly impossible to simply
write that many threads, it should be done automatically, so maybe
functional language will do a better job in parallel programming than
procedural language like C or Ruby.

From: Ben
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168265574.746548.104790@11g2000cwr.googlegroups.com>
On Jan 8, 8:21 am, ·············@gmail.com" <············@gmail.com>
wrote:
> >From this linkhttp://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
> (Note : Japanese)
> Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
> desktop computers will be common, it's nearly impossible to simply
> write that many threads, it should be done automatically, so maybe
> functional language will do a better job in parallel programming than
> procedural language like C or Ruby.

Large embedded system quite often have that many threads.  Obviously,
they aren't all actually executing simultaneously on the processors we
have right now, but various numbers of them are run depending on the
platform, so the system is (or should be anyway) coded to handle each
thread executing at any time.  Not that I disagree with your point -
functional programming would be a great help as our systems grow in
complexity.

Begin old embedded programmer rant:
Kids these days just have no idea how to watch for side effects and
avoid them, or why they should.  What are they learning in school?!
And don't even ask them to create a formal state machine for the side
effects they need.  They'd rather throw fifteen booleans in there and
hope they can cover every possibility!
End rant.

  Regardless of the language used on an actual product, training people
in functional programming teaches them the skills they need when
writing large scale concurrent apps, or small, single threaded apps, or
any code that they don't want to be patching for the next 30 years.

BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?
From: Steven L. Collins
Subject: Re: Next Generation of Language
Date: 
Message-ID: <ic6dnZJMroy_hj7YnZ2dnUVZ_vyunZ2d@comcast.com>
"Ben" <········@gmail.com> wrote in message 
·····························@11g2000cwr.googlegroups.com...
> On Jan 8, 8:21 am, ·············@gmail.com" <············@gmail.com>
> wrote:
>> >From this 
>> >linkhttp://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
>> (Note : Japanese)
>> Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
>> desktop computers will be common, it's nearly impossible to simply
>> write that many threads, it should be done automatically, so maybe
>> functional language will do a better job in parallel programming than
>> procedural language like C or Ruby.
>
> Large embedded system quite often have that many threads.  Obviously,
> they aren't all actually executing simultaneously on the processors we
> have right now, but various numbers of them are run depending on the
> platform, so the system is (or should be anyway) coded to handle each
> thread executing at any time.  Not that I disagree with your point -
> functional programming would be a great help as our systems grow in
> complexity.
>
> Begin old embedded programmer rant:
> Kids these days just have no idea how to watch for side effects and
> avoid them, or why they should.  What are they learning in school?!
> And don't even ask them to create a formal state machine for the side
> effects they need.  They'd rather throw fifteen booleans in there and
> hope they can cover every possibility!
> End rant.
>
>  Regardless of the language used on an actual product, training people
> in functional programming teaches them the skills they need when
> writing large scale concurrent apps, or small, single threaded apps, or
> any code that they don't want to be patching for the next 30 years.
>
> BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?
>
See "Real-Time programming in Common Lisp" by James R. Allard and Lowell B. 
Hawkinson
From: Marc Battyani
Subject: Re: Next Generation of Language
Date: 
Message-ID: <CZ-dnZ8mptMrND7YnZ2dnUVZ8tGqnZ2d@giganews.com>
"Ben" <········@gmail.com> wrote
>
> BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?

Real time robot driving + Data acquisition + 3D display (OpenGL) in Lisp.

http://www.fractalconcept.com/asp/lhw3sdataQuxi9oMj8B==
The very hard real time (ns scale) is done in VHDL in a FPGA and everything 
else in Lisp. (Lispworks)

Marc
From: Kjetil Svalastog Matheussen
Subject: Re: Next Generation of Language
Date: 
Message-ID: <Pine.LNX.4.64.0701091542250.21158@iannis.localdomain>
On Mon, 8 Jan 2007, Ben wrote:

> 
> BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?
> 

Well, sort of:
http://www.notam02.no/arkiv/doc/snd-rt/

And by "Well, sort of", I mean that its definitely hard real time, but 
perhaps not lisp, because its not consing. However, consing is definitely 
possible, and even garbage collecting is possible (if it had been 
implemented). I am however not sure how useful hard-core consing would be 
for this kind of work.
From: Kjetil Svalastog Matheussen
Subject: Re: Next Generation of Language
Date: 
Message-ID: <Pine.LNX.4.64.0701091553460.21158@iannis.localdomain>
On Tue, 9 Jan 2007, Kjetil Svalastog Matheussen wrote:

> 
> 
> On Mon, 8 Jan 2007, Ben wrote:
> 
> > 
> > BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?

I also think I read something about the lisp guys at austin/texas 
university using rscheme to controle robots. Rscheme was once hard real 
time capable.
From: ······@yahoo.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168489207.874750.19330@77g2000hsv.googlegroups.com>
I don't think there is anything more anti parallelism like Lisp. Lisp
is recursive, a function has to basically wait for another instance of
itself to finish before continuing. Where is the parallelism?

Kjetil Svalastog Matheussen wrote:
> On Mon, 8 Jan 2007, Ben wrote:
>
> >
> > BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?
> >
>
> Well, sort of:
> http://www.notam02.no/arkiv/doc/snd-rt/
>
> And by "Well, sort of", I mean that its definitely hard real time, but
> perhaps not lisp, because its not consing. However, consing is definitely
> possible, and even garbage collecting is possible (if it had been
> implemented). I am however not sure how useful hard-core consing would be
> for this kind of work.
From: Barry Margolin
Subject: Re: Next Generation of Language
Date: 
Message-ID: <barmar-C2102D.00045711012007@comcast.dca.giganews.com>
In article <·······················@77g2000hsv.googlegroups.com>,
 ······@yahoo.com wrote:

> I don't think there is anything more anti parallelism like Lisp. Lisp
> is recursive, a function has to basically wait for another instance of
> itself to finish before continuing. Where is the parallelism?

Functions like MAPCAR easily lend themselves to parallel variants that 
operate on many elements concurrently.  *Lisp, the Lisp dialect for the 
massively-parallel Connection Machine, was built around operations like 
this.

For coarse-grained parallelism, you can easily make use of the 
multi-threading features of most modern Lisps.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Paul Wallich
Subject: Re: Next Generation of Language
Date: 
Message-ID: <eo6132$7ul$1@reader2.panix.com>
Barry Margolin wrote:
> In article <·······················@77g2000hsv.googlegroups.com>,
>  ······@yahoo.com wrote:
> 
>> I don't think there is anything more anti parallelism like Lisp. Lisp
>> is recursive, a function has to basically wait for another instance of
>> itself to finish before continuing. Where is the parallelism?
> 
> Functions like MAPCAR easily lend themselves to parallel variants that 
> operate on many elements concurrently.  *Lisp, the Lisp dialect for the 
> massively-parallel Connection Machine, was built around operations like 
> this.
> 
> For coarse-grained parallelism, you can easily make use of the 
> multi-threading features of most modern Lisps.

And even if you (unnecessarily) stick to recursive expressions of your 
algorith, you can do a fair amount of work in parallel by using futures 
(q.v.) or techniques analagous to loop-unrolling.

paul
From: Thomas A. Russ
Subject: Re: Next Generation of Language
Date: 
Message-ID: <ymi7ivtfl6n.fsf@sevak.isi.edu>
······@yahoo.com writes:

> I don't think there is anything more anti parallelism like Lisp. Lisp
> is recursive, a function has to basically wait for another instance of
> itself to finish before continuing. Where is the parallelism?

MAPCAR.

-- 
Thomas A. Russ,  USC/Information Sciences Institute
From: Alex Mizrahi
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45a68040$0$49201$14726298@news.sunsite.dk>
(message (Hello ·······@yahoo.com)
(you :wrote  :on '(10 Jan 2007 20:20:08 -0800))
(

 w> I don't think there is anything more anti parallelism like Lisp. Lisp
 w> is recursive, a function has to basically wait for another instance of
 w> itself to finish before continuing. Where is the parallelism?

i have an execute-parallel function which is able to execute list of jobs in 
parallel:

(execute-parallel
    (loop for i in list
        collect (lambda () (do-something-with i)))
  5)

with some macrology you can make it even more concise, e.g.

(wtih-parallel-execution (5)
 (loop for i in list
    collect (p-job (do-something-with i)))

so, with some macrology you can make concurrent execution very easy, but 
same time keeping it controllable.
actually, totally uncontrolled concurency IS A PROBLEM in functional 
language -- it's ok when you do internal calculations, but once you need 
some interaction with outer world -- e.g. I/O, network protocols -- you need 
some non-obvious constructs (like monads in Haskell). do you really think 
you need that shit -- simplify implicit concurrency at cost of complicating 
communications with linear-time outer world?
some people dig it, but i suspect that most programmers will find that very 
confusing, and they'd prefer some easily controllable concurrency.

actually, you can make a non-deterministic concurrent language in Lisp --  
take a look at Screamer. i suspect it's possible to make Screamer 
concurrent. but you'll have controllable concurrency -- you'll specify which 
computations you want to run in parallel undeterministically, others you do 
in simple linear order.

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-006@yahoo.com>
> From: ······@yahoo.com
> I don't think there is anything more anti parallelism like Lisp. Lisp
> is recursive, a function has to basically wait for another instance of
> itself to finish before continuing. Where is the parallelism?

OK, consider the most trivial recursive-definition example, factorial:
I you use non-branching recursion merely to emulate iteration,
that's stupid, requiring tail-recursion optimization to avoid stack
overflow. But if you divide-and-conquer, the parallelism is obvious:

(defun lo+hi-product (lo hi)
  "Computes product lo * (lo+1) * (lo+2) ... up to but not including hi"
  (if (>= (+ 1 lo) hi)
    lo
    (let ((mid (ceiling (+ lo hi) 2)))
      (* (lo+hi-product lo mid)
         (lo+hi-product mid hi)))))
;(lo+hi-product 2 3) ==> 2
;(lo+hi-product 1 6) ==> 120
;Caveat: The above code works correctly only when lo is an integer.
;A slight change in the logic would be needed to handle products such
; as 1.3 * 2.3 * 3.3 * 4.3 etc., left as exercise to newbies/students.

(defun factorial (n)
  (lo+hi-product 1 (+ 1 n)))
;(factorial 5) ==> 120

In general, recursive algorithms are more divide-and-conquer rather
than emulation of iteration. IMO It's unfortunate that the
really-iterative definition of factorial is usually given as the
first example of "recursion" in schools.

<mathCliche>In general, it helps to solve a more general problem
than the one given. The more general problem is often actually
easier to understand than the given special case.</mathCliche>
In this case, generalizing 1*2*3*... to LO*(LO+1)*(LO+2)*...
make it more obvious how to divide-and-conquer.
From: ······@corporate-world.lisp.de
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168615951.986754.107390@v45g2000cwv.googlegroups.com>
Ben schrieb:

> On Jan 8, 8:21 am, ·············@gmail.com" <············@gmail.com>
> wrote:
> > >From this linkhttp://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
> > (Note : Japanese)
> > Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
> > desktop computers will be common, it's nearly impossible to simply
> > write that many threads, it should be done automatically, so maybe
> > functional language will do a better job in parallel programming than
> > procedural language like C or Ruby.
>
> Large embedded system quite often have that many threads.  Obviously,
> they aren't all actually executing simultaneously on the processors we
> have right now, but various numbers of them are run depending on the
> platform, so the system is (or should be anyway) coded to handle each
> thread executing at any time.  Not that I disagree with your point -
> functional programming would be a great help as our systems grow in
> complexity.
>
> Begin old embedded programmer rant:
> Kids these days just have no idea how to watch for side effects and
> avoid them, or why they should.  What are they learning in school?!
> And don't even ask them to create a formal state machine for the side
> effects they need.  They'd rather throw fifteen booleans in there and
> hope they can cover every possibility!
> End rant.
>
>   Regardless of the language used on an actual product, training people
> in functional programming teaches them the skills they need when
> writing large scale concurrent apps, or small, single threaded apps, or
> any code that they don't want to be patching for the next 30 years.
>
> BTW:  Has anyone done any hard real time work using Lisp?  How'd it go?

There was a LispWorks version with a real-time GC running on a large
ATM Switch.
G2 from Gensym uses a Lisp without GC.
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-003@yahoo.com>
> From: "Ben" <········@gmail.com>
> Begin old embedded programmer rant:
> Kids these days just have no idea how to watch for side effects and
> avoid them, or why they should.  What are they learning in school?!
> And don't even ask them to create a formal state machine for the side
> effects they need.  They'd rather throw fifteen booleans in there and
> hope they can cover every possibility!
> End rant.

So why don't you do something to fix this problem? Join with me to
create a Web-based (CGI/CommonLisp) tutorial which coaches kids
toward having these sorely missing skills? Our online tutorial
would be a fully CAI (Computer-Assisted-Instruction) application,
making liberal use of my SegMat algorithms for coaching the student
towards the correct short-answer fill-in
(see my early 2001 demos at
 <http://shell.rawbw.com/~rem/cgi-bin/topscript.cgi>
 ranging from single word fillins through phrase/sentence fill-in
 such as answers to chicken-cross-road riddle to very long entries
 such as complete song lyrics with rollover)
and also my all-but-one algorithm for effective flashcard drill
(use the guest1 login at my current active Web service site at
 <http://tinyurl.com/uh3t>
 for example try learning Spanish common words in context).


--
Nobody in their right mind likes spammers, nor their automated assistants.
To open an account here, you must demonstrate you're not one of them.
Please spend a few seconds to try to read the text-picture in this box:

/--------------------------------------------------------------\
|            .__.     .              .  . .                    |
|            [__]._  _|._. _ .    ,  |  |*| _  __              |
|            |  |[ )(_][  (/, \/\/   |/\|||(/,_)               |
\-(Rendered by means of <http://www.schnoggo.com/figlet.html>)-/

Then enter your best guess of the text (10-20 chars) into this TextField:
          +--------------------+
          |                    |
          +--------------------+
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168276726.751899.9170@38g2000cwa.googlegroups.com>
············@gmail.com wrote:

> Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
> desktop computers will be common, it's nearly impossible to simply
> write that many threads, it should be done automatically, so maybe
> functional language will do a better job in parallel programming than
> procedural language like C or Ruby.

I think the only interesting bit of this is what people will do with
this on desktops.  Server applications frequently have plenty of
parallelism to exploit and people are becoming very sensitive to power
issues (not, I think, out of any sense of responsibility but because it
is now often hard to fill racks in DCs without exceeding power &
cooling budgets, and both are also expensive of course).  This is
finally driving people towards multiple core systems clocked less
aggressively (quadratic dependence of power on clock speed really makes
a difference here).

Even when there is not enough parallelism to exploit you can use
multiple-core machines to consolidate lots of less-threaded
applications efficiently, either using one of the somewhat horrible
machine-level virtualisation things or something more lightweight like
zones.

Of course, all this is predicated on there being enough memory
bandwidth that everything doesn't just starve.  I dunno how good
current seriously-multicore systems are in this respect.

But on the desktop most of these applications aren't very interesting,
so finding something for a seriously multicore system to do might be
more of a challenge.  There is, of course, the argument that it doesn't
matter very much: given that it's expensive to provide enough memory
bandwidth & that desktop applications are often much more latency
sensitive than server ones, but somehow desktop processors ship with
much less cache than those for servers, one has to wonder whether
anyone actually really notices.  I suspect desktops already spend most
of their time either idle or stalled waiting for memory.  Adding more
cores will just mean they spend more time doing both.  Not that this
will stop anyone, of course.
From: Pascal Bourguignon
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87bql94abb.fsf@thalassa.informatimago.com>
"Tim Bradshaw" <··········@tfeb.org> writes:
> I think the only interesting bit of this is what people will do with
> this on desktops.  

Simulations, 3D, virtual worlds, Neural Networks, (game) AI.

Indeed, it would be nice to add several parallel memory buses, or just
have big L2 or L3 caches.  
With a 64MB L2 cache PER core and 64 cores, you have 4GB of RAM on chip.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Small brave carnivores
Kill pine cones and mosquitoes
Fear vacuum cleaner
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168298143.150234.124410@38g2000cwa.googlegroups.com>
Pascal Bourguignon wrote:

> Simulations,

Of what?

> 3D,

perhaps, but there's probably more than enough computing power already
to do all but the rendering, and the rendering will never be done in a
general purpose CPU I should think (as it isn't now, of course).

virtual worlds,

To be interesting these involve information that travels over networks
which have latencies of significant fractions of a second and
constrained bandwidth.  Seems unlikely that vast local computing power
will help that much.

Neural Networks,

To what end?

(game) AI.

I know nothing about games really, but I'd lay odds that the thing that
consumes almost all the computational resource is rendering. See above.

>
> Indeed, it would be nice to add several parallel memory buses, or just
> have big L2 or L3 caches.
> With a 64MB L2 cache PER core and 64 cores, you have 4GB of RAM on chip.

There are lots of reasons why that sort of thing is hard and expensive.
From: Spiros Bousbouras
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168298748.558477.152070@11g2000cwr.googlegroups.com>
Tim Bradshaw wrote:
> Pascal Bourguignon wrote:
> > (game) AI.
>
> I know nothing about games really, but I'd lay odds that the thing that
> consumes almost all the computational resource is rendering. See above.

If you want to analyse chess positions you can never
have too much speed and it has nothing to do with
rendering. I'm sure it's the same situation with go and
many other games.
From: Ken Tilton
Subject: Re: Next Generation of Language
Date: 
Message-ID: <iIBoh.69$GY1.25@newsfe10.lga>
Spiros Bousbouras wrote:
> Tim Bradshaw wrote:
> 
>>Pascal Bourguignon wrote:
>>
>>>(game) AI.
>>
>>I know nothing about games really, but I'd lay odds that the thing that
>>consumes almost all the computational resource is rendering. See above.
> 
> 
> If you want to analyse chess positions you can never
> have too much speed and it has nothing to do with
> rendering. I'm sure it's the same situation with go and
> many other games.
> 

That's kind of a reductio ad absurdum argument. Deciding the edge in a 
middle game chess position is a tad trickier than deciding if that 
cluster bomb went off close enough to this NPC to kill it.

otoh, GPUs are already a form of parallelization and the computations 
they do lend themselves nicely to on-board parallelization, so ... well, 
come to think of it, people are already playing realtime online so games 
may be held back by Net bandwidth for a long time to come, I guess until 
they get bored with that and just take to the streets roving in gangs 
attacking passers-by.

kt

-- 
The Dalai Lama gets the same crap all the time.
   -- Kenny Tilton on c.l.l when accused of immodesty
From: Ray Dillinger
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45afc469$0$69019$742ec2ed@news.sonic.net>
Ken Tilton wrote:

> Spiros Bousbouras wrote:

>> If you want to analyse chess positions you can never
>> have too much speed and it has nothing to do with
>> rendering. I'm sure it's the same situation with go and
>> many other games.


> That's kind of a reductio ad absurdum argument. Deciding the edge in a 
> middle game chess position is a tad trickier than deciding if that 
> cluster bomb went off close enough to this NPC to kill it.


No, it's not.  If you give game designers the power, they're
going to do generalized min-maxing with generalized pruning
to decide exactly which armaments to deploy in the next ten
seconds, and when it comes down to a choice between spending
ammo for the machine gun and spending a missile, they're going
to test both scenarios against opponent's responses and odds of
opponents still being able to respond with a 3-move lookahead
generating some thousands of scenarios, simulate them all, and
pick the highest-scored one.  Just like they do now with simple
games such as chess.

			Bear
From: Ken Tilton
Subject: Re: Next Generation of Language
Date: 
Message-ID: <N5Qrh.22$Cw6.7@newsfe10.lga>
Ray Dillinger wrote:
> Ken Tilton wrote:
> 
>> Spiros Bousbouras wrote:
> 
> 
>>> If you want to analyse chess positions you can never
>>> have too much speed and it has nothing to do with
>>> rendering. I'm sure it's the same situation with go and
>>> many other games.
> 
> 
> 
>> That's kind of a reductio ad absurdum argument. Deciding the edge in a 
>> middle game chess position is a tad trickier than deciding if that 
>> cluster bomb went off close enough to this NPC to kill it.
> 
> 
> 
> No, it's not.

Is, too!

Oh, hang on, you agree, because from here on you say nothing about the 
computational complexity of deciding a chess move vs a bomb kill:

>  If you give game designers the power, they're
> going to do generalized min-maxing with generalized pruning
> to decide exactly which armaments to deploy in the next ten
> seconds, and when it comes down to a choice between spending
> ammo for the machine gun and spending a missile, they're going
> to test both scenarios against opponent's responses and odds of
> opponents still being able to respond with a 3-move lookahead
> generating some thousands of scenarios, simulate them all, and
> pick the highest-scored one. 

Aside from that being a non-contradicting contradiction, which game do 
you have in mind that demonstrates this unfed thirst for CPU power? ie, 
Which game, IYHO, has the best AI?

kzo

-- 
The Dalai Lama gets the same crap all the time.
   -- Kenny Tilton on c.l.l when accused of immodesty
From: Timofei Shatrov
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45aff76a.47929599@news.readfreenews.net>
On Thu, 18 Jan 2007 14:45:36 -0500, Ken Tilton <·········@gmail.com> tried to
confuse everyone with this message:


>>  If you give game designers the power, they're
>> going to do generalized min-maxing with generalized pruning
>> to decide exactly which armaments to deploy in the next ten
>> seconds, and when it comes down to a choice between spending
>> ammo for the machine gun and spending a missile, they're going
>> to test both scenarios against opponent's responses and odds of
>> opponents still being able to respond with a 3-move lookahead
>> generating some thousands of scenarios, simulate them all, and
>> pick the highest-scored one. 
>
>Aside from that being a non-contradicting contradiction, which game do 
>you have in mind that demonstrates this unfed thirst for CPU power? ie, 
>Which game, IYHO, has the best AI?
>

Take the Britain's best-selling game for example, Football Manager 2007.
Graphics are purely cosmetic. CPU intensive as hell. I think this is the game
with the most complicated AI on the market.

Another example is an underground hit Dwarf Fortress, which also needs a lot of
computational power, while having ANSI characters for graphics.

-- 
|Don't believe this - you're not worthless              ,gr---------.ru
|It's us against millions and we can't take them all... |  ue     il   |
|But we can take them on!                               |     @ma      |
|                       (A Wilhelm Scream - The Rip)    |______________|
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1169169266.750812.270220@51g2000cwl.googlegroups.com>
Ray Dillinger wrote:

>
> No, it's not.  If you give game designers the power, they're
> going to do generalized min-maxing with generalized pruning
> to decide exactly which armaments to deploy in the next ten
> seconds, and when it comes down to a choice between spending
> ammo for the machine gun and spending a missile, they're going
> to test both scenarios against opponent's responses and odds of
> opponents still being able to respond with a 3-move lookahead
> generating some thousands of scenarios, simulate them all, and
> pick the highest-scored one.  Just like they do now with simple
> games such as chess.

In many cases they won't do that, because if they did then the computer
player would outplay the humans all the time, and the game would then
not sell very well.
From: Ken Tilton
Subject: Re: Next Generation of Language
Date: 
Message-ID: <GOVrh.2$ka.0@newsfe12.lga>
Tim Bradshaw wrote:
> Ray Dillinger wrote:
> 
> 
>>No, it's not.  If you give game designers the power, they're
>>going to do generalized min-maxing with generalized pruning
>>to decide exactly which armaments to deploy in the next ten
>>seconds, and when it comes down to a choice between spending
>>ammo for the machine gun and spending a missile, they're going
>>to test both scenarios against opponent's responses and odds of
>>opponents still being able to respond with a 3-move lookahead
>>generating some thousands of scenarios, simulate them all, and
>>pick the highest-scored one.  Just like they do now with simple
>>games such as chess.
> 
> 
> In many cases they won't do that, because if they did then the computer
> player would outplay the humans all the time, and the game would then
> not sell very well.
> 

Yeah, I have to assume they hold back the bot race car drivers in GT4. 
But while researching this issue the better to do battle with you yobs I 
saw one AI big say that one advantage of AI would be that the game would 
not have to cheat to keep up.

I have no facts to back this up, but what makes me think MMORPGs are 
taking things more towards human vs human. Much more draw there. Look at 
Usenet addiction... :)

If so, the bottlenecks continue to be net bandwidth and the CPU's 
ability to feed the GPU (and the GPU).

The wise game developer drops out of the CPU/GPU race and works on game 
play -- the story, the setting, the rules. But that is like saying the 
smart film studio should concentrate on great screenplays -- nah, too 
hard, get Pixar on the phone and gimme my checkbook. Is de Caprio 
available? Jolie?

kzo

-- 
The Dalai Lama gets the same crap all the time.
   -- Kenny Tilton on c.l.l when accused of immodesty
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168333745.243377.150650@s80g2000cwa.googlegroups.com>
Spiros Bousbouras wrote:
> Tim Bradshaw wrote:

>
> If you want to analyse chess positions you can never
> have too much speed and it has nothing to do with
> rendering. I'm sure it's the same situation with go and
> many other games.

Quite. Those kinds of games are really popular on PCs, I hear: no one
plays all those tedious `video games' any more.
From: Spiros Bousbouras
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168481307.874093.69730@p59g2000hsd.googlegroups.com>
Madhu wrote:
> * "Spiros Bousbouras"  <························@11g2000cwr.XXXXX.com> :
> | If you want to analyse chess positions you can never
> | have too much speed and it has nothing to do with
> | rendering. I'm sure it's the same situation with go and
> | many other games.
>
> But having more than one core will not be a benefit if your algorithms
> are graph based and have to search a tree. IIRC most graph algorithms
> (dfs bfs) are inherently unparallelizable.

I don't know what algorithms are being used for
chess engines but they definitely benefit from parallel
processing. Well known programmes like Fritz , Schredder ,
Crafty etc. play better with many processors. I stumbled
upon a PhD thesis on the net once about parallel chess
playing algorithms. And of course Hydra
(http://www.hydrachess.com/main.cfm) uses many processors.
From: Christopher Browne
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87vejetbmd.fsf@wolfe.cbbrowne.com>
In an attempt to throw the authorities off his trail, Madhu <·······@meer.net> transmitted:
> * "Spiros Bousbouras"  <························@11g2000cwr.XXXXX.com> :
> | If you want to analyse chess positions you can never
> | have too much speed and it has nothing to do with
> | rendering. I'm sure it's the same situation with go and
> | many other games.
>
> But having more than one core will not be a benefit if your algorithms
> are graph based and have to search a tree. IIRC most graph algorithms
> (dfs bfs) are inherently unparallelizable.

Many probably aren't designed to be able to be parallelized; that's
not necessarily quite the same thing as being unable to be decomposed
into parallelizable portions.

It is, however, pretty fair to say that parallel decomposition
continues to be a troublesome problem.
-- 
let name="cbbrowne" and tld="gmail.com" in name ^ ·@" ^ tld;;
http://linuxdatabases.info/info/spreadsheets.html
"What I find most amusing  about com and .NET  is that they are trying
to solve a problem I only had when programming using MS tools."
-- Max M <····@mxm.dk> (on comp.lang.python)
From: Barry Margolin
Subject: Re: Next Generation of Language
Date: 
Message-ID: <barmar-966206.00230511012007@comcast.dca.giganews.com>
In article <··············@robohate.meer.net>, Madhu <·······@meer.net> 
wrote:

> * "Spiros Bousbouras"  <························@11g2000cwr.XXXXX.com> :
> | If you want to analyse chess positions you can never
> | have too much speed and it has nothing to do with
> | rendering. I'm sure it's the same situation with go and
> | many other games.
> 
> But having more than one core will not be a benefit if your algorithms
> are graph based and have to search a tree. IIRC most graph algorithms
> (dfs bfs) are inherently unparallelizable.

I think there was a Chess program for the Connection Machine, a 
massively parallel computer with thousands of very simple processors 
(or, in the case of the CM-5 model, hundreds of SPARC processors).  I 
don't know the specifics of the algorithm, but my guess is that it 
worked by assigning analysis of different positions at a particular ply 
to each processor.  Walking the tree isn't very parallelizable, but once 
you've reached the leaves you can get quite a bit of benefit.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Juan R.
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168509823.635811.23580@k58g2000hse.googlegroups.com>
Madhu ha escrito:

> * "Spiros Bousbouras"  <························@11g2000cwr.XXXXX.com> :
> | If you want to analyse chess positions you can never
> | have too much speed and it has nothing to do with
> | rendering. I'm sure it's the same situation with go and
> | many other games.
>
> But having more than one core will not be a benefit if your algorithms
> are graph based and have to search a tree. IIRC most graph algorithms
> (dfs bfs) are inherently unparallelizable.

And did not a parallel search tree could distribute subtree search
between cores at each branching point?

       \             1 core search
        \
        /\           2 cores search
       /  \
      /    \
     /\     \        3 cores search
    /  \     \
   /    \     \
      Target
From: Maciek Pasternacki
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87r6u1plm6.fsf@lizard.king>
On Sweetmorn, Chaos 11, 3173 YOLD, Juan R. wrote:

>> | If you want to analyse chess positions you can never
>> | have too much speed and it has nothing to do with
>> | rendering. I'm sure it's the same situation with go and
>> | many other games.
>>
>> But having more than one core will not be a benefit if your algorithms
>> are graph based and have to search a tree. IIRC most graph algorithms
>> (dfs bfs) are inherently unparallelizable.
>
> And did not a parallel search tree could distribute subtree search
> between cores at each branching point?

I thought about parallelizing DFS with queues:

single thread would work like:
(loop
  (if *node-queue*
    (let ((node (dequeue *node-queue*)))
      (do-something-with node)
      (dolist (subnode (children node))
        (enqueue subnode *node-queue*)))
    (return))

Search would start with enqueuing root node, and would end by any
thread setting *node-queue* to NIL.  This would be parallelizable over
any number of cores (supposing one doesn't care about exact DFS search
order -- but if one cared about order, one wouldn't have considered
parallelizing).

-- 
__    Maciek Pasternacki <·······@japhy.fnord.org> [ http://japhy.fnord.org/ ]
`| _   |_\  /{Podobnie my�la�em kiedy� pod �ukiem sklepienia w ko�ciele.Czemu�
,|{-}|}| }\/to,pomy�la�em,sklepienie si� nie zapada,skoro go nic nie podpiera?
\/   |____/Bo wszystkie,odpowiedzia�em,kamienie chc� ruszy� naraz.}(J.P.) -><-
From: Madhu
Subject: Re: Next Generation of Language
Date: 
Message-ID: <m3hcuwaose.fsf@robohate.meer.net>
* Maciek Pasternacki  <··············@lizard.king> :
| On Sweetmorn, Chaos 11, 3173 YOLD, Juan R. wrote:
|
|>> | If you want to analyse chess positions you can never have too
|>> | much speed and it has nothing to do with rendering. I'm sure
|>> | it's the same situation with go and many other games.
|>>
|>> But having more than one core will not be a benefit if your
|>> algorithms are graph based and have to search a tree. IIRC most
|>> graph algorithms (dfs bfs) are inherently unparallelizable.
|>
|> And did not a parallel search tree could distribute subtree search
|> between cores at each branching point?
[...]
| single thread would work like:
| (loop
|   (if *node-queue*
|     (let ((node (dequeue *node-queue*)))
|       (do-something-with node)
|       (dolist (subnode (children node))
|         (enqueue subnode *node-queue*)))
|     (return))
|
| Search would start with enqueuing root node, and would end by any
| thread setting *node-queue* to NIL.  This would be parallelizable
| over any number of cores (supposing one doesn't care about exact DFS
| search order -- but if one cared about order, one wouldn't have
| considered parallelizing).

Your stopping criterion will have to be different.  Also, if your
input is not a tree, this algorithm will expand the same node multiple
times.  This [inefficiency] can be done in parallel, of course :)

Which is why order tends to be important in DFS, and why it is
unsuitable for decomposition.  Of course, as others have noted, once
the leaves are reached there are usually gains to be made.  The point
I wanted to make was akin to that in chemistry, where the overall rate
of a reaction is limited by the rate of the slowest step. (The slowest
step here being walking the graph)

--
Madhu
From: Alex Mizrahi
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45a75d8f$0$49204$14726298@news.sunsite.dk>
(message (Hello 'Madhu)
(you :wrote  :on '(Fri, 12 Jan 2007 09:18:17 +0530))
(

 M> Your stopping criterion will have to be different.  Also, if your
 M> input is not a tree, this algorithm will expand the same node multiple
 M> times.  This [inefficiency] can be done in parallel, of course :)

 M> Which is why order tends to be important in DFS, and why it is
 M> unsuitable for decomposition.

and how order is important, and how will it help you to find circularities?

i've checked if me is dumb reading wikipedia article. didn't find anything 
new about 'important order' -- they offer two approaches for graphs with 
cycles: remembering which nodes were traversed, and interative deepening (if 
graph is very large).
both of this approaches will work for parallel DFS.

btw, you can find articles about parallel DFS and BFS in google. maybe you'd 
better first educate yourself before saying strange stuff?

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: Alex Mizrahi
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45a7bc13$0$49198$14726298@news.sunsite.dk>
(message (Hello 'Madhu)
(you :wrote  :on '(Fri, 12 Jan 2007 19:35:20 +0530))
(

 M> | and how order is important, and how will it help you to find
 M> | circularities?  i've checked if me is dumb reading wikipedia
 M> | article. didn't find anything new about 'important order' -- they
 M> | offer two approaches for graphs with cycles: remembering which nodes
 M> | were traversed, and interative deepening (if graph is very large).
 M> | both of this approaches will work for parallel DFS.

 M> "Remembering nodes" effectively imposes an ordering - the order
 M> induced by the DFS search.

i see no connection of remembering nodes and ordering.

obvious way will be to use a hash-table like construct to remember nodes, 
although it will have questionable performance impact, it can be shared 
among different processors, so there's no need for any ordering, and it 
should work.
other obvious way will be setting marks on nodes themselves -- and again, 
there's no order involved and no issues with multiple processes.

 M> | btw, you can find articles about parallel DFS and BFS in
 M> | google. maybe you'd better first educate yourself before saying
 M> | strange stuff?

 M> I think the theoretical limits of what I'm saying have been well
 M> established for decades now. I learnt some of this stuff in a graduate
 M> course in parallel algorithms.

you're saying that ordering is somehow important in DFS, i don't see why. 
can you point to such order-sensitive DFS?
maybe you just don't recall correctly what you have learned?

btw, 'decades from now' parallelization might mean calculations on some 
cluster. but we are speaking about calculations on multiple cores of one 
machine -- in this case 'communication' speed between nodes is orders of 
magnitutude more effective, thus algorithms that are practically infeasible 
on a cluster will work very well on multicore system. for example, 
aforementioned marking method can consume lotsa traffic, but on SMP system 
it's cost about 0, since multiple cores use same memory (they might need to 
synchronize caches, but i think it's cost is very low).

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: Maciek Pasternacki
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87ac0op2qv.fsf@lizard.king>
On Boomtime, Chaos 12, 3173 YOLD, Madhu wrote:

> |>> | If you want to analyse chess positions you can never have too
> |>> | much speed and it has nothing to do with rendering. I'm sure
> |>> | it's the same situation with go and many other games.
> |>>
> |>> But having more than one core will not be a benefit if your
> |>> algorithms are graph based and have to search a tree. IIRC most
> |>> graph algorithms (dfs bfs) are inherently unparallelizable.
> |>
> |> And did not a parallel search tree could distribute subtree search
> |> between cores at each branching point?
> [...]
> | single thread would work like:
> | (loop
> |   (if *node-queue*
> |     (let ((node (dequeue *node-queue*)))
> |       (do-something-with node)
> |       (dolist (subnode (children node))
> |         (enqueue subnode *node-queue*)))
> |     (return))
> |
> | Search would start with enqueuing root node, and would end by any
> | thread setting *node-queue* to NIL.  This would be parallelizable
> | over any number of cores (supposing one doesn't care about exact DFS
> | search order -- but if one cared about order, one wouldn't have
> | considered parallelizing).
>
> Your stopping criterion will have to be different.  Also, if your
> input is not a tree, this algorithm will expand the same node multiple
> times.  This [inefficiency] can be done in parallel, of course :)

Yup, I posted DFS for trees for simplicity, but it can be generalized
by marking visited nodes (with single lock -- atomically mark and
enqueue a node, and don't enqueue marked nodes).

-- 
__    Maciek Pasternacki <·······@japhy.fnord.org> [ http://japhy.fnord.org/ ]
`| _   |_\  / { Razors pain you; rivers are damp; acids stain you; and drugs
,|{-}|}| }\/ cause cramp.  Guns aren't lawful; nooses give; gas smells awful;
\/   |____/ you might as well live.               }  (  Dorothy Parker )  -><-
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-009@yahoo.com>
> From: Madhu <·······@meer.net>
> if your input is not a tree, this algorithm will expand the same
> node multiple times.  This [inefficiency] can be done in parallel,
> of course :) Which is why order tends to be important in DFS, and
> why it is unsuitable for decomposition.

If you use any parallelism, then by definition it isn't *depth*first*
search, because you're already starting a second branch before the
first branch is completely finished. So I'm going to ignore the
reference to DFS in what you said, and treat the more general case
of top-down traversal of DAG (directed acyclic graph) from a unique
top node.

In the context here, per-CPU computational resources greatly exceed
inter-CPU communication capability. Accordingly there's virtually
no cost for computing hash numbers for situations and using them to
pass service requests to a distributed hash table and thereby
consolidating duplicate appearances of the same situation and then
caching results. Note that it isn't necessary to pass the entire
situation from the node that spawns it to the node owning its hash.
It's sufficient for the spawner to ask "are you working on this
already" via the hash code, and if so then let the work continue
and set a callback when task is done, if not then set a back-link
and work on it yourself if you're not overloaded relative to the
hash-owner.
From: Pascal Bourguignon
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87zm8t2i18.fsf@thalassa.informatimago.com>
"Tim Bradshaw" <··········@tfeb.org> writes:

> Pascal Bourguignon wrote:
>
>> Simulations,
>
> Of what?

Of anything.  Galaxies, planets, meteo, ecosystems, animals, cells,
nanobots, chemicals, particules, etc.


>> 3D,
>
> perhaps, but there's probably more than enough computing power already
> to do all but the rendering, and the rendering will never be done in a
> general purpose CPU I should think (as it isn't now, of course).
>
> virtual worlds,
>
> To be interesting these involve information that travels over networks
> which have latencies of significant fractions of a second and
> constrained bandwidth.  Seems unlikely that vast local computing power
> will help that much.

Let me see, in my far out corner of the net, my ISP doubled my ADSL
speed every year (without me asking anything even). In ten years, I
should have here 2Gb/s of Internet bandwidth.  I don't think 64 cores
will be too many to handle that.



> Neural Networks,
>
> To what end?

To do your job in your place.  In ten years, we'll have enough
processing power and memory in desktop computers to modelize a whole
human brain.  Better have parallal processors then, if you want to
emulate one at an acceptable speed.



> (game) AI.
>
> I know nothing about games really, but I'd lay odds that the thing that
> consumes almost all the computational resource is rendering. See above.


That's because rendering consumes all CPU that nothing else is done in
games (but tricks).  


>> Indeed, it would be nice to add several parallel memory buses, or just
>> have big L2 or L3 caches.
>> With a 64MB L2 cache PER core and 64 cores, you have 4GB of RAM on chip.
>
> There are lots of reasons why that sort of thing is hard and expensive.

They won't be anymore.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Our users will know fear and cower before our software! Ship it!
Ship it and let them flee like the dogs they are!"
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168334857.190727.297040@i15g2000cwa.googlegroups.com>
Pascal Bourguignon wrote:

>
> Of anything.  Galaxies, planets, meteo, ecosystems, animals, cells,
> nanobots, chemicals, particules, etc.

I think you're missing my point (which is, I admit, a bit subtle,
especially by the standards of cll in recent history).

I don't doubt that multicore processors will make it into desktops in a
big way.  But they will make it into desktops for reasons which are not
what you might think, and in particular are substantially not due to
the computational demands of desktop applications.

So for instance: do a commercially significant proportion of desktop
users spend their time simulating ecosystems, the weather, galaxies
etc?  I suggest that they do not, and they will not. (As to whether
multicore CPUs are a good approach to that kind of simulation: I
suspect they're not as good as you might think, but that's another
story.)

> Let me see, in my far out corner of the net, my ISP doubled my ADSL
> speed every year (without me asking anything even). In ten years, I
> should have here 2Gb/s of Internet bandwidth.  I don't think 64 cores
> will be too many to handle that.

I suspect it will top out long before that.  It's amazing (really, it
is a deeply spectacular feat of engineering) what can be got over a
phone line, but there are limits.  More to the point latency is not
something you can make vanish.

> That's because rendering consumes all CPU that nothing else is done in
> games (but tricks).

All the interesting rendering happens in the video card, and I don't
see that changing any time soon.  Video cards already often have
considerably more (specialised) computing power than general purpose
CPUs on serious gaming PCs.  They cost more than the rest of the
machine too.

> They won't be anymore.

Sorry, they will.  The memory wall is emphatically not going away.

--tim
From: Pascal Bourguignon
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87ejq42yx2.fsf@thalassa.informatimago.com>
"Tim Bradshaw" <··········@tfeb.org> writes:

> Pascal Bourguignon wrote:
>
>>
>> Of anything.  Galaxies, planets, meteo, ecosystems, animals, cells,
>> nanobots, chemicals, particules, etc.
>
> I think you're missing my point (which is, I admit, a bit subtle,
> especially by the standards of cll in recent history).
>
> I don't doubt that multicore processors will make it into desktops in a
> big way.  But they will make it into desktops for reasons which are not
> what you might think, and in particular are substantially not due to
> the computational demands of desktop applications.
>
> So for instance: do a commercially significant proportion of desktop
> users spend their time simulating ecosystems, the weather, galaxies
> etc?  I suggest that they do not, and they will not. (As to whether
> multicore CPUs are a good approach to that kind of simulation: I
> suspect they're not as good as you might think, but that's another
> story.)

Ah, well, if you want to discuss the real reason why they'll be put in
desktop PC, it's clearly because they cannot increase the clock
frequency much more, so instead being able to say "Hey, my PC has a
5GHz processor!", we must be able to say "Hey, my PC has 128 cores!".



Anyways, you leave us expecting.  What is the reason we're not
suspecting why multicores will spread on desktops?


>> Let me see, in my far out corner of the net, my ISP doubled my ADSL
>> speed every year (without me asking anything even). In ten years, I
>> should have here 2Gb/s of Internet bandwidth.  I don't think 64 cores
>> will be too many to handle that.
>
> I suspect it will top out long before that.  It's amazing (really, it
> is a deeply spectacular feat of engineering) what can be got over a
> phone line, but there are limits.  More to the point latency is not
> something you can make vanish.

They'll have switched to optical fiber by then.  Perhaps by nanobots
who would convert the coper in the existing cable into some kind of
transparent copper cristal ;-)


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

This is a signature virus.  Add me to your signature and help me to live.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168345148.098241.181670@i15g2000cwa.googlegroups.com>
Pascal Bourguignon wrote:
> "Tim Bradshaw" <··········@tfeb.org> writes:
>

> Ah, well, if you want to discuss the real reason why they'll be put in
> desktop PC, it's clearly because they cannot increase the clock
> frequency much more, so instead being able to say "Hey, my PC has a
> 5GHz processor!", we must be able to say "Hey, my PC has 128 cores!".

> Anyways, you leave us expecting.  What is the reason we're not
> suspecting why multicores will spread on desktops?

Basically what you say (so, clearly it is what you were expecting
anyway): they have to keep selling upgrades to people.

I think a couple of other reasons (don't know their respective
importance) are
- power consumption: people, I hope, will finally realise that liquid
cooled PCs are actually not funny;
- it's what the CPU vendors will have to sell: they won't want to spend
huge amounts of money developing entirely different desktop and server
processors (I'm not sure if this argument holds water).
From: Kirk  Sluder
Subject: Re: Next Generation of Language
Date: 
Message-ID: <kirk-CF6DAD.09132509012007@newsclstr02.news.prodigy.com>
In article <························@i15g2000cwa.googlegroups.com>,
 "Tim Bradshaw" <··········@tfeb.org> wrote:

> - it's what the CPU vendors will have to sell: they won't want to spend
> huge amounts of money developing entirely different desktop and server
> processors (I'm not sure if this argument holds water).

The high-performance processors are already moving to quad core.  
None of the chip makers appear to have much interest in simplifying 
the market in that way.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168413057.772483.12280@77g2000hsv.googlegroups.com>
Kirk Sluder wrote:

> The high-performance processors are already moving to quad core.

Well, of course, the high-end processors shipped for the server market
will always be somewhat ahead of those for desktops, as there's a
commercially significant number of people willing to pay thousands of
dollars per part there, while the market for desktops that expensive is
really very small indeed (not many people buy $5-10k desktops any
more).  It takes a year or two for parts to get cheap enough to make it
into the desktop market (and they're often lower spec in various ways -
less cache per core etc - to get the price down further).

But actually I read a review of a quad core Intel CPU which was clearly
aimed at (high-end) desktops just the other day (though it was some
horrible two-chips-in-one-package thing, so not actually quad core in
any real sense at all, merely a pair of very densely packed dual-core
CPUs).  Outside of x86, 8 core CPUs have been shipping in significant
numbers for a while.

> None of the chip makers appear to have much interest in simplifying
> the market in that way.

The point I was making is that the same basic implementation
architecture is what will ship into both markets, not that identical
parts will.  And it's not a matter of "simplifying the market", it's a
matter of it being far too expensive for processor vendors to develop
entirely independent lines for markets which are so closely related.

--tim
From: Christopher Browne
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87ps9mtbbv.fsf@wolfe.cbbrowne.com>
Martha Stewart called it a Good Thing when "Tim Bradshaw" <··········@tfeb.org> wrote:
> Kirk Sluder wrote:
>
>> The high-performance processors are already moving to quad core.
>
> Well, of course, the high-end processors shipped for the server market
> will always be somewhat ahead of those for desktops, as there's a
> commercially significant number of people willing to pay thousands of
> dollars per part there, while the market for desktops that expensive is
> really very small indeed (not many people buy $5-10k desktops any
> more).  It takes a year or two for parts to get cheap enough to make it
> into the desktop market (and they're often lower spec in various ways -
> less cache per core etc - to get the price down further).

Note that laptop makers are starting to trumpet "Intel Dual Core!!!",
so while that may not be quad-core, they're certainly pushing in this
direction.  And they're trying to at least pretend that users need
more CPU power so as to use this.

The "next generation" apps that can chew the CPU resources appear to
be compression/decompression of video and sound.  

How useful that *truly* is may be questionable.  Having dual cores on
my laptop doesn't make ssh sessions run perceptably faster, and that,
along with Firefox sessions, is primarily what that laptop does...

> But actually I read a review of a quad core Intel CPU which was clearly
> aimed at (high-end) desktops just the other day (though it was some
> horrible two-chips-in-one-package thing, so not actually quad core in
> any real sense at all, merely a pair of very densely packed dual-core
> CPUs).  Outside of x86, 8 core CPUs have been shipping in significant
> numbers for a while.

There's a company trying to hawk systems based on 16-core MIPS CPUs.
I wish them well, but suspect it may not go well for them...

>> None of the chip makers appear to have much interest in simplifying
>> the market in that way.
>
> The point I was making is that the same basic implementation
> architecture is what will ship into both markets, not that identical
> parts will.  And it's not a matter of "simplifying the market", it's a
> matter of it being far too expensive for processor vendors to develop
> entirely independent lines for markets which are so closely related.

Which is essentially how Opteron "took off," whereas IA-64 seems to be
essentially of no commercial interest.
-- 
output = ("cbbrowne" ·@" "gmail.com")
http://cbbrowne.com/info/nonrdbms.html
Error: Keyboard not attached. Press F1 to continue. 
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-005@yahoo.com>
> > I suspect it will top out long before that.  It's amazing (really, it
> > is a deeply spectacular feat of engineering) what can be got over a
> > phone line, but there are limits.  More to the point latency is not
> > something you can make vanish.
> From: Pascal Bourguignon <····@informatimago.com>
> They'll have switched to optical fiber by then.  Perhaps by nanobots
> who would convert the coper in the existing cable into some kind of
> transparent copper cristal ;-)

Nah, copper is becoming a scarce resource. Criminals are already
stripping copper out of electrical wiring, water pipes, anything
else they can find that doesn't have an armed guard, to sell on the
salvage market. Criminals are even risking their lives stripping
copper out of live high-voltage power lines and substations. We're
going to need to hire guards to watch *all* our long-distance power
lines 24/7 to protect them. We need to replace all that copper
wiring used for communication with cheap silicon dioxide to relieve
the copper shortage and thereby lower the salvage value of copper
still needed for power distribution.

I propose the nanobots would first drill a parallel tube alongside
the existing cable, to use for importing new materials and getting
rid of copper "waste", then march along an existing cable strand
replacing copper wire with glass fiber. If any long path is split
into short segments, with an access point at each junction between
successive segments, then a team of nanobots could replace all the
segments of a wire in parallel, so the downtime for a long strand
would be merely the time it takes one nanobot to replace one
segment of wire. Then as soon as one strand is replaced, a second
strand is taken down and replaced, etc., until all the stands in
the entire multi-wire cable are now optical fiber. Note that each
time one strand is segment-wise replaced, the nanobots have
advanced one step from left to right:
        Access------------Access------------Access------------Access
Start1: Nanobot1----------Nanobot2----------Nanobot3----------Empty
Midway: Empty-->Nanobot1>-------->Nanobot2>-------->Nanobot3>-Empty
Final1: Empty-------------Nanobot1----------Nanobot2----------Nanobot3
so the leftmost access point is empty of nanobot while the rightmost
has a nanobot with nothing to do. There are two fixes for this:
- Have a spare/new nanobot which is positionned to the leftmost access
   point sometime during the replacement sweep, so at the start of
   the next sweep it's already ready to start, and retire the
   rightmost nanobot which ran off the end:
Final1: Empty-------------Nanobot1----------Nanobot2----------Nanobot3Done
Start2: NewNanobot--------Nanobot2----------Nanobot3----------Empty
Sweep2: Empty-->NewNanobot>------>Nanobot1>-------->Nanobot2>-Empty
Final2: Empty-------------NewNanobot-------Nanobot1----------Nanobot2Done
- Alternate direction of sweeping:
Start2: Empty-------------Nanobot1----------Nanobot2----------Nanobot3
Sweep2: Empty--<Nanobot1<--------<Nanobot2<--------<Nanobot3<-Empty
Final2: Nanobot1----------Nanobot2----------Nanobot3----------Empty

Meanwhile, I've proposed contracting otherwise homeless/unemployed
people to stand guard over neighborhoods and copper resources to
alert police to all sorts of crime that may happen. This guarding
could be partly in-the-field patrols with cellphones with video
cameras and partly in-office staff watching via video surveilance
depending on which is most effective in any given locale.


--
Nobody in their right mind likes spammers, nor their automated assistants.
To open an account here, you must demonstrate you're not one of them.
Please spend a few seconds to try to read the text-picture in this box:

/--------------------------------------------------------------\
|       |~) _  _ _ _ |  |~) _    ._(~|   o(~|._  _ ._          |
|       |~ (_|_\(_(_||  |_)(_)|_||  _||_|| _|| |(_)| |         |
\-(Rendered by means of <http://www.schnoggo.com/figlet.html>)-/

Then enter your best guess of the text (10-20 chars) into this TextField:
          +--------------------+
          |                    |
          +--------------------+
From: Kirk  Sluder
Subject: Re: Next Generation of Language
Date: 
Message-ID: <kirk-3FE815.09101109012007@newsclstr02.news.prodigy.com>
In article <························@i15g2000cwa.googlegroups.com>,
 "Tim Bradshaw" <··········@tfeb.org> wrote:

> I don't doubt that multicore processors will make it into desktops in a
> big way.  But they will make it into desktops for reasons which are not
> what you might think, and in particular are substantially not due to
> the computational demands of desktop applications.

Well, just speaking as an Mac user, the benefits of multi-core CPUs
for desktop applications beyond simulating ecosystems has been noted
by consumers.  In addition to consumer/professional applications that
are designed for multiple processors (such as photoshop) doing just
about anything in a modern operating system involves multiple threads
competing for CPU time.  

If anything is true about computing it's that users find ways to
(occasionally) soak up any available power in their price range.  One
of the areas where I'm seeing consumer growth/interest is multimedia
development.  So "desktop application" is expanding to include video
authoring and DVD mastering.  Five years ago these were applications
where I used to put a sign on a computer saying, "compressing--don't
touch" overnight.

I will agree that price and heat considerations are nice bonuses.  
But most consumers are not that interested in heat provided that 
they don't get burns from their laptop.  They do pay attention to 
benchmarks such as MSWord scrolling, photoshop filter application or 
zip file creation.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168415128.371873.262730@p59g2000hsd.googlegroups.com>
Kirk Sluder wrote:

>
> Well, just speaking as an Mac user, the benefits of multi-core CPUs
> for desktop applications beyond simulating ecosystems has been noted
> by consumers.

I suspect you are confusing several issues here.

- The first multicore (in a single socket macs were the intel boxes, so
there are multiple near-simultaneous performance changes tangled up -
significantly higher single-core performance, probably significantly
better compilation technology, and multiple cores per die.

- it is far, far easier to find something for a 2 core box (whether
multiple cores per socket or multiple sockets) to do than to find
something for an 8, 16, 32 core box to do.

- It easy to design a 2 core system in such a way that it has
reasonable performance - you've been able to get 2 core systems (in the
form of 2 socket systems) from box-shifters like Dell (and actually
Apple I think - when did they start shipping 2 socket PPC machines?)
for a very long time, but no one who isn't doing serious design has
been shipping systems with more than 4 cores.  This will change as it
becomes cheap to put large numbers of cores in a box.

- The consumer market is notoriously bad at making judgements about
performance and what influences it.  No that's wrong: the 10s-to-100s
of k per box market is notoriously bad at making judgements about
performance and what influences it; the consumer market is just even
worse.

> In addition to consumer/professional applications that
> are designed for multiple processors (such as photoshop) doing just
> about anything in a modern operating system involves multiple threads
> competing for CPU time.

Well, they're competing for something, for sure.  That something is
probably generally memory access rather than actually CPU time.
Unfortunately there are almost no tools available which show how much
time a processor spends stalled waiting for memory rather than doing
anything useful - certainly all the standard tools show that time as
the processor being busy.  You can deal with this with a single core if
it's multithreaded (by which I mean that the core itself selects which
thread to execute on each clock cycle, based typically on whether
memory accesses have completed).  Such multithreaded cores have existed
for a while - the Tera MTA was a well-known example in the HPC arena,
and Sun's Niagara also has multiple threads per core (4, with I think 8
or 16 in the next generation design).  Unfortunately I suspect you need
OS & compiler support to take advantage of these systems.

> I will agree that price and heat considerations are nice bonuses.
> But most consumers are not that interested in heat provided that
> they don't get burns from their laptop.

However they do care about things like battery life, noise, and system
cost which correlate quite well with power consumption.  And they
*will* care about power consumption (even the Americans) when the
systems start costing significantly more than their purchase cost to
run for a year.

--tim
From: Robert Uhl
Subject: Re: Next Generation of Language
Date: 
Message-ID: <m3wt3umm9j.fsf@latakia.dyndns.org>
"Tim Bradshaw" <··········@tfeb.org> writes:
>
> However they do care about things like battery life, noise, and system
> cost which correlate quite well with power consumption.  And they
> *will* care about power consumption (even the Americans) when the
> systems start costing significantly more than their purchase cost to
> run for a year.

How long until that's the case?  I just built a new box with a Pentium D
(said box is never turned off, ever), and the gas & electricity bill for
my entire home is still around $40-$60/month, depending on the season of
the year.  And I'm a homebrewer, which means that I spend a significant
amount of electricity heating 6 1/2 gallons of liquid and boiling it
down to 5 1/4 gallons.  Oh, and it's winter here in Denver, so I have to
heat my home.

-- 
Robert Uhl <http://public.xdi.org/=ruhl>
I have sustained a continual bombardment & cannonade for 24 hours & have
not lost a man.  The enemy has demanded a surrender at discretion,
otherwise the garrison are to be put to the sword if the fort is taken.
I have answered the demand with a cannon shot, and our flag still waves
proudly from the walls.  I shall never surrender nor retreat.
                                         --William B. Travis
From: John Thingstad
Subject: Re: Next Generation of Language
Date: 
Message-ID: <op.tly7oytxpqzri1@pandora.upc.no>
On Thu, 11 Jan 2007 01:37:28 +0100, Robert Uhl <·········@NOSPAMgmail.com>  
wrote:

> "Tim Bradshaw" <··········@tfeb.org> writes:
>>
>> However they do care about things like battery life, noise, and system
>> cost which correlate quite well with power consumption.  And they
>> *will* care about power consumption (even the Americans) when the
>> systems start costing significantly more than their purchase cost to
>> run for a year.
>
> How long until that's the case?  I just built a new box with a Pentium D
> (said box is never turned off, ever), and the gas & electricity bill for
> my entire home is still around $40-$60/month, depending on the season of
> the year.  And I'm a homebrewer, which means that I spend a significant
> amount of electricity heating 6 1/2 gallons of liquid and boiling it
> down to 5 1/4 gallons.  Oh, and it's winter here in Denver, so I have to
> heat my home.
>

Well modern computers come with power saving features.
Max consumption on my machine is about 400 W.
For a machine with dual graphics boards consumption can be as high as 1000  
W.
But average consumption is much lower, more like 40 W.
So about as much as a light-bulb.

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Kirk  Sluder
Subject: Re: Next Generation of Language
Date: 
Message-ID: <kirk-06A62E.14341410012007@newsclstr02.news.prodigy.com>
In article <························@p59g2000hsd.googlegroups.com>,
 "Tim Bradshaw" <··········@tfeb.org> wrote:

> Kirk Sluder wrote:
> 
> >
> > Well, just speaking as an Mac user, the benefits of multi-core CPUs
> > for desktop applications beyond simulating ecosystems has been noted
> > by consumers.
> 
> I suspect you are confusing several issues here.
> 
> - The first multicore (in a single socket macs were the intel boxes, so
> there are multiple near-simultaneous performance changes tangled up -
> significantly higher single-core performance, probably significantly
> better compilation technology, and multiple cores per die.

The first generation of Apple Intel hardware did offer a mix of 
single-core and dual-core processors.  At that time there were 
comparisons made and discussion about the impact of two processor 
vs. one processor systems.  (And as you point out, discussion of 
multiple CPUs in a system isn't new.)

> However they do care about things like battery life, noise, and system
> cost which correlate quite well with power consumption.  And they
> *will* care about power consumption (even the Americans) when the
> systems start costing significantly more than their purchase cost to
> run for a year.

True.  I still think that some of the demand is driven by mainstream 
adoption of increasingly resource-hungry applications that had 
previously required some pretty expensive custom hardware.  

> 
> --tim
From: Alex Mizrahi
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45a797cf$0$49205$14726298@news.sunsite.dk>
(message (Hello 'Tim)
(you :wrote  :on '(9 Jan 2007 23:45:28 -0800))
(

 TB> Unfortunately there are almost no tools available which show how much
 TB> time a processor spends stalled waiting for memory rather than doing
 TB> anything useful - certainly all the standard tools show that time as
 TB> the processor being busy.

what's about Intel VTune profiler for Intel processors?
there's also AMD profiler for AMD processors (not that cool though).

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168614916.552271.303700@s34g2000cwa.googlegroups.com>
Alex Mizrahi wrote:

> what's about Intel VTune profiler for Intel processors?
> there's also AMD profiler for AMD processors (not that cool though).

I don't know about these, but yes, there are profilers of course - to
be useful they typically need to be able to get access to various
counters in the processor which let them know how many instructions are
stalling, what's happening to the caches, etc.  What I really meant was
that a lot of the tools people use to decide where the time is going
don't usefully tell you whether the system is waiting for memory all
the time or whether it's actually really busy.
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168760963.454856.73870@11g2000cwr.googlegroups.com>
Tim Bradshaw wrote:
> I don't know about these, but yes, there are profilers of course - to
> be useful they typically need to be able to get access to various
> counters in the processor which let them know how many instructions are
> stalling, what's happening to the caches, etc.  What I really meant was
> that a lot of the tools people use to decide where the time is going
> don't usefully tell you whether the system is waiting for memory all
> the time or whether it's actually really busy.

A number of profilers (I imagine Intel's VTune does, for example) do --
you can count cache misses and compare them with the number of loads /
stores to get an idea if your application is successfully exploiting
locality.  If you use the PAPI library, you can get that information
without paying for VTune.

mfh
From: Marc Battyani
Subject: Re: Next Generation of Language
Date: 
Message-ID: <6JydneCC2I9CNz7YnZ2dnUVZ8saonZ2d@giganews.com>
"Tim Bradshaw" <··········@tfeb.org> wrote
>
> I suspect it will top out long before that.  It's amazing (really, it
> is a deeply spectacular feat of engineering) what can be got over a
> phone line, but there are limits.  More to the point latency is not
> something you can make vanish.
[...]
> Sorry, they will.  The memory wall is emphatically not going away.

It's even going worse as the bandwidth increases. Just look at the DDR3 and 
FBDIMM.
The memory access latency is already the limiting factor in most cases. 
(excepted for LINPACK and this kind of benchmarks)
For HPC applications, let's switch to FPGA+QDRII+DSL written in Lisp! (DSL= 
Domain Specific Languages, not ADSL ;-)

Marc
From: Juan R.
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168508395.688810.68410@i56g2000hsf.googlegroups.com>
Tim Bradshaw ha escrito:

> So for instance: do a commercially significant proportion of desktop
> users spend their time simulating ecosystems, the weather, galaxies
> etc?  I suggest that they do not, and they will not.

Maybe yes in some irrelevant MSWord 2017 assistant [remember current
Earth assistant on Word 2000].
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168509013.881688.101260@i56g2000hsf.googlegroups.com>
Juan R. wrote:

>
> Maybe yes in some irrelevant MSWord 2017 assistant [remember current
> Earth assistant on Word 2000].

I dunno much about Word I'm afraid.  But of course, you're nearly
right.  It won't be Word though, it will be systems which have been
taken over by botnets, which botnets will be fighting computational
arms races with anti-spam systems.  By 2017 essentially all the world's
computing resources will have been taken over by the battle between
spam/antispam & virus/antivirus systems, with the remainder used up by
video games.  Humans will be extinct.

--tim
From: Szabolcs Szucs
Subject: Re: Next Generation of Language
Date: 
Message-ID: <s7mac0pzx36.fsf@login09.caesar.elte.hu>
Hi,

"Tim Bradshaw" <··········@tfeb.org> writes:

> [...]
> arms races with anti-spam systems.  By 2017 essentially all the world's
> computing resources will have been taken over by the battle between
> spam/antispam & virus/antivirus systems, with the remainder used up by

Spam isn't a problem if one can use PGP-like systems and other
channels to maintain trust values.

Who cares about viruses on LispOS? :) Viruses are OS and architecture
dependent entities. Diversity in this domain going to solve this. 

> video games.  Humans will be extinct.
>
> --tim
>

=--= 
Szucs Szabolcs Laszlo
(36 30) 228 42 40




9BF6 00E9 0CA9 B3A8 5234  03DB 3FB3 F85B 0033 74D7
http://keyserver.noreply.org/pks/lookup?search=kotee%40elte.hu&fingerprint=on
From: Chris Barts
Subject: Re: Next Generation of Language
Date: 
Message-ID: <pan.2007.01.11.11.17.03.666996@tznvy.pbz>
On Thu, 11 Jan 2007 11:17:33 +0100, Szabolcs Szucs wrote:

> 
> Who cares about viruses on LispOS? :) Viruses are OS and architecture
> dependent entities. Diversity in this domain going to solve this. 
> 

And mandatory DRM is going to solve diversity. Why should it be legal for
anyone to own a computer that can do any more than
Sony/BMI/Disney/Vivendi/Universal/AOL/TimeWarner say it should?

-- 
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.


----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-007@yahoo.com>
> From: Szabolcs Szucs <·····@elte.hu>
> Spam isn't a problem if one can use PGP-like systems and other
> channels to maintain trust values.

Do you know of any SMTP server which filters on the basis of PGP
signatures from trusted known senders, issuing 5yz rejection codes
for all anonymous/untrusted traffic? It is *not* acceptable to
accept all e-mail then divert the 99% which is spam to a side
folder which will never been examined, because false positives go
hanging forever with the sender wondering why the recipient never
answered and the recipient wondering why he hasn't heard from the
sender for a long time (or in case of first contact the recipient
never becoming aware that the sender was attempting to open
commuication). At least with 5yz rejection, the sender knows
promptly that the e-mail didn't go through, and can try an
alternate communication method.

> Who cares about viruses on LispOS? :) Viruses are OS and architecture
> dependent entities. Diversity in this domain going to solve this.

Have you considered my proposal for CGI/PHP applications to act as
alternate e-mail servers, whereby each Web master can set up
his/her own individual standards for accepting/rejecting e-mail,
with many different individualized types of Turing tests as
qualification for sending e-mail and getting it accepted by server,
thereby keeping sufficient diversity to prevent defeat by spambot authors?


--
Nobody in their right mind likes spammers, nor their automated assistants.
To open an account here, you must demonstrate you're not one of them.
Please spend a few seconds to try to read the text-picture in this box:

/--------------------------------------------------------------------------\
|  [~._._ _ ._o  |/ _|_||_  _  _ ._ _|  ._  _ _|_   _ _|__|_ _  _|_  _ _|  |
|  [_| | (_)| o  |\}_ _||_)(_)(_|| (_|  | |(_) |   (_| |  | (_|(_| |}_(_|o |
|  |~)._ _ _ _  |~'|  _|_ _    _ _ ._ _|_o._     _                         |
|  |~ | }__\_\  |~ |   | (_)  (_(_)| | | || ||_|}_o                        |
\-------(Rendered by means of <http://www.schnoggo.com/figlet.html>)-------/

Then enter your best guess of the text (50-60 chars) into this TextField:
    +------------------------------------------------------------+
    |                                                            |
    +------------------------------------------------------------+
From: Juan R.
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168506432.758008.67180@77g2000hsv.googlegroups.com>
Pascal Bourguignon ha escrito:

> > Neural Networks,
> >
> > To what end?
>
> To do your job in your place.  In ten years, we'll have enough
> processing power and memory in desktop computers to modelize a whole
> human brain.  Better have parallal processors then, if you want to
> emulate one at an acceptable speed.

>From where did you get that data?

So far as i know the prediction is that at some time in the second half
of this century, fast supercomputer could only offers us a 1000s MD
simulation for a _E. coli_ (~ 10^10 heavy atoms). MD simulations are
very inexpensive and rough. Prediction suggests no accurate _ab initio_
model would be available on this century.
From: John Thingstad
Subject: Re: Next Generation of Language
Date: 
Message-ID: <op.tlzf7da9pqzri1@pandora.upc.no>
On Thu, 11 Jan 2007 10:07:12 +0100, Juan R.  
<··············@canonicalscience.com> wrote:

>
> Pascal Bourguignon ha escrito:
>
>> > Neural Networks,
>> >
>> > To what end?
>>
>> To do your job in your place.  In ten years, we'll have enough
>> processing power and memory in desktop computers to modelize a whole
>> human brain.  Better have parallal processors then, if you want to
>> emulate one at an acceptable speed.
>
>> From where did you get that data?
>
> So far as i know the prediction is that at some time in the second half
> of this century, fast supercomputer could only offers us a 1000s MD
> simulation for a _E. coli_ (~ 10^10 heavy atoms). MD simulations are
> very inexpensive and rough. Prediction suggests no accurate _ab initio_
> model would be available on this century.
>

My suggestion is to forget Moor's law.
Computing increase in power increase has been decreasing for some time.
Growth is no longer exponential but scalar.
Say, a quad core CPU has 180% the speed of a single core.
Amdahl's law (wikipedia)

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Juan R.
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168517117.407750.117690@i39g2000hsf.googlegroups.com>
John Thingstad ha escrito:

> > So far as i know the prediction is that at some time in the second half
> > of this century, fast supercomputer could only offers us a 1000s MD
> > simulation for a _E. coli_ (~ 10^10 heavy atoms). MD simulations are
> > very inexpensive and rough. Prediction suggests no accurate _ab initio_
> > model would be available on this century.
> >
>
> My suggestion is to forget Moor's law.
> Computing increase in power increase has been decreasing for some time.
> Growth is no longer exponential but scalar.
> Say, a quad core CPU has 180% the speed of a single core.
> Amdahl's law (wikipedia)

Yes, prediction was done over an exponential rise. Of course, if rise
is linear in next decades then we will not see MD simulation for E.
coli in this century, and no _ab initio_ simulation for many many many
time.

I believe that the law you cite is asumming that parallelizable task in
of order N^0 and i do not agree because do not account for
synchronization issues between cores is N dependant.

Topologically, performance is approx.

[N * 100] - [b * [ [N * [N - 1]] / 2] ]

Assuming 60-70% for general tasks for the Opteron dual-core [1], one
obtains b = 35.

Therefore for quad systems performance would be of order of 190% for
Opteron.

More realistic formula may account for N dependence on b and for real
topological design, for instance if topology for 8 cores is

   C-C-C-C
   |x| |x|
   C-C-C-C

then above formula does not apply. Performance would be of order 240%.

Note that two perfect parallel quad systems it would be 190 * 2 = 380%


[1]
http://www-03.ibm.com/servers/eserver/opteron/pdf/IBM_dualcore_whitepaper.pdf
From: John Thingstad
Subject: Re: Next Generation of Language
Date: 
Message-ID: <op.tlzpmjyppqzri1@pandora.upc.no>
On Thu, 11 Jan 2007 13:05:17 +0100, Juan R.  
<··············@canonicalscience.com> wrote:

Nop. The numbers are gotten from Amhdal's law.
They are also the numbers Intel use! (180%)
Yes, they are approximate. Obviously.


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Juan R.
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168591153.676220.47550@q2g2000cwa.googlegroups.com>
John Thingstad ha escrito:

> On Thu, 11 Jan 2007 13:05:17 +0100, Juan R.
> <··············@canonicalscience.com> wrote:
>
> Nop. The numbers are gotten from Amhdal's law.
> They are also the numbers Intel use! (180%)
> Yes, they are approximate. Obviously.

I estimated b = 35 from AMD Opteron data on dual-cores [1]

>From that parameter and topological performance

[N * 100] - [b * [ [N * [N - 1]] / 2] ]

I got 190% for the AMD quad-core. It is close to the 180% you claim for
an Intel quad-core. If b = 36.67 for the Intel then you would got the
180%.

If you compute the speedup from above formula you obtain

N - [ [[b * N] / 200] * [N - 1] ]

Looking like the Gustafson's law [2] for an alpha = [[b * N] / 200]

The N-dependence of alpha is function of the explicit expression for b
= b(N), which is both technological and algorithmical dependant.

For a small number of cores, both laws may provide 'equivalent'
answers. But I know of publications on further violations of the
Amhdal's law for 1024-cores. I do not know of violations for the
Gustafson's law.

Any case, I think that main idea on the thread is that parallelization
is not good for general usage because with four cores one obtains
generally less than the double of power over a single core and a double
core just adds less than 3/4 over a single one. This is the main reason
that 2 and 4 cores Intel and AMD are more popular -and economically
viable- than hypotetical 8, 16, or 32 cores units.

About the question of LISP being used on scientific computing, I know
of none scientist or mathematician working in high-performance
algorithms (e.g. in computational chemistry) using LISP. I always read
Fortran code.

P.S: What do you mean by "nop"? No-op?


[1]
http://www-03.ibm.com/servers/eserver/opteron/pdf/IBM_dualcore_whitepaper.pdf

[2]  http://en.wikipedia.org/wiki/Gustafson%27s_law
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-004@yahoo.com>
> From: Pascal Bourguignon <····@informatimago.com>
> >> Simulations,
> > Of what?
> Of anything.  Galaxies, planets, meteo, ecosystems, animals, cells,
> nanobots, chemicals, particules, etc.

It seems to me that it would be entirely sufficient to run the
actual simulation on a server farm and then pipe the video feed to
the user over any ordinary NTSC/PSL/SECAM or HDTV circuit. No need
to run the simulation on a "desktop" computer, right?

> Let me see, in my far out corner of the net, my ISP doubled my ADSL
> speed every year (without me asking anything even). In ten years, I
> should have here 2Gb/s of Internet bandwidth.  I don't think 64 cores
> will be too many to handle that.

Nitpick: Past performance does not guarantee future performance,
on the stock market, or in customer relations/services.
But yes I tentatively agree (if the trend does actually continue).

> In ten years, we'll have enough processing power and memory in
> desktop computers to modelize a whole human brain.

Does Hans Moravec agree with this prediction?

> >> Indeed, it would be nice to add several parallel memory buses, or just
> >> have big L2 or L3 caches.
> >> With a 64MB L2 cache PER core and 64 cores, you have 4GB of RAM on chip.
> > There are lots of reasons why that sort of thing is hard and expensive.
> They won't be anymore.

I'm just imagining here: 1 GB fast RAM on each CPU semi-chip, 16 or
32 semi-chips integrated into each mega-chip, 8 mega-chips in a box
with power supply etc., 64 GB slow RAM in the box, virtual memory
swapping pages of slow box-RAM into fast CPU-RAM, so if your
effective working set per CPU is less than 1 GB you don't thrash.
Then gigantic-capacity hard disk with second level of swapping
between slow box-RAM and disk, so if total working set is less than
64 GB you don't thrash there either. How feasible is that at what
cost today/RSN?
From: Rob Thorpe
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168338898.973500.89020@51g2000cwl.googlegroups.com>
Tim Bradshaw wrote:
> Pascal Bourguignon wrote:
>
> (game) AI.
>
> I know nothing about games really, but I'd lay odds that the thing that
> consumes almost all the computational resource is rendering. See above.

The rendering does, but it performs most of it in the graphics
hardware.
Of the remaining runtime a large proportion is consumed by game "AI".
I remember reading that one recent first-per-shooter game used 50% of
it's runtime for AI controlling the movement of the enemies.

Whether this type of AI actually is AI of-course varies from game to
game.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168342188.993041.326220@i15g2000cwa.googlegroups.com>
Rob Thorpe wrote:

> The rendering does, but it performs most of it in the graphics
> hardware.

That was my point: the computationally demanding part of the
application *already* lives in special-purpose hardware, and will
continue to do so.  And I suggest that as games become closer and
closer to being photorealistic, the rendering will consume vastly more
resources, almost none of which can usefully be provided by a general
purpose multicore CPU.

> Of the remaining runtime a large proportion is consumed by game "AI".

Yes, and the issue is: how much more of that is needed?  I suggest the
answer might be `much much less than is needed in the rendering'.

--tim
From: pTymN
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168444589.282107.325500@i56g2000hsf.googlegroups.com>
I work in the video games industry, and I think that multicore
processors are going to kill the PPU (physics processing unit) cards
that Aegia is trying to release. For the foreseeable future, more
realistic collision detection and particle based physics will happily
consume as many processors as we can throw at the problem. It will not
be cheap to add interactive fluids to a game, and this is one problem
that requires fairly random memory access, so GPUs won't be as useful.

I work on Gamebryo, and we recently parallelized our physics and
collision libraries. Triangle mesh to triangle mesh collisions are
computationally expensive and can be done in parallel.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168455717.815364.106930@k58g2000hse.googlegroups.com>
pTymN wrote:

> I work in the video games industry, and I think that multicore
> processors are going to kill the PPU (physics processing unit) cards
> that Aegia is trying to release. For the foreseeable future, more
> realistic collision detection and particle based physics will happily
> consume as many processors as we can throw at the problem.

How good is the locality of these codes?  I suspect that one feature of
multicore desktop systems will be that they will be *extremely* short
of memory bandwidth, because that's expensive to provide.

--tim
From: Chris Barts
Subject: Re: Next Generation of Language
Date: 
Message-ID: <pan.2007.01.11.11.18.29.948318@tznvy.pbz>
On Wed, 10 Jan 2007 11:01:57 -0800, Tim Bradshaw wrote:

> these codes

How many people have forgotten that 'code' is a mass noun and, as such,
does not take plurals? Do you also say 'these muds' and 'these dusts'?

-- 
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.


----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168515479.220367.4100@i39g2000hsf.googlegroups.com>
Chris Barts wrote:

>
> How many people have forgotten that 'code' is a mass noun and, as such,
> does not take plurals? Do you also say 'these muds' and 'these dusts'?

How many people have forgotten that *language changes over time* and is
not something handed down from the elder days, never to be changed?
The sense of `codes' I gave is very common in the HPC community where
"a code" typically refers to something approximating to a particular
implementation of an algorithm.  The plural use, which is more common,
means something like "implementations of  algorithms"  Thus for
instance a paper title from 2001: "High-Performance Java Codes for
Computational Fluid Dynamics" and many other examples.  Note that this
use is quite different than the mass noun use "this code" would refer
to a *single* program, or some chunk of code from it, while "these
codes" would refer to a number of programs, or rather to their
computational kernels.

And in fact, this usage is fairly similar to the way you might use
"these muds" say, to refer to a number of different kinds of mud.  I'm
quite sure I could find such usages in the geological literature, since
I'm already aware of usages like "these shales" to refer to different
kinds of shale.

--tim
From: Rob Warnock
Subject: Re: Next Generation of Language
Date: 
Message-ID: <cb2dncx2oPyulDrYnZ2dnUVZ_obinZ2d@speakeasy.net>
Tim Bradshaw <··········@tfeb.org> wrote:
+---------------
| Chris Barts wrote:
| > How many people have forgotten that 'code' is a mass noun and, as such,
| > does not take plurals? Do you also say 'these muds' and 'these dusts'?
| 
| How many people have forgotten that *language changes over time* and is
| not something handed down from the elder days, never to be changed?
| The sense of `codes' I gave is very common in the HPC community where
| "a code" typically refers to something approximating to a particular
| implementation of an algorithm.  The plural use, which is more common,
| means something like "implementations of algorithms".
+---------------

Yup. Far too much of the HPC market consists of simply rerunning 1960's
"dusty deck" codes with different inputs and larger array dimensions.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168635134.640423.237430@m58g2000cwm.googlegroups.com>
Rob Warnock wrote:
> Yup. Far too much of the HPC market consists of simply rerunning 1960's
> "dusty deck" codes with different inputs and larger array dimensions.

Um, not in my world ;)

1. First of all the 1960's codes probably won't even compile since
that's pre - Fortran 77 ;P

2. The parallel codes tend to use MPI which is a creation of the early
90's.  There is a strong push to develop parallel HPC languages that
give a shared-memory model without hiding the costs of communication.
DARPA is paying big bucks to several different companies (Sun, IBM,
Cray) to work on this.

3. The _interfaces_ might be old (BLAS, LAPACK) but are constantly
under development -- people demand new routines all the time, and
_they_get_them_.

4. The algorithms are continually evolving -- for example, if the
1960's code was for solving an elliptic PDE, it was probably using some
relaxation iteration to solve the linear system.  Recent codes probably
use multigrid, which is _way_ more efficient.

so yeah, don't diss the HPC world, we're the ones who'll figure out how
to use all those cores that are going to be on your desktop in the next
few years ;P

with friendliness
mfh
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168640409.714698.233110@11g2000cwr.googlegroups.com>
············@gmail.com wrote:

> Rob Warnock wrote:
> > "dusty deck" codes
>
> 1960's codes
>
> parallel codes
>
> Recent codes

OK, I think this makes the point that "codes" is a common usage in the
HPC community.  I will expect delivery of Chris Barts' head on a
platter tomorrow morning.  You can do what you want with the rest of
him.
From: Chris Barts
Subject: Re: Next Generation of Language
Date: 
Message-ID: <pan.2007.01.13.05.55.03.498015@tznvy.pbz>
On Fri, 12 Jan 2007 14:20:11 -0800, Tim Bradshaw wrote:

> 
> OK, I think this makes the point that "codes" is a common usage in the
> HPC community.  I will expect delivery of Chris Barts' head on a
> platter tomorrow morning.  You can do what you want with the rest of
> him.

Doesn't matter. It merely means a lot of people are wrong, and a lot of
people need frying.

-- 
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.


----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
From: Chris Barts
Subject: Re: Next Generation of Language
Date: 
Message-ID: <pan.2007.01.13.05.56.28.181008@tznvy.pbz>
On Thu, 11 Jan 2007 03:37:59 -0800, Tim Bradshaw wrote:

> Chris Barts wrote:
> 
>>
>> How many people have forgotten that 'code' is a mass noun and, as such,
>> does not take plurals? Do you also say 'these muds' and 'these dusts'?
> 
> How many people have forgotten that *language changes over time* and is
> not something handed down from the elder days, never to be changed?

"Like, wow, dude! Language is whatever I say it is! Crumb buttercake up
the windowpane with the black shoehorn butterhorse!"

Grow up.

-- 
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.


----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
From: Kirk  Sluder
Subject: Re: Next Generation of Language
Date: 
Message-ID: <kirk-20FF6B.02404113012007@newsclstr03.news.prodigy.net>
In article <······························@tznvy.pbz>,
 Chris Barts <··············@tznvy.pbz> wrote:

> On Thu, 11 Jan 2007 03:37:59 -0800, Tim Bradshaw wrote:
> 
> > Chris Barts wrote:
> > 
> >>
> >> How many people have forgotten that 'code' is a mass noun and, as such,
> >> does not take plurals? Do you also say 'these muds' and 'these dusts'?
> > 
> > How many people have forgotten that *language changes over time* and is
> > not something handed down from the elder days, never to be changed?
> 
> "Like, wow, dude! Language is whatever I say it is! Crumb buttercake up
> the windowpane with the black shoehorn butterhorse!"

Actually, to be pedantic, 'code' is a collective noun similar to 
'government,' 'people' or 'team' with the appropriate definition for 
this context being: "Any system of symbols and rules for expressing 
information or instructions in a form usable by a computer or other 
machine for processing or transmitting information."  

My OED cites plural uses going back to the mid-18th century:
1735: Larger far Than civil codes with all their glosses are. 
1818: The different German tribes were first governed by codes of 
laws formed by their respective chiefs. 
1875: Maritime codes of signals.

And finally in cybernetics
1970: A saving in computer time..compared with the discrete ordinate 
codes NIOBE and STRAINT.

This makes sense given that the derivation is from CODEX or book, 
and the 14th century use of the term focuses on the code of laws 
created by specific Roman Emperors (presumably in contrast with 
codes created by other Roman emperors.)  

And in fact, the OED has a specific definition for the plural "muds,"

"2. In pl. Tracts of mud on the margin of a tidal river; mudflats."

1755: Near half a mile front on said river of flats or muds, which 
yields extraordinary pasture at all seasons. 
1902: There are still no flounders on the famous Bishop's Muds.

Along with comparative use of "muds" as a plural:
1885: Encycl. Brit. XVIII. 122/1 At some points in the same regions 
are found green muds and sands, which, as regards their 
origin..resemble the blue muds.

There is even documented use of "dusts" in there to compare between 
types of dusts.  "Of the many dusts tested, wheat dust was most 
flammable."

Of course I'm a pretty strong descriptivists who feels that the 
pragmatics of language do more to keep things stable than 
well-meaning but misguided grammar wonks who don't understand what 
they seek to defend.  So I'll just point out that you are arguing 
against about 250 years of formal use on this one.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168701007.742167.194760@l53g2000cwa.googlegroups.com>
Kirk Sluder wrote:

> My OED cites plural uses going back to the mid-18th century:
> 1735: Larger far Than civil codes with all their glosses are.
> 1818: The different German tribes were first governed by codes of
> laws formed by their respective chiefs.
> 1875: Maritime codes of signals.

Yes, I should have thought of things like "legal code" "legal codes"
which are clearly evidence of plural use going back a long time ...
>
> And finally in cybernetics
> 1970: A saving in computer time..compared with the discrete ordinate
> codes NIOBE and STRAINT.

... and that might be the HPC usage, though it depends exactly on what
NIOBE and STRAINT were - I suppose one could look up the OED citation
if one cared.

> And in fact, the OED has a specific definition for the plural "muds,"

Heh

> There is even documented use of "dusts" in there to compare between
> types of dusts.  "Of the many dusts tested, wheat dust was most
> flammable."

That's exactly the usage I would have expected.

> well-meaning but misguided grammar wonks who don't understand what
> they seek to defend.

He won't like split infinitives either.  Mind you I own both of the
Fowlers' books and have read thyem more-or-less end to end.  Mostly I
think because they're so nicely written.

--tim
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168701139.573700.122790@a75g2000cwd.googlegroups.com>
Chris Barts wrote:

> "Like, wow, dude! Language is whatever I say it is! Crumb buttercake up
> the windowpane with the black shoehorn butterhorse!"

I'm afraid I can make neither head nor tail of your curious colonial
speech.
From: Chris Barts
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168752780_1232@corp.com>
Tim Bradshaw <··········@tfeb.org> wrote on Saturday 13 January 2007 08:12
in comp.lang.lisp <························@a75g2000cwd.googlegroups.com>:

> Chris Barts wrote:
> 
>> "Like, wow, dude! Language is whatever I say it is! Crumb buttercake up
>> the windowpane with the black shoehorn butterhorse!"
> 
> I'm afraid I can make neither head nor tail of your curious colonial
> speech.

You know, I never thought I could jerk your chain this effectively.

-- 
My address happens to be com (dot) gmail (at) usenet (plus) chbarts,
wardsback and translated.
It's in my header if you need a spoiler.

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168786566.364341.270440@v45g2000cwv.googlegroups.com>
Chris Barts wrote:

>
> You know, I never thought I could jerk your chain this effectively.

Do you understand the phrase "taking the piss"?
From: John Thingstad
Subject: Re: Next Generation of Language
Date: 
Message-ID: <op.tlzpfloapqzri1@pandora.upc.no>
On Wed, 10 Jan 2007 16:56:29 +0100, pTymN <·········@gmail.com> wrote:

> I work in the video games industry, and I think that multicore
> processors are going to kill the PPU (physics processing unit) cards
> that Aegia is trying to release. For the foreseeable future, more
> realistic collision detection and particle based physics will happily
> consume as many processors as we can throw at the problem. It will not
> be cheap to add interactive fluids to a game, and this is one problem
> that requires fairly random memory access, so GPUs won't be as useful.
>
> I work on Gamebryo, and we recently parallelized our physics and
> collision libraries. Triangle mesh to triangle mesh collisions are
> computationally expensive and can be done in parallel.
>

Sorry!  This was never supposed to end up here..

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168548447.698201.64250@77g2000hsv.googlegroups.com>
Tim Bradshaw wrote:
> Of course, all this is predicated on there being enough memory
> bandwidth that everything doesn't just starve.  I dunno how good
> current seriously-multicore systems are in this respect.

Intel's proposed 80-core architecture will have DRAM attached to each
core -- sort of how Cell has "local stores" attached to each SPE.
That's how they plan to solve the BW problem -- amortize it over all
the cores.

Basically it means that scicomp people like me get a huge job advantage
'cause we know how to deal with these issues (that's what i'm hoping
anyway ;P ).

mfh
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168549437.833955.160460@i39g2000hsf.googlegroups.com>
············@gmail.com wrote:

> Intel's proposed 80-core architecture will have DRAM attached to each
> core -- sort of how Cell has "local stores" attached to each SPE.
> That's how they plan to solve the BW problem -- amortize it over all
> the cores.

Don't we call that `cache' normally?  (yes, I know, they'll be *big*
caches, but only big by today's standards, in the same sense that
today's machines have as much cache as yesterday's had main memory.)
From: George Neuner
Subject: Re: Next Generation of Language
Date: 
Message-ID: <k7vdq216cqf4jhpjourf8kqi53mh3hmh12@4ax.com>
On 11 Jan 2007 13:03:57 -0800, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>············@gmail.com wrote:
>
>> Intel's proposed 80-core architecture will have DRAM attached to each
>> core -- sort of how Cell has "local stores" attached to each SPE.
>> That's how they plan to solve the BW problem -- amortize it over all
>> the cores.
>
>Don't we call that `cache' normally?  (yes, I know, they'll be *big*
>caches, but only big by today's standards, in the same sense that
>today's machines have as much cache as yesterday's had main memory.)

Well, on Cells the private memories are not cache but staging memories
... the main processor has to move data into and out of them on behalf
of the coprocessors.  It's very similar to the multi-level memory
system used on the old Cray's where the CPU had to fetch and organize
data to feed the array processors and store the results back to the
shared main memory.

AFAIK, no one has tried to offer a hardware solution to staging
computations in a distributed memory system since the KSR1 (circa
1990, which failed due to the company's creative bookkeeping rather
than the machine's technology).  Everyone now relies on software
approaches like MPI and PVM.

George
--
for email reply remove "/" from address
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168594529.082616.310760@38g2000cwa.googlegroups.com>
George Neuner wrote:

> Well, on Cells the private memories are not cache but staging memories
> ... the main processor has to move data into and out of them on behalf
> of the coprocessors.  It's very similar to the multi-level memory
> system used on the old Cray's where the CPU had to fetch and organize
> data to feed the array processors and store the results back to the
> shared main memory.

It doesn't matter very much who moves the data, it's still cache :-).
The issue that counts, really, is what the programming model is at the
user level. No one should need to care whether things are done
automagically by the hardware as most L1/L2 caches are today, or by
hardware with substantial SW support as, say, MMUs, or almost entirely
by SW with some small amount of HW support, as, say disk paging.
(Actually, the second thing that counts is whether the HW can
efficiently support the programming model you choose.)

>
> AFAIK, no one has tried to offer a hardware solution to staging
> computations in a distributed memory system since the KSR1 (circa
> 1990, which failed due to the company's creative bookkeeping rather
> than the machine's technology).  Everyone now relies on software
> approaches like MPI and PVM.

Well, I think they have actually, in all but name: that's essentially
what NUMA machines are.  Such machines are quite common, of course
(well, for bigger systems anyway): all Sun's recent larger machines (4
& 5-digit sunfire boxes) are basically NUMA, and it may be that smaller
ones are too.

Of course, as I said above, this comes down to programming model and
how much HW support you need for it.  I think the experience of the
last 10-20 years is that a shared memory model (perhaps "shared address
space"?), preferably with cache-coherency, is a substantially easier
thing to program for than a distributed memory model. Whether that will
persist, who knows (I suspect it will, for a surprisingly long time).
Of course the physical memory that underlies this model will become
increasingly distributed, as it already has to a great extent.

--tim
From: Ken Tilton
Subject: Re: Next Generation of Language
Date: 
Message-ID: <o7Mph.14$5s6.5@newsfe08.lga>
Tim Bradshaw wrote:
> George Neuner wrote:
> 
> 
>>Well, on Cells the private memories are not cache but staging memories
>>... the main processor has to move data into and out of them on behalf
>>of the coprocessors.  It's very similar to the multi-level memory
>>system used on the old Cray's where the CPU had to fetch and organize
>>data to feed the array processors and store the results back to the
>>shared main memory.
> 
> 
> It doesn't matter very much who moves the data, it's still cache :-).
> The issue that counts, really, is what the programming model is at the
> user level. 

http://common-lisp.net/project/cells/

<sigh>

kenny
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168614358.225737.55610@i15g2000cwa.googlegroups.com>
Ken Tilton wrote:

> http://common-lisp.net/project/cells/

That's either a good pun or you weren't reading the thread carefully
enough.  Or perhaps you're a variant Kibo.
From: Ken Tilton
Subject: Re: Next Generation of Language
Date: 
Message-ID: <KeQph.369$3w7.279@newsfe12.lga>
Tim Bradshaw wrote:
> Ken Tilton wrote:
> 
> 
>>http://common-lisp.net/project/cells/
> 
> 
> That's either a good pun or you weren't reading the thread carefully
> enough.

Well that is certainly the case, but I /did/ see it originate with gosh 
what would we ever do with all these CPUs and continue with golly it is 
hard to parallelize applications and Cells /do/ automatically decompose 
a complex application into automatically parallelizable chunks of code 
and people interested in parallel programming /have/ suggested Cells 
could be a breakthru there and you /did/ ask about user programming. 
Perhaps this does not rise to the high level of seriousness, accuracy, 
and relevance required of articles posted to Usenet and especially 
c.l.l, but it was all I had.

kzo

-- 
The Dalai Lama gets the same crap all the time.
   -- Kenny Tilton on c.l.l when accused of immodesty
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168634504.923403.257000@a75g2000cwd.googlegroups.com>
Tim Bradshaw wrote:
> It doesn't matter very much who moves the data, it's still cache :-).

Actually, for performance, it can help a lot (in the HPC world) if you
have software control of data movement.  I can dig up some citations if
you are interested.

I would argue that a good programming language for HPC applications
would let programmers know when they are about to do something
expensive.  Communication is way more expensive than arithmetic, no?

mfh
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168639875.758355.205120@m58g2000cwm.googlegroups.com>
············@gmail.com wrote:


> Actually, for performance, it can help a lot (in the HPC world) if you
> have software control of data movement.  I can dig up some citations if
> you are interested.

Yes, of course, but the HPC world is an odd one.  A similar argument
(and this isn't sarcasm) says that for performance it can help a lot if
you don't use a dynamic/high-level language, avoid non-array datatypes
&c &c.  It can, but  for most application areas there are other
considerations.

>
> I would argue that a good programming language for HPC applications
> would let programmers know when they are about to do something
> expensive.  Communication is way more expensive than arithmetic, no?

Yes, I agree with this.  but HPC is, as I said, odd (though I'm
interested in it).  For most applications you want to have some idea of
what the performance model of the machine is like (which is beyond the
vast majority of programmers for a start), to write code which (if
performance is an issue which very often it is not) should sit well
with that model, and then to allow the machine (in the `compiler, HW
and OS' sense) do most of the boring work of, for instance, making sure
memory is local to threads &c &c.

I can see that I've now got completely sidetracked.  My original point
in this thread was that multicore systems will end up on desktops for
reasons which have rather little to do with performance, and now I'm
talking about HPC :-)  I will go and devour some minions.

--tim
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168761414.963541.92910@11g2000cwr.googlegroups.com>
Tim Bradshaw wrote:
> Yes, of course, but the HPC world is an odd one.  A similar argument
> (and this isn't sarcasm) says that for performance it can help a lot if
> you don't use a dynamic/high-level language, avoid non-array datatypes
> &c &c.  It can, but  for most application areas there are other
> considerations.

Of course :)  I'm not suggesting that Joe/Jane Programmer be required
to insert software prefetches into their code ;P  One should of course
first code for correctness, then if (and ONLY IF) performance is
inadequate, profile to find the bottlenecks, and then apply trickier
optimizations to those as necessary.

I would argue that the HPC world has a lot to do with the game world (a
large number of mathematical floating-point computations; physics
calculations; some tolerance for inaccuracy in many cases) and the
embedded world (more strict resource restrictions and performance
requirements than usual).

> Yes, I agree with this.  but HPC is, as I said, odd (though I'm
> interested in it).  For most applications you want to have some idea of
> what the performance model of the machine is like (which is beyond the
> vast majority of programmers for a start), to write code which (if
> performance is an issue which very often it is not) should sit well
> with that model, and then to allow the machine (in the `compiler, HW
> and OS' sense) do most of the boring work of, for instance, making sure
> memory is local to threads &c &c.

That's right, that kind of stuff should be automated, and the
infrastructure exists to do that already.

> I can see that I've now got completely sidetracked.  My original point
> in this thread was that multicore systems will end up on desktops for
> reasons which have rather little to do with performance, and now I'm
> talking about HPC :-)  I will go and devour some minions.

Heh heh, HPC will rule the world!!!  *brainwashes more minions*

mfh
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168842260.616634.7020@38g2000cwa.googlegroups.com>
············@gmail.com wrote:

>
> I would argue that the HPC world has a lot to do with the game world (a
> large number of mathematical floating-point computations; physics
> calculations; some tolerance for inaccuracy in many cases) and the
> embedded world (more strict resource restrictions and performance
> requirements than usual).
>

Yes, I think that's definitely true - game programming *is* HPC
programming, albeit you tend to be targetting a platform whose end-user
cost is hundreds not millions of dollars.

I think that one important issue for general-purpose processors (or
general-purpose computer systems, be they multi core, multi socket,
multi board) is that they need to be able to support naive programs,
and support them without too catastrophic a performance hit.  "Naive
programs" are probably something like "programs that assume a cc SMP
system" or something like that.  That's not true for true HPC systems
or for special games hardware, be it graphics cards or consoles.
Though obviously even there you don't want to make the thing *too* hard
to program for.

--tim
From: George Neuner
Subject: Re: Next Generation of Language
Date: 
Message-ID: <e3ufq25kjqvrnucknj89i86drieikhce0b@4ax.com>
On 12 Jan 2007 01:35:29 -0800, "Tim Bradshaw" <··········@tfeb.org>
wrote:

>George Neuner wrote:
>
>> Well, on Cells the private memories are not cache but staging memories
>> ... the main processor has to move data into and out of them on behalf
>> of the coprocessors.
>
>It doesn't matter very much who moves the data, it's still cache :-).
>The issue that counts, really, is what the programming model is at the
>user level. No one should need to care whether things are done
>automagically by the hardware as most L1/L2 caches are today, or by
>hardware with substantial SW support as, say, MMUs, or almost entirely
>by SW with some small amount of HW support, as, say disk paging.
>(Actually, the second thing that counts is whether the HW can
>efficiently support the programming model you choose.)

I have considerable experience with manual staging (on DSPs) and I can
tell you that it is a royal PITA to schedule several functional units
and keep them going full blast using software alone.  

Cell is less onerous only because of the granularity of the code the
coprocessors can execute - whole functions or miniprograms rather than
the baby steps DSP units can take.


>> AFAIK, no one has tried to offer a hardware solution to staging
>> computations in a distributed memory system since the KSR1 (circa
>> 1990, which failed due to the company's creative bookkeeping rather
>> than the machine's technology).  Everyone now relies on software
>> approaches like MPI and PVM.
>
>Well, I think they have actually, in all but name: that's essentially
>what NUMA machines are.  Such machines are quite common, of course
>(well, for bigger systems anyway): all Sun's recent larger machines (4
>& 5-digit sunfire boxes) are basically NUMA, and it may be that smaller
>ones are too.

Non Uniform Memory Access simply means different memories have
different access times - that describes just about every machine made
today.  The NUMA model distinguishes between "near" and "far" memories
in terms of access time, but does not distinguish by how the memories
are connected - a system with fast cache and slower main memory fits
the model just as well as one with a butterfly network between CPU and
memory.


>Of course, as I said above, this comes down to programming model and
>how much HW support you need for it.  I think the experience of the
>last 10-20 years is that a shared memory model (perhaps "shared address
>space"?), preferably with cache-coherency, is a substantially easier
>thing to program for than a distributed memory model. Whether that will
>persist, who knows (I suspect it will, for a surprisingly long time).
>Of course the physical memory that underlies this model will become
>increasingly distributed, as it already has to a great extent.

It's all about the programming model and I think you are on the right
track.  Shared address space is the right approach, IMO, but further I
believe it should be implemented in hardware.   

That is why I mentioned KSR1 - the only massive multiprocessor I know
of that tried to help the programmer.  KSR1 was a distributed memory
multiprocessor (256..1088 CPUs) with a multilevel caching tree network
which provided the programmer with the illusion of a shared memory.
The KSR1 ran a version of OSF/1, so software written for any shared
memory Unix multiprocessor was relatively easy to port - an important
consideration because most people looking to buy a supercomputer were
outgrowing a shared memory machine.

There was, of course, a penalty paid for the illusion of shared
memory.  Estimates were that the cache consistency model slowed the
machine by 15-25% vs comparable MPI designs, but IMO that was more
than made up for by the ease of programming.  The second generation
KSR2 improved shared memory speeds considerably, but few people ever
saw one - the company went belly up before it was formally introduced.


George
--
for email reply remove "/" from address
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168641772.203903.39190@l53g2000cwa.googlegroups.com>
George Neuner wrote:


> I have considerable experience with manual staging (on DSPs) and I can
> tell you that it is a royal PITA to schedule several functional units
> and keep them going full blast using software alone.

I bet it is!



> Non Uniform Memory Access simply means different memories have
> different access times - that describes just about every machine made
> today.  The NUMA model distinguishes between "near" and "far" memories
> in terms of access time, but does not distinguish by how the memories
> are connected - a system with fast cache and slower main memory fits
> the model just as well as one with a butterfly network between CPU and
> memory.

I agree with this in theory (and of course, all (well, nearly, there
have been recent cacheless designs which aimed to hide latency by
heavily multithreaded processors) machines are NUMA in that sense. But
I think the conventional use for the term is for multiprocessors where
all memory is "more local" (in time terms) to some processors than it
is to others, and that was the sense in which I was using it. You can
think of these kinds of machines as systems where there is only cache
memory. It seems to me inevitable that all large machines will become
NUMA, if they are not all already.  And the nonuniformity will increase
over time.

My argument is that physically, these machines actually are distributed
memory systems, but their programming model is that of a shared memory
system.  And this illusion is maintained by a combination of hardware
(route requests to non-local memory over the interconnect, deal with
cache-coherency etc) and system-level software (arrange life so that
memory is local to the threads which are using it where that is
possible etc).

Of course these machines typically are not MPP systems, and are also
typically not HPC-oriented.  Though I think SGI made NUMA systems with
really quite large numbers of processors, and a Sun E25K can have 144
cores (72 2-core processors), though I think it would be quite unusual
to run a configuration like that as a single domain.

--tim
From: Rob Warnock
Subject: Re: Next Generation of Language
Date: 
Message-ID: <Cq2dnVS8c9jBLTXYnZ2dnUVZ_t-mnZ2d@speakeasy.net>
Tim Bradshaw <··········@tfeb.org> wrote:
+---------------
| My argument is that physically, these machines actually are distributed
| memory systems, but their programming model is that of a shared memory
| system.  And this illusion is maintained by a combination of hardware
| (route requests to non-local memory over the interconnect, deal with
| cache-coherency etc) and system-level software (arrange life so that
| memory is local to the threads which are using it where that is
| possible etc).
| 
| Of course these machines typically are not MPP systems, and are also
| typically not HPC-oriented.  Though I think SGI made NUMA systems with
| really quite large numbers of processors, and a Sun E25K can have 144
| cores (72 2-core processors), though I think it would be quite unusual
| to run a configuration like that as a single domain.
+---------------

SGI *stills* makes large ccNUMA systems, the Altix 4700 series,
which offer a very large global main memeory, up to 128 TB(!),
with global cache coherency (sequential consistency, to be specific)
and with up to 512 Itanium CPUs standard (up to 1024 by special-order)
in a *single* domain, that is, a single instance of Linux, see:

    http://www.sgi.com/products/servers/altix/4000/

Two things make this scale well:

1. A directory-based cache coherency system, which keeps cache
   line ownership information with the memory subsystem the
   cache line is in.

2. Compared to other large ccNUMA or NUMA systems, a really low
   ratio of remote to local memory access times, varying between
   3:1 to 4:1 for large to very-large systems.

And, yes, there a quite a few HPC customers who run systems that
large as single images for SMP-style codes which don't convert to
MPI style very well.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Rob Warnock
Subject: Re: Next Generation of Language
Date: 
Message-ID: <Cq2dnVa8c9hmLDXYnZ2dnUVZ_t_inZ2d@speakeasy.net>
Tim Bradshaw <··········@tfeb.org> wrote:
+---------------
| My argument is that physically, these machines actually are distributed
| memory systems, but their programming model is that of a shared memory
| system.  And this illusion is maintained by a combination of hardware
| (route requests to non-local memory over the interconnect, deal with
| cache-coherency etc) and system-level software (arrange life so that
| memory is local to the threads which are using it where that is
| possible etc).
| 
| Of course these machines typically are not MPP systems, and are also
| typically not HPC-oriented.  Though I think SGI made NUMA systems with
| really quite large numbers of processors, and a Sun E25K can have 144
| cores (72 2-core processors), though I think it would be quite unusual
| to run a configuration like that as a single domain.
+---------------

SGI *still* makes large ccNUMA systems, the Altix 4700 series, which
offer a very large global main memeory, up to 128 TB(!), with global
cache coherency (sequential consistency, to be specific) and with
up to 512 Itanium CPUs standard (up to 1024 by special-order) in a
*single* domain, that is, a single instance of Linux, see:

    http://www.sgi.com/products/servers/altix/4000/

Two things make this scale well:

1. A directory-based cache coherency system, which keeps cache
   line ownership information with the memory subsystem the
   cache line is in.

2. Compared to other large ccNUMA or NUMA systems, a really low
   ratio of remote to local memory access times, varying between
   3:1 to 4:1 for large to very-large systems.

And, yes, there a quite a few HPC customers who run systems that
large as single images for SMP-style codes which don't convert to
MPI style very well.


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Pascal Bourguignon
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87odp5w3zv.fsf@thalassa.informatimago.com>
"Tim Bradshaw" <··········@tfeb.org> writes:

> ············@gmail.com wrote:
>
>> Intel's proposed 80-core architecture will have DRAM attached to each
>> core -- sort of how Cell has "local stores" attached to each SPE.
>> That's how they plan to solve the BW problem -- amortize it over all
>> the cores.
>
> Don't we call that `cache' normally?  (yes, I know, they'll be *big*
> caches, but only big by today's standards, in the same sense that
> today's machines have as much cache as yesterday's had main memory.)

Well, the fact that L1 and L2 caches are totally transparent to the
programmer and the HD cache somewhat less is no reason to distinguish
them.

You've probably already seen this pyramid with the registers in the
top corner, above layers of memories,  L1, L2 and now L3, the RAM, the
HD, the tapes, etc.  We could also add layers for the Internet and the
physical world.

RAM is used as cache for the HD. HD is used as cache for the big
storage repositories on tapes or CD, or for the Internet.  The
Internet is used as a cache for the real world.  Our computers don't
need robotic extensions to access information in the real world,
because the real world is cached into the Internet.  (Well, it may be
useful to have these robotic extensions to allow the computer access
the real world itself, instead of having armies of human filling
wikipedia and other pages indexed by google).

It's only a matter of OS to hide all these details.  Use mmap instead
of open/read/write/close.  Add an imap(2) and call 
imap(address,"http://en.wikipedia.org/wiki/Raven");
instead of sending your robotic extensions go watch birds.
Of course, it helps to have a big addressing space. 

Earth is 510,065,600 km�(*), that's  510,065,600e12 mm� or 69 bits to
identify each mm� of Earth surface.  So we'll have to wait for 128bit
processors to be able to mmap every bit of Earth surface into the
virtual memory space of our computers. In the meantime, we can just
implement our own 128-bit virtual address space, and a mere emap(2)
syscall is all what is needed to address the (physical) desktop of
your coworkers on another continent, thru remote presence robots.



(*) I'm lazy to compute it tonight, so I just copied the number cached
in Wikipedia; beware! ;-)

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

HEALTH WARNING: Care should be taken when lifting this product,
since its mass, and thus its weight, is dependent on its velocity
relative to the user.
From: Tim Bradshaw
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168593447.606139.59810@11g2000cwr.googlegroups.com>
Pascal Bourguignon wrote:

> You've probably already seen this pyramid with the registers in the
> top corner, above layers of memories,  L1, L2 and now L3, the RAM, the
> HD, the tapes, etc.  We could also add layers for the Internet and the
> physical world.

Thanks.  I'm reasonably familiar with computer architecture.
From: Pascal Bourguignon
Subject: Re: Next Generation of Language
Date: 
Message-ID: <87bql4wq60.fsf@thalassa.informatimago.com>
"Tim Bradshaw" <··········@tfeb.org> writes:

> Pascal Bourguignon wrote:
>
>> You've probably already seen this pyramid with the registers in the
>> top corner, above layers of memories,  L1, L2 and now L3, the RAM, the
>> HD, the tapes, etc.  We could also add layers for the Internet and the
>> physical world.
>
> Thanks.  I'm reasonably familiar with computer architecture.

Well, my point was that it doesn't matter much if the memory that's
near the processor is cache memory or normal RAM, the only difference
being who will fill it, the MMU, or the OS.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Indentation! -- I will show you how to indent when I indent your skull!"
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-008@yahoo.com>
> From: Pascal Bourguignon <····@informatimago.com>
> You've probably already seen this pyramid with the registers in the
> top corner, above layers of memories,  L1, L2 and now L3, the RAM, the
> HD, the tapes, etc.  We could also add layers for the Internet and the
> physical world.
>
> RAM is used as cache for the HD. HD is used as cache for the big
> storage repositories on tapes or CD, or for the Internet.  The
> Internet is used as a cache for the real world.  Our computers don't
> need robotic extensions to access information in the real world,
> because the real world is cached into the Internet.  (Well, it may be
> useful to have these robotic extensions to allow the computer access
> the real world itself, instead of having armies of human filling
> wikipedia and other pages indexed by google).

Hey, I wish I had seen this back when you wrote it, so that I could
have nominated it for POTM (Post Of The Month). It's almost like
Kent Pittman's great philosophical essays about software.

As for WikiPedia: This is just proper division/specialization of labor.
For raw data I agree robots should collect it automatically, and indeed
for most astronomy nowadays that is almost exactly what is being done.
But for cogitating, humans are at present better skilled than
computer algorithms, so humans are still best in those roles.
So WikiPedia will continue to be a human work (aided by groupware)
in the foreseeable future.

> It's only a matter of OS to hide all these details.  Use mmap instead
> of open/read/write/close.  Add an imap(2) and call
> imap(address,"http://en.wikipedia.org/wiki/Raven");

Or encapulsate it all inside a higher level of abstraction, and
call the highlevel method instead directly calling mmap or imap.

> Of course, it helps to have a big addressing space.

We already have it: URLs/URIs etc.
(Hmm, the following visual pun occurred to me just now: URl,
 which is a lower-case "el" but looks like an upper-case "eye" in some fonts.
 Maybe in the future when I'm not sure whether I am talking about
 a URL or URI, I can say "URl" deliberately? If I do, will you spank me??)


--
Nobody in their right mind likes spammers, nor their automated assistants.
To gain access to this site, you must demonstrate you're not one of them.
Please spend a few seconds to try to read the text-picture in this box:
/--------------------------------------------------------------\
|     .__             .  .__                                   |
|     [__) _. __ _. _.|  [__) _ . .._. _ . .* _ ._  _ ._       |
|     |   (_]_) (_.(_]|  [__)(_)(_|[  (_](_||(_][ )(_)[ )      |
|                                     ._|    ._|               |
\-(Rendered by means of <http://www.schnoggo.com/figlet.html>)-/

Then enter your best guess of the text (10-20 chars) into this TextField:
          +--------------------+
          |                    |
          +--------------------+
From: Alex Mizrahi
Subject: Re: Next Generation of Language
Date: 
Message-ID: <45a25333$0$49207$14726298@news.sunsite.dk>
(message (Hello ·············@gmail.com)
(you :wrote  :on '(8 Jan 2007 05:21:24 -0800))
(

 ??>> From this link
 s> http://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
 s> (Note : Japanese)
 s> Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
 s> desktop computers will be common, it's nearly impossible to simply
 s> write that many threads, it should be done automatically, so maybe
 s> functional language will do a better job in parallel programming than
 s> procedural language like C or Ruby.

nobody needs just 64 cores. they need that many cores FOR A SPECIFIC TASK.
if it's web-server, it can be easily paralelizable -- you can handle each 
request in a separate thread.
there are well-known paralellization techinques for scientific tasks that 
involve large matrices, etc.

so, actually there's no much need for automatic parallel programming. tasks 
requiring high-performance ALREADY are running in parallel. i bet if you run 
some usual single-core task with some magic auto-parallel language, you 
won't make significant benefits.

btw, you don't have to wait 10 years. you can buy GeForce 8800 for 500$, it 
has hundreds of computing cores.
http://developer.nvidia.com/object/cuda.html
---
What is CUDA technology?

GPU computing with CUDA technology is an innovative combination of computing 
features in next generation NVIDIA GPUs that are accessed through a standard 
��C�� language.  Where previous generation GPUs were based on ��streaming 
shader programs��, CUDA programmers use ��C�� to create programs called 
threads that are similar to multi-threading programs on traditional CPUs. 
In contrast to multi-core CPUs, where only a few threads execute at the same 
time, NVIDIA GPUs featuring CUDA technology process thousands of threads 
simultaneously enabling a higher capacity of information flow.
---

it would be nice to program that CUDA thing in Lisp instead of C :)

)
(With-best-regards '(Alex Mizrahi) :aka 'killer_storm)
"People who lust for the Feel of keys on their fingertips (c) Inity") 
From: ············@gmail.com
Subject: Re: Next Generation of Language
Date: 
Message-ID: <1168548057.509651.33970@77g2000hsv.googlegroups.com>
Alex Mizrahi wrote:
> nobody needs just 64 cores. they need that many cores FOR A SPECIFIC TASK.
> if it's web-server, it can be easily paralelizable -- you can handle each
> request in a separate thread.
> there are well-known paralellization techinques for scientific tasks that
> involve large matrices, etc.
>
> so, actually there's no much need for automatic parallel programming. tasks
> requiring high-performance ALREADY are running in parallel.

Um, yes and no; the big problem is not the huge-scale HPC applications
(for which NSF, DARPA, etc. fund the research necessary to parallelize
them effectively) but the smaller stuff, e.g. Matlab written by
engineers for ad-hoc problem solving.  Individual cores aren't getting
much faster so in order to get performance gains from multiple cores,
parallelism needs to be automatically extracted (because we can't
expect Joe / Jane Civil Engineer to know how to extract it manually) or
at least made easy to extract (in the style of Matlab*p or the new
parallel Matlab that Cleve Moler is pushing).

> i bet if you run some usual single-core task with some magic auto-parallel language, you
> won't make significant benefits.

Well, extracting parallelism effectively on future machines is going to
require a lot of tuning, much of which may be done automatically (like
ATLAS or FFTW do for single-processor codes).  It's quite probable that
people who code parallel libraries (e.g. for sorting or linear algebra)
will in the future write very little explicitly parallel code --
instead they will supply annotations to an auto-tuner.  See e.g. the
PLAPACK project.

> btw, you don't have to wait 10 years. you can buy GeForce 8800 for 500$, it
> has hundreds of computing cores.

Heh, wait until they at least document their floating-point semantics
first ;P

mfh
From: Robert Maas, see http://tinyurl.com/uh3t
Subject: Re: Next Generation of Language
Date: 
Message-ID: <rem-2007nov09-002@yahoo.com>
Aside: I found this thread while looking for something else, namely
       implementations of PGP signatures (for newsgroup articles
       and e-mail) written in Common Lisp. Google search for Common
       Lisp PGP didn't turn up anything useful, so next I tried
       simply searching for PGP by itself within comp.lang.lisp,
       whereupon this interesting thread turned up:
> From: ·············@gmail.com" <············@gmail.com>
> From this link
> http://itpro.nikkeibp.co.jp/a/it/alacarte/iv1221/matsumoto_1.shtml
> (Note : Japanese)

Yeah. Do you know of a plain text (or HTML) US-ASCII translation?

> Matsu, the creator of Ruby, said in the next 10 years,  64 or 128 cores
> desktop computers will be common, it's nearly impossible to simply
> write that many threads, it should be done automatically, so maybe
> functional language will do a better job in parallel programming than
> procedural language like C or Ruby.

What is the meaning of "cores"?? Are you referring to processing units,
as in parallel-processor computers?

If that's what you and Matsu mean: At present we already have a
need for multiple processors, the only safe way to deal with
threats from the network (worms/viruses including spambots trojans
etc.), which require multiple levels of firewall, and local threats
(somebody breaks into your office while you are away and tries to
use your computer). With a single processor trying to emulate
firewalls and local security as well as local user login and remote
user login and file encryption etc., it's just too easy for a
single point of failure to compromise *all* security measures at
once. With each firewall and filesystem etc. in a separate box, a
trespasser could simply reconnect the boxes in a different
configuration to defeat all security measures. With most of the
wiring under the desk and behind the boxes, the authorized user
might not even notice the wiring has been changed for weeks if
ever. But with a single box that had multiple processors, with the
primary InterNet firewall/interface running in pROM, using public-key
cryptosystems to communicate with other processors including duplicate
event-loggers, and the secondary firewall running in its own
processor using RAM but configurable only via fully public-key
authorized and logged transactions, and with keyboard controller
also running in pROM transmitting all keystrokes to event loggers,
same with mouse controller, and all JavaScript or other client-side
scripts running in their own "play pen" environment isolated from
the rest of the operating system and applications, it'd be
impossible for a network worm to accomplish anything nasty, and a
physical intruder would need to break the seal on the computer box
in a visibily evident way, as well as do some pretty difficult
mangling of the internal wiring, before any security breach could
occur.

We also have a practical use for additional multiple processors:
Simply run each user application, and each network service, on a
separate processor. Maybe even assign multiple processors for
multiple simultaneous instances of a single service (such as CGI or
J2EE/JSP) if that service is heavily loaded in practice. With 64
processors, the bottleneck of Apache service (for hundreds of
near-simultaneous clients) would be eliminated, and raw network
bandwidth would become the limiting factor, which can then be
remedied by faster service up to the maximum technically available
speed and then by multiple network connections if even more
bandwidth is needed. Of course XML with its extreme verbosity
should be scrapped in favor of s-expressions and/or data-compressed
network connections. (And if the one server uses compressed TCP
streams but the thousands of clients are old-fashioned computers
which can't handle their individual CPU loads, have multiple remote
proxies/gateways to convert between normal TCP and compressed-TCP.)
With lots of available processors, the CPU load of data-compressed
network connections would not be a problem. Each compressed-TCP
stream could have its own dedicated CPU if necessary. Hey, we've
run out of processors, need to upgrade from 64 to 128. Who said
we'd be wasting most of those processors if we don't use functional
programming languages? Just ordinary network services, on a heavily
loaded server, would use them all up already!!


--
Nobody in their right mind likes spammers, nor their automated assistants.
To open an account here, you must demonstrate you're not one of them.
Please spend a few seconds to try to read the text-picture in this box:

/--------------------------------------------------------------\
|          .  .       ,   .  .     .__   ,                     |
|          |_/  _ ._ -+-  |\/|     [__)*-+-._ _  _.._          |
|          |  \(/,[ ) |   |  | *   |   | | [ | )(_][ )         |
\-(Rendered by means of <http://www.schnoggo.com/figlet.html>)-/

Then enter your best guess of the text (10-20 chars) into this TextField:
          +--------------------+
          |                    |
          +--------------------+