From: namekuseijin
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <gtl93p$2v9n$1@adenine.netfront.net>
Abdulaziz Ghuloum wrote:
> MzScheme should be faster than what it's showing.  Basically,
> MzScheme uses libgmp (iirc), but so does ikarus.  On the same
> machine (MacBook Intel Core2Duo 2GHz), Ikarus is 6~7 times
> faster than both, which is unexpected (I compared the answers
> just to make sure Ikarus is not cheating).

Holy baloney!  You're are right!

············@nameku:~$ ikarus
Ikarus Scheme version 0.0.3
Copyright (c) 2006-2008 Abdulaziz Ghuloum

 > (define (fact n)
	(let f ((n n) (r 1))
	  (if (< n 2) r (f (- n 1) (* r n)))))
 > (time (and (fact 100000) #t))
running stats for (and (fact 100000) #t):
     2405 collections
     4988 ms elapsed cpu time, including 664 ms collecting
     4990 ms elapsed real time, including 691 ms collecting
     9931119600 bytes allocated
#t

It's the same if the number is printed, except you have to use an 
external cronometer or else you'll miss the timing that goes before the 
printed output!

This is even faster than sbcl!

············@nameku:~$ sbcl
This is SBCL 1.0.11.debian, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses.  See the CREDITS and COPYING files in the
distribution for more information.
* (defun fact (n &optional (r 1))
	(if (< n 2) r
		(fact (1- n) (* r n))))

FACT
* (time (and (fact 100000) t))

Evaluation took:
   9.696 seconds of real time
   9.192575 seconds of user run time
   0.504031 seconds of system run time
   [Run times include 0.904 seconds GC run time.]
   0 calls to %EVAL
   0 page faults and
   9,931,045,056 bytes consed.
T

Saint GMP, Batman!

> $ gsi
> Gambit v4.0.1
> 
>  > (define (fact n)
>     (let f ((n n) (r 1))
>       (if (< n 2) r
>          (f (- n 1) (* r n)))))
>  > (time (and (fact 100000) #t))
> (time (and (fact 100000) #t))
>     56927 ms real time
>     56234 ms cpu time (45247 user, 10987 system)
>     15321 collections accounting for 4615 ms real time (2809 user, 1739 
> system)
>     9979093064 bytes allocated
>     no minor faults
>     no major faults
> #t

I was really surprised at this so I had to check it out for myself! gsi 
truly gives about the same time as code compiled by gsc and gcc!

BTW, guys, this is an extensive paper on a number of algorithms for 
factorial that give much better performance:

http://www.cs.berkeley.edu/~fateman/papers/factorial.pdf

You facts will never be the same again... bwahahahaa

From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <b15b3e97-d0fd-4418-b3dd-ecf2adf0cea0@q33g2000pra.googlegroups.com>
Sorry I tried my Mosh 0.1.0 on my Ubuntu 9.04 (libgmp.so.3.4.4)
It works without memory swapping!

% mosh
mosh>(import (mosh))
#<unspecified>
mosh>(define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
#<unspecified>
mosh>(time (and (fact 100000) #t))y

;;40.426059 real 40.186512 user 0.100006 sys


% mzscheme
Welcome to MzScheme v4.1.3 [3m], Copyright (c) 2004-2008 PLT Scheme
Inc.
>       (define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
         (time (and (fact 100000) #t))
> cpu time: 61291 real time: 61475 gc time: 10828

namekuseijin's case can be GC problem.
Mosh uses Boehm GC. And it needs proper configuration flags for each
operating systems.
Will you please show me your "uname -ar" ?


Thomas Lord wrote:

> Not to impose but
> please consider writing up
> a brief description of the implementation
>with special attention to what makes it fast
>and to what the low-level (sub-scheme) interface
>to the run-time system is like.

Sure.

Mosh uses Psyntax written by Abdulaziz Ghuloum for frontend.
It helps to expand R6RS library system and syntax-case.
Good library thanks.

Backend is a simple stack VM with many optimizations.
Eg.
- procedure inlining
- constant folding
- jump destination peephole optimaztion
- instruction unification
- direct threded code
- gloc

Source files http://code.google.com/p/mosh-scheme/source/browse/trunk
- instruction.scm: Mosh's instruction set.
- VM-Run.cpp: VM loop.
- compiler.scm: compiler
- vm.scm: subset of VM written in C++

It uses Boehm GC for memory manament.

Cheers.
From: Vend
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <154cdf43-eaa4-4f27-b30a-e117c93dc311@o27g2000vbd.googlegroups.com>
On 4 Mag, 04:03, higepon <·······@gmail.com> wrote:
> Sorry I tried my Mosh 0.1.0 on my Ubuntu 9.04 (libgmp.so.3.4.4)
> It works without memory swapping!
>
> % mosh
> mosh>(import (mosh))
> #<unspecified>
> mosh>(define (fact n)
>              (let f ((n n) (r 1))
>                (if (< n 2) r
>                   (f (- n 1) (* r n)))))
> #<unspecified>
> mosh>(time (and (fact 100000) #t))y
>
> ;;40.426059 real 40.186512 user 0.100006 sys
>
> % mzscheme
> Welcome to MzScheme v4.1.3 [3m], Copyright (c) 2004-2008 PLT Scheme
> Inc.>       (define (fact n)
>
>              (let f ((n n) (r 1))
>                (if (< n 2) r
>                   (f (- n 1) (* r n)))))
>          (time (and (fact 100000) #t))
>
> > cpu time: 61291 real time: 61475 gc time: 10828
>
> namekuseijin's case can be GC problem.
> Mosh uses Boehm GC. And it needs proper configuration flags for each
> operating systems.
> Will you please show me your "uname -ar" ?

I haven't looked at your code, but I suppose you allocate bigints with
gc_malloc() and Bohem complains that tracing through these large
memory blocks is expensive (since it doesn't know what regions of a
block are pointers it needs to examine it all).

I presume that could be solved by replacing gc_malloc() with
gc_malloc_atomic() for bigints.
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <dfa7a746-f1ad-4c0d-9fa2-888627b55e43@j9g2000prh.googlegroups.com>
Hi

namekuseijin wrote:
> > Will you please register the issue?
> Done.

Thank you!

fujita-y wrote:
> Please use '--heap-limit' flag to run fact program on Ypsilon. :)

Cool option. It offers safe and stable runnig. :-)

leppie wrote:
> I can finally compare!  :)

Yeah. IronScheme is an older brother of Mosh and a rival.

On Mosh 0.0.1, Biginteger library GMP is not faster version (not asm
version).
It may be slower than on Linux.
I will fix this next release.

BTW, backend of IronScheme (DLR) may have JIT. So it can run very
faster?

Vend wrote:

> I presume that could be solved by replacing gc_malloc() with
> gc_malloc_atomic() for bigints.

Good point. I tried to use gc_malloc_atomic.
And it runs well w/o any warnings. :)

I'm not sure all of allocations in GMP is pointer free or not.
But with gc_malloc_atomic, Mosh passes all tests.

I will realease the fixed version of Mosh tonight.

Cheers.

On 5$B7n(B6$BF|(B, $B8aA0(B1:51, Vend <······@virgilio.it> wrote:
> On 4 Mag, 04:03, higepon <·······@gmail.com> wrote:
>
>
>
> > Sorry I tried my Mosh 0.1.0 on my Ubuntu 9.04 (libgmp.so.3.4.4)
> > It works without memory swapping!
>
> > % mosh
> > mosh>(import (mosh))
> > #<unspecified>
> > mosh>(define (fact n)
> >              (let f ((n n) (r 1))
> >                (if (< n 2) r
> >                   (f (- n 1) (* r n)))))
> > #<unspecified>
> > mosh>(time (and (fact 100000) #t))y
>
> > ;;40.426059 real 40.186512 user 0.100006 sys
>
> > % mzscheme
> > Welcome to MzScheme v4.1.3 [3m], Copyright (c) 2004-2008 PLT Scheme
> > Inc.>       (define (fact n)
>
> >              (let f ((n n) (r 1))
> >                (if (< n 2) r
> >                   (f (- n 1) (* r n)))))
> >          (time (and (fact 100000) #t))
>
> > > cpu time: 61291 real time: 61475 gc time: 10828
>
> > namekuseijin's case can be GC problem.
> > Mosh uses Boehm GC. And it needs proper configuration flags for each
> > operating systems.
> > Will you please show me your "uname -ar" ?
>
> I haven't looked at your code, but I suppose you allocate bigints with
> gc_malloc() and Bohem complains that tracing through these large
> memory blocks is expensive (since it doesn't know what regions of a
> block are pointers it needs to examine it all).
>
> I presume that could be solved by replacing gc_malloc() with
> gc_malloc_atomic() for bigints.
From: Vend
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <4e334a94-cd36-4db0-b344-e24716bcf79a@j12g2000vbl.googlegroups.com>
On 6 Mag, 09:32, higepon <·······@gmail.com> wrote:
> Vend wrote:
> > I presume that could be solved by replacing gc_malloc() with
> > gc_malloc_atomic() for bigints.
>
> Good point. I tried to use gc_malloc_atomic.
> And it runs well w/o any warnings. :)
>
> I'm not sure all of allocations in GMP is pointer free or not.
> But with gc_malloc_atomic, Mosh passes all tests.

"GMP may use allocated blocks to hold pointers to other allocated
blocks. This will limit the assumptions a conservative garbage
collection scheme can make. " -
http://gmplib.org/manual/Custom-Allocation.html

You can't use gc_malloc_atomic() for internal GMP allocations, it
would introduce leaks.

If there is some way of calling the destructor when deallocating a C++
class, then you should leave the default memory allocation functions
of GMP and call mpz_clear() in the destructor of your Bignum class.

> I will realease the fixed version of Mosh tonight.
>
> Cheers.
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <a647c9ce-bd1f-45c3-98f0-401d53bb237a@k19g2000prh.googlegroups.com>
Hi Vend.

> blocks. This will limit the assumptions a conservative garbage
> You can't use gc_malloc_atomic() for internal GMP allocations, it
> would introduce leaks.

Thank you.

> If there is some way of calling the destructor when deallocating a C++
> class, then you should leave the default memory allocation functions
> of GMP and call mpz_clear() in the destructor of your Bignum class

We can do this like.
 class Bignum : public gc_cleanup // destructor will be called

I've try above, but same warings are issued.
I'll check whether the size of GC_malloc-ed is correct or not.

Best regards.

> collection scheme can make. " -http://gmplib.org/manual/Custom-Allocation.html
>
> You can't use gc_malloc_atomic() for internal GMP allocations, it
> would introduce leaks.
>
> If there is some way of calling the destructor when deallocating a C++
> class, then you should leave the default memory allocation functions
> of GMP and call mpz_clear() in the destructor of your Bignum class.
>
> > I will realease the fixed version of Mosh tonight.
>
> > Cheers.
>
>
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <59477f88-27f5-4cd8-9011-3f9c0a061b1f@w35g2000prg.googlegroups.com>
Reading this document, I understand why these warnings are issued.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/debugging.html

On 32bit architecture, using GMP Bignum creates many "false pointer",
which seem to be pointer for GC but is internal representation of
Bignum (array of word).

It will take long time to fix.

Cheers.


On 5$B7n(B6$BF|(B, $B8a8e(B9:40, higepon <·······@gmail.com> wrote:
> Hi Vend.
>
> > blocks. This will limit the assumptions a conservative garbage
> > You can't use gc_malloc_atomic() for internal GMP allocations, it
> > would introduce leaks.
>
> Thank you.
>
> > If there is some way of calling the destructor when deallocating a C++
> > class, then you should leave the default memory allocation functions
> > of GMP and call mpz_clear() in the destructor of your Bignum class
>
> We can do this like.
>  class Bignum : public gc_cleanup // destructor will be called
>
> I've try above, but same warings are issued.
> I'll check whether the size of GC_malloc-ed is correct or not.
>
> Best regards.
>
> > collection scheme can make. " -http://gmplib.org/manual/Custom-Allocation.html
>
> > You can't use gc_malloc_atomic() for internal GMP allocations, it
> > would introduce leaks.
>
> > If there is some way of calling the destructor when deallocating a C++
> > class, then you should leave the default memory allocation functions
> > of GMP and call mpz_clear() in the destructor of your Bignum class.
>
> > > I will realease the fixed version of Mosh tonight.
>
> > > Cheers.
>
>
From: Vend
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <e104bf2d-08f2-4b7f-b9fa-0c4c7d54b48a@s16g2000vbp.googlegroups.com>
On 7 Mag, 10:53, higepon <·······@gmail.com> wrote:
> Reading this document, I understand why these warnings are issued.http://www.hpl.hp.com/personal/Hans_Boehm/gc/debugging.html
>
> On 32bit architecture, using GMP Bignum creates many "false pointer",
> which seem to be pointer for GC but is internal representation of
> Bignum (array of word).

But if you don't redefine GMP allocation function shouldn't GMP object
be ignored by the tracer?

> It will take long time to fix.
>
> Cheers.
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <9ed4ec73-85a4-4b52-818c-383d00e6a249@r31g2000prh.googlegroups.com>
> But if you don't redefine GMP allocation function shouldn't GMP object
> be ignored by the tracer?

Yes.
I tried it today.
But much slower, because GMP calls realloc too many times.
From: Vend
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <5e0b916e-8d17-4ceb-8d24-365fa1ef12f7@s16g2000vbp.googlegroups.com>
On 8 Mag, 16:36, higepon <·······@gmail.com> wrote:
> > But if you don't redefine GMP allocation function shouldn't GMP object
> > be ignored by the tracer?
>
> Yes.
> I tried it today.
> But much slower, because GMP calls realloc too many times.

Doesn't it do that if garbage collection is used?
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <d9e18132-b170-4123-a741-79eb6e8cb351@x1g2000prh.googlegroups.com>
I think I fixed this problem.

Relesed Mosh 0.1.2
http://code.google.com/p/mosh-scheme/downloads/list

What I've done for the problem are following.
(1) For GMP, use malloc/realloc/free instead of GC_malloc/realloc/
free.
    This never make "false pointer".

(2) Cleanup the malloc-ed mpz_t with destructor of Bignum class
(extends gc_cleanup).

(3) hook the realloc, call GC_gcollect on right timing.

I also cleanup the code and reduce memory allocations for Bignum.
Mosh becomes much faster I think. :-)

*** Mosh taks about 9sec
% mosh -v
Mosh R6RS scheme interpreter, version 0.1.2
% mosh
mosh>(define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
#<unspecified>
mosh>(time (and (fact 100000) #t))

;;9.126263 real 8.396524 user 0.676042 sys
#t

*** Ikarus takes about 6.5 sec
% ikarus
Ikarus Scheme version 0.0.4-rc1+, 64-bit (revision 1773, build
2009-05-13)
Copyright (c) 2006-2009 Abdulaziz Ghuloum

> (define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
> (time (and (fact 100000) #t))
running stats for (and (fact 100000) #t):
    1194 collections
    6536 ms elapsed cpu time, including 648 ms collecting
    6568 ms elapsed real time, including 637 ms collecting
    9932311040 bytes allocated


Thank you your kind advice.

Cheers.

---
MINOWA Taro(Higepon)

http://www.monaos.org/
http://code.google.com/p/mosh-scheme/
http://www.facebook.com/people/Taro_Minowa_Higepon/649065487
http://friendfeed.com/higepon
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <940b7ec5-031e-42f8-92c0-3de0e0d4442f@18g2000prx.googlegroups.com>
I think I fixed this problem.

Relesed Mosh 0.1.2
http://code.google.com/p/mosh-scheme/downloads/list

What I've done for the problem are following.
(1) For GMP, use malloc/realloc/free instead of GC_malloc/realloc/
free.
    This never make "false pointer".

(2) Cleanup the malloc-ed mpz_t with destructor of Bignum class
(extends gc_cleanup).

(3) hook the realloc, call GC_gcollect on right timing.

I also cleanup the code and reduce memory allocations for Bignum.
Mosh becomes much faster I think. :-)

*** Mosh takes about 9sec.
% mosh -v
Mosh R6RS scheme interpreter, version 0.1.2
sewashi% mosh
mosh>(define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
#<unspecified>
mosh>(time (and (fact 100000) #t))

;;9.126263 real 8.396524 user 0.676042 sys
#t

*** Ikarus takes about 6.5 sec
% ikarus
Ikarus Scheme version 0.0.4-rc1+, 64-bit (revision 1773, build
2009-05-13)
Copyright (c) 2006-2009 Abdulaziz Ghuloum

> (define (fact n)
             (let f ((n n) (r 1))
               (if (< n 2) r
                  (f (- n 1) (* r n)))))
> (time (and (fact 100000) #t))
running stats for (and (fact 100000) #t):
    1194 collections
    6536 ms elapsed cpu time, including 648 ms collecting
    6568 ms elapsed real time, including 637 ms collecting
    9932311040 bytes allocated


Thank you for your kind advice.

Cheers.

---
MINOWA Taro(Higepon)

http://www.monaos.org/
http://code.google.com/p/mosh-scheme/
http://www.facebook.com/people/Taro_Minowa_Higepon/649065487
http://friendfeed.com/higepon
From: leppie
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <fa88c8a3-135f-44dc-8db3-c4aa71e1c058@s28g2000vbp.googlegroups.com>
On May 13, 3:56 pm, higepon <·······@gmail.com> wrote:
> % mosh -v
> Mosh R6RS scheme interpreter, version 0.1.2
> sewashi% mosh
> mosh>(define (fact n)
>              (let f ((n n) (r 1))
>                (if (< n 2) r
>                   (f (- n 1) (* r n)))))
> #<unspecified>
> mosh>(time (and (fact 100000) #t))
>
> ;;9.126263 real 8.396524 user 0.676042 sys
> #t

Sweet!  That's one fast interpreter!

Can you do a timing with Chez Petite too, for comparison's sake?  :)
From: higepon
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <07ab381a-6410-44cf-8643-7e830f8ab7d8@x1g2000prh.googlegroups.com>
> Sweet!  That's one fast interpreter!

Thanks.

> Can you do a timing with Chez Petite too, for comparison's sake?  :)

Here is Petite Chez Scheme (threaded version)

% petite
Petite Chez Scheme Version 7.4
Copyright (c) 1985-2007 Cadence Research Systems

> (define (fact n)
            (let f ((n n) (r 1))
               (if (< n 2) r
                   (f (- n 1) (* r n)))))
> (time (and (fact 100000) #t))
(time (and (fact 100000) ...))
    635 collections
    23133 ms elapsed cpu time, including 116 ms collecting
    23429 ms elapsed real time, including 118 ms collecting
    9931612296 bytes allocated, including 9928743496 bytes reclaimed
From: Vend
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <60add991-9df6-4dd9-95f3-a9cc3f621f53@g20g2000vba.googlegroups.com>
On 13 Mag, 17:11, leppie <········@gmail.com> wrote:
> On May 13, 3:56 pm, higepon <·······@gmail.com> wrote:
>
> > % mosh -v
> > Mosh R6RS scheme interpreter, version 0.1.2
> > sewashi% mosh
> > mosh>(define (fact n)
> >              (let f ((n n) (r 1))
> >                (if (< n 2) r
> >                   (f (- n 1) (* r n)))))
> > #<unspecified>
> > mosh>(time (and (fact 100000) #t))
>
> > ;;9.126263 real 8.396524 user 0.676042 sys
> > #t
>
> Sweet!  That's one fast interpreter!
>
> Can you do a timing with Chez Petite too, for comparison's sake?  :)

On my machine Iron Scheme takes about 80 seconds, while Mzscheme
freezes for a long time then returns an error.
From: namekuseijin
Subject: Re: ANN: Mosh 0.1.0 released - A Fast R6RS Scheme interpreter
Date: 
Message-ID: <2512085f-a4d2-4462-a5d8-31601014f539@21g2000vbk.googlegroups.com>
On May 14, 1:06 pm, Vend <······@virgilio.it> wrote:
> On my machine Iron Scheme takes about 80 seconds, while Mzscheme
> freezes for a long time then returns an error.

Welcome to MzScheme v4.0.2 [3m], Copyright (c) 2004-2008 PLT Scheme
Inc.
> (define (fact n)
        (let f ((n n) (r 1))
                (if (< n 2) r
                        (f (- n 1) (* r n)))))
> (fact 7)
5040
> (time (and (fact 100000) #t))
cpu time: 55255 real time: 55639 gc time: 7468
#t

Core 2 Duo here on Windows at work.  I remember I also benchmarked it
at home Linux with a more up-to-date mzscheme.

You're using the latest download?  I believe the PLT guys were
alerting people to report strange behavior with the latest release,
it's going multicore or something and may not be without bugs...