Just read this blog entry:
http://www.method-combination.net/blog/archives/2007/09/29/wide-finder-in-lisp.html
Essentially, the fellow implemented Bray's Wide Finder
(http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder) in
Common Lisp. The remarkable thing is that the Lisp version runs about
as fast (or as slowly) as the Ruby version. This is kinda surprising,
given that Ruby is the canonically slow language and that SBCL is
compiled. I figured it'd be an interesting exercise in optimisation.
The code is below:
(defun run (&rest logs)
(let ((counts (make-hash-table :test #'equal)))
(dolist (filename logs)
(with-open-file (stream filename :direction :input
:external-format :latin-1)
(loop for line = (read-line stream nil stream)
until (eq line stream)
do (cl-ppcre:register-groups-bind
(match)
("GET /ongoing/When/\\d{3}x/(\\d{4}/\\d{2}/\\d{2}/[^ .]+) " line)
(incf (gethash match counts 0))))))
(loop for key being the hash-keys of counts
collect key into keys
finally (map nil #'(lambda (x)
(format t "~D: ~A~%" (gethash x counts) x))
(subseq (sort keys #'>
:key #'(lambda (x) (gethash x counts)))
0 10)))))
Looking at it, it's majorly straightforward: initialise a hash table;
read a file line-by-line; look for a regexp and increment the match's
hash table entry when found; later, print everything out.
In fact, I can't really think how to improve it. Is it just the case
that SBCL's IO is slow? I think I noticed the same thing when working
on some SPOJ entries--I couldn't read files in the time given, which was
kinda annoying.
--
Robert Uhl <http://public.xdi.org/=ruhl>
Very often many things are said by the Holy Scriptures and in it many
names are used not in a literal sense...those who have minds
understand this. --St. Isaac the Syrian
Robert Uhl <·········@NOSPAMgmail.com> wrote:
> Just read this blog entry:
>
> http://www.method-combination.net/blog/archives/2007/09/29/wide-finder-in-lisp.html
>
> Essentially, the fellow implemented Bray's Wide Finder
> (http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder) in
> Common Lisp. The remarkable thing is that the Lisp version runs about
> as fast (or as slowly) as the Ruby version. This is kinda surprising,
> given that Ruby is the canonically slow language and that SBCL is
> compiled.
It doesn't seem particularily surprising to me, considering that
reading in log files and matching the lines to regexps is a bread and
butter use case of scripting languages. Of course a scripting language
is going to be optimized for that case. On this program I expect
practically all the time will be spent in libraries written in C, with
very little time spent actually executing Ruby.
> In fact, I can't really think how to improve it. Is it just the
> case that SBCL's IO is slow?
As far as I know the performance of READ-LINE in SBCL is pretty good
when compared to other Lisp implementations. But it is fundamentally
going to be slower than just reading in a bunch of raw bytes in C,
since the character IO pipeline will end up copying the data around
multiple times between buffers, transcoding from the native external
format to the internal utf-32 presentation, etc.
--
Juho Snellman
Robert Uhl <·········@NOSPAMgmail.com> writes:
> (defun run (&rest logs)
> (let ((counts (make-hash-table :test #'equal)))
> (dolist (filename logs)
> (with-open-file (stream filename :direction :input
> :external-format :latin-1)
> (loop for line = (read-line stream nil stream)
> until (eq line stream)
> do (cl-ppcre:register-groups-bind
> (match)
> ("GET /ongoing/When/\\d{3}x/(\\d{4}/\\d{2}/\\d{2}/[^ .]+) " line)
> (incf (gethash match counts 0))))))
Looks like you're building the RegEx for every line in the file. Try
building it before, it might make a huge difference.
On 2 öÏ×, 07:34, Robert Uhl <·········@NOSPAMgmail.com> wrote:
> Just read this blog entry:
>
> http://www.method-combination.net/blog/archives/2007/09/29/wide-finde...
>
> Essentially, the fellow implemented Bray'sWideFinder
> (http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder) in
> Common Lisp. The remarkable thing is that the Lisp version runs about
> as fast (or as slowly) as the Ruby version. This is kinda surprising,
> given that Ruby is the canonically slow language and that SBCL is
> compiled. I figured it'd be an interesting exercise in optimisation.
According to the latest benchmarks (as of 22 Nov, 07:40 UTC),
the Perl version is the fastest.[1] It just interesting to see if it
is
possible to optimize the Common Lisp version similarly to the
optimizations
done in the fastest implementations.
[1] http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results
With best regards,
Victor
http://vityok.org.ua
På Thu, 22 Nov 2007 08:43:44 +0100, skrev Vityok <······@ua.fm>:
> On 2 Жов, 07:34, Robert Uhl <·········@NOSPAMgmail.com> wrote:
>> Just read this blog entry:
>>
>> http://www.method-combination.net/blog/archives/2007/09/29/wide-finde...
>>
>> Essentially, the fellow implemented Bray'sWideFinder
>> (http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder) in
>> Common Lisp. The remarkable thing is that the Lisp version runs about
>> as fast (or as slowly) as the Ruby version. This is kinda surprising,
>> given that Ruby is the canonically slow language and that SBCL is
>> compiled. I figured it'd be an interesting exercise in optimisation.
>
> According to the latest benchmarks (as of 22 Nov, 07:40 UTC),
> the Perl version is the fastest.[1] It just interesting to see if it
> is
> possible to optimize the Common Lisp version similarly to the
> optimizations
> done in the fastest implementations.
>
> [1] http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results
>
> With best regards,
>
> Victor
>
> http://vityok.org.ua
Seems the I/O is the culprit. CL's stream libraries are inefficient. I
think CLISP has the fastest streams and thus provides the best
performance. (I'm not sure I mostly use LispWorks)
The best "optimation" is to set *print-pretty* to nil. (Many
implementations have it nil per default.)
There is a flaw in the design of the pretty printer that makes common
operations very slow.
Appart from that if you are using clsql then when connecting use :pool t
:if-exists :old to pool connections and reuse them rather than make new
all the time. A connection to MySQL will probaly run faster than one to
pgsql. As you probaly gathered the speed of the libraries are more
importand than the speed of the compiled code in most CRUD (Create Read
Update Delete) App's.
--------------
John Thingstad