From: Xah Lee
Subject: what are the most frequently used functions?
Date: 
Message-ID: <1162020254.660829.124900@e3g2000cwe.googlegroups.com>
I had a idea today.

I wanted to know what are the top most frequently used functions in the
emacs lisp language. I thought i can write a quick script that go thru
all the elisp library locations and get a word-frequency report i want.

I started with a simple program:
http://xahlee.org/p/titus/count_word_frequency.py

and applied it to a Shakespeare text. Here's a sample result:
http://xahlee.org/p/titus/word_frequency.html

Then, i wrote a more elaborate one that recurse thru directories to
work on elisp code treasury.

The code is here:
http://xahlee.org/x/count_word_frequency.py

and i got a strange result. The word “the” appeared on the top,
along with many other English words. I quickly realized that these are
due to lisp function's doc strings. (not comments)

At this point, it dawned on me that there's no easy way to work around
this, Unless, i write this script in elisp which has functions that
read lisp code and can easily filter out doc strings.

Originally, i planned to use the word-frequency script on Perl, Python,
as well as Java, as well as Elisp. However, now it seems to me this
task is nigh impossible. Each of these lang has their own doc string
syntax. It's gonna be a heavy undertaking if the word-frequency script
is to work with all these langs, since that amounts to writing a parser
for each lang.

Alternatively, one can write multiple word-frequency scripts using each
lang in question, since most lang has facilities to deal with its own
syntax. However, this is still not trivial, and amounts to several
programing efforts.

Anyone would be interested in this problem?

PS bpalmer on #emacs irc.freenode wrote a elisp quicky to deal with
lisp, but that program is currently not fully working... see bottom
http://paste.lisp.org/display/28840

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

From: Barry Margolin
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <barmar-9B3FB2.10403928102006@comcast.dca.giganews.com>
In article <························@e3g2000cwe.googlegroups.com>,
 "Xah Lee" <···@xahlee.org> wrote:

> I had a idea today.
> 
> I wanted to know what are the top most frequently used functions in the
> emacs lisp language. I thought i can write a quick script that go thru
> all the elisp library locations and get a word-frequency report i want.
> 
> I started with a simple program:
> http://xahlee.org/p/titus/count_word_frequency.py
> 
> and applied it to a Shakespeare text. Here's a sample result:
> http://xahlee.org/p/titus/word_frequency.html
> 
> Then, i wrote a more elaborate one that recurse thru directories to
> work on elisp code treasury.
> 
> The code is here:
> http://xahlee.org/x/count_word_frequency.py
> 
> and i got a strange result. The word “the” appeared on the top,
> along with many other English words. I quickly realized that these are
> due to lisp function's doc strings. (not comments)
> 
> At this point, it dawned on me that there's no easy way to work around
> this, Unless, i write this script in elisp which has functions that
> read lisp code and can easily filter out doc strings.

For Lisp, just look for symbols that are immediately preceded by ( or 
#'.  The tokens after ( are not always functions, since this is also 
used for constructing literal lists and for subforms of special 
operators (e.g. the variable names in LET bindings) but I think the ones 
that aren't functions will have low enough frequency that they won't 
impact the results.

Perl would be harder, I think.  For ordinary function calls you can look 
for a word followed by (, but built-in functions allow use without 
parentheses around the parameters.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: Xah Lee
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <1162101329.856825.186110@e64g2000cwd.googlegroups.com>
Barry Margolin wrote:
« For Lisp, just look for symbols that are immediately preceded by (
...»

Thanks a lot! great thought.

I've done accordingly, which counts satisfactorily.
http://xahlee.org/emacs/function-frequency.html

Will take a break and think about Perl, Python, Java later...  For
Python and Java, i think the report will also have to count method
call since that what these langs deal with... slightly quite more
complex than just functional langs...

  Xah
  ···@xahlee.org
∑ http://xahlee.org/
From: robert
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <ehvevg$614$1@news.albasani.net>
Xah Lee wrote:
> I had a idea today.
> 
> I wanted to know what are the top most frequently used functions in the
> emacs lisp language. I thought i can write a quick script that go thru
> all the elisp library locations and get a word-frequency report i want.
> 
> I started with a simple program:
> http://xahlee.org/p/titus/count_word_frequency.py
> 
> and applied it to a Shakespeare text. Here's a sample result:
> http://xahlee.org/p/titus/word_frequency.html
> 
> Then, i wrote a more elaborate one that recurse thru directories to
> work on elisp code treasury.
> 
> The code is here:
> http://xahlee.org/x/count_word_frequency.py
> 
> and i got a strange result. The word “the” appeared on the top,
> along with many other English words. I quickly realized that these are
> due to lisp function's doc strings. (not comments)

Would be interesting to see if the type-checking "The" in lisp is still frequent. I doubt.

> At this point, it dawned on me that there's no easy way to work around
> this, Unless, i write this script in elisp which has functions that
> read lisp code and can easily filter out doc strings.
> 
> Originally, i planned to use the word-frequency script on Perl, Python,
> as well as Java, as well as Elisp. However, now it seems to me this
> task is nigh impossible. Each of these lang has their own doc string
> syntax. It's gonna be a heavy undertaking if the word-frequency script
> is to work with all these langs, since that amounts to writing a parser
> for each lang.
> 
> Alternatively, one can write multiple word-frequency scripts using each
> lang in question, since most lang has facilities to deal with its own
> syntax. However, this is still not trivial, and amounts to several
> programing efforts.

Editor code (best maybe scintilla/sc1, check also emacs itself, ...) has libraries for colorizing comments in all kinds of programming langs ...

> Anyone would be interested in this problem?

I have a theory, that "bad source code" has more if/else/elif/case/switch dispatching statements per number of code words (lines..) than "good code" - independent of the language.

If you can count these ratio and correlate it to maybe a sf-ranking and to languages, that would be highly interesting for me... (in case drop a pointer in this thread / repeated subject)



-robert
From: J�rgen Exner
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <ghH0h.91$pU3.67@trndny08>
> Xah Lee wrote:
>> I had a idea today.

Oh, really? You should mark your calendar and celebrate the day annually!!!

>> I wanted to know what are the top most frequently used functions in
>> the emacs lisp language.

And the relationship with Perl, Python, Java is exactly what?

jue 
From: Barry Margolin
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <barmar-969446.10342928102006@comcast.dca.giganews.com>
In article <···············@trndny08>,
 "J�rgen Exner" <········@hotmail.com> wrote:

> > Xah Lee wrote:
> >> I had a idea today.
> 
> Oh, really? You should mark your calendar and celebrate the day annually!!!
> 
> >> I wanted to know what are the top most frequently used functions in
> >> the emacs lisp language.
> 
> And the relationship with Perl, Python, Java is exactly what?
> 
> jue 

His script is written in Python, and he wrote:

> Originally, i planned to use the word-frequency script on Perl, Python,
> as well as Java, as well as Elisp.

which makes it on-topic for those groups.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
From: robert
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <ehvh1b$8q8$1@news.albasani.net>
J�rgen Exner wrote:
>> Xah Lee wrote:
>>> I had a idea today.
> 
> Oh, really? You should mark your calendar and celebrate the day annually!!!
> 
>>> I wanted to know what are the top most frequently used functions in
>>> the emacs lisp language.
> 
> And the relationship with Perl, Python, Java is exactly what?

read more of the context and answer to the OP
From: Dr.Ruud
Subject: Re: what are the most frequently used functions?
Date: 
Message-ID: <ehvrno.1d4.1@news.isolution.nl>
robert schreef:

> read more of the context and answer to the OP

That OP is invisible in most relevant contexts.

-- 
Affijn, Ruud

"Gewoon is een tijger."