I'm trying to write a function that will take a string and
return the words in the string as a list of symbols, excluding
non-sense characters and punctuation marks. e.g.
==> (string-to-symbol "This is a sentence with some *&^%$ character's")
(THIS IS A SENTENCE WITH SOME CHARACTER S)
It seems like something that you can do with
read-from-string, but I can't figure out how.
Any help would be greatly appreciated.
Ken
--
________________________________________________________________________
while she died, while she died,
i listened to women singing.
your words then fell into me,
like stones raining on the ground. -- LBH
________________________________________________________________________
In article <············@news.fas.harvard.edu>,
Kenneth Liu <·····@fas.harvard.edu> wrote:
>I'm trying to write a function that will take a string and
>return the words in the string as a list of symbols, excluding
>non-sense characters and punctuation marks. e.g.
>
>==> (string-to-symbol "This is a sentence with some *&^%$ character's")
>
>(THIS IS A SENTENCE WITH SOME CHARACTER S)
>
>It seems like something that you can do with
>read-from-string, but I can't figure out how.
To do it with READ-FROM-STRING, you would have to create a readtable where
all the punctuation characters have the syntax of #\space.
Or you could scan through the string for punctuation characters, and call
READ-FROM-STRING repeatedly on each substring you identify, or just call
(intern (string-upcase string :start start :end end)) directly.
--
Barry Margolin, ······@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
Support the anti-spam movement; see <http://www.cauce.org/>
Please don't send technical questions directly to me, post them to newsgroups.
Kenneth Liu wrote:
> I'm trying to write a function that will take a string and
> return the words in the string as a list of symbols, excluding
> non-sense characters and punctuation marks. e.g.
>
> ==> (string-to-symbol "This is a sentence with some *&^%$ character's")
Actually, characters shouldn't have an apostrophe in the above sentence.
dave
You (Kenneth Liu <·····@fas.harvard.edu>) wrote
in newsgroup comp.lang.lisp, on 24 Dec 1997 15:24:35 GMT:
>
> I'm trying to write a function that will take a string and
> return the words in the string as a list of symbols, excluding
> non-sense characters and punctuation marks. e.g.
>
> ==> (string-to-symbol "This is a sentence with some *&^%$ character's")
>
> (THIS IS A SENTENCE WITH SOME CHARACTER S)
>
> It seems like something that you can do with
> read-from-string, but I can't figure out how.
>
> Any help would be greatly appreciated.
>
(defun string-to-list (string)
"Convert a string to a list of words."
(read-from-string (concatenate 'string "(" string ")")))
(defun remove-punctuation (string)
"Replace a strings punctuation characters with spaces."
(substitute-if #\space #'punctuation-p string))
(defun punctuation-p (chars) (find chars "*&^%$"))
(string-to-list (remove-punctuation "This is a sentence with some *&^%$ character's"))
==> (THIS IS A SENTENCE WITH SOME CHARACTER 'S) ;
==> 48
As "Dave" previously correctly stated, characters shouldn't have an apostrophe in the
above sentence (see the above result).
Hope this helps?
________________________________________________________________
Valentino Kyriakides Universitaet Hamburg, FB. Informatik
Arbeitsbereich Softwaretechnik
________________________________________________________________
E-Mail: ········@informatik.uni-hamburg.de (ASCII, MIME)
NeXTmail: ····················@public.uni-hamburg.de
________________________________________________________________
Thanks to everyone who responded. You've all been very helpful, and
sorry about the "character's" goof. I'm usually much better at
grammar (considering I'm an English major;)
Ken
Valentino Kyriakides <····················@public.uni-hamburg.de> wrote:
: You (Kenneth Liu <·····@fas.harvard.edu>) wrote
: in newsgroup comp.lang.lisp, on 24 Dec 1997 15:24:35 GMT:
: >
: > I'm trying to write a function that will take a string and
: > return the words in the string as a list of symbols, excluding
: > non-sense characters and punctuation marks. e.g.
: >
: > ==> (string-to-symbol "This is a sentence with some *&^%$ character's")
: >
: > (THIS IS A SENTENCE WITH SOME CHARACTER S)
: >
: > It seems like something that you can do with
: > read-from-string, but I can't figure out how.
: >
: > Any help would be greatly appreciated.
: >
: (defun string-to-list (string)
: "Convert a string to a list of words."
: (read-from-string (concatenate 'string "(" string ")")))
: (defun remove-punctuation (string)
: "Replace a strings punctuation characters with spaces."
: (substitute-if #\space #'punctuation-p string))
:
: (defun punctuation-p (chars) (find chars "*&^%$"))
: (string-to-list (remove-punctuation "This is a sentence with some *&^%$ character's"))
: ==> (THIS IS A SENTENCE WITH SOME CHARACTER 'S) ;
: ==> 48
: As "Dave" previously correctly stated, characters shouldn't have an apostrophe in the
: above sentence (see the above result).
: Hope this helps?
: ________________________________________________________________
: Valentino Kyriakides Universitaet Hamburg, FB. Informatik
: Arbeitsbereich Softwaretechnik
: ________________________________________________________________
: E-Mail: ········@informatik.uni-hamburg.de (ASCII, MIME)
: NeXTmail: ····················@public.uni-hamburg.de
: ________________________________________________________________
--
________________________________________________________________________
while she died, while she died,
i listened to women singing.
your words then fell into me,
like stones raining on the ground. -- LBH
________________________________________________________________________