Depth / Breadth Searches for Web

From: Anonymous
Subject: Depth / Breadth Searches for Web
Date: Mon, 26 Feb 2001 22:49:07 +0000
Message-ID: <3a9adcf8$1_1@anonymous>

I am trying to create an search agent that can compare times of diferent
search methods. I have written some in Java, but it is relatively slow, but
is easy to implement on the web. I am trying to develop one in Lisp, but
does anybody know how to make it search Internet sites? Any Reference
materials? Small Lisp robots?

(PS- I do know about the guidelines for web agents / robots).



  --------== Posted Anonymously via Newsfeeds.Com ==-------
     Featuring the worlds only Anonymous Usenet Server
    -----------== http://www.newsfeeds.com ==----------

Re: Depth / Breadth Searches for Web Anonymous
- Re: Depth / Breadth Searches for Web Joe Marshall
- Re: Depth / Breadth Searches for Web Rainer Joswig
  - Re: Depth / Breadth Searches for Web Anonymous
Re: Depth / Breadth Searches for Web Jochen Schmidt
- Re: Depth / Breadth Searches for Web Anonymous

From: Anonymous
Subject: Re: Depth / Breadth Searches for Web
Date: Mon, 26 Feb 2001 23:28:33 +0000
Message-ID: <3a9ae63a_2@anonymous>

By the way, the anonymous posting is from me - ·······@cs.arizona.edu

Our ISP automatically posts everything anon. Sorry.


"Anonymous" <·········@anonymous.anonymous> wrote in message
·················@anonymous...
> I am trying to create an search agent that can compare times of diferent
> search methods. I have written some in Java, but it is relatively slow,
but
> is easy to implement on the web. I am trying to develop one in Lisp, but
> does anybody know how to make it search Internet sites? Any Reference
> materials? Small Lisp robots?
>
> (PS- I do know about the guidelines for web agents / robots).
>
>
>
>   --------== Posted Anonymously via Newsfeeds.Com ==-------
>      Featuring the worlds only Anonymous Usenet Server
>     -----------== http://www.newsfeeds.com ==----------



  --------== Posted Anonymously via Newsfeeds.Com ==-------
     Featuring the worlds only Anonymous Usenet Server
    -----------== http://www.newsfeeds.com ==----------

From: Joe Marshall
Subject: Re: Depth / Breadth Searches for Web
Date: Mon, 26 Feb 2001 23:55:06 +0000
Message-ID: <itlxgftx.fsf@content-integrity.com>

Anonymous <·········@anonymous.anonymous> writes:

> By the way, the anonymous posting is from me - ·······@cs.arizona.edu
> 
> Our ISP automatically posts everything anon. Sorry.

Your news server supplies other hosts that do not strip off
identification.  Try mammoth.usenet-access.com

>   --------== Posted Anonymously via Newsfeeds.Com ==-------
>      Featuring the worlds only Anonymous Usenet Server
>     -----------== http://www.newsfeeds.com ==----------

I don't know how to get rid of this annoying micro-spam, though.

-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----

From: Rainer Joswig
Subject: Re: Depth / Breadth Searches for Web
Date: Tue, 27 Feb 2001 03:42:16 +0000
Message-ID: <joswig-5792A5.04421627022001@goliath.news.is-europe.net>

In article <··········@anonymous>,
 Anonymous <·········@anonymous.anonymous> wrote:

> By the way, the anonymous posting is from me - ·······@cs.arizona.edu
> 
> Our ISP automatically posts everything anon. Sorry.
> 
> 
> "Anonymous" <·········@anonymous.anonymous> wrote in message
> ·················@anonymous...
> > I am trying to create an search agent that can compare times of diferent
> > search methods. I have written some in Java, but it is relatively slow,
> but
> > is easy to implement on the web. I am trying to develop one in Lisp, but
> > does anybody know how to make it search Internet sites? Any Reference
> > materials? Small Lisp robots?
> >
> > (PS- I do know about the guidelines for web agents / robots).

CL-HTTP has a constraint-guided web-walker.

http://www.ai.mit.edu/projects/iiip/doc/cl-http/w4/w4.html

From: Anonymous
Subject: Re: Depth / Breadth Searches for Web
Date: Tue, 27 Feb 2001 21:19:54 +0000
Message-ID: <3a9c1995_3@anonymous>

I checked out the web walker site. Good stuff. Do you work with this at all
or are you familiar with it's implementation at all? It seems really
interesting, thanks for the tip.


"Rainer Joswig" <······@corporate-world.lisp.de> wrote in message
·································@goliath.news.is-europe.net...
> In article <··········@anonymous>,
>  Anonymous <·········@anonymous.anonymous> wrote:
>
> > By the way, the anonymous posting is from me - ·······@cs.arizona.edu
> >
> > Our ISP automatically posts everything anon. Sorry.
> >
> >
> > "Anonymous" <·········@anonymous.anonymous> wrote in message
> > ·················@anonymous...
> > > I am trying to create an search agent that can compare times of
diferent
> > > search methods. I have written some in Java, but it is relatively
slow,
> > but
> > > is easy to implement on the web. I am trying to develop one in Lisp,
but
> > > does anybody know how to make it search Internet sites? Any Reference
> > > materials? Small Lisp robots?
> > >
> > > (PS- I do know about the guidelines for web agents / robots).
>
> CL-HTTP has a constraint-guided web-walker.
>
> http://www.ai.mit.edu/projects/iiip/doc/cl-http/w4/w4.html



  --------== Posted Anonymously via Newsfeeds.Com ==-------
     Featuring the worlds only Anonymous Usenet Server
    -----------== http://www.newsfeeds.com ==----------

From: Jochen Schmidt
Subject: Re: Depth / Breadth Searches for Web
Date: Mon, 26 Feb 2001 23:25:29 +0000
Message-ID: <97eoc2$omkt4$1@ID-22205.news.dfncis.de>

Anonymous wrote:

> I am trying to create an search agent that can compare times of diferent
> search methods. I have written some in Java, but it is relatively slow,
> but is easy to implement on the web. I am trying to develop one in Lisp,
> but does anybody know how to make it search Internet sites? Any Reference
> materials? Small Lisp robots?
> 
> (PS- I do know about the guidelines for web agents / robots).

I had an idea to this topic some time ago - it is possibly nothing new but 
perhaps it helps someone. I would be interested in what others would say to 
this idea.

The idea is to develop a herd of agents with different tasks. Some of them 
do not much more than try to get a whole local-site (that means given an 
url they collect all relative referenced sites to that url they can reach. 
- relative referenced means here also absolute links which would lead to an 
adress at the same url and that are therefore relative in reality)

Now imagine another agent that doesn't directly search through the pages 
but that's task is to concatenate all pages of one URL together (at least 
logically). Now we take a synonym database like probably "Wordnet" and 
search the concatenated site for direct hits of the search-words or their 
synonyms. now we try to calculate a discrete probability function that maps 
over all words of the concateneted site. Whenever we find a direct 
searchword we get a "Spike" (the more related to a direct word the higher) 
and also increase the function-values in the environment of this spike 
through a gauss-distribution-curve. After doing this for the whole site we 
look in those parts of the probability-function that lies above some chosen 
threshold-value if we can find some extern links. If we find them we the 
agent selects one of the free collector-agents and giving him the task to 
fetch the found URL.

Another agent may try to analyze what other search engines deliver upon the 
search-words. One agent may search through Usenet and another may seach in 
IRC. All agents can run on completely different servers on completely 
different LANs giving them probably a huge bandwidth.

This is maybe not really new but I would like to here what others think 
upon it.

Regards,
Jochen

> 
> 
> 
>   --------== Posted Anonymously via Newsfeeds.Com ==-------
>      Featuring the worlds only Anonymous Usenet Server
>     -----------== http://www.newsfeeds.com ==----------
>

From: Anonymous
Subject: Re: Depth / Breadth Searches for Web
Date: Mon, 26 Feb 2001 23:46:59 +0000
Message-ID: <3a9aea8b_2@anonymous>

I have actually thought of someting similiar. I would like to create a
series of goal oriented autonomous agents, that can perform memory dumps on
my machine to help keep bandwidth down, (plus I have my own sites to test
on). However, since most people communicate differently (as in the typical
structure of their senetences - ie the way the structure sentences through
verb phrases, noun phrases, ect), they are likely to want to conduct
searches differently, so if the agents could use linguistic features to
'learn' the users speach formation patterns they could look for spikes that
would be based on an individual users tenencies. I havn't really considered
the value of using gauss-distribution-curve, but it would be interesting.

The other idea would be to create a master agent, one that tracks all of the
individual agents and groups them to 'personality'. ie - an agent team
creates a 'person' and has 'learned' their gramatical tendnecies. Then like
'people' are grouped under a master agent 'personality' to perhaps give more
advantage to the 'learning' of an agent group. Far fetched I know, but I
have some ideas of how to make this happen, and have had limited success
with my Java agents. I feel that the language is hurting me though, because
it is not best language for dealing with lists.
(·······@cs.arizona.edu)

"Jochen Schmidt" <···@dataheaven.de> wrote in message
···················@ID-22205.news.dfncis.de...
> Anonymous wrote:
>
> > I am trying to create an search agent that can compare times of diferent
> > search methods. I have written some in Java, but it is relatively slow,
> > but is easy to implement on the web. I am trying to develop one in Lisp,
> > but does anybody know how to make it search Internet sites? Any
Reference
> > materials? Small Lisp robots?
> >
> > (PS- I do know about the guidelines for web agents / robots).
>
> I had an idea to this topic some time ago - it is possibly nothing new but
> perhaps it helps someone. I would be interested in what others would say
to
> this idea.
>
> The idea is to develop a herd of agents with different tasks. Some of them
> do not much more than try to get a whole local-site (that means given an
> url they collect all relative referenced sites to that url they can reach.
> - relative referenced means here also absolute links which would lead to
an
> adress at the same url and that are therefore relative in reality)
>
> Now imagine another agent that doesn't directly search through the pages
> but that's task is to concatenate all pages of one URL together (at least
> logically). Now we take a synonym database like probably "Wordnet" and
> search the concatenated site for direct hits of the search-words or their
> synonyms. now we try to calculate a discrete probability function that
maps
> over all words of the concateneted site. Whenever we find a direct
> searchword we get a "Spike" (the more related to a direct word the higher)
> and also increase the function-values in the environment of this spike
> through a gauss-distribution-curve. After doing this for the whole site we
> look in those parts of the probability-function that lies above some
chosen
> threshold-value if we can find some extern links. If we find them we the
> agent selects one of the free collector-agents and giving him the task to
> fetch the found URL.
>
> Another agent may try to analyze what other search engines deliver upon
the
> search-words. One agent may search through Usenet and another may seach in
> IRC. All agents can run on completely different servers on completely
> different LANs giving them probably a huge bandwidth.
>
> This is maybe not really new but I would like to here what others think
> upon it.
>
> Regards,
> Jochen
>
> >
> >
> >
> >   --------== Posted Anonymously via Newsfeeds.Com ==-------
> >      Featuring the worlds only Anonymous Usenet Server
> >     -----------== http://www.newsfeeds.com ==----------
> >
>

  --------== Posted Anonymously via Newsfeeds.Com ==-------
     Featuring the worlds only Anonymous Usenet Server
    -----------== http://www.newsfeeds.com ==----------