From: Mark Conrad
Subject: How to dig up all old newsgroup posts?
Date: 
Message-ID: <010420031028290690%nospam@iam.invalid>
Excuse this dumb question, but I do not know how to go about digging up
all the old newsgroup postings of this newsgroup.

I assume Google is used, and I assume I better allow lots of storage
and time for the download.

Can anyone suggest how I should word my Google request, and should I
quote some of the terms in the request?

My reason for doing this is to cull out those posts in the past that
interest me, and make some sort of huge final file containing all the
posts of interest.

I am trying to get on board regarding Common Lisp, and I find the older
posts to have very high educational value, far exceeding what can be
found in the usual Lisp books that I have access to.

Thanks for any and all suggestions as to exactly how I should use
Google for my 'learning-project'.

Mark-

From: Nils Goesche
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <87pto6vz4q.fsf@darkstar.cartan>
Mark Conrad <······@iam.invalid> writes:

> Excuse this dumb question, but I do not know how to go about
> digging up all the old newsgroup postings of this newsgroup.
> 
> I assume Google is used, and I assume I better allow lots of
> storage and time for the download.

When you are through with all of some 79000 comp.lang.lisp
postings you'll be ripe for an asylum, I guess.

Go to http://groups.google.com and click on ``Advanced Group
Search��.

Regards,
-- 
Nils G�sche
Ask not for whom the <CONTROL-G> tolls.

PGP key ID #xD26EF2A0
From: Nils Goesche
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <87llyuvyxw.fsf@darkstar.cartan>
Nils Goesche <···@cartan.de> writes:

> .. some 79000 comp.lang.lisp postings

Make that: ``some 79000 /threads/��!!

You have been warned.

Regards,
-- 
Nils G�sche
Ask not for whom the <CONTROL-G> tolls.

PGP key ID #xD26EF2A0
From: Mark Conrad
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <010420032206556504%nospam@iam.invalid>
In article <··············@darkstar.cartan>, Nils Goesche
<···@cartan.de> wrote:

> > Excuse this dumb question, but I do not know how to go about
> > digging up all the old newsgroup postings of this newsgroup.
> > 
> > I assume Google is used, and I assume I better allow lots of
> > storage and time for the download.
> 
> When you are through with all of some 79000 comp.lang.lisp
> postings you'll be ripe for an asylum, I guess.
> 
> Go to http://groups.google.com and click on ``Advanced Group
> Search��.


Wow, did not realize there were that many threads..

Thanks for the info' about getting started with the Google search.

Mark-
From: Peter Seibel
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <m3pto5to7k.fsf@localhost.localdomain>
Mark Conrad <······@iam.invalid> writes:

> Wow, did not realize there were that many threads..

And be sure, once you're done with comp.lang.lisp that you then deal
with all the net.lang.lisp postings. ;-) You only have 21 years of
catching up to do.

Here's a starting point. Perhaps *the* starting point:

<http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=anews.Aucbarpa.997>

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

  The intellectual level needed   for  system design is  in  general
  grossly  underestimated. I am  convinced  more than ever that this
  type of work is very difficult and that every effort to do it with
  other than the best people is doomed to either failure or moderate
  success at enormous expense. --Edsger Dijkstra
From: Mark Conrad
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <020420030853481188%nospam@iam.invalid>
In article <··············@localhost.localdomain>, Peter Seibel
<·····@javamonkey.com> wrote:

> > Wow, did not realize there were that many threads..
> 
> And be sure, once you're done with comp.lang.lisp that you then deal
> with all the net.lang.lisp postings. ;-) You only have 21 years of
> catching up to do.

You got that right.

It might be worth my effort to spend time trying to automate my
preferences as to what posts to save versus what posts to toss out.

That may not be possible, however, given the present state of the art
in getting software to "guess" what might be of interest to me.

Lots of times pearls of wisdom are buried in the text of a post, hard
for a human to extract, let alone a computer script.

Now, a puzzle for anyone out there  ;-)

How many "extremely valuable" old posts would any given person have to
collect in order to make sure he extracted everything of interest to
himself - - - probably an impossibly large number of posts.

This just might be the age when we should specialize our knowledge,
like the medical profession does.<g>

Mark-
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <JTmia.11$af5.871@paloalto-snr1.gtei.net>
In article <·························@iam.invalid>,
Mark Conrad  <······@iam.invalid> wrote:
>Excuse this dumb question, but I do not know how to go about digging up
>all the old newsgroup postings of this newsgroup.
>
>I assume Google is used, and I assume I better allow lots of storage
>and time for the download.

I don't think Google provides any way to download all the postings in a
batch.  You'd probably have to write a script that does it.

BTW, I once did something similar to this.  Back when Common Lisp was first
being developed in the early 80's, the discussion took place on an Arpanet
mailing list.  Sometime in the mid-to-late 80's I printed out the archive
of this, and read all the discussions that led to the production of CLTL.

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Henrik Motakef
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <87brzq7xab.fsf@interim.henrik-motakef.de>
Barry Margolin <··············@level3.com> writes:

> BTW, I once did something similar to this.  Back when Common Lisp was first
> being developed in the early 80's, the discussion took place on an Arpanet
> mailing list.  Sometime in the mid-to-late 80's I printed out the archive
> of this, and read all the discussions that led to the production of CLTL.

You don't have this archive still available electronically, by chance?
Or know where one could get it elsewhere?

Regards
Henrik
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfwznnand00.fsf@shell01.TheWorld.com>
Henrik Motakef <··············@web.de> writes:

> Barry Margolin <··············@level3.com> writes:
> 
> > BTW, I once did something similar to this.  Back when Common Lisp was first
> > being developed in the early 80's, the discussion took place on an Arpanet
> > mailing list.  Sometime in the mid-to-late 80's I printed out the archive
> > of this, and read all the discussions that led to the production of CLTL.
> 
> You don't have this archive still available electronically, by chance?
> Or know where one could get it elsewhere?

Dunno.  Might ask the authors of "Kneejerk Anti-LOOPism and
other Email Phenomena"
 http://ccs.mit.edu/papers/CCSWP150.html
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <0fnia.15$af5.810@paloalto-snr1.gtei.net>
In article <··············@interim.henrik-motakef.de>,
Henrik Motakef  <··············@web.de> wrote:
>Barry Margolin <··············@level3.com> writes:
>
>> BTW, I once did something similar to this.  Back when Common Lisp was first
>> being developed in the early 80's, the discussion took place on an Arpanet
>> mailing list.  Sometime in the mid-to-late 80's I printed out the archive
>> of this, and read all the discussions that led to the production of CLTL.
>
>You don't have this archive still available electronically, by chance?
>Or know where one could get it elsewhere?

I never had a personal electronic copy, I printed it out (it was several
large binders).  I held on to it across a few office moves, but eventually
I tossed it.  It's presumably on the backup tapes of the old MIT-AI
machine, and maybe someone has those archived somewhere accessible.

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Pascal Costanza
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <costanza-546E60.02322802042003@news.netcologne.de>
In article <················@paloalto-snr1.gtei.net>,
 Barry Margolin <··············@level3.com> wrote:

> In article <··············@interim.henrik-motakef.de>,
> Henrik Motakef  <··············@web.de> wrote:
> >Barry Margolin <··············@level3.com> writes:
> >
> >> BTW, I once did something similar to this.  Back when Common Lisp was first
> >> being developed in the early 80's, the discussion took place on an Arpanet
> >> mailing list.  Sometime in the mid-to-late 80's I printed out the archive
> >> of this, and read all the discussions that led to the production of CLTL.
> >
> >You don't have this archive still available electronically, by chance?
> >Or know where one could get it elsewhere?
> 
> I never had a personal electronic copy, I printed it out (it was several
> large binders).  I held on to it across a few office moves, but eventually
> I tossed it.  It's presumably on the backup tapes of the old MIT-AI
> machine, and maybe someone has those archived somewhere accessible.

Is http://www.apl.jhu.edu/~hall/lisp/Early-CL-History.text the stuff you are talking about?

(Don't ask me how I found this...)


Pascal

-- 
"If I could explain it, I wouldn't be able to do it."
A.M.McKenzie
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <NkDia.2$NE1.372@paloalto-snr1.gtei.net>
In article <······························@news.netcologne.de>,
Pascal Costanza  <········@web.de> wrote:
>Is http://www.apl.jhu.edu/~hall/lisp/Early-CL-History.text the stuff you
>are talking about?

No, I was talking about something that filled several hundred sheets of
paper when I printed it 2-up double-sided.

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Thaddeus L Olczyk
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <j7pk8voi9t4unvoipe2v2m67i7d4ou9hc7@4ax.com>
On Tue, 01 Apr 2003 21:10:20 GMT, Barry Margolin
<··············@level3.com> wrote:

>In article <··············@interim.henrik-motakef.de>,
>Henrik Motakef  <··············@web.de> wrote:
>>Barry Margolin <··············@level3.com> writes:
>>
>>> BTW, I once did something similar to this.  Back when Common Lisp was first
>>> being developed in the early 80's, the discussion took place on an Arpanet
>>> mailing list.  Sometime in the mid-to-late 80's I printed out the archive
>>> of this, and read all the discussions that led to the production of CLTL.
>>
>>You don't have this archive still available electronically, by chance?
>>Or know where one could get it elsewhere?
>
>I never had a personal electronic copy, I printed it out (it was several
>large binders).  I held on to it across a few office moves, but eventually
>I tossed it.  It's presumably on the backup tapes of the old MIT-AI
>machine, and maybe someone has those archived somewhere accessible.
It's scary what kinds of things are archived from old system.
One shudders at future CS archeologists plumbing the depths
of newly discovered archives, trying to get a clue why CS took this
turn or that.
--------------------------------------------------
Thaddeus L. Olczyk, PhD
Think twice, code once.
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfwd6k55nup.fsf@shell01.TheWorld.com>
Thaddeus L Olczyk <······@interaccess.com> writes:

> It's scary what kinds of things are archived from old system.
> One shudders at future CS archeologists plumbing the depths
> of newly discovered archives, trying to get a clue why CS took this
> turn or that.

I am particularly worried about the paper records. They will eventually
fade, tear, or get thrown out.

In the far distant future, looking back in time at the beginning of
computer science will be like looking at the distant end of the
Universe through the Hubble telescope.  By examining early web
archives and Deja News records, we'll be able to get really, really
close to the Big Bang that started it all.  But the last few decades,
from 1994 and before going back another 3 or 4 decades, will be
missing because it was on paper.  So the true nature of the
Informational Big Bang will elude people.  

This is one reason I freely post my imperfect recollections of the
obscure corner of the older times that I was privvy to here on the
relevant newsgroup, so there is at least the echo of what happened
from one person's point of view.  And I assume others do likewise here
and elsewhere, to add dimensionality to that kind of record.

But I see a great deal of historical revisionism that comes from
people with fading memories, people with out of control egos, and
people of good intent who are just confused or misled by others'
accounts and who repeat the wrong information until it sounds like
fact.  What is really needed to counter this is not debate but first
hand subjectmatter.  I've done what I can, when I can afford the time,
to take my own paper records and put them online.  I've seen others do
likewise.  But there is still much to do, and anyone else with
personal sets of hardcopy should do what they can to preserve the
historical record into electronic media for the sake of History.
From: Raymond Toy
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <4nel4lf2vo.fsf@edgedsp4.rtp.ericsson.se>
>>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:

    Kent> fact.  What is really needed to counter this is not debate but first
    Kent> hand subjectmatter.  I've done what I can, when I can afford the time,
    Kent> to take my own paper records and put them online.  I've seen others do
    Kent> likewise.  But there is still much to do, and anyone else with
    Kent> personal sets of hardcopy should do what they can to preserve the
    Kent> historical record into electronic media for the sake of History.

Even that can be hard.  Technology is changing so fast that electronic
media of today has a good chance not working in the relatively near
future.  A serious problem that I think people are looking at.

Ray
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfwd6k4ga24.fsf@shell01.TheWorld.com>
Raymond Toy <···@rtp.ericsson.se> writes:

> >>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:
> 
>     Kent> fact.  What is really needed to counter this is not debate
>     Kent> but first hand subjectmatter.  I've done what I can, when
>     Kent> I can afford the time, to take my own paper records and
>     Kent> put them online.  I've seen others do likewise.  But there
>     Kent> is still much to do, and anyone else with personal sets of
>     Kent> hardcopy should do what they can to preserve the
>     Kent> historical record into electronic media for the sake of
>     Kent> History.
> 
> Even that can be hard.  Technology is changing so fast that electronic
> media of today has a good chance not working in the relatively near
> future.  A serious problem that I think people are looking at.

I am willing to trust that anything stored in today's documented media
(e.g., HTML or GIF or PostScript/PDF) will be accessible to future 
historians.

Also, even if they lose the specs, reverse engineering HTML won't be
that difficult, which is why it's my medium of choice.

And, for that matter, web crawlers exhaustively archiving todays' internet 
for posterity have probably also accidentally archived some bootlegged
copies of the programs necessary to read PDF, etc. ;)

Which is not to say that creating enduring formats don't have a purpose,
but I really think HTML is, accidentally or not, a pretty good one.
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <%6Eia.10$NE1.389@paloalto-snr1.gtei.net>
In article <···············@shell01.TheWorld.com>,
Kent M Pitman  <······@world.std.com> wrote:
>Raymond Toy <···@rtp.ericsson.se> writes:
>> Even that can be hard.  Technology is changing so fast that electronic
>> media of today has a good chance not working in the relatively near
>> future.  A serious problem that I think people are looking at.
>
>I am willing to trust that anything stored in today's documented media
>(e.g., HTML or GIF or PostScript/PDF) will be accessible to future 
>historians.
>
>Also, even if they lose the specs, reverse engineering HTML won't be
>that difficult, which is why it's my medium of choice.

I don't think it's the document format that's the main problem, it's the
storage technology.  I think you'd have to go on a long hunt to be able to
read a DECtape these days, and even 9-track tape drives are pretty scarce.

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Nils Goesche
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <87of3o6dhb.fsf@darkstar.cartan>
Barry Margolin <··············@level3.com> writes:

> In article <···············@shell01.TheWorld.com>,
> Kent M Pitman  <······@world.std.com> wrote:

> >Also, even if they lose the specs, reverse engineering HTML
> >won't be that difficult, which is why it's my medium of
> >choice.
> 
> I don't think it's the document format that's the main problem,
> it's the storage technology.

Right -- if they could decipher texts from ancient Egypt, PDF
should be managable, too :-)

> I think you'd have to go on a long hunt to be able to read a
> DECtape these days, and even 9-track tape drives are pretty
> scarce.

But I bet they last longer than a number of paperback books of
mine which, printed on non-acid-free paper, practically burn away
while I watch.

Regards,
-- 
Nils G�sche
Ask not for whom the <CONTROL-G> tolls.

PGP key ID #xD26EF2A0
From: Don Geddis
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <m3wuic7ysl.fsf@maul.geddis.org>
Barry Margolin <··············@level3.com> writes:
> I don't think it's the document format that's the main problem, it's the
> storage technology.  I think you'd have to go on a long hunt to be able to
> read a DECtape these days, and even 9-track tape drives are pretty scarce.

I wonder if the net is resolving this problem.  Put documents online behind
a web server, and it doesn't matter if the underlying hardware changes.
So: instead of backing up to tape or CD or DVD, just back up to a 200GB
hard drive running online somewhere.  As long as hard drive technology grows
significantly faster than your archiving needs (not unreasonable), you should
be able to survive by constantly keeping your records on the net, and perhaps
copying them over to new hardware every 5-10 years.

_______________________________________________________________________________
Don Geddis                    http://don.geddis.org              ···@geddis.org
From: James A. Crippen
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <m3r88jc1so.fsf@kappa.unlambda.com>
Don Geddis <···@geddis.org> writes:

> Barry Margolin <··············@level3.com> writes:
>> I don't think it's the document format that's the main problem, it's the
>> storage technology.  I think you'd have to go on a long hunt to be able to
>> read a DECtape these days, and even 9-track tape drives are pretty scarce.

There are some people involved in PDP-10 emulation that have the
ability to read DECtapes and 9-tracks.  As for other formats, your
mileage may vary.

Of course for punched cards and paper tape it's easy.  Just use the
eyeball-finger interface provided with every computer user.  Transfer
rate is kinda low, though.

> I wonder if the net is resolving this problem.  Put documents online behind
> a web server, and it doesn't matter if the underlying hardware changes.
> So: instead of backing up to tape or CD or DVD, just back up to a 200GB
> hard drive running online somewhere.  As long as hard drive technology grows
> significantly faster than your archiving needs (not unreasonable), you should
> be able to survive by constantly keeping your records on the net, and perhaps
> copying them over to new hardware every 5-10 years.

The real way to back up online is just to put your random stuff online
and get other people to look at it.  There's a high likelihood that if
it's important enough other people will download local copies,
particularly if it's something large and unwieldy, like your average
tarball.

Linus Torvalds joked about this at one point.  He lost a copy of
kernel sources in some random crash so he just asked other people if
they had made copies.  Of course someone else had one from his recent
posting of it, so he got the restored copy from them.  Instant backup.

As for news messages, it appears that except for some spam and certain
(ahem) newsgroups Google seems to be working hard at caching
everything ever posted to Usenet since at least 1993, or maybe earlier
(I've forgotten).  Other groups seem to have threatened to cache the
contents of as many websites as possible, but I haven't looked into
the effectiveness of that sort of effort.

So it remains that the best way to backup data is still to make it so
important to someone else that they do the backing up for you, 'for
their own good'.

This is nothing new, of course.  Just look at what the police
organizations or credit reporting agencies do with your data.  You
make it seem so indispensable to them that they keep the data forever
without you having to do a thing to make sure it's backed up.  Easy!

'james

-- 
James A. Crippen <james at unlambda.com> Lambda Unlimited
61.2204N, -149.8964W                     Recursion 'R' Us
Anchorage, Alaska, USA, Earth            Y = \f.(\x.f(xx))(\x.f(xx))
From: Chris Beggy
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <87znn8pvkp.fsf@lackawana.kippona.com>
Raymond Toy <···@rtp.ericsson.se> writes:

>>>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:
>
>     Kent> fact.  What is really needed to counter this is not debate but first
>     Kent> hand subjectmatter.  I've done what I can, when I can afford the time,
>     Kent> to take my own paper records and put them online.  I've seen others do
>     Kent> likewise.  But there is still much to do, and anyone else with
>     Kent> personal sets of hardcopy should do what they can to preserve the
>     Kent> historical record into electronic media for the sake of History.
>
> Even that can be hard.  Technology is changing so fast that electronic
> media of today has a good chance not working in the relatively near
> future.  A serious problem that I think people are looking at.

Or a serious problem that people are laughing at:

   http://www.deadmedia.org/

Chris
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <hpDia.3$NE1.283@paloalto-snr1.gtei.net>
In article <··············@edgedsp4.rtp.ericsson.se>,
Raymond Toy  <···@rtp.ericsson.se> wrote:
>>>>>> "Kent" == Kent M Pitman <······@world.std.com> writes:
>
>    Kent> fact.  What is really needed to counter this is not debate but first
>    Kent> hand subjectmatter.  I've done what I can, when I can afford the time,
>    Kent> to take my own paper records and put them online.  I've seen others do
>    Kent> likewise.  But there is still much to do, and anyone else with
>    Kent> personal sets of hardcopy should do what they can to preserve the
>    Kent> historical record into electronic media for the sake of History.
>
>Even that can be hard.  Technology is changing so fast that electronic
>media of today has a good chance not working in the relatively near
>future.  A serious problem that I think people are looking at.

Heard what I think was an April Fools story last night on NPR, about a
group at the Library of Congress working on archiving all their sound
recordings onto 78 RPM phonograph records.  The premise was that although
this may not be the highest fidelity, it's the most durable medium.
Technology may make CD's, DVD's, and MP3's unreadable in the future, but
you can listen to a phonograph simply by putting a pin in the groove and
turning it with your hand.  So even if we nuke ourselves back to the stone
age, we'll be able to recover these recordings.

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Joe Marshall
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <of3pc55w.fsf@ccs.neu.edu>
Kent M Pitman <······@world.std.com> writes:

> I am particularly worried about the paper records. They will eventually
> fade, tear, or get thrown out.

Paper records last longer than electronic ones.
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfw8yusg9x1.fsf@shell01.TheWorld.com>
Joe Marshall <···@ccs.neu.edu> writes:

> Kent M Pitman <······@world.std.com> writes:
> 
> > I am particularly worried about the paper records. They will eventually
> > fade, tear, or get thrown out.
> 
> Paper records last longer than electronic ones.

Individually, probably.  If all of civilization destroys itself, it
won't matter.  But there's a lot of archiving/replication/spidercaching
going on such that I think things that exist for a long time in public view
are pretty safe.

Not that I don't make personal backups of my stuff onto more than one
CD-ROM (in case of scratches and such) and periodically move them to a
safe deposit box geographically separated from my house by a large 
distance...
From: Ivan Boldyrev
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <qm5ulxqht.ln2@elaleph.borges.cgitftp.uiggm.nsc.ru>
On 8336 day of my life Joe Marshall wrote:
> Kent M Pitman <······@world.std.com> writes:
> 
> > I am particularly worried about the paper records. They will eventually
> > fade, tear, or get thrown out.
> 
> Paper records last longer than electronic ones.

Really.  I saw books published 100 years ago and more.  But have
anyone ever seen 50-years old electronic records? :-)

-- 
Ivan Boldyrev
PGP fp: 3640 E637 EE3D AA51 A59F 3306 A5BD D198 5609 8673  ID 56098673

Violets are red, Roses are blue. //
I'm schizophrenic, And so am I.
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfwpto3jl4h.fsf@shell01.TheWorld.com>
Ivan Boldyrev <···············@cgitftp.uiggm.nsc.ru> writes:

> On 8336 day of my life Joe Marshall wrote:
> > Kent M Pitman <······@world.std.com> writes:
> > 
> > > I am particularly worried about the paper records. They will eventually
> > > fade, tear, or get thrown out.
> > 
> > Paper records last longer than electronic ones.
> 
> Really.  I saw books published 100 years ago and more.  But have
> anyone ever seen 50-years old electronic records? :-)




Btw, I don't think most of the records are "published".

Most of the things I'm thinking about were spun off of dot-matrix or
laser printers and are poorly preserved.  They are largely without
covers and not on acid-free paper.  They fade due to light, the ink
sticks together from page to page and transfers onto other pages if
the pages are fan-folded so that pages with text are juxtaposed.
I have a ton of stuff like this, perhaps almost literally, and have
recently observed that about 25% of it that was in my supposedly
moisture-controlled storage room ended up under some moisture anyway
somehow, and was so far gone that it needs to simply be disposed of.

And I hate to sound anti-establishment, but I think a very wrong and
biased view of history comes from only considering published books and
PhD-level reports.  As they say, the "winners" write the history books.
But that doesn't mean they write everything ever written.  Important
records of ANSI CL include the set of _failed_ cleanups.  I certainly
have copies of them, even though they don't seem to still exist at
the online document archives in
 ftp://ftp.parc.xerox.com/pub/cl/
(Possibly because they failed they were never publicly available,
or maybe they are available embedded in the mail.)  This is stuff that
happens to be online, but there are many such things that aren't.
Working Papers at the MIT AI Lab are documents that were, by their
nature, not by their subsequent importance, declared to be not fit for
reference.  That was a weird way to do things--there are important
"working papers" and not-very-important "memos" as a result.  And I
have random little scraps of printout in my files that are things that
batted about the net as useful wisdom from before there was a web,
design discussions on mailing lists that were probably never archived,
etc.  All of this WILL fade and in less than 100 years if I don't do
something to actively prevent it.

The fact that some hardcopy documents are long-lived is not proof that
putting things into hardcopy will make things live long.
From: Barry Margolin
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <dm4ja.31$jn5.1096@paloalto-snr1.gtei.net>
In article <···············@shell01.TheWorld.com>,
Kent M Pitman  <······@world.std.com> wrote:
>Most of the things I'm thinking about were spun off of dot-matrix or
>laser printers and are poorly preserved.  They are largely without
>covers and not on acid-free paper.  They fade due to light, the ink
>sticks together from page to page and transfers onto other pages if
>the pages are fan-folded so that pages with text are juxtaposed.
>I have a ton of stuff like this, perhaps almost literally, and have
>recently observed that about 25% of it that was in my supposedly
>moisture-controlled storage room ended up under some moisture anyway
>somehow, and was so far gone that it needs to simply be disposed of.

One of the things I've noticed about the modern, electronic age is that it
seems like much less stuff gets saved.

Every year or so you hear reports about an archivist unearthing a
previously-unknown manuscript by someone like Mozart.  Most ex-presidents
have libraries where you can find just about everything they've written,
from inauguration speeches down to memos.

Each of us has probably written thousands of computer programs in our
careers, but unless you're a real pack-rat, what's the chance that a
throw-away script you wrote years ago will show up anywhere?  It might be
on some ancient backup tape -- if they haven't been recycled and if it's
still readable.  There's probably a museum where you can see the first
thing Shakespeare wrote, but I'll be damned if I can dig up my first Emacs
init file.  These things got into museums because their creators were
considered masters, but at the time that they were saving them they didn't
know that history would eventually brand them this way; they were just guys
doing their jobs, and saving and cataloguing your work was SOP on those
days.  Someone even has the book in whose margin Fermat scribbled his
famous Last Theorem, but how many of us have saved notepads on which we've
doodled while designing applications and operating systems (often we design
things on whiteboards, which are quickly reused).

-- 
Barry Margolin, ··············@level3.com
Genuity Managed Services, a Level(3) Company, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
From: Erann Gat
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <gat-0304031658390001@192.168.1.51>
In article <·················@paloalto-snr1.gtei.net>, Barry Margolin
<··············@level3.com> wrote:

> One of the things I've noticed about the modern, electronic age is that it
> seems like much less stuff gets saved.

No, I don't think that's true.

> Every year or so you hear reports about an archivist unearthing a
> previously-unknown manuscript by someone like Mozart.  Most ex-presidents
> have libraries where you can find just about everything they've written,
> from inauguration speeches down to memos.

Yes, but this only happens *after* they become President.

> There's probably a museum where you can see the first
> thing Shakespeare wrote,

No, actually there isn't.  No one even knows for sure where Shakespeare
grew up.  The only records we have of Shakespeare are those that started
to be kept after he became famous (and he was famous even in his own day).

> but I'll be damned if I can dig up my first Emacs
> init file.

I can easily find every posting I've ever made to comp.lang.lisp.  (And,
alas, so can everyone else.)  This level of historical logging is
unprecedented in history.

> These things got into museums because their creators were
> considered masters, but at the time that they were saving them they didn't
> know that history would eventually brand them this way; they were just guys
> doing their jobs, and saving and cataloguing your work was SOP on those
> days.  Someone even has the book in whose margin Fermat scribbled his
> famous Last Theorem,

Again, by the time this happened, Fermat was already a famous mathematician.

> but how many of us have saved notepads on which we've
> doodled while designing applications and operating systems (often we design
> things on whiteboards, which are quickly reused).

I actually have files that go back to my undergrad days, and a few things
that go all the way back to high school.  I'm still waiting for the
historians to take an interest.

E.

-- 
The opinions expressed here are my own and do not necessarily
reflect the views of JPL or NASA.
From: James A. Crippen
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <m3pto1n3bj.fsf@kappa.unlambda.com>
···@jpl.nasa.gov (Erann Gat) writes:

> I actually have files that go back to my undergrad days, and a few things
> that go all the way back to high school.  I'm still waiting for the
> historians to take an interest.

They sure are slow, aren't they.  I guess it comes from digging around
in dusty files for so long.  But I've been waiting for them to come
knocking for a long time, myself.  I swear someone should complain to
the government about historians and how they're just not paying enough
attention to the modern world...

'james

-- 
James A. Crippen <james at unlambda.com> Lambda Unlimited
61.2204N, -149.8964W                     Recursion 'R' Us
Anchorage, Alaska, USA, Earth            Y = \f.(\x.f(xx))(\x.f(xx))
From: Kent M Pitman
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <sfw65pv0w8c.fsf@shell01.TheWorld.com>
Barry Margolin <··············@level3.com> writes:

> One of the things I've noticed about the modern, electronic age is that it
> seems like much less stuff gets saved.

Probably the number of things worth saving is constant.  We've just enabled
the creation of more junk, so more junk gets lost...

> Each of us has probably written thousands of computer programs in our
> careers, but unless you're a real pack-rat, what's the chance that a
> throw-away script you wrote years ago will show up anywhere? 

(I probably have close to all the things I've done... Heh... Except from
a few companies that had strict leave-it-behind policies.  I'm a serious
packrat.  I've got tons and tons of email... At MIT, long ago, disk space
was so expensive I used to hardcopy it to get JPG off my back for my
disk use... But email was smaller back then (fewer headers) and it only
occupies a bit more than a cubic foot of hardcopy...)

> It might be
> on some ancient backup tape -- if they haven't been recycled and if it's
> still readable.  There's probably a museum where you can see the first
> thing Shakespeare wrote, but I'll be damned if I can dig up my first Emacs
> init file.  These things got into museums because their creators were
> considered masters, but at the time that they were saving them they didn't
> know that history would eventually brand them this way; they were just guys
> doing their jobs, and saving and cataloguing your work was SOP on those
> days.  Someone even has the book in whose margin Fermat scribbled his
> famous Last Theorem, but how many of us have saved notepads on which we've
> doodled while designing applications and operating systems (often we design
> things on whiteboards, which are quickly reused).

That's true.  I didn't save all of my whiteboards.
From: Coby Beck
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <b6j4ch$1iao$1@otis.netspace.net.au>
"Kent M Pitman" <······@world.std.com> wrote in message
····················@shell01.TheWorld.com...
> Barry Margolin <··············@level3.com> writes:
> > days.  Someone even has the book in whose margin Fermat scribbled his
> > famous Last Theorem, but how many of us have saved notepads on which
we've
> > doodled while designing applications and operating systems (often we
design
> > things on whiteboards, which are quickly reused).
>
> That's true.  I didn't save all of my whiteboards.

Enter the digital camera!  All you pack-rats delight...

--
Coby Beck
(remove #\Space "coby 101 @ bigpond . com")
From: James A. Crippen
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <m3n0j5n395.fsf@kappa.unlambda.com>
"Coby Beck" <·····@mercury.bc.ca> writes:

> "Kent M Pitman" <······@world.std.com> wrote in message
> ····················@shell01.TheWorld.com...
>> Barry Margolin <··············@level3.com> writes:
>> > days.  Someone even has the book in whose margin Fermat scribbled his
>> > famous Last Theorem, but how many of us have saved notepads on which
> we've
>> > doodled while designing applications and operating systems (often we
> design
>> > things on whiteboards, which are quickly reused).
>>
>> That's true.  I didn't save all of my whiteboards.
>
> Enter the digital camera!  All you pack-rats delight...

Yeah, but then you have to print all those pictures out or else they
get lost...  Same problem. :-)

'james

-- 
James A. Crippen <james at unlambda.com> Lambda Unlimited
61.2204N, -149.8964W                     Recursion 'R' Us
Anchorage, Alaska, USA, Earth            Y = \f.(\x.f(xx))(\x.f(xx))
From: Mark Conrad
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <010420032207137644%nospam@iam.invalid>
In article <················@paloalto-snr1.gtei.net>, Barry Margolin
<··············@level3.com> wrote:

> >Excuse this dumb question, but I do not know how to go about digging up
> >all the old newsgroup postings of this newsgroup.
> >
> >I assume Google is used, and I assume I better allow lots of storage
> >and time for the download.
> 
> I don't think Google provides any way to download all the postings in a
> batch.  You'd probably have to write a script that does it.

Thanks for forewarning me about about the necessity for creating a
script.

If I have difficulty creating a workable script, I will just download
the postings manually, starting with the most recent postings.

That should keep me busy for some time  :)

Mark-
From: Simon Andr�s
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <vcd65pxyssd.fsf@tarski.math.bme.hu>
Mark Conrad <······@iam.invalid> writes:

> Thanks for forewarning me about about the necessity for creating a
> script.
> 
> If I have difficulty creating a workable script, I will just download
> the postings manually, starting with the most recent postings.
> 
> That should keep me busy for some time  :)

I think you'd be better off spending your time writing the script in
Lisp, using e.g. the http client functions of aserve.

Andras


> 
> Mark-
From: Mark Conrad
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <020420030854062276%nospam@iam.invalid>
In article <···············@tarski.math.bme.hu>, Simon Andr�s
<······@math.bme.hu> wrote:

> I think you'd be better off spending your time writing the script in
> Lisp, using e.g. the http client functions of aserve.

Thanks for that tip - I assume "aserve" is a Common Lisp term.

I have been inactive in Lisp for many years, so look on myself as a
rank novice at present, so please excuse the ignorance on my part.

Mark-
From: Simon Andr�s
Subject: Re: How to dig up all old newsgroup posts?
Date: 
Message-ID: <vcd1y0kzhka.fsf@tarski.math.bme.hu>
Mark Conrad <······@iam.invalid> writes:

> In article <···············@tarski.math.bme.hu>, Simon Andr�s
> <······@math.bme.hu> wrote:
> 
> > I think you'd be better off spending your time writing the script in
> > Lisp, using e.g. the http client functions of aserve.
> 
> Thanks for that tip - I assume "aserve" is a Common Lisp term.

Not really. It's an opensource web server written in Common Lisp by
Franz Inc., available from
http://opensource.franz.com/aserve/index.html. If you're not using
Franz's Allegro CL, then look for the Portable AllegroServe link on
that page. In case you haven't settled on a particular implementation,
I'd suggest (the fully functional trial version of) Allegro CL, for
various reasons.

Andras