Monday 12 March 2018

Newsgroups: I Made a Difference

In 2013, I had the idea of making a newsgroup search engine. All I had to do was download all newsgroup postings and index them.

Well I wrote some Python code which did multiple threads, and got a Giganews Diamond account which allowed a lot of threads and unlimited downloads.

After a month I had most newsgroups - excepting binaries - and it came to 800GB.

Trying to index THAT lot was impossible.

So in mid-2016, I just heaved the 800GB's onto a 1TB 2.5" USB hard drive and sent it to the Internet Archive.

I thought they hadn't done anything with the drive.

After a few weeks of confusion, a friend of mine - connected to the IA - got all it sorted out. They had uploaded all my data here

https://archive.org/details/giganewsnewsgroups2003to2013

(2003 - 2013 because 2013 is when I last indexed them).

There was a mystery of the missing drive. Basically, they mislaid my original drive, so said they couldn't send it back to me in the post. My friend got that sorted out too and a new drive is on its way.

Anyway, it's all there, independent of Google Groups.

To download a newsgroup, get the domain (below, gna-comp is comp.*)

https://archive.org/download/gna-comp/comp.sys.amiga.announce.mbox.gz

And unzip it. You'll be able to load it into your email client or newsgroup reader.

Hoping to index 2013-2018, but I have until 2023 to do so (it goes back only 10 years).

My 1981-1991 newsgroup search engine is at: http://www.dejadejadeja.com/

Finally, the technology behind my newsgroup search engine is called MailXplorer, and it's here:

http://www.sanfransys.com/mailxplorer

I'm looking for some customers who want to search old hard drives for missing emails.

That's all! Thanks for reading.