View Single Post
  #259 (permalink)  
Old March 30th, 2006, 09:44 AM
gursikh gursikh is offline
Newbie Floating Down The Mistic River
 
Join Date: Apr 2005
Posts: 54
Send a message via AIM to gursikh
Lightbulb

I was also wondering about the same. I also figured he knew what he was doing.

While Connell is off working on the plugin I was wondering if there are any Java programmers out there willing to do a little work in preparing the data properly.

I have looked into what is required, and it's not much. The current method Connell is using seems very...uhmm.. unnecessary.

What simply needs to be done is to take the needed archive, in our case enwiki-pages-articles (available here: http://download.wikimedia.org/ ) and parse it into Connell format. Fortunately there already exists tools to do this. All we would have to do is write a new filter class for the tool MWDumper (download here: http://download.wikimedia.org/tools/ Readme Here: http://www.mediawiki.org/wiki/MWDumper )

(please see my user talk page on the rockipedia site for some detail as to how I came to this conclusion -- discussion with wikipedia dev)

Once we have the tool modified we can run it and have the data ready in a few hours!

I'm not very good at this wiki business so if they're any competent people out there interested, please put your effort on the site (http://rockipedia.techmight.com)

gursikh
__________________
Code:
        *****Man********Jeetai********Jugg*******Jeet*******
        __   _|)____  ____(|_\\_   _________   ____(|____
         _|_| |  |     q_| | _/     q_| _| |    q_| | _/    ||
         \| | | ( )    / | | _)     / | \| |    / | | _)    ||
                                         ---          --
            Self-conquest is the conquest of the world!
Reply With Quote