View Single Post
  #221 (permalink)  
Old March 20th, 2006, 08:13 AM
connell connell is offline
Newbie Floating Down The Mistic River
 
Join Date: Feb 2006
Location: Scotland
Posts: 1
Quote:
Originally Posted by partisan
28-55 hours for the conversion of the wiktionary
Yea, with the current software this is a problem (it would take _ages_ for wikipedia) but there are reasons for this:

1 - The Wiktionary is converted into HTML using the MediaWiki software so essentially there is 2 conversions taking place (mediawiki to html then html to rockipedia). With more time, a mediawiki to rockipedia converter could be written to drastically reduce conversion time).

2 - The MediaWiki and MySQL content store that I am using are on seperate machines (due to file space restrictions). This introduces the extra slow down of a complete TCP/IP transfer of information across a network. This slow down would be completely removed with the converter consolidation in 1.

3 - The conversion software is written in Perl using the HTML::Tree module which is over-bloat for what is needed.

Quote:
Originally Posted by partisan
I thought enhancing the dictionary plugin could be an alternative to your approach
I'm not sure if the users of the dictionary plugin would want to bloat that the extra features bring. The fact remains, however, that the most of the nitty-gritty work of the plugin is done (although the code is still very messy and unoptimised) and IMO an alpha release is not 100 years away. Having scanned the dictionary plugin code, it appears that the contents of the dictionary are stored in 2 files, an index file and a content file. This causes problems when large amounts of data are being used because FAT32, the filesystem used by the jukeboxes, only allows individual files up to 2GB. Being able to split the content into many files overcomes this, and uses the filesystems design to aid in the organisation of the topics. The dictionary plugin has only 315 lines of code and does not have the structures required to easily add formatting or linking support. In contrast the wiki plugin requires over 600 lines of code to implement the features it has so far (although this is likely to change as more are implemented and optimisation takes place).

Perhaps the best approach would be to analyse how the dictionary plugin achieves its search speed and incorporate that into the wiki plugin.

Connell
Reply With Quote