[1:15]<Wes-> Dantman: How expensive is bandwidth over there? I want to mirror the wiki ("functional backup") - but without Last-Modified headers, it's going to be a VERY arduous re-crawl [1:15]<Wes-> and the one I started this morning was mysteriously killed, grrrr [1:16]<Dantman> Wes-, rsync xml dumps? [1:16]<Wes-> Dantman: xml dumps aren't helpful, very hard to read [1:16]<Wes-> Just want HTML content [1:20]<Dantman> Hmm... now that I think about it... isn't the import script intelligent? [1:32]<Dantman> Wes-, sounds like MediaWiki's import functionality should be smart enough... You should be able to create a simple local MW install, and on a daily basis rsync the .xml dump from it, import that into the install, and dump html from there. [1:34]<Dantman> I wonder which would be more bw efficient though... .xml rsync, or rsync of a .html dump I make [1:39]<Dantman> n/m... perhaps it's not intelligent [1:39]<Dantman> *sigh* Maybe I should do a .html dump alongside the .xml dumps [1:44]<Dantman> Ok, n/m again... it IS intelligent, importing .xml dumps will work [1:48]<Dantman> Ugh... zumbrunn you know that change in dns for just commonjs.org made the rsync line I gave to everyone invalid [1:54]<Dantman> Wes-, rsync -r wiki.commonjs.org::commonjs/htmldump/ . [1:59]<Dantman> Wes-, I added html dumping to the cron job, you can rsync it daily... you should probably rsync a full.xml with it too just for backup purposes... [2:01]<Dantman> ^_^ rsync uses an efficient diff algo, so last-modifieds don't matter [4:09]* Dantman gives up on a random idea that probably wouldn't be worth implementing... [4:09]<Dantman> ... heh, I was thinking of making it so that searching on the wiki would return results from both the wiki and the mailing list. [5:19]<- *Wyverald* h [5:19]<- *Wyverald* help [12:45]<Wes-> dantman: skipping non-regular file "images/thumb/4/4d/AP_CommonJS.jpg/120px-AP_CommonJS.jpg" [12:46]<Wes-> dantman: for all the images, etc, during rsync -- any idea what's up with that? Are they maybe links on your end? [12:49]<Wes-> dantman: That htmldump is good stuff, though!!!! [12:50]<Wes-> very few links need re-writing locally to work [12:50]<Wes-> Oh, hm, local re-writes won't necessarily be kind to rsync [12:51]<Wes-> Oh scratch the "very few" part [12:54]* Wes- gets scripting [13:54]<Wes-> dantman: http://wiki.commonjs.org.mirrors.page.ca/ [13:55]<Wes-> dantman: http://wiki.commonjs.org.mirrors.page.ca/update_cjs_wiki.sh [13:59]<Wes-> WTF? Where did dan go?