Fixing "Random Page"
The Knot Atlas contains a few hundred human-generated pages and a few thousand computer generated pages, one per knot with up to 14 crossings and a few more for a few other classes of knots and links. If we were to choose a page at random within the Knot Atlas, with uniform distribution, we'd be likely to get some the page of some anonymous 13- or 14-crossing knot, of which there is not much interesting to say. We thus needed to change the randomization procedure to one that will give a higher weight to the more "interesting" pages.
The standard MediaWiki random page algorithm is a bit strange: at creation time, every page is assigned a random number between 0 and 1 (the column page_random
in the MySQL table mw_page
). Then in order to select a random page, a further random number is generated, and the MySQL database is queried to order all pages by their page_random
and return the first page whose page_random
is higher (in a cyclic sense) than . The page_random
attribute for any given page never changes. Thus if for a given page its page_random
is just a tiny bit more than the page_random
of some other page, then this page is unlikely to ever be selected "at random".
We exploit the strange MediaWiki selection algorithm to our benefit, by re-assigning the page_random
attribute non-randomly, so as the gaps "behind" less-interesting pages are shorter and thus they are chosen with a lower probability. This is done by the program RebuildRandomPage.php
which we run periodically at the maintenance subdirectory of our MediaWiki installation. The program is quoted below.
<?php // Rebuilds the page_random column; uncomment // "mysql_query($action, $connection);" // near the end of the file for wet runs. require_once( "commandLine.inc" ); function p($title) { $p = 1; if (ereg("^8_[0-9]+$", $title)) $p=15/21; if (ereg("^8_[0-9]+_Quantum_Invariants$", $title)) $p=15/21; if (ereg("^9_[0-9]+$", $title)) $p=15/49; if (ereg("^9_[0-9]+_Quantum_Invariants$", $title)) $p=15/49; if (ereg("^10_[0-9]+$", $title)) $p=15/165; if (ereg("^10_[0-9]+_Quantum_Invariants$", $title)) $p=15/165; if (ereg("^K11[an][0-9]+$", $title)) $p=15/552; if (ereg("^K12[an][0-9]+$", $title)) $p=15/2176; if (ereg("^K13[an][0-9]+$", $title)) $p=15/9988; if (ereg("^K14[an][0-9]+$", $title)) $p=15/46962; if (ereg("^L8[an][0-9]+$", $title)) $p=15/29; if (ereg("^L9[an][0-9]+$", $title)) $p=15/83; if (ereg("^L10[an][0-9]+$", $title)) $p=15/287; if (ereg("^L11[an][0-9]+$", $title)) $p=15/1007; if (ereg("^L12[an][0-9]+$", $title)) $p=15/4276; if (ereg("^L13[an][0-9]+$", $title)) $p=15/7539; if (ereg("^T\([0-9]+,[0-9]+\)$", $title)) $p=15/36; return $p; } $connection = mysql_connect($wgDBserver, $wgDBuser, $wgDBpassword); mysql_select_db($wgDBname, $connection); $query = "SELECT * FROM mw_page WHERE page_namespace=0 AND page_is_redirect=0"; $res = mysql_query($query, $connection); $Z = 0; $N=0; while ($row = mysql_fetch_array($res)) { $title = $row["page_title"]; $p = p($title); print "$title -> $p\n"; $Z += $p; ++$N; } $res = mysql_query($query, $connection); $r = 0; while ($row = mysql_fetch_array($res)) { $id = $row["page_id"]; $title = $row["page_title"]; $random = $row["page_random"]; $p = p($title); $r += $p/$Z; $action = "UPDATE mw_page SET page_random=$r WHERE page_id=$id"; print "$title, $random: $action\n"; // mysql_query($action, $connection); } print "\$N=$N; \$Z=$Z\n"; ?>