Fixing "Random Page"
The Knot Atlas contains a few hundred human-generated pages and a few thousand computer generated pages, one per knot with up to 14 crossings and a few more for a few other classes of knots and links. If we were to choose a page at random within the Knot Atlas, with uniform distribution, we'd be likely to get some the page of some anonymous 13- or 14-crossing knot, of which there is not much interesting to say. We thus needed to change the randomization procedure to one that will give a higher weight to the more "interesting" pages.
The standard MediaWiki random page algorithm is a bit strange: at creation time, every page is assigned a random number between 0 and 1 (the column page_random in the MySQL table mw_page). Then in order to select a random page, a further random number is generated, and the MySQL database is queried to order all pages by their page_random and return the first page whose page_random is higher (in a cyclic sense) than . The page_random attribute for any given page never changes. Thus if for a given page its page_random is just a tiny bit more than the page_random of some other page, then this page is unlikely to ever be selected "at random".
We exploit the strange MediaWiki selection algorithm to our benefit, by re-assigning the page_random attribute non-randomly, so as the gaps "behind" less-interesting pages are shorter and thus they are chosen with a lower probability. This is done by the program RebuildRandomPage.php which we run periodically at the maintenance subdirectory of our MediaWiki installation. The program is quoted below.
<?php
// Rebuilds the page_random column; uncomment
// "mysql_query($action, $connection);"
// near the end of the file for wet runs.
require_once( "commandLine.inc" );
function p($title) {
$p = 1;
if (ereg("^8_[0-9]+$", $title)) $p=15/21;
if (ereg("^8_[0-9]+_Quantum_Invariants$", $title)) $p=15/21;
if (ereg("^9_[0-9]+$", $title)) $p=15/49;
if (ereg("^9_[0-9]+_Quantum_Invariants$", $title)) $p=15/49;
if (ereg("^10_[0-9]+$", $title)) $p=15/165;
if (ereg("^10_[0-9]+_Quantum_Invariants$", $title)) $p=15/165;
if (ereg("^K11[an][0-9]+$", $title)) $p=15/552;
if (ereg("^K12[an][0-9]+$", $title)) $p=15/2176;
if (ereg("^K13[an][0-9]+$", $title)) $p=15/9988;
if (ereg("^K14[an][0-9]+$", $title)) $p=15/46962;
if (ereg("^L8[an][0-9]+$", $title)) $p=15/29;
if (ereg("^L9[an][0-9]+$", $title)) $p=15/83;
if (ereg("^L10[an][0-9]+$", $title)) $p=15/287;
if (ereg("^L11[an][0-9]+$", $title)) $p=15/1007;
if (ereg("^L12[an][0-9]+$", $title)) $p=15/4276;
if (ereg("^L13[an][0-9]+$", $title)) $p=15/7539;
if (ereg("^T\([0-9]+,[0-9]+\)$", $title)) $p=15/36;
return $p;
}
$connection = mysql_connect($wgDBserver, $wgDBuser, $wgDBpassword);
mysql_select_db($wgDBname, $connection);
$query = "SELECT *
FROM mw_page
WHERE page_namespace=0 AND page_is_redirect=0";
$res = mysql_query($query, $connection);
$Z = 0; $N=0;
while ($row = mysql_fetch_array($res)) {
$title = $row["page_title"];
$p = p($title);
print "$title -> $p\n";
$Z += $p; ++$N;
}
$res = mysql_query($query, $connection);
$r = 0;
while ($row = mysql_fetch_array($res)) {
$id = $row["page_id"];
$title = $row["page_title"];
$random = $row["page_random"];
$p = p($title);
$r += $p/$Z;
$action = "UPDATE mw_page SET page_random=$r WHERE page_id=$id";
print "$title, $random: $action\n";
// mysql_query($action, $connection);
}
print "\$N=$N; \$Z=$Z\n";
?>