Creating the take home database: Difference between revisions

From Knot Atlas
Jump to navigationJump to search
(New page: This page is really only for Scott and Dror. It documents the procedure for creating RDF dumps from the Knot Atlas. If you're only interested in the actual data, go to [[Take Home Database...)
 
No edit summary
Line 1: Line 1:
This page is really only for Scott and Dror. It documents the procedure for creating RDF dumps from the Knot Atlas. If you're only interested in the actual data, go to [[Take Home Database]].
This page is really only for Scott and Dror. It documents the procedure for creating RDF dumps from the Knot Atlas. If you're only interested in the actual data, go to [[The Take Home Database]].


==Quick start==
==Quick start==
Line 12: Line 12:
This is actually a two step process. First, we create RDF statements in an on-disk RDF repository. Second, we dump these statements back to the file <code>katlas.rdf</code>
This is actually a two step process. First, we create RDF statements in an on-disk RDF repository. Second, we dump these statements back to the file <code>katlas.rdf</code>


==Cleaning up===
===Cleaning up===
The file <code>katlas.xml</code> gets deleted, as it's not so useful, and the file <code>katlas.rdf</code> gets gzipped to <code>katlas.rdf.gz</code>. It's then available at [http://katlas.org/data/katlas.rdf.gz].
The file <code>katlas.xml</code> gets deleted, as it's not so useful, and the file <code>katlas.rdf</code> gets gzipped to <code>katlas.rdf.gz</code>. It's then available at [http://katlas.org/data/katlas.rdf.gz].

Revision as of 10:45, 18 July 2007

This page is really only for Scott and Dror. It documents the procedure for creating RDF dumps from the Knot Atlas. If you're only interested in the actual data, go to The Take Home Database.

Quick start

  1. Go to /www/html/data/
  2. As root, run ./create-rdf.sh

What does this do?

XML dump

First, we dump the current version of every page in the Knot Atlas to XML, using the /w/maintenance/dumpBackup.php script from mediawiki. This produces the file katlas.xml.

Convert to RDF

This is actually a two step process. First, we create RDF statements in an on-disk RDF repository. Second, we dump these statements back to the file katlas.rdf

Cleaning up

The file katlas.xml gets deleted, as it's not so useful, and the file katlas.rdf gets gzipped to katlas.rdf.gz. It's then available at [1].