RDF | Techstreams

practical semantic web – creating a catalog of Linked data April 4, 2010

Posted by Anand Mallaya in computers, internet, technology, tutorial, web.
Tags: dataset, linked data, linked data catalog, linked open data catalog, LOD catalog, RDF, semantic web, VoID vocabulary
6 comments

Today I am going to work on creating a semantic web document. I am going to make a catalog of linked data datasets listed under linkeddata.org. Here is the list I am going to publish in RDF.

choose the correct vocabulary – there are different generic vocabularies like Dublin Core, FOAF etc and specialized vocabulary like Dcat, Void etc. for creating catalogs. Dcat is designed for government data catalogs, so I choose VoID vocabulary, which is designed for single dataset provider. It uses generic vocabularies lke FOAF and DC as well.
Select suitable tools – Tools to edit the RDF document. There are plenty of them ,in this case and RDF editor, like the Rhodonite tool for RDF editing and browsing. But I couldn’t understand it well because of poor documentation and help. So I chose an online VoID editor from DERI Galway. Though the result is in Turtle format, there are tools to convert turtle document to RDF/XML format. Like this one online : RDF Validator/converter rdfabout.com
Creating the semantic grpah – first I am going to choose a dataset and add it to my catalog. To start with, I chose CrunchBase entry listed in linkeddata.org. Go to VoID editor and add the following details in it

Dataset URI: http://cb.semsol.org/

Dataset Homepage URI: http://cb.semsol.org/

Dataset Name: Crunchbase

Dataset Description: RDFized Crunchbase entries

Example Resource: http://cb.semsol.org/company/yahoo

Dataset Topic : business, database

Vocabulary URI: http://www.w3.org/1999/02/22-rdf-syntax-ns# , http://www.w3.org/2003/01/geo/wgs84_pos#, http://cb.semsol.org/ns#

Publisher : http://semsol.com

SPARQL endpoint : http://cb.semsol.org/sparql

Now the dataset entry for cruchbase dataset is ready in VoID vocabulary, in the right side textarea. It is in Turtle notation.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix dcterms: <http://purl.org/dc/terms/> .

@prefix void: <http://rdfs.org/ns/void#> .

@prefix : <#> .

## your dataset

<http://cb.semsol.org/> rdf:type void:Dataset ;

foaf:homepage <http://cb.semsol.org/> ;

dcterms:title “Crunchbase” ;

dcterms:description “RDFized Crunchbase entries” ;

dcterms:publisher <http://semsol.com> ;

void:sparqlEndpoint <http://cb.semsol.org/sparql> ;

void:vocabulary <http://www.w3.org/1999/02/22-rdf-syntax-ns#> ;

void:vocabulary <http://www.w3.org/2003/01/geo/wgs84_pos#> ;

void:vocabulary <http://cb.semsol.org/ns#> ;

void:exampleResource <http://cb.semsol.org/company/yahoo> ;

dcterms:subject <http://dbpedia.org/resource/Database> ;

dcterms:subject <http://dbpedia.org/resource/Business> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix dcterms: <http://purl.org/dc/terms/> .@prefix void: <http://rdfs.org/ns/void#> .@prefix : <#> .
## your dataset<http://cb.semsol.org/> rdf:type void:Dataset ; foaf:homepage <http://cb.semsol.org/> ; dcterms:title “Crunchbase” ; dcterms:description “RDFized Crunchbase entries” ; dcterms:publisher <http://semsol.com> ; void:sparqlEndpoint <http://cb.semsol.org/sparql> ; void:vocabulary <http://www.w3.org/1999/02/22-rdf-syntax-ns#> ; void:vocabulary <http://www.w3.org/2003/01/geo/wgs84_pos#> ; void:vocabulary <http://cb.semsol.org/ns#> ; void:exampleResource <http://cb.semsol.org/company/yahoo> ; dcterms:subject <http://dbpedia.org/resource/Database> ; dcterms:subject <http://dbpedia.org/resource/Business> .

Convert in to RDF/XML file(serialize) – copy the dataset details in turtle notation and go to RDF/XML converter tool. And paste the turtle notated content there and select input format as N-Triples/Turtle, and click validate. The result is given below.

<?xml version=”1.0″?>
<rdf:RDF xmlns:foaf=”http://xmlns.com/foaf/0.1/” xmlns:void=”http://rdfs.org/ns/void#”
xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:dcterms=”http://purl.org/dc/terms/”>
<void:Dataset rdf:about=”http://cb.semsol.org/”>
<foaf:homepage rdf:resource=”http://cb.semsol.org/” />
<dcterms:title>Crunchbase</dcterms:title>
<dcterms:description>RDFized Crunchbase entries</dcterms:description>
<dcterms:publisher rdf:resource=”http://semsol.com” />
<void:sparqlEndpoint rdf:resource=”http://cb.semsol.org/sparql” />
<void:vocabulary rdf:resource=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” />
<void:vocabulary rdf:resource=”http://www.w3.org/2003/01/geo/wgs84_pos#” />
<void:vocabulary rdf:resource=”http://cb.semsol.org/ns#” />
<void:exampleResource rdf:resource=”http://cb.semsol.org/company/yahoo” />
<dcterms:subject rdf:resource=”http://dbpedia.org/resource/Database” />
<dcterms:subject rdf:resource=”http://dbpedia.org/resource/Business” />
</void:Dataset>
</rdf:RDF>

Now repeat the above process and create entries for all the datasets listed in the Linkeddata.org.
Combine the RDF entries in to a single file, no need to copy all the tags but starting from <void:Dataset .. >only
something like
<?xml version=”1.0″?>
<rdf:RDF xmlns:foaf=”http://xmlns.com/foaf/0.1/” xmlns:void=”http://rdfs.org/ns/void#”
xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:dcterms=”http://purl.org/dc/terms/”>
<void:Dataset rdf:about=”http://cb.semsol.org/”>
…
…
</void:Dataset>
<void:Dataset rdf:about=”http://dbpedia.org/”>
….
….
<void:Dataset>
<void:Dataset rdf:about=”http://www.geonames.org/”>
…
…
<void:Dataset>
</rdf:RDF>
And the catalog is ready in RDF/XML with all the datasets added. Save it as a file with extension .rdf like Linked_data_catalog.rdf. Now the machine can understand (if programmed wisely) to some extend what data are available there to convert in to meaningful information and then to knowledge.
You can find the catalog here: Linkeddata datasets catalog(note: not yet ready, comeback later )

Techstreams

practical semantic web – creating a catalog of Linked data April 4, 2010

About...

Follow me on twitter

Recent posts

Older posts

Tags

My Bookmarks on Delcious

Feeds