Faceted Browsing with RDF

From ActiveArchives

Jump to: navigation, search

Construction.gif This page is currently being worked on.

What is faceted browsing?

Implementing Faceted Browsing with SPARQL

PREFIX dc:<http://purl.org/dc/elements/1.1/>
SELECT ?doc ?title
WHERE {
  ?doc dc:title ?title .
  {
    {?doc dc:LANGUAGE "Dutch"@nl .}
  }
  {
    {?doc dc:creator <http://www.sarma.be/tags/Jeroen_Peeters> .}
    UNION
    {?doc dc:creator <http://www.sarma.be/tags/Alexander_Baervoets> .}
  }
}
ORDER BY ?title

Implementing Faceted Browsing in Python & SPARQL

This code uses the rdflib library (version 3.1), and a homegrown SparqlQuery convenience class.

The process is basically:

  1. Get a list of all the relationships for the current set of items.
  2. For each relationship, produce list of all possible values and their associated counts.

In this case the "context" if defined by two things:

def getRelations (http, baseurl, facet=None, norels=None):
    """ Returns the list of all (unique) relations (urirefs) to things that have titles
    norels: optional list of relationship URLs to exclude
    """
 
    q = SparqlQuery()
    q.prefix("dc:<http://purl.org/dc/elements/1.1/>")
    q.select("?rel", distinct=True)
    q.where("?doc dc:title ?title .")
    q.where("?doc ?rel ?obj .")
    if facet:
        for rel, values in facet.items():
            for val in values:
                q.where_clause("?doc <%s> %s ." % (rel, val.n3()))
            q.where_clause_end()
    if norels:
        for relurl in norels:
            q.filter("(?rel != <%s>) ." % relurl)
    q = q.render()
    # print q
    results = sparql_query_list(http, baseurl, q)
    relations = [b['rel'] for b in results]
    return relations

...

def getRelationValueCounts (http, baseurl, relation, facet=None):
    """
    Returns a count dictionary mapping values to document (things with titles) counts.
    """
    q = SparqlQuery()
    q.prefix("dc:<http://purl.org/dc/elements/1.1/>")
    q.select("?obj")
    q.where("?doc dc:title ?title .")
    if facet:
        for rel, values in facet.items():
            # don't filter a category with values in the same category
            # e.g. only filter a count list using selections in the *other* filters
            if rel == relation: continue
            for val in values:
                q.where_clause("?doc <%s> %s ." % (rel, val.n3()))
            q.where_clause_end()
 
    q.where("?doc <%s> ?obj ." % relation)
    q.orderby("?obj")
    results = sparql_query_list(http, baseurl, q.render())
    results = [b['obj'] for b in results]
 
    ret = []
    curvalue = None
    for value in results:
        if curvalue != value:
            if curvalue and curcount:
                ret.append((curvalue, curcount))
            curvalue = value
            curcount = 0
        curcount += 1
    if curvalue and curcount:
        ret.append((curvalue, curcount))
 
    return ret
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox