API Documentation

Skos module

This module contains a read-only model of the SKOS specification.

To complement the SKOS specification, some elements were borrowed from the SKOS-THES specification (eg. superordinate and subordinate array).

New in version 0.2.0.

class skosprovider.skos.Collection(id, uri=None, concept_scheme=None, labels=[], notes=[], sources=[], members=[], member_of=[], superordinates=[])[source]

A SKOS Collection.

concept_scheme = None

The ConceptScheme this Collection is a part of.

id = None

An id for this Collection within a vocabulary

label(language=u'any')[source]

Provide a single label for this collection.

This uses the label() function to determine which label to return.

Parameters:language (string) – The preferred language to receive the label in. This should be a valid IANA language tag.
Return type:skosprovider.skos.Label or False if no labels were found.
labels = []

A lst of skosprovider.skos.label instances.

member_of = []

A lst of collection ids.

members = []

A lst of concept or collection ids.

notes = []

A lst of skosprovider.skos.Note instances.

sources = []

A lst of skosprovider.skos.Source instances.

superordinates = []

A lst of concept ids.

type = u'collection'

The type of this concept or collection.

eg. ‘collection’

uri = None

A proper uri for this Collection

class skosprovider.skos.Concept(id, uri=None, concept_scheme=None, labels=[], notes=[], sources=[], broader=[], narrower=[], related=[], member_of=[], subordinate_arrays=[], matches={})[source]

A SKOS Concept.

broader = []

A lst of concept ids.

concept_scheme = None

The ConceptScheme this Concept is a part of.

id = None

An id for this Concept within a vocabulary

eg. 12345

label(language=u'any')[source]

Provide a single label for this concept.

This uses the label() function to determine which label to return.

Parameters:language (string) – The preferred language to receive the label in. This should be a valid IANA language tag.
Return type:skosprovider.skos.Label or False if no labels were found.
labels = []

A lst of Label instances.

matches = ({},)

A dictionary. Each key is a matchtype and contains a list of URI’s.

matchtypes = [u'close', u'exact', u'related', u'broad', u'narrow']

Matches with Concepts in other ConceptSchemes.

This dictionary contains a key for each type of Match (close, exact, related, broad, narrow). Attached to each key is a list of URI’s.

member_of = []

A lst of collection ids.

narrower = []

A lst of concept ids.

notes = []

A lst of Note instances.

related = []

A lst of concept ids.

sources = []

A lst of skosprovider.skos.Source instances.

subordinate_arrays = []

A list of collection ids.

type = u'concept'

The type of this concept or collection.

eg. ‘concept’

uri = None

A proper uri for this Concept

eg. http://id.example.com/skos/trees/1

class skosprovider.skos.ConceptScheme(uri, labels=[], notes=[], sources=[], languages=[])[source]

A SKOS ConceptScheme.

Parameters:
label(language=u'any')[source]

Provide a single label for this conceptscheme.

This uses the label() function to determine which label to return.

Parameters:language (string) – The preferred language to receive the label in. This should be a valid IANA language tag.
Return type:skosprovider.skos.Label or False if no labels were found.
labels = []

A lst of skosprovider.skos.label instances.

languages = []

A lst of languages that are being used in the ConceptScheme.

There’s no guarantuee that labels or notes in other languages do not exist.

notes = []

A lst of skosprovider.skos.Note instances.

sources = []

A lst of skosprovider.skos.Source instances.

uri = None

A URI for this conceptscheme.

class skosprovider.skos.Label(label, type=u'prefLabel', language=u'und')[source]

A SKOS Label.

static is_valid_type(type)[source]

Check if the argument is a valid SKOS label type.

Parameters:type (string) – The type to be checked.
label = None

The label itself (eg. churches, trees, Spitfires, …)

language = u'und'

The language the label is in (eg. en, en-US, nl, nl-BE).

type = u'prefLabel'

The type of this label (prefLabel, altLabel, hiddenLabel, ‘sortLabel’).

valid_types = [u'prefLabel', u'altLabel', u'hiddenLabel', u'sortLabel']

The valid types for a label

class skosprovider.skos.Note(note, type=u'note', language=u'und', markup=None)[source]

A SKOS Note.

static is_valid_markup(markup)[source]

Check the argument is a valid type of markup.

Parameters:markup (string) – The type to be checked.
static is_valid_type(type)[source]

Check if the argument is a valid SKOS note type.

Parameters:type (string) – The type to be checked.
language = u'und'

The language the label is in (eg. en, en-US, nl, nl-BE).

markup = None

What kind of markup does the note contain?

If not None, the note should be treated as a certain type of markup. Currently only HTML is allowed.

note = None

The note itself

type = u'note'

The type of this note ( note, definition, scopeNote, …).

valid_types = [u'note', u'changeNote', u'definition', u'editorialNote', u'example', u'historyNote', u'scopeNote']

The valid types for a note.

class skosprovider.skos.Source(citation, markup=None)[source]

A Source for a concept, collection or scheme.

citation = None

A bibliographic citation for this source.

static is_valid_markup(markup)[source]

Check the argument is a valid type of markup.

Parameters:markup (string) – The type to be checked.
markup = None

What kind of markup does the source contain?

If not None, the source should be treated as a certain type of markup. Currently only HTML is allowed.

skosprovider.skos.dict_to_label(dict)[source]

Transform a dict with keys label, type and language into a Label.

Only the label key is mandatory. If type is not present, it will default to prefLabel. If language is not present, it will default to und.

If the argument passed is not a dict, this method just returns the argument.

skosprovider.skos.dict_to_note(dict)[source]

Transform a dict with keys note, type and language into a Note.

Only the note key is mandatory. If type is not present, it will default to note. If language is not present, it will default to und. If markup is not present it will default to None.

If the argument passed is already a Note, this method just returns the argument.

skosprovider.skos.dict_to_source(dict)[source]

Transform a dict with key ‘citation’ into a Source.

If the argument passed is already a Source, this method just returns the argument.

skosprovider.skos.filter_labels_by_language(labels, language, broader=False)[source]

Filter a list of labels, leaving only labels of a certain language.

Parameters:
  • labels (list) – A list of Label.
  • language (str) – An IANA language string, eg. nl or nl-BE.
  • broader (boolean) – When true, will also match nl-BE when filtering on nl. When false, only exact matches are considered.
skosprovider.skos.find_best_label_for_type(labels, language, labeltype)[source]

Find the best label for a certain labeltype.

Parameters:
  • labels (list) – A list of Label.
  • language (str) – An IANA language string, eg. nl or nl-BE.
  • labeltype (str) – Type of label to look for, eg. prefLabel.
skosprovider.skos.label(labels=[], language=u'any', sortLabel=False)[source]

Provide a label for a list of labels.

The items in the list of labels are assumed to be either instances of Label, or dicts with at least the key label in them. These will be passed to the dict_to_label() function.

This method tries to find a label by looking if there’s a pref label for the specified language. If there’s no pref label, it looks for an alt label. It disregards hidden labels.

While matching languages, preference will be given to exact matches. But, if no exact match is present, an inexact match will be attempted. This might be because a label in language nl-BE is being requested, but only nl or even nl-NL is present. Similarly, when requesting nl, a label with language nl-NL or even nl-Latn-NL will also be considered, providing no label is present that has an exact match with the requested language.

If language ‘any’ was specified, all labels will be considered, regardless of language.

To find a label without a specified language, pass None as language.

If a language or None was specified, and no label could be found, this method will automatically try to find a label in some other language.

Finally, if no label could be found, None is returned.

Parameters:
  • language (string) – The preferred language to receive the label in. This should be a valid IANA language tag.
  • sortLabel (boolean) – Should sortLabels be considered or not? If True, sortLabels will be preferred over prefLabels. Bear in mind that these are still language dependent. So, it’s possible to have a different sortLabel per language.
Return type:

A Label or None if no label could be found.

skosprovider.skos.valid_markup = [None, u'HTML']

Valid types of markup for a note or a source.

Providers module

This module provides an abstraction of controlled vocabularies.

This abstraction allows our application to work with both local and remote vocabs (be they SOAP, REST, XML-RPC or something else).

The basic idea is that we have skos providers. Each provider is an instance of a VocabularyProvider. The same class can thus be reused with different configurations to handle different vocabs. Generally speaking, every instance of a certain VocabularyProvider will deal with concepts and collections from a single conceptscheme.

class skosprovider.providers.VocabularyProvider(metadata, **kwargs)[source]

An interface that all vocabulary providers must follow.

__init__(metadata, **kwargs)[source]

Create a new provider and register some metadata.

Parameters:
  • uri_generator – An object that implements the skosprovider.uri.UriGenerator interface.
  • concept_scheme – A ConceptScheme. If not present, a default ConceptScheme will be created with a uri generated by the DefaultConceptSchemeUrnGenerator in combination with the provider id.
  • metadata (dict) –

    Metadata essential to this provider. Possible metadata:

    • id: A unique identifier for the vocabulary. Required.
    • default_language: Used to determine what language to use when returning labels if no language is specified. Will default to en if not specified.
    • subject: A list of subjects or tags that define what the provider is about or what the provider can handle. This information can then be used when querying a Registry for providers.
    • dataset: A dict detailing the dataset the conceptscheme and all concepts and collections are part of. Currently the contents of the dictionary are undefined except for a uri attribute that must be present.
concept_scheme = None

The ConceptScheme this provider serves.

expand(id)[source]

Expand a concept or collection to all it’s narrower concepts.

This method should recurse and also return narrower concepts of narrower concepts.

If the id passed belongs to a skosprovider.skos.Concept, the id of the concept itself should be include in the return value.

If the id passed belongs to a skosprovider.skos.Collection, the id of the collection itself must not be present in the return value In this case the return value includes all the member concepts and their narrower concepts.

Parameters:id – A concept or collection id.
Return type:A list of id’s or False if the concept or collection doesn’t exist.
find(query, **kwargs)[source]

Find concepts that match a certain query.

Currently query is expected to be a dict, so that complex queries can be passed. You can use this dict to search for concepts or collections with a certain label, with a certain type and for concepts that belong to a certain collection.

# Find anything that has a label of church.
provider.find({'label': 'church'})

# Find all concepts that are a part of collection 5.
provider.find({'type': 'concept', 'collection': {'id': 5})

# Find all concepts, collections or children of these
# that belong to collection 5.
provider.find({'collection': {'id': 5, 'depth': 'all'})

# Find anything that has a label of church.
# Preferentially display a label in Dutch.
provider.find({'label': 'church'}, language='nl')
Parameters:
  • query

    A dict that can be used to express a query. The following keys are permitted:

    • label: Search for something with this label value. An empty label is equal to searching for all concepts.
    • type: Limit the search to certain SKOS elements. If not present or None, all is assumed:
    • collection: Search only for concepts belonging to a certain collection. This argument should be a dict with two keys:
      • id: The id of a collection. Required.
      • depth: Can be members or all. Optional. If not present, members is assumed, meaning only concepts or collections that are a direct member of the collection should be considered. When set to all, this method should return concepts and collections that are a member of the collection or are a narrower concept of a member of the collection.
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
  • sort (string) – Optional. If present, it should either be id, label or sortlabel. The sortlabel option means the providers should take into account any sortLabel if present, if not it will fallback to a regular label to sort on.
  • sort_order (string) – Optional. What order to sort in: asc or desc. Defaults to asc
Returns:

A lst of concepts and collections. Each of these is a dict with the following keys:

  • id: id within the conceptscheme
  • uri: uri of the concept or collection
  • type: concept or collection
  • label: A label to represent the concept or collection. It is determined by looking at the language parameter, the default language of the provider and finally falls back to en.

get_all(**kwargs)[source]

Returns all concepts and collections in this provider.

Parameters:
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
  • sort (string) – Optional. If present, it should either be id, label or sortlabel. The sortlabel option means the providers should take into account any sortLabel if present, if not it will fallback to a regular label to sort on.
  • sort_order (string) – Optional. What order to sort in: asc or desc. Defaults to asc
Returns:

A lst of concepts and collections. Each of these is a dict with the following keys:

  • id: id within the conceptscheme
  • uri: uri of the concept or collection
  • type: concept or collection
  • label: A label to represent the concept or collection. It is determined by looking at the language parameter, the default language of the provider and finally falls back to en.

get_by_id(id)[source]

Get all information on a concept or collection, based on id.

Providers should assume that all id’s passed are strings. If a provider knows that internally it uses numeric identifiers, it’s up to the provider to do the typecasting. Generally, this should not be done by changing the id’s themselves (eg. from int to str), but by doing the id comparisons in a type agnostic way.

Since this method could be used to find both concepts and collections, it’s assumed that there are no id collisions between concepts and collections.

Return type:skosprovider.skos.Concept or skosprovider.skos.Collection or False if the concept or collection is unknown to the provider.
get_by_uri(uri)[source]

Get all information on a concept or collection, based on a URI.

Return type:skosprovider.skos.Concept or skosprovider.skos.Collection or False if the concept or collection is unknown to the provider.
get_children_display(id, **kwargs)[source]

Return a list of concepts or collections that should be displayed under this concept or collection.

Parameters:
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
  • sort (string) – Optional. If present, it should either be id, label or sortlabel. The sortlabel option means the providers should take into account any sortLabel if present, if not it will fallback to a regular label to sort on.
  • sort_order (string) – Optional. What order to sort in: asc or desc. Defaults to asc
  • id (str) – A concept or collection id.
Returns:

A lst of concepts and collections. Each of these is a dict with the following keys:

  • id: id within the conceptscheme
  • uri: uri of the concept or collection
  • type: concept or collection
  • label: A label to represent the concept or collection. It is determined by looking at the language parameter, the default language of the provider and finally falls back to en.

get_metadata()[source]

Get some metadata on the provider or the vocab it represents.

Return type:Dict.
get_top_concepts(**kwargs)[source]

Returns all top-level concepts in this provider.

Top-level concepts are concepts that have no broader concepts themselves. They might have narrower concepts, but this is not mandatory.

Parameters:
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
  • sort (string) – Optional. If present, it should either be id, label or sortlabel. The sortlabel option means the providers should take into account any sortLabel if present, if not it will fallback to a regular label to sort on.
  • sort_order (string) – Optional. What order to sort in: asc or desc. Defaults to asc
Returns:

A lst of concepts, NOT collections. Each of these is a dict with the following keys:

  • id: id within the conceptscheme
  • uri: uri of the concept or collection
  • type: concept or collection
  • label: A label to represent the concept or collection. It is determined by looking at the language parameter, the default language of the provider and finally falls back to en.

get_top_display(**kwargs)[source]

Returns all concepts or collections that form the top-level of a display hierarchy.

As opposed to the get_top_concepts(), this method can possibly return both concepts and collections.

Parameters:
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
  • sort (string) – Optional. If present, it should either be id, label or sortlabel. The sortlabel option means the providers should take into account any sortLabel if present, if not it will fallback to a regular label to sort on.
  • sort_order (string) – Optional. What order to sort in: asc or desc. Defaults to asc
Returns:

A lst of concepts and collections. Each of these is a dict with the following keys:

  • id: id within the conceptscheme
  • uri: uri of the concept or collection
  • type: concept or collection
  • label: A label to represent the concept or collection. It is determined by looking at the language parameter, the default language of the provider and finally falls back to en.

get_vocabulary_id()[source]

Get an identifier for the vocabulary.

Return type:String or number.
uri_generator = None

The UriGenerator responsible for generating URIs for this provider.

class skosprovider.providers.MemoryProvider(metadata, list, **kwargs)[source]

A provider that keeps everything in memory.

The data is passed in the constructor of this provider as a lst of skosprovider.skos.Concept and skosprovider.skos.Collection instances.

__init__(metadata, list, **kwargs)[source]
Parameters:
case_insensitive = True

Is searching for labels case insensitive?

By default a search for a label is done case insensitive. Older versions of this provider were case sensitive. If this behaviour is desired, this can be triggered by providing a case_insensitive keyword to the constructor.

class skosprovider.providers.DictionaryProvider(metadata, list, **kwargs)[source]

A simple vocab provider that use a python list of dicts.

The provider expects a list with elements that are dicts that represent the concepts.

class skosprovider.providers.SimpleCsvProvider(metadata, reader, **kwargs)[source]

A provider that reads a simple csv format into memory.

The supported csv format looks like this: <id>,<preflabel>,<note>,<source>

This provider essentialy provides a flat list of concepts. This is commonly associated with short lookup-lists.

New in version 0.2.0.

__init__(metadata, reader, **kwargs)[source]
Parameters:
  • metadata – A metadata dictionary.
  • reader – A csv reader.

Registry module

This module provides a registry for skos providers.

This registry helps us find providers during runtime. We can also apply some operations to all or several providers at the same time.

class skosprovider.registry.Registry[source]

This registry collects all skos providers.

concept_scheme_uri_map = {}

Dictionary mapping concept scheme uri’s to vocabulary id’s.

find(query, **kwargs)[source]

Launch a query across all or a selection of providers.

# Find anything that has a label of church in any provider.
registry.find({'label': 'church'})

# Find anything that has a label of church with the BUILDINGS provider.
# Attention, this syntax was deprecated in version 0.3.0
registry.find({'label': 'church'}, providers=['BUILDINGS'])

# Find anything that has a label of church with the BUILDINGS provider.
registry.find({'label': 'church'}, providers={'ids': ['BUILDINGS']})

# Find anything that has a label of church with a provider
# marked with the subject 'architecture'.
registry.find({'label': 'church'}, providers={'subject': 'architecture'})

# Find anything that has a label of church in any provider.
# If possible, display the results with a Dutch label.
registry.find({'label': 'church'}, language='nl')
Parameters:
  • query (dict) – The query parameters that will be passed on to each find() method of the selected. providers.
  • providers (dict) – Optional. If present, it should be a dictionary. This dictionary can contain any of the keyword arguments available to the get_providers() method. The query will then only be passed to the providers confirming to these arguments.
  • language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
Returns:

a list of dict. Each dict has two keys: id and concepts.

get_all(**kwargs)[source]

Get all concepts from all providers.

# get all concepts in all providers.
registry.get_all()

# get all concepts in all providers.
# If possible, display the results with a Dutch label.
registry.get_all(language='nl')
Parameters:language (string) – Optional. If present, it should be a language-tag. This language-tag is passed on to the underlying providers and used when selecting the label to display for each concept.
Returns:a list of dict. Each dict has two keys: id and concepts.
get_by_uri(uri)[source]

Get a concept or collection by its uri.

Returns a single concept or collection if one exists with this uri. Returns False otherwise.

Parameters:uri (string) – The uri to find a concept or collection for.
Raises:ValueError – The uri is invalid.
Return type:skosprovider.skos.Concept or skosprovider.skos.Collection
get_provider(id)[source]

Get a provider by id or uri.

Parameters:id (str) – The identifier for the provider. This can either be the id with which it was registered or the uri of the conceptscheme that the provider services.
Returns:A skosprovider.providers.VocabularyProvider or False if the id or uri is unknown.
get_providers(**kwargs)[source]

Get all providers registered.

If keyword ids is present, get only the providers with these ids.

If keys subject is present, get only the providers that have this subject.

# Get all providers with subject 'biology'
registry.get_providers(subject='biology')

# Get all providers with id 1 or 2
registry.get_providers(ids=[1,2])

# Get all providers with id 1 or 2 and subject 'biology'
registry.get_providers(ids=[1,2], subject='biology']
Parameters:
  • ids (list) – Only return providers with one of the Ids or URIs.
  • subject (str) – Only return providers with this subject.
Returns:

A list of providers

providers = {}

Dictionary containing all providers, keyed by id.

register_provider(provider)[source]

Register a skosprovider.providers.VocabularyProvider.

Parameters:provider (skosprovider.providers.VocabularyProvider) – The provider to register.
Raises:RegistryException – A provider with this id or uri has already been registered.
remove_provider(id)[source]

Remove the provider with the given id or URI.

Parameters:id (str) – The identifier for the provider.
Returns:A skosprovider.providers.VocabularyProvider or False if the id is unknown.

Uri module

This module provides utilities for working with URIs.

New in version 0.3.0.

class skosprovider.uri.DefaultConceptSchemeUrnGenerator[source]

Generate a URN for a conceptscheme specific to skosprovider.

Used for generating default URI for providers that do not have an explicit conceptscheme.

generate(**kwargs)[source]

Generate a URI based on parameters passed.

Parameters:id – The id of the conceptscheme.
Return type:string
class skosprovider.uri.DefaultUrnGenerator(vocabulary_id)[source]

Generate a URN specific to skosprovider.

Used for providers that do not implement a specific UriGenerator.

Parameters:vocabulary_id – An identifier for the vocabulary we’re generating URIs for.
generate(**kwargs)[source]

Generate a URI based on parameters passed.

Parameters:id – The id of the concept or collection.
Return type:string
class skosprovider.uri.TypedUrnGenerator(vocabulary_id)[source]

Generate a URN specific to skosprovider that contains a type.

Parameters:vocabulary_id – An identifier for the vocabulary we’re generating URIs for.
generate(**kwargs)[source]

Generate a URI based on parameters passed.

Parameters:
  • id – The id of the concept or collection.
  • type – What we’re generating a URI for: concept or collection.
Return type:

string

class skosprovider.uri.UriGenerator[source]

An abstract class for generating URIs.

generate(**kwargs)[source]

Generate a URI based on parameters passed.

class skosprovider.uri.UriPatternGenerator(pattern)[source]

Generate a URI based on a simple pattern.

generate(**kwargs)[source]

Generate a URI based on parameters passed.

Parameters:id – The id of the concept or collection.
Return type:string
skosprovider.uri.is_uri(uri)[source]

Check if a string is a valid URI according to rfc3987

Parameters:uri (string) –
Return type:boolean

Exceptions module

This module provides custom exceptions for skos providers.

New in version 0.5.0.

exception skosprovider.exceptions.ProviderUnavailableException(message)[source]

This exception can be raised by a provider if it’s unable to provide the thesaurus. This can occur when an underlying resource is unavailable (database connection, webservice, …). The message should contain some more information about the problem.

Utils module

This module contains utility functions for dealing with skos providers.

skosprovider.utils.dict_dumper(provider)[source]

Dump a provider to a format that can be passed to a skosprovider.providers.DictionaryProvider.

Parameters:provider (skosprovider.providers.VocabularyProvider) – The provider that wil be turned into a dict.
Return type:A list of dicts.

New in version 0.2.0.