pycldf.orm

Object oriented (read-only) access to CLDF data

To read ORM objects from a pycldf.Dataset, use two methods

  • pycldf.Dataset.objects

  • pycldf.Dataset.get_object

Both will return default implementations of the objects, i.e. instances of the corresponding class defined in this module. To customize these objects,

  1. subclass the default and specify the appropriate component (i.e. the table of the CLDF dataset which holds rows to be transformed to this type):

    from pycldf.orm import Language
    
    class Variety(Language):
        __component__ = 'LanguageTable'
    
        def custom_method(self):
            pass
    
  2. pass the class into the objects or get_object method.

Limitations:

  • We only support foreign key constraints for CLDF reference properties targeting either a component’s CLDF id or its primary key. This is because CSVW does not support unique constraints other than the one implied by the primary key declaration.

  • This functionality comes with the typical “more convenient API vs. less performance and bigger memory footprint” trade-off. If you are running into problems with this, you might want to load your data into a SQLite db using the pycldf.db module, and access via SQL. Some numbers (to be interpreted relative to each other): Reading ~400,000 rows from a ValueTable of a StructureDataset takes

    • ~2secs with csvcut, i.e. only making sure it’s valid CSV

    • ~15secs iterating over pycldf.Dataset[‘ValueTable’]

    • ~35secs iterating over pycldf.Dataset.objects(‘ValueTable’)

The Object base class

class pycldf.orm.Object(dataset, row)[source]

Represents a row of a CLDF component table.

Subclasses of Object are instantiated when calling Dataset.objects or Dataset.get_object.

Variables
  • dataset – Reference to the Dataset instance, this object was loaded from.

  • data – An OrderedDict with a copy of the row the object was instantiated with.

  • cldf – A dict with CLDF-specified properties of the row, keyed with CLDF terms.

  • id – The value of the CLDF id property of the row.

  • name – The value of the CLDF name property of the row.

  • description – The value of the CLDF description property of the row.

  • pk – The value of the column specified as primary key for the table. (May differ from id)

Parameters

row (dict) –

aboutUrl(col='id')[source]

The table’s aboutUrl property, expanded with the object’s row as context.

Return type

typing.Optional[str]

CLDF reference properties can be list-valued. This method returns all related objects for such a property.

Parameters

relation (str) –

Return type

typing.Union[pycldf.util.DictTuple, list]

property component: str

Name of the CLDF component the object belongs to. Can be used to lookup the corresponding table via obj.dataset[obj.component_name()].

Return type

str

propertyUrl(col='id')[source]

The table’s propertyUrl property, expanded with the object’s row as context.

references[source]

pycldf.Reference instances associated with the object.

>>> obj.references[0].source['title']
>>> obj.references[0].fields.title
>>> obj.references[0].description  # The "context", typically cited pages
related(relation)[source]

The CLDF ontology specifies several “reference properties”. This method returns the first related object specified by such a property.

Parameters

relation (str) – a CLDF reference property name.

Returns

related Object instance.

valueUrl(col='id')[source]

The table’s valueUrl property, expanded with the object’s row as context.

Component-specific object classes

class pycldf.orm.Borrowing(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Code(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Cognateset(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Cognate(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Contribution(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Entry(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Example(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Form(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.FunctionalEquivalentset(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.FunctionalEquivalent(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Language(dataset, row)[source]
Parameters

row (dict) –

glottolog_languoid(glottolog_api)[source]

Get a Glottolog languoid associated with the Language.

Parameters

glottolog_apipyglottolog.Glottolog instance or dict mapping glottocodes to pyglottolog.langoids.Languoid instances.

Returns

pyglottolog.langoids.Languoid instance or None.

property lonlat
Returns

(longitude, latitude) pair

class pycldf.orm.Media(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Parameter(dataset, row)[source]
Parameters

row (dict) –

concepticon_conceptset(concepticon_api)[source]

Get a Concepticon conceptset associated with the Parameter.

Parameters

concepticon_apipyconcepticon.Concepticon instance or dict mapping conceptset IDs to pyconcepticon.models.Conceptset instances.

Returns

pyconcepticon.models.Conceptset instance or None.

class pycldf.orm.Sense(dataset, row)[source]
Parameters

row (dict) –

class pycldf.orm.Value(dataset, row)[source]
Parameters

row (dict) –