pycldf.ext.discovery

This module provides a function (get_dataset()) implementing dataset discovery.

The scope of discoverable datasets can be extended by plugins, i.e. Python packages which register additional DatasetResolver subclasses using the entry point pycldf_dataset_resolver

pycldf itself comes with two resolvers

  • LocalResolver

  • GenericUrlResolver

Additional resolvers:

  • The cldfzenodo package (>=1.0) provides a dataset resolver for DOI URLs pointing to the Zenodo archive.

class pycldf.ext.discovery.DatasetResolver[source]

Virtual base class for dataset resolvers.

Variables:

priority – A number between 0 and 10, determining the call order of registered resolvers. Resolvers with higher priority will be called earlier. Thus, resolvers specifying a high priority should be quick in figuring out whether they apply to a locator.

pycldf.ext.discovery.get_dataset(locator, download_dir, base=None)[source]
Parameters:
  • locator (str) – Dataset locator as specified in “Dataset discovery”.

  • download_dir (pathlib.Path) – Directory to which to download remote data if necessary.

  • base (typing.Optional[pathlib.Path]) – Optional path relative to which local paths in locator must be resolved.

Return type:

pycldf.dataset.Dataset