pycldf.media

Accessing media associated with a CLDF dataset.

You can iterate over the File objects associated with media using the Media wrapper:

from pycldf.media import Media

for f in Media(dataset):
    if f.mimetype.type == 'audio':
        f.save(directory)

or instantiate a File from a pycldf.orm.Object:

from pycldf.media import File

f = File.from_dataset(dataset, dataset.get_object('MediaTable', 'theid'))
class pycldf.media.File(media, row)[source]

A File represents a row in a MediaTable, providing functionality to access the contents.

Variables:
  • id – The ID of the item.

  • url – The URL (as str) to download the content associated with the item.

File supports media files within ZIP archives as specified in CLDF 1.2. I.e.

  • read() will extract the specified file from a downloaded ZIP archive and

  • save() will write a (deflated) ZIP archive containing the specified file as single member.

Parameters:
classmethod from_dataset(ds, row_or_object)[source]

Factory method to instantiate a File bypassing the Media wrapper.

Parameters:
Return type:

pycldf.media.File

local_path(d)[source]
Return type:

pathlib.Path

Returns:

The expected path of the file in the directory d.

Parameters:

d (pathlib.Path) –

property mimetype: Mimetype

The Mimetype object associated with the item.

While the mediaType column is required by the CLDF spec, this might be disabled. If so, we use “out-of-band” methods to figure out a mimetype for the file.

read(d=None)[source]
Parameters:

d – A local directory where the file has been saved before. If None, the content will be read from the file’s URL.

Return type:

typing.Union[None, str, bytes]

save(d)[source]

Saves the content of File in directory d.

Return type:

pathlib.Path

Returns:

Path of the local file where the content has been saved.

Note

We use the identifier of the media item (i.e. the content of the ID column of the associated row) as stem of the file to be written.

Parameters:

d (pathlib.Path) –

class pycldf.media.MediaTable(ds)[source]

Container class for a Dataset’s media items.

Parameters:

ds (pycldf.dataset.Dataset) –

class pycldf.media.Mimetype(s)[source]

A media type specification.

Variables:
  • type – The (main) type as str.

  • subtype – The subtype as str.

  • encoding – The encoding specified with a “charset” parameter.