Usage

Invenio-Records is a metadata storage module.

In a few words, a record is basically a structured collection of fields and values (metadata) which provides information about other data.

A record (and each revision) is identified by a unique UUID, as most of the others entities in Invenio.

Invenio-Records is a core component of Invenio and it provides a way to create, update and delete records. Records are versioned, to keep track of modifications and to be able to revert back to a specific revision.

When creating or updating a record, if the record contains a schema definition, the record data will be validated against its schema. Moreover, data format can for each field be also validated.

When deleting a record, two options are available:

  • soft deletion: record will be deletes but keeping its identifier and history, to ensure that the same record’s identifier cannot be reused, and that older revisions can be retrieved.

  • hard deletion: record will be completely deleted with its history.

Records creation and update can be validated if the schema is provided.

Further documentation available Documentation: https://invenio-records.readthedocs.io/

Initialization

Create a Flask application:

>>> import os
>>> db_url = os.environ.get('SQLALCHEMY_DATABASE_URI', 'sqlite://')
>>> from flask import Flask
>>> app = Flask('myapp')
>>> app.config.update({
...     'SQLALCHEMY_DATABASE_URI': db_url,
...     'SQLALCHEMY_TRACK_MODIFICATIONS': False,
... })

Initialize Invenio-Records dependencies and Invenio-Records itself:

>>> from invenio_db import InvenioDB
>>> ext_db = InvenioDB(app)
>>> from invenio_records import InvenioRecords
>>> ext_records = InvenioRecords(app)

The following examples needs to run in a Flask application context, so let’s push one:

>>> app.app_context().push()

Also, for the examples to work we need to create the database and tables (note, in this example we use an in-memory SQLite database by default):

>>> from invenio_db import db
>>> db.create_all()

CRUD operations

Creation

Let’s create a very simple record:

>>> from invenio_records import Record
>>> record = Record.create({"title": "The title of the record"})
>>> db.session.commit()
>>> assert record.revision_id == 0

A new row has been added to the database, in the table records_metadata: this corresponds to the record metadata, first version (version 1).

Update

Let’s try to update the previously created record with new data. This will create a new version of the previous with the same uuid but incremented version/revision id. Update the record and commit the changes to apply them to the record:

>>> record['title'] = 'The title of the 2nd version of the record'
>>> record = record.commit()  # validate new data and store changes
>>> db.session.commit()
>>> assert record.revision_id == 1

A second row has been added, version 2. You can access to the different versions by doing:

>>> rec_v1 = record.revisions[0]
>>> rec_v2 = record.revisions[1]

Reverting

To restore the first version of the record, just:

>>> record = record.revert(0)
>>> db.session.commit()
>>> assert record.revision_id == 2

Patch

It is also possible to patch a record to perform multiple operations in one shot:

>>> record = Record.create({"title": "First title"})
>>> db.session.commit()
>>> assert len(record.revisions) == 1
>>> ops = [
...     {"op": "replace", "path": "/title", "value": "Title first record"},
...     {"op": "add", "path": "/description", "value": "Record description"}
... ]
>>> record = record.patch(ops)
>>> record = record.commit()
>>> db.session.commit()
>>> assert len(record.revisions) == 2

See JSON Patch documentation to have nice examples.

Deletion

Let’s create another record and then soft delete it:

>>> record = Record.create({"title": "Record to be deleted"})
>>> db.session.commit()
>>> record['title'] = 'Record to be deleted version 2'
>>> record = record.commit()
>>> db.session.commit()
>>> deleted = record.delete()

There is only one row left in the database corresponding to this record. Notice that the json column is empty, but the uuid is still there. This ensures uniqueness. The record can be retrieved by doing:

>>> deleted = Record.get_record(record.id, with_deleted=True)
>>> assert deleted.id == record.id

Let’s hard delete it, completely:

>>> deleted = record.delete(force=True)

Now, try to retrieve it, it will throw an exception.

>>> Record.get_record(record.id,
...                   with_deleted=True)  
Traceback (most recent call last):
  ...
NoResultFound: No row was found for one()

Record validation

When creating or updating a record, the input data can be validated to ensure that it is conform to a specified schema and values formats are respected. The validation is provided by the jsonschema library.

How jsonschema works

  • Format checker: create a custom format checker (or use one of the available), for example to validate if the first letter of a string is uppercase:

    >>> from jsonschema import FormatChecker
    >>> from jsonschema.validators import Draft4Validator
    >>> checker = FormatChecker()
    >>> f = checker.checks("uppercaseFirstLetter")(lambda value: value[0]
    ...                                             .isupper())
    >>> validator = Draft4Validator({"format": "uppercaseFirstLetter"},
    ...                             format_checker=checker)
    

    Now, let’s try it out:

    >>> validator.validate("Title of the record")
    

    Does not throw any exception, because the data is valid, the first letter is uppercase.

    >>> validator.validate(
    ...               "title of the record")  
    Traceback (most recent call last):
      ...
    ValidationError: 'title of the record' is not a 'uppercaseFirstLetter'
      ...
    

    This raises a ValidationError error exception, because the first letter is lowercase.

  • Schema validator: create a validator to ensure that the input data structure, fields and types conform to a specific schema.

    >>> schema = {
    ...     'type': 'object',
    ...     'properties': {
    ...         'title': { 'type': 'string' },
    ...         'description': { 'type': 'string' }
    ...     },
    ...     'required': ['title']
    ... }
    

    Try to validate a record without the field title, which is required.

    >>> from jsonschema.validators import validate
    >>> record = {"description": "Description but no title"}
    >>> validate(record, schema)  
    Traceback (most recent call last):
      ...
    ValidationError: 'title' is a required property
      ...
    

If the JSON schema is not defined inside the JSON itself, like in the example, but it is defined somewhere else (e.g. any schema provider service), the record should contain the $ref field with the URI link to the schema definition. Record provides a method api.RecordBase.replace_refs() that will resolve the URI in the $ref field and return a new Record with the schema definition injected.

Invenio-Records validation

Let’s put everything together and create a record with validation and format checking: define a schema with a mandatory title field and a validation format for the title field.

>>> from jsonschema import FormatChecker
>>> checker = FormatChecker()
>>> f = checker.checks("uppercaseFirstLetter")(lambda value: value[0]
...                                             .isupper())
>>> schema = {
...     'type':'object',
...     'properties': {
...         'title': {
...             'type':'string',
...             'format': 'uppercaseFirstLetter'
...         },
...         'description': {
...             'type':'string'
...         }
...     },
...     'required': ['title']
... }

Create a new record with an invalid value format for the title field. Notice that the schema must be defined in the record with the field $schema and the format checker must be passed as kwarg argument with the key format_checker, to be taken into account by the jsonschema library.

>>> record = {
...     "$schema": schema,
...     "title": "title of this record",  # first letter is lowercase
...     "description": "Description of this record"
... }
>>> rec = Record.create(record,
...                format_checker=checker)  
Traceback (most recent call last):
  ...
ValidationError: 'title of this record' is not a 'uppercaseFirstLetter'
  ...

Create a new record without the title field:

>>> record = {
...     "$schema": schema,
...     "description": "Description of this record without a title"
... }
>>> rec = Record.create(record,
...                format_checker=checker)  
Traceback (most recent call last):
  ...
ValidationError: 'title' is a required property
  ...

Signals

Invenio-Records provides several types of signals and they can be used to react to events to read or modify data before or after an operation.

Events are sent in case of:

  • record creation, before and after

  • record update, before and after

  • record deletion, before and after

  • record revert, before and after

Let’s modify the record before creation and verify, after creation, that the record has been correctly modified:

>>> from invenio_records.signals import (before_record_insert, \
...                                      after_record_insert)
>>> def before_record_creation_add_flag(sender, *args, **kwargs):
...     record = kwargs['record']
...     record['created_with'] = 'Invenio'
...
>>> listener = before_record_insert.connect(before_record_creation_add_flag)
>>> def after_record_creation(sender, *args, **kwargs):
...     record = kwargs['record']
...     assert 'created_with' in record
...
>>> listener = after_record_insert.connect(after_record_creation)
>>> rec_events = Record.create({"title": "My new record"})
>>> db.session.commit()

See API Docs for extensive API documentation.