Package development

CoLRev packages are Python packages that extend CoLRev by relying on its shared data structure, standard process, and common interfaces. Packages can support specific endpoints (e.g., search_source, prescreen, pdf-get) or provide complementary functionalities (e.g., for ad-hoc data exploration and visualization).

The following guide explains how to develop built-in packages, i.e., packages that reside in the packages directory. Built-in packages should also be registered as a dependency in the pyproject.toml.

Overview package development

Init

To create a new CoLRev package, the following command sets up the necessary directories, files, and code skeleton:

colrev package --init

To check the package structure and metadata, use the following command:

colrev package --check

Install and use

To install a CoLRev package, you can use the following command (pip install <package_name> is also possible):

colrev install <package_name>

Once installed, packages that implement endpoints can be used in the standard process by registering the package’s endpoint in the settings.json of a project (e.g., by running colrev search –add <package_name>).

Creating a new CoLRev package

To create a new CoLRev package, the following command sets up the necessary directories, files, and code skeleton:

colrev package --init

Develop, test, document and check

The init command should set up the package structure and metadata. The following sections provide more details on how to develop, test, document, and check the package.

It is recommended to run the following check regularly:

colrev package --check

Package structure

A package contains the following files and directories:

├── pyproject.toml
├── README.md
├── src
│   ├── __init__.py
│   ├── package_functionality.py

The package metadata is stored in the pyproject.toml file. The metadata is used by the CoLRev to identify the package and its dependencies. The metadata should include the following fields:

[tool.poetry]
name = "colrev.abi_inform_proquest"
description = "CoLRev package for abi_inform_proquest"
version = "0.1.0"
authors = ["Gerit Wagner <gerit.wagner@uni-bamberg.de>"]
license = "MIT"
repository = "https://github.com/CoLRev-Environment/colrev/blob/main/colrev/packages/sync"


[tool.colrev]
colrev_doc_description = "Package for sync"
colrev_doc_link = "README.md"
search_types = ["API", "TOC", "MD"]

[tool.poetry.plugins.colrev]
search_source = "colrev.packages.abi_inform_proquest.src.package_functionality:ABIInformProQuestSearchSource"

In the tool.poetry.plugins.colrev section, the endpoints can be specified. The endpoint class is a string that contains the module path and the class name of the endpoint. The module path is relative to the package directory.

Develop

Package development is done in the src directory. The package should implement the respective endpoint interface.

Best practices

  • Remember to install CoLRev in editable mode, so that changes are immediately available (run pip install -e /path/to/cloned/colrev)

  • Check the other package implementations for getting a good idea on how to proceed

  • Use the colrev constants

  • Get paths from review_manager

  • Use the logger and colrev_report_logger to help users examine and validate the process, including links to the docs where instructions for tracing and fixing errors are available.

  • Before committing do a pre-commit test

  • Use poetry for dependency management (run poetry add <package_name> to add a new dependency)

  • Once the package development is completed, make a PR to the CoLRev, with brief description of the package.

  • The add_endpoint is only required for SearchSources. It is optional for other endpoint types.

Endpoints allow packages to implement functionality that can be called in the standard process if users register the endpoint in the settings.json of a project.

To implement an endpoint, the tool.colrev section of pyproject.toml must provide a reference to the endpoint class which implements the respective interfaces. The reference is a string that contains the module path and the class name of the endpoint. The module path is relative to the package directory.

The following endpoint - interface pairs are available:

Endpoint

Interface

review_type

ReviewTypeInterface

search_source

SearchSourceInterface

prep

PrepInterface

prep_man

PrepManInterface

dedupe

DedupeInterface

prescreen

PrescreenInterface

pdf_get

PDFGetInterface

pdf_get_man

PDFGetManInterface

pdf_prep

PDFPrepInterface

pdf_prep_man

PDFPrepManInterface

screen

ScreenInterface

data

DataInterface

Documentation

  • Link the documentation (README.md) in the pyproject.toml.

  • See tests/REAMDE.md for details on building the CoLRev docs.

  • CLI demonstrations can be recorded with asciinema.

Testing

  • Tests for built-in packages are currently in the tests of the CoLRev packages.

  • See tests/REAMDE.md for details.

Document

  • Link the documentation (README.md) in the pyproject.toml.

  • See docs/REAMDE.md for details on building the CoLRev docs.

  • CLI demonstrations can be recorded with asciinema.

Publish

  • Standalone CoLRev packages are published on PyPI.

  • Built-in packages are not published separately. They are automatically provided with every PyPI-release of CoLRev.

Register

To have a package registered as an official CoLRev package, create a pull-request adding it to the packages.json.

To integrate the package documentation into the official CoLRev documentation, the CoLRev team

Package development resources