PDF quality model

The quality model specifies the necessary checks when a records should transition to pdf_prepared. The functionality fixing errors is organized in the pdf-prep package endpoints.

Similar to linters such as pylint, it should be possible to disable selected checks. Failed checks are made transparent by adding the corresponding codes (e.g., author-not-in-pdf) to the colrev_masterdata_provenance (notes field).

Table of contents


No text in the PDF, need to apply OCR.


PDF incomplete, i.e., the number of pages recorded does not match the number of pages in the PDF document.


The author names are not in the PDF document.


The title is not in the PDF document.


A decorative cover-page is included in the PDF.


A decorative last page is included in the PDF.