PDF quality model

The quality model specifies the necessary checks when a records should transition to pdf_prepared. The functionality fixing errors is organized in the pdf-prep package endpoints.

Similar to linters such as pylint, it should be possible to disable selected checks. Failed checks are made transparent by adding the corresponding codes (e.g., author-not-in-pdf) to the colrev_masterdata_provenance (notes field).

Table of contents

no-text-in-pdf

No text in the PDF, need to apply OCR.

pdf-incomplete

PDF incomplete, i.e., the number of pages recorded does not match the number of pages in the PDF document.

author-not-in-pdf

The author names are not in the PDF document.

title-not-in-pdf

The title is not in the PDF document.

coverpage-included

A decorative cover-page is included in the PDF.

last-page-included

A decorative last page is included in the PDF.