colrev.record.record_prep.PrepRecord

class colrev.record.record_prep.PrepRecord(data)[source]

Bases: Record

The PrepRecord class provides a range of Function for record preparation.

Initialize the instance.

Methods

add_field_provenance

Add a field provenance, including source and note (based on a key).

add_field_provenance_note

Add a field provenance note (based on a key).

add_provenance_all

Add a data provenance (source) to all fields.

align_provenance

Remove unnecessary provenance information and add missing provenance information.

change_entrytype

Change the ENTRYTYPE.

complete_provenance

Complete provenance information for indexing.

copy_prep_rec

Copy the record object (as a PrepRecord).

defects

Get a list of defects for a field.

format_author_field

Format the author field (recognizing first/last names based on HumanName parser).

format_bib_style

Simple formatter for bibliography-style output.

format_if_mostly_upper

Format the field if it is mostly in upper case.

get_citation_format

Get the record as a citation.

get_colrev_id

Returns the colrev_id of the Record.

get_colrev_pdf_id

Generate the colrev_pdf_id.

get_container_title

Get the record's container title (journal name, booktitle, etc.).

get_data

Get the record data.

get_diff

Get diff between record objects.

get_field_provenance

Get the provenance for a selected field (key).

get_field_provenance_notes

Get field provenance notes based on a key.

get_field_provenance_source

Get the provenance source for a selected field (key).

get_record_change_score

Determine how much records changed.

get_record_similarity

Determine the similarity between two records (their masterdata).

get_tei_filename

Get the TEI filename associated with the file (PDF).

get_toc_key

Get the record's toc-key.

get_value

Get a record value (based on the key parameter).

has_fatal_quality_defects

Check whether a record has fatal quality defects.

has_pdf_defects

Check whether the PDF has quality defects.

has_quality_defects

Check whether a record (or specific field/key) has quality defects.

ignore_defect

Ignore a defect for a field.

ignored_defect

Get a list of ignored defects for a record.

is_retracted

Check for potential retracts.

masterdata_is_curated

Check whether the record masterdata is curated.

merge

General-purpose record merging for preparation, curated/non-curated records and records with origins.

prescreen_exclude

Prescreen-exclude a record.

print_citation_format

Print the record as a citation.

remove_field

Remove a field.

remove_field_provenance_note

Remove field provenance notes based on a key (also if IGNORE:note).

rename_field

Rename a field.

require_prov

Ensure that provenance fields are available.

reset_pdf_provenance_notes

Reset the PDF (file) provenance notes.

run_pdf_quality_model

Run the PDF quality model.

run_quality_model

Update the masterdata provenance.

set_masterdata_complete

Set the masterdata to complete.

set_masterdata_consistent

Set the masterdata to consistent.

set_masterdata_curated

Set record masterdata to curated.

set_status

Set the record status.

unify_pages_field

Unify the format of the page field.

update_by_record

Update all data of a record object based on another record.

update_field

Update a record field (including provenance information).

Attributes

pp

data

Dictionary containing the record data

add_field_provenance(*, key, source, note='')

Add a field provenance, including source and note (based on a key).

Return type:

None

add_field_provenance_note(*, key, note)

Add a field provenance note (based on a key).

Return type:

None

add_provenance_all(*, source)

Add a data provenance (source) to all fields.

Return type:

None

align_provenance()

Remove unnecessary provenance information and add missing provenance information.

Return type:

None

change_entrytype(new_entrytype)

Change the ENTRYTYPE.

Return type:

None

complete_provenance(*, source_info)

Complete provenance information for indexing.

Return type:

bool

copy_prep_rec()

Copy the record object (as a PrepRecord).

Return type:

PrepRecord

data

Dictionary containing the record data

defects(key)

Get a list of defects for a field.

Return type:

List[str]

classmethod format_author_field(input_string)[source]

Format the author field (recognizing first/last names based on HumanName parser).

Return type:

str

format_bib_style()

Simple formatter for bibliography-style output.

Return type:

str

format_if_mostly_upper(key, *, case='sentence')[source]

Format the field if it is mostly in upper case.

Return type:

None

get_citation_format()

Get the record as a citation.

Return type:

str

get_colrev_id(*, assume_complete=False)

Returns the colrev_id of the Record.

Return type:

str

classmethod get_colrev_pdf_id(pdf_path)

Generate the colrev_pdf_id.

Return type:

str

get_container_title(*, na_string='NA')

Get the record’s container title (journal name, booktitle, etc.).

Return type:

str

get_data()

Get the record data.

Return type:

dict

get_diff(other_record, *, identifying_fields_only=True)

Get diff between record objects.

Return type:

list

get_field_provenance(*, key, default_source='ORIGINAL')

Get the provenance for a selected field (key).

Return type:

dict

get_field_provenance_notes(key)

Get field provenance notes based on a key.

Return type:

list

get_field_provenance_source(key)

Get the provenance source for a selected field (key).

Return type:

str

classmethod get_record_change_score(record_a, record_b)

Determine how much records changed.

This method is less sensitive than get_record_similarity, especially when fields are missing. For example, if the journal field is missing in both records, get_similarity will return a value > 1.0. The get_record_changes will return 0.0 (if all other fields are equal).

Return type:

float

classmethod get_record_similarity(record_a, record_b)

Determine the similarity between two records (their masterdata).

Return type:

float

get_tei_filename()

Get the TEI filename associated with the file (PDF).

Return type:

Path

get_toc_key()

Get the record’s toc-key.

Return type:

str

get_value(key, *, default=None)

Get a record value (based on the key parameter).

Return type:

str

has_fatal_quality_defects()

Check whether a record has fatal quality defects.

Return type:

bool

has_pdf_defects()

Check whether the PDF has quality defects.

Return type:

bool

has_quality_defects(*, key='')

Check whether a record (or specific field/key) has quality defects.

Return type:

bool

ignore_defect(*, key, defect)

Ignore a defect for a field.

Return type:

None

ignored_defect(*, key, defect)

Get a list of ignored defects for a record.

Return type:

bool

is_retracted()

Check for potential retracts.

Return type:

bool

masterdata_is_curated()

Check whether the record masterdata is curated.

Return type:

bool

merge(merging_record, *, default_source, preferred_masterdata_source_prefixes=None)

General-purpose record merging for preparation, curated/non-curated records and records with origins.

Apply heuristics to create a fusion of the best fields based on quality heuristics

Return type:

None

prescreen_exclude(*, reason, print_warning=False)

Prescreen-exclude a record.

Return type:

None

print_citation_format()

Print the record as a citation.

Return type:

None

remove_field(*, key, not_missing_note=False, source='')

Remove a field.

Return type:

None

remove_field_provenance_note(*, key, note)

Remove field provenance notes based on a key (also if IGNORE:note).

Return type:

None

rename_field(*, key, new_key)

Rename a field.

Return type:

None

require_prov()

Ensure that provenance fields are available.

Return type:

None

reset_pdf_provenance_notes()

Reset the PDF (file) provenance notes.

Return type:

None

run_pdf_quality_model(pdf_qm, *, set_prepared=False)

Run the PDF quality model.

Return type:

None

run_quality_model(quality_model, *, set_prepared=False)

Update the masterdata provenance.

Return type:

None

set_masterdata_complete(*, source, masterdata_repository, replace_source=True)

Set the masterdata to complete.

Return type:

None

set_masterdata_consistent()

Set the masterdata to consistent.

Return type:

None

set_masterdata_curated(source)

Set record masterdata to curated.

Return type:

None

set_status(target_state, *, force=False)

Set the record status.

Return type:

None

unify_pages_field()[source]

Unify the format of the page field.

Return type:

None

update_by_record(update_record)

Update all data of a record object based on another record.

Return type:

None

update_field(*, key, value, source, note='', keep_source_if_equal=True, append_edit=True)

Update a record field (including provenance information).

Return type:

None