colrev.record.record_prep.PrepRecord¶
- class colrev.record.record_prep.PrepRecord(data)[source]¶
Bases:
Record
The PrepRecord class provides a range of Function for record preparation
Methods
add_field_provenance
Add a field provenance, including source and note (based on a key)
add_field_provenance_note
Add a field provenance note (based on a key)
add_provenance_all
Add a data provenance (source) to all fields
align_provenance
Remove unnecessary provenance information and add missing provenance information
change_entrytype
Change the ENTRYTYPE
complete_provenance
Complete provenance information for indexing
copy_prep_rec
Copy the record object (as a PrepRecord)
defects
Get a list of defects for a field
format_author_field
Format the author field (recognizing first/last names based on HumanName parser)
format_bib_style
Simple formatter for bibliography-style output
format_if_mostly_upper
Format the field if it is mostly in upper case
get_citation_format
Get the record as a citation
get_colrev_id
Returns the colrev_id of the Record.
get_colrev_pdf_id
Generate the colrev_pdf_id
get_container_title
Get the record's container title (journal name, booktitle, etc.)
get_data
Get the record data
get_diff
Get diff between record objects
get_field_provenance
Get the provenance for a selected field (key)
get_field_provenance_notes
Get field provenance notes based on a key
get_field_provenance_source
Get the provenance source for a selected field (key)
get_record_change_score
Determine how much records changed
get_record_similarity
Determine the similarity between two records (their masterdata)
get_tei_filename
Get the TEI filename associated with the file (PDF)
get_toc_key
Get the record's toc-key
get_value
Get a record value (based on the key parameter)
has_fatal_quality_defects
Check whether a record has fatal quality defects
has_pdf_defects
Check whether the PDF has quality defects
has_quality_defects
Check whether a record (or specific field/key) has quality defects
ignore_defect
Ignore a defect for a field
ignored_defect
Get a list of ignored defects for a record
is_retracted
Check for potential retracts
masterdata_is_curated
Check whether the record masterdata is curated
merge
General-purpose record merging for preparation, curated/non-curated records and records with origins
prescreen_exclude
Prescreen-exclude a record
print_citation_format
Print the record as a citation
remove_field
Remove a field
remove_field_provenance_note
Remove field provenance notes based on a key (also if IGNORE:note)
rename_field
Rename a field
require_prov
Ensure that provenance fields are available
reset_pdf_provenance_notes
Reset the PDF (file) provenance notes
run_pdf_quality_model
Run the PDF quality model
run_quality_model
Update the masterdata provenance
set_masterdata_complete
Set the masterdata to complete
set_masterdata_consistent
Set the masterdata to consistent
set_masterdata_curated
Set record masterdata to curated
set_status
Set the record status
unify_pages_field
Unify the format of the page field
update_by_record
Update all data of a record object based on another record
update_field
Update a record field (including provenance information)
Attributes
pp
data
Dictionary containing the record data
- add_field_provenance(*, key, source, note='')¶
Add a field provenance, including source and note (based on a key)
- Return type:
None
- add_field_provenance_note(*, key, note)¶
Add a field provenance note (based on a key)
- Return type:
None
- add_provenance_all(*, source)¶
Add a data provenance (source) to all fields
- Return type:
None
- align_provenance()¶
Remove unnecessary provenance information and add missing provenance information
- Return type:
None
- change_entrytype(new_entrytype)¶
Change the ENTRYTYPE
- Return type:
None
- complete_provenance(*, source_info)¶
Complete provenance information for indexing
- Return type:
bool
- copy_prep_rec()¶
Copy the record object (as a PrepRecord)
- Return type:
- data¶
Dictionary containing the record data
- defects(key)¶
Get a list of defects for a field
- Return type:
List
[str
]
- classmethod format_author_field(input_string)[source]¶
Format the author field (recognizing first/last names based on HumanName parser)
- Return type:
str
- format_bib_style()¶
Simple formatter for bibliography-style output
- Return type:
str
- format_if_mostly_upper(key, *, case='sentence')[source]¶
Format the field if it is mostly in upper case
- Return type:
None
- get_citation_format()¶
Get the record as a citation
- Return type:
str
- get_colrev_id(*, assume_complete=False)¶
Returns the colrev_id of the Record.
- Return type:
str
- classmethod get_colrev_pdf_id(pdf_path)¶
Generate the colrev_pdf_id
- Return type:
str
- get_container_title(*, na_string='NA')¶
Get the record’s container title (journal name, booktitle, etc.)
- Return type:
str
- get_data()¶
Get the record data
- Return type:
dict
- get_diff(other_record, *, identifying_fields_only=True)¶
Get diff between record objects
- Return type:
list
- get_field_provenance(*, key, default_source='ORIGINAL')¶
Get the provenance for a selected field (key)
- Return type:
dict
- get_field_provenance_notes(key)¶
Get field provenance notes based on a key
- Return type:
list
- get_field_provenance_source(key)¶
Get the provenance source for a selected field (key)
- Return type:
str
- classmethod get_record_change_score(record_a, record_b)¶
Determine how much records changed
This method is less sensitive than get_record_similarity, especially when fields are missing. For example, if the journal field is missing in both records, get_similarity will return a value > 1.0. The get_record_changes will return 0.0 (if all other fields are equal).
- Return type:
float
- classmethod get_record_similarity(record_a, record_b)¶
Determine the similarity between two records (their masterdata)
- Return type:
float
- get_tei_filename()¶
Get the TEI filename associated with the file (PDF)
- Return type:
Path
- get_toc_key()¶
Get the record’s toc-key
- Return type:
str
- get_value(key, *, default=None)¶
Get a record value (based on the key parameter)
- Return type:
str
- has_fatal_quality_defects()¶
Check whether a record has fatal quality defects
- Return type:
bool
- has_pdf_defects()¶
Check whether the PDF has quality defects
- Return type:
bool
- has_quality_defects(*, key='')¶
Check whether a record (or specific field/key) has quality defects
- Return type:
bool
- ignore_defect(*, key, defect)¶
Ignore a defect for a field
- Return type:
None
- ignored_defect(*, key, defect)¶
Get a list of ignored defects for a record
- Return type:
bool
- is_retracted()¶
Check for potential retracts
- Return type:
bool
- masterdata_is_curated()¶
Check whether the record masterdata is curated
- Return type:
bool
- merge(merging_record, *, default_source, preferred_masterdata_source_prefixes=None)¶
General-purpose record merging for preparation, curated/non-curated records and records with origins
Apply heuristics to create a fusion of the best fields based on quality heuristics
- Return type:
None
- prescreen_exclude(*, reason, print_warning=False)¶
Prescreen-exclude a record
- Return type:
None
- print_citation_format()¶
Print the record as a citation
- Return type:
None
- remove_field(*, key, not_missing_note=False, source='')¶
Remove a field
- Return type:
None
- remove_field_provenance_note(*, key, note)¶
Remove field provenance notes based on a key (also if IGNORE:note)
- Return type:
None
- rename_field(*, key, new_key)¶
Rename a field
- Return type:
None
- require_prov()¶
Ensure that provenance fields are available
- Return type:
None
- reset_pdf_provenance_notes()¶
Reset the PDF (file) provenance notes
- Return type:
None
- run_pdf_quality_model(pdf_qm, *, set_prepared=False)¶
Run the PDF quality model
- Return type:
None
- run_quality_model(quality_model, *, set_prepared=False)¶
Update the masterdata provenance
- Return type:
None
- set_masterdata_complete(*, source, masterdata_repository, replace_source=True)¶
Set the masterdata to complete
- Return type:
None
- set_masterdata_consistent()¶
Set the masterdata to consistent
- Return type:
None
- set_masterdata_curated(source)¶
Set record masterdata to curated
- Return type:
None
- set_status(target_state, *, force=False)¶
Set the record status
- Return type:
None
- update_by_record(update_record)¶
Update all data of a record object based on another record
- Return type:
None
- update_field(*, key, value, source, note='', keep_source_if_equal=True, append_edit=True)¶
Update a record field (including provenance information)
- Return type:
None