Metadata quality model¶
The quality model specifies the necessary checks when a records should transition to md_prepared
. The functionality fixing errors is organized in the prep package endpoints.
Similar to linters such as pylint, it should be possible to disable selected checks. Failed checks are made transparent by adding the corresponding codes (e.g., mostly-upper) to the colrev_masterdata_provenance (notes field).
Table of contents¶
Format¶
mostly-all-caps¶
Fields should not contain mostly upper case letters.
Problematic value
title = {AN EMPIRICAL STUDY OF PLATFORM EXIT}
Correct value
title = {An empirical study of platform exit}
Fields checked |
---|
author |
title |
editor |
journal |
booktitle |
name-format-titles¶
Names should not contain titles, such as “MD”, “Dr”, “PhD”, “Prof”, or “Dipl Ing”.
Problematic value
@phdthesis{Smith2022,
...
author = {Prof. Smith, M. PhD.},
...
}
Correct value
@phdthesis{Smith2022,
...
author = {Smith, M.},
...
}
Fields checked |
---|
author |
editor |
name-format-separators¶
Names should be correctly separated.
Problematic value
author = {Smith, W.; Thompson, U.}
Correct value
author = {Smith, W. and Thompson, U.}
Author names are separated by ” and “.
Must contain at least two capital letters, and all should be letters
Should be separated by
,
Must be longer than 5
Fields checked |
---|
author |
editor |
name-particles¶
Name particles should be formatted correctly and protected.
Problematic value
author = {Brocke, Jan vom}
Correct value
author = {{vom Brocke}, Jan}
Fields checked |
---|
author |
editor |
Links
year-format¶
year
should be full year.
Problematic value
year = {2023-01-03}
Correct value
year = {2023}
Fields checked |
---|
year |
doi-not-matching-pattern¶
The doi field should follow a predefined pattern. It does not start with http… and is in upper case.
Problematic value
doi = {https://doi.org/10.1016/j.jsis. 2021.101694}
Correct value
doi = {10.1016/j.jsis.2021.101694}
Fields checked |
---|
doi |
Links
isbn-not-matching-pattern¶
ISBN should be valid.
Problematic value
isbn = {978316}
Correct value
isbn = {978-3-16-148410-0}
TODO : ISBN-10/ISBN13, how multiple ISBNs are stored
Fields checked |
---|
ibn |
pubmedid_not_matching_pattern¶
Pubmed IDs should be formatted correctly (7 or 8 digits).
Problematic value
colrev.pubmed.pubmedid = {PMID: 1498274774},
Correct value
colrev.pubmed.pubmedid = {33044175},
Fields checked |
---|
colrev.pubmed.pubmedid |
[PMID specification](https://www.nlm.nih.gov/bsd/mms/medlineelements.html#pmid)
language-format-error¶
The ISO 639-3 language code should be valid.
Problematic value
language = {en}
Correct value
language = {eng}
Fields checked |
---|
language |
See language_service.
language-unknown¶
Record should contain a ISO 639-3 language code.
Problematic value
language = {American English}
Correct value
language = {eng}
Fields checked |
---|
language |
See language_service.
Completeness¶
missing-field¶
Records should contain all required fields for the respective ENTRYTYPE.
Problematic value
@article{Webster2002,
title = {Analyzing the past to prepare for the future: Writing a literature review},
author = {Webster, Jane and Watson, Richard T},
journal = {MIS quarterly},
}
Correct value
@article{Webster2002,
title = {Analyzing the past to prepare for the future: Writing a literature review},
author = {Webster, Jane and Watson, Richard T},
journal = {MIS quarterly},
volume = {26},
number = {2},
pages = {xiii-xxiii},
}
See: inconsistent-field
ENTRYTYPE |
Required fields |
---|---|
article |
author, title, journal, year, volume, number |
inproceedings |
author, title, booktitle, year |
incollection |
author, title, booktitle, publisher, year |
inbook |
author, title, chapter, publisher, year |
proceedings |
booktitle, editor, year |
conference |
booktitle, editor, year |
book |
author, title, publisher, year |
phdthesis |
author, title, school, year |
bachelorthesis |
author, title, school, year |
thesis |
author, title, school, year |
masterthesis |
author, title, school, year |
techreport |
author, title, institution, year |
unpublished |
title, author, year |
misc |
author, title, year |
software |
author, title, url |
online |
author, title, url |
other |
author, title, year |
incomplete-field¶
Fields should be complete. Fields considered incomplete (truncated) if they have ...
at the end.
Problematic value
title = {A commentary on ...}
Correct value
title = {A commentary on microsourcing}
Fields checked |
---|
title |
journal |
booktitle |
author |
abstract |
container-title-abbreviated¶
Containers should not be abbreviated.
Problematic value
journal = {MISQ}
Correct value
year = {MIS Quarterly}
Container are considers abbreviated if it is less than 6 characters and all upper case.
Fields checked |
---|
journal |
booktitle |
name-abbreviated¶
Names should not be abbreviated
Problematic value
author = {Smith, W. et. al.}
Correct value
author = {Smith, W. and Thompson, U.}
Fields checked |
---|
author |
editor |
Within-record consistency¶
inconsistent-with-entrytype¶
Some fields are inconsistent with the respective ENTRYTYPE.
Problematic value
@article{SmithParkerWeber2003,
...
booktitle = {First Workshop on ...},
...
}
Correct value
@inproceedings{SmithParkerWeber2003,
...
booktitle = {First Workshop on ...},
...
}
ENTRYTYPE |
inconsistent fields |
---|---|
article |
booktitle |
inproceedings |
issue,number,journal |
incollection |
|
inbook |
journal |
book |
volume,issue,number,journal |
phdthesis |
volume,issue,number,journal,booktitle |
masterthesis |
volume,issue,number,journal,booktitle |
techreport |
volume,issue,number,journal,booktitle |
unpublished |
volume,issue,number,journal,booktitle |
online |
journal,booktitle |
misc |
journal,booktitle |
page-range¶
Page range should be valid, i.e., the first page should be lower than the last page if the pages are numerical.
Problematic value
pages = {11--9}
Correct value
pages = {11--19}
Fields checked |
---|
pages |
identical-values-between-title-and-container¶
Title and containers (booktitle, journal) should not contain identical values.
Problematic value
title = {MIS Quarterly},
journal = {MIS Quarterly},
Correct value
title = {A commentary on microsourcing}
journal = {MIS Quarterly},
inconsistent-content¶
Fields should not contain inconsistent values,
Journal should not be from conference or workshop,
booktitle should not belong to journal
Problematic value
journal = {Proceedings of the 32nd Conference on ...}
Correct value
booktitle = {Proceedings of the 32nd Conference on ...}
Fields checked |
Erroneous values |
---|---|
journal |
conference, workshop |
booktitle |
journal |
Origin consistency¶
inconsistent-with-doi-metadata¶
Record content needs to be consistent with doi metadata.
Problematic value
@article{wagner2021exploring,
title = {Analyzing the past to prepare for the future: Writing a literature review},
author = {Webster, Jane and Watson, Richard T},
journal = {MIS quarterly},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
doi = {10.1016/j.jsis.2021.101694}
}
# metadat at crossref:
# https://api.crossref.org/works/10.1016/j.jsis.2021.101694
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
doi = {10.1016/j.jsis.2021.101694}
}
Correct value
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
doi = {10.1016/j.jsis.2021.101694}
}
# metadat at crossref:
# https://api.crossref.org/works/10.1016/j.jsis.2021.101694
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
doi = {10.1016/j.jsis.2021.101694}
}
Fields checked |
---|
title |
journal |
author |
inconsistent-with-url-metadata¶
Checks url metadata should be consistent with Zotero generated metadata about the url.
Problematic value
@article{wagner2021exploring,
title = {Analyzing the past to prepare for the future: Writing a literature review},
author = {Webster, Jane and Watson, Richard T},
journal = {MIS quarterly},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
url = {https://www.sciencedirect.com/science/article/abs/pii/S096386872100041X}
}
# metadat from the url:
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
url = {https://www.sciencedirect.com/science/article/abs/pii/S096386872100041X}
}
Correct value
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
url = {https://www.sciencedirect.com/science/article/abs/pii/S096386872100041X}
}
# metadat from the url:
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
url = {https://www.sciencedirect.com/science/article/abs/pii/S096386872100041X}
}
Fields checked |
---|
author |
title |
year |
journal |
volume |
number |
record-not-in-toc¶
The record should be found in the relevant table-of-content (toc) if a toc is available.
Problematic value
@article{wagner2021exploring,
title = {A breakthrough paper on microsouring},
author = {Wagner, Gerit},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
year = {2021},
}
# Table-of-contents (based on crossref):
# The Journal of Strategic Information Systems, 30-4
Gable, G. and Chan, Y. - Welcome to this 4th issue of Volume 30 of The Journal of Strategic Information Systems
Mamonov, S. and Peterson, R. - The role of IT in organizational innovation – A systematic literature review
Eismann, K. and Posegga, O. and Fischbach, K. - Opening organizational learning in crisis management: On the affordances of social media
Dhillon, G. and Smith, K. and Dissanayaka, I. - Information systems security research agenda: Exploring the gap between research and practice
Wagner, G. and Prester, J. and Pare, G. - Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research
Hund, A. and Wagner, H. T. and Beimborn, D. and Weitzel, T. - Digital innovation: Review and novel perspective
Correct value
@article{wagner2021exploring,
title = {Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research},
author = {Wagner, Gerit and Prester, Julian and Paré, Guy},
journal = {The Journal of Strategic Information Systems},
volume = {30},
number = {4},
pages = {101694},
year = {2021},
}
# Table-of-contents (based on crossref):
# The Journal of Strategic Information Systems, 30-4
Gable, G. and Chan, Y. - Welcome to this 4th issue of Volume 30 of The Journal of Strategic Information Systems
Mamonov, S. and Peterson, R. - The role of IT in organizational innovation – A systematic literature review
Eismann, K. and Posegga, O. and Fischbach, K. - Opening organizational learning in crisis management: On the affordances of social media
Dhillon, G. and Smith, K. and Dissanayaka, I. - Information systems security research agenda: Exploring the gap between research and practice
Wagner, G. and Prester, J. and Pare, G. - Exploring the boundaries and processes of digital platforms for knowledge work: A review of information systems research
Hund, A. and Wagner, H. T. and Beimborn, D. and Weitzel, T. - Digital innovation: Review and novel perspective
Common defects¶
erroneous-symbol-in-field¶
Fields should not contains invalid symbols.
Problematic value
author = {M�ller, U.}
Correct value
author = {Müller, U.}
Symbols considered erroneous: “�”, “™”
Fields checked |
---|
author |
title |
editor |
journal |
booktitle |
erroneous-term-in-field¶
Fields should not contain any erroneous terms.
Problematic value
author = {Smith, F. orcid-0012393}
Correct value
author = {Smith, F.}
field |
Erroneous terms |
---|---|
author |
http, University, orcid, student, Harvard, Conference, Mrs, Hochschule |
title |
research paper, completed research, research in progress, full research paper |
erroneous-title-field¶
Title should not contain typical defects.
Problematic value
title = {A I S ssociation for nformation ystems}
Correct value
title = {An empirical study of platform exit}
Fields checked |
---|
title |