colrev.env.tei_parser.TEIParser¶

class TEIParser(*, pdf_path=None, tei_path=None)[source]¶

Bases: object

Environment service for TEI parsing

Creates a TEI file modes of operation: - pdf_path: create TEI and temporarily store in self.data - pfd_path and tei_path: create TEI and save in tei_path - tei_path: read TEI from file

Methods

`get_abstract`	Get the abstract
`get_author_details`	Get the author details
`get_citations_per_section`	Get a dict of section-names and list-of-citations
`get_grobid_version`	Get the GROBID version used for TEI creation
`get_metadata`	Get the metadata of the PDF (title, author, ...) as a dict
`get_paper_keywords`	Get hte keywords
`get_references`	Get the bibliography (references section) as a list of record dicts
`get_tei_str`	Get the TEI string
`mark_references`	Mark references with the additional record ID

Attributes

`ns`
`nsmap`

get_abstract()[source]¶

Get the abstract

Return type:: str

get_author_details()[source]¶

Get the author details

Return type:: list

get_citations_per_section()[source]¶

Get a dict of section-names and list-of-citations

Return type:: dict

get_grobid_version()[source]¶

Get the GROBID version used for TEI creation

Return type:: str

get_metadata()[source]¶

Get the metadata of the PDF (title, author, …) as a dict

Return type:: dict

get_paper_keywords()[source]¶

Get hte keywords

Return type:: list

get_references(*, add_intext_citation_count=False)[source]¶

Get the bibliography (references section) as a list of record dicts

Return type:: list

get_tei_str()[source]¶

Get the TEI string

Return type:: str

mark_references(*, records)[source]¶: Mark references with the additional record ID