search_query.pubmed package

Submodules

search_query.pubmed.constants module

Constants for PubMed.

search_query.pubmed.constants.generic_search_field_to_syntax_field(generic_search_field: str) str

Convert a set of generic search fields to a set of syntax strings.

search_query.pubmed.constants.map_to_standard(syntax_str: str) str

Map a syntax string to a standard syntax string.

search_query.pubmed.constants.syntax_str_to_generic_search_field_set(field_value: str) set

Translate a search field

search_query.pubmed.linter module

Pubmed query linter.

class search_query.pubmed.linter.PubmedQueryListLinter(parser: PubmedListParser, string_parser_class: Type[QueryStringParser])

Bases: QueryListLinter

Linter for PubMed Query Strings

check_invalid_list_reference() None

Check for invalid list reference

check_operator_node_token_sequence() None

Check operator nodes

validate_tokens() None

Validate token list

class search_query.pubmed.linter.PubmedQueryStringLinter(query_str: str = '')

Bases: QueryStringLinter

Linter for PubMed Query Strings

PLATFORM: <PLATFORM.PUBMED: 'pubmed'> = 'pubmed'
PROXIMITY_REGEX = re.compile('^\\[(.+):~(.*)\\]$')
VALID_FIELDS_REGEX: re.Pattern
VALID_TOKEN_SEQUENCES: Dict[TokenTypes, List[TokenTypes]] = {TokenTypes.FIELD: [TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED, TokenTypes.RANGE_OPERATOR], TokenTypes.LOGIC_OPERATOR: [TokenTypes.SEARCH_TERM, TokenTypes.PARENTHESIS_OPEN], TokenTypes.PARENTHESIS_CLOSED: [TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED], TokenTypes.PARENTHESIS_OPEN: [TokenTypes.SEARCH_TERM, TokenTypes.PARENTHESIS_OPEN], TokenTypes.RANGE_OPERATOR: [TokenTypes.SEARCH_TERM], TokenTypes.SEARCH_TERM: [TokenTypes.FIELD, TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED]}
check_character_replacement_in_search_term(query: Query) None

Check a search term for invalid characters

check_general_search_field_mismatch() None

Check general search field mismatch

check_invalid_proximity_operator() None

Check search field for invalid proximity operator

check_invalid_syntax() None

Check for invalid syntax in the query string.

check_invalid_token_sequences() None

Check token list for invalid token sequences.

check_invalid_wildcard(query: Query) None

Check search term for invalid wildcard *

check_unsupported_pubmed_search_fields() None

Check for the correct format of fields.

messages: List[dict]
query: Optional[Query]
syntax_str_to_generic_search_field_set(field_value: str) set

Translate a search field

tokens: List[Token]
validate_platform_query(query: Query) None

Validate the query for the PubMed platform

validate_query_tree(query: Query) None

Validate the query tree

validate_tokens(*, tokens: List[Token], query_str: str, search_field_general: str = '') List[Token]

Validate token list

search_query.pubmed.parser module

Pubmed query parser.

class search_query.pubmed.parser.PubmedListParser(query_list: str, *, search_field_general: str = '', mode: str = 'non-strict')

Bases: QueryListParser

Parser for Pubmed (list format) queries.

LIST_ITEM_REF = re.compile('#\\d+')
LIST_ITEM_REGEX: Pattern = re.compile('^(\\d+).\\s+(.*)$')
OPERATOR_NODE_REGEX = re.compile('#\\d|AND|OR|NOT')
get_operator_node_tokens(token_nr: int) list

Get operator node tokens

parse() Query

Parse the query in list format.

query_dict: dict
class search_query.pubmed.parser.PubmedParser(query_str: str, search_field_general: str = '', mode: str = 'non-strict')

Bases: QueryStringParser

Parser for Pubmed queries.

OPERATOR_REGEX: re.Pattern = re.compile('(\\||&|\\b(?:AND|OR|NOT|:)\\b)(?!\\s?\\[[^\\[]*?\\])')
PARENTHESIS_REGEX = re.compile('[\\(\\)]')
PROXIMITY_REGEX = re.compile('^\\[(.+):~(.*)\\]$')
SEARCH_FIELD_REGEX = re.compile('\\[[^\\[]*?\\]')
SEARCH_PHRASE_REGEX = re.compile('\\".*?\\"')
SEARCH_TERM_REGEX = re.compile('[^\\s\\[\\]()\\|&]+')
linter: QueryStringLinter
parse() Query

Parse a query string

parse_query_tree(tokens: list) Query

Parse a query from a list of tokens

pattern = re.compile('\\[[^\\[]*?\\]|(\\||&|\\b(?:AND|OR|NOT|:)\\b)(?!\\s?\\[[^\\[]*?\\])|[\\(\\)]|\\".*?\\"|[^\\s\\[\\]()\\|&]+', re.IGNORECASE)
tokenize() None

Tokenize the query_str

tokens: list

search_query.pubmed.serializer module

Pubmed serializer.

search_query.pubmed.serializer.to_string_pubmed(query: Query) str

Serialize the Query tree into a PubMed search string.

search_query.pubmed.translator module

Pubmed query translator.

class search_query.pubmed.translator.PubmedTranslator

Bases: QueryTranslator

Translator for Pubmed queries.

classmethod to_generic_syntax(query: Query) Query

Convert the query to a generic syntax.

classmethod to_specific_syntax(query: Query) Query

Convert the query to a specific syntax.

classmethod translate_search_fields_to_generic(query: Query) None

Translate search fields

Module contents

Top-level package for PubMed.