search_query.pubmed package

Submodules

search_query.pubmed.constants module

Constants for PubMed.

search_query.pubmed.constants.generic_field_to_syntax_field(generic_field: str) str

Convert a set of generic search fields to a set of syntax strings.

search_query.pubmed.constants.map_to_standard(syntax_str: str) str

Map a syntax string to a standard syntax string.

search_query.pubmed.constants.syntax_str_to_generic_field_set(field_value: str) set

Translate a search field

search_query.pubmed.linter module

Pubmed query linter.

class search_query.pubmed.linter.PubmedQueryListLinter(parser: PubmedListParser, string_parser_class: Type[QueryStringParser])

Bases: QueryListLinter

Linter for PubMed Query Strings

OPERATOR_NODE_REGEX = re.compile('#?\\d+|AND|OR|NOT')
last_read_index: Dict[int, int]
messages: dict
parser: PubmedListParser
validate_tokens() None

Validate token list

class search_query.pubmed.linter.PubmedQueryStringLinter(query_str: str = '', *, original_str: str | None = None, silent: bool = False)

Bases: QueryStringLinter

Linter for PubMed Query Strings

PLATFORM: PLATFORM = 'pubmed'
PROXIMITY_REGEX = re.compile('^\\[(.+):~(.*)\\]$')
VALID_TOKEN_SEQUENCES: Dict[TokenTypes, List[TokenTypes]] = {TokenTypes.FIELD: [TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED, TokenTypes.RANGE_OPERATOR], TokenTypes.LOGIC_OPERATOR: [TokenTypes.TERM, TokenTypes.PARENTHESIS_OPEN], TokenTypes.PARENTHESIS_CLOSED: [TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED], TokenTypes.PARENTHESIS_OPEN: [TokenTypes.TERM, TokenTypes.PARENTHESIS_OPEN], TokenTypes.RANGE_OPERATOR: [TokenTypes.TERM], TokenTypes.TERM: [TokenTypes.FIELD, TokenTypes.LOGIC_OPERATOR, TokenTypes.PARENTHESIS_CLOSED]}
VALID_fieldS_REGEX: re.Pattern
YEAR_VALUE_REGEX = re.compile('^"?(?P<year>\\d{4})(?P<month>\\/(0[1-9]|1[0-2]))?(?P<day>\\/(0[1-9]|[12]\\d|3[01]))?(\\:(?P<year2>(\\d{4})(?P<month2>\\/(0[1-9]|1[0-2]))?(?P<day2>\\/(0[1-9]|[12]\\d|3[01]))?))?"?$', re.VERBOSE)
add_artificial_parentheses_for_operator_precedence(index: int = 0, output: list | None = None) tuple[int, list[Token]]

Adds artificial parentheses with position (-1, -1) to enforce PubMed operator precedence.

check_character_replacement_in_term(query: Query) None

Check a search term for invalid characters

check_implicit_fields() None

Check the general search field

check_invalid_proximity_operator() None

Check search field for invalid proximity operator

check_invalid_syntax() None

Check for invalid syntax in the query string.

check_invalid_token_sequences() None

Check token list for invalid token sequences.

check_invalid_wildcard(query: Query) None

Check search term for invalid wildcard *

check_unsupported_pubmed_fields() None

Check for the correct format of fields.

check_year_format(query: Query) None

Check for the correct format of year.

messages: List[dict]
query: Optional[Query]
syntax_str_to_generic_field_set(field_value: str) set

Translate a search field

tokens: List[Token]
validate_platform_query(query: Query) None

Validate the query for the PubMed platform

validate_query_tree(query: Query) None

Validate the query tree

validate_tokens(*, tokens: List[Token], query_str: str, field_general: str = '') List[Token]

Validate token list

search_query.pubmed.parser module

Pubmed query parser.

class search_query.pubmed.parser.PubmedListParser(query_list: str, *, field_general: str = '')

Bases: QueryListParser

Parser for Pubmed (list format) queries.

parse() Query

Parse the query in list format.

class search_query.pubmed.parser.PubmedParser(query_str: str, *, field_general: str = '', offset: dict | None = None, original_str: str | None = None, silent: bool = False)

Bases: QueryStringParser

Parser for Pubmed queries.

FIELD_REGEX = re.compile('\\[[^\\[]*?\\]')
LOGIC_OPERATOR_REGEX = re.compile('(\\||&|\\b(?:AND|OR|NOT|:)\\b)(?!\\s?\\[[^\\[]*?\\])')
PARENTHESIS_REGEX = re.compile('[\\(\\)]')
PROXIMITY_REGEX = re.compile('^\\[(.+):~(.*)\\]$')
SEARCH_PHRASE_REGEX = re.compile('\\".*?\\"')
TERM_REGEX = re.compile('[^\\s\\[\\]()\\|&]+')
linter: QueryStringLinter
parse() Query

Parse a query string

parse_query_tree(tokens: list) Query

Parse a query from a list of tokens

pattern = re.compile('\\[[^\\[]*?\\]|(\\||&|\\b(?:AND|OR|NOT|:)\\b)(?!\\s?\\[[^\\[]*?\\])|[\\(\\)]|\\".*?\\"|[^\\s\\[\\]()\\|&]+', re.IGNORECASE)
tokenize() None

Tokenize the query_str

tokens: list

search_query.pubmed.serializer module

Pubmed serializer.

search_query.pubmed.serializer.to_string_pubmed(query: Query) str

Serialize the Query tree into a PubMed search string.

search_query.pubmed.translator module

Pubmed query translator.

class search_query.pubmed.translator.PubmedTranslator

Bases: QueryTranslator

Translator for Pubmed queries.

classmethod to_generic_syntax(query: Query) Query

Convert the query to a generic syntax.

classmethod to_specific_syntax(query: Query) Query

Convert the query to a specific syntax.

classmethod translate_fields_to_generic(query: Query) Query

Translate search fields

Module contents

Top-level package for PubMed.