Linter

Linters are responsible for validating query strings or query lists before execution. They analyze token sequences, syntax, search fields, and operator use to identify errors or ambiguities and print meaningful messages (documented in the messages section). Each platform implements its own linter, which interhits from the base class linter_base.py. Linters are used in the parser methods.

Base Classes

Use the appropriate base class when developing a new linter:

  • QueryStringLinter: for single query strings

  • QueryListLinter: for list-based query formats

Each linter must override the validate_tokens() method and the validate_query_tree(). validate_tokens() is called when the query is parsed, and validate_query_tree() is called when the query tree is built (i.e., at the end of the parsing process and when the query is constructed programmatically).

Best Practices

  • Use standardized linter messages defined in constants.QueryErrorCode.

  • Add details in messages for user guidance (e.g., invalid format, missing logic).

  • Ensure valid token sequences using the VALID_TOKEN_SEQUENCES dictionary.

  • Consider using utility methods provided by linter_base.py: - check_unbalanced_parentheses() - check_unknown_token_types() - add_artificial_parentheses_for_operator_precedence() - check_invalid_characters_in_search_term(chars) - check_operator_capitalization() - etc.

  • For search field validation, use a corresponding field mapping and helper functions like map_to_standard().

import typing

from search_query.constants import QueryErrorCode
from search_query.constants import TokenTypes
from search_query.linter_base import QueryStringLinter

if typing.TYPE_CHECKING:
    from search_query.query import Query


class XYQueryStringLinter(QueryStringLinter):
    """Linter for XY query strings"""

    VALID_TOKEN_SEQUENCES = {
        TokenTypes.FIELD: [TokenTypes.SEARCH_TERM],
        TokenTypes.SEARCH_TERM: [
            TokenTypes.LOGIC_OPERATOR,
            TokenTypes.PARENTHESIS_CLOSED,
        ],
        TokenTypes.LOGIC_OPERATOR: [
            TokenTypes.SEARCH_TERM,
            TokenTypes.PARENTHESIS_OPEN,
        ],
        # ...
    }

    def validate_tokens(
        self,
        *,
        tokens: typing.List[Token],
        query_str: str,
        search_field_general: str = "",
    ) -> typing.List[Token]:
        """Main validation routine"""

        self.tokens = tokens
        self.query_str = query_str
        self.search_field_general = search_field_general

        self.check_unbalanced_parentheses()
        self.check_unknown_token_types()
        self.check_invalid_token_sequences()
        self.check_operator_capitalization()

        # custom validation

        return self.tokens

    def check_invalid_token_sequences(self) -> None:
        for i, token in enumerate(self.parser.tokens[:-1]):
            expected = self.VALID_TOKEN_SEQUENCES.get(token.type, [])
            if self.parser.tokens[i + 1].type not in expected:
                self.add_linter_message(
                    QueryErrorCode.INVALID_TOKEN_SEQUENCE,
                    position=self.parser.tokens[i + 1].position,
                    details=f"Unexpected token after {token.type}",
                )

    def validate_query_tree(self, query: Query) -> None:
        """
        Validate the query tree.
        This method is called after the query tree has been built.
        """

        self.check_quoted_search_terms_query(query)
        self.check_operator_capitalization_query(query)
        self.check_invalid_characters_in_search_term_query(query, "@&%$^~\\<>{}()[]#")
        self.check_unsupported_search_fields_in_query(query)
        # term_field_query = self.get_query_with_fields_at_terms(query)

Strict vs. Non-Strict Mode

In non-strict mode (mode=”lenient”), linters report errors but do not raise exceptions. In strict mode (mode=”strict”), any linter message will cause an exception to be raised, which can be used for automated pipelines or validation gates.

Field validation in strict vs. non-strict modes

Search Field Validation in Strict vs. Non-Strict Modes

Search-Field required

Search String

Search-Field

Mode: Strict

Mode: Non-Strict

Yes

With Search-Field

Empty

ok

ok

Yes

With Search-Field

Equal to Search-String

ok - search-field-redundant

ok

Yes

With Search-Field

Different from Search-String

error: search-field-contradiction

ok - search-field-contradiction. Parser uses Search-String per default

Yes

Without Search-Field

Empty

error: search-field-missing

ok - search-field-missing. Parser adds title as the default

Yes

Without Search-Field

Given

ok - search-field-extracted

ok

No

With Search-Field

Empty

ok

ok

No

With Search-Field

Equal to Search-String

ok - search-field-redundant

ok

No

With Search-Field

Different from Search-String

error: search-field-contradiction

ok - search-field-contradiction. Parser uses Search-String per default

No

Without Search-Field

Empty

ok - search-field-not-specified

ok - Parser uses default of database

No

Without Search-Field

Given

ok - search-field-extracted

ok