Skip to content

Search

CodeChunkTypes = ImportChunk | FunctionChunk | ClassChunk module-attribute

logger = logging.getLogger(__name__) module-attribute

DatabaseLoaderProtocol

Bases: Protocol

Load additional information from the database for the search results.

E.g., Content if only stored as location, or additional metadata.

__call__(uow, results) async

LoadContentMode

Bases: Enum

The mode to load content in for search results.

CHUNK = auto() class-attribute instance-attribute

FULL_FILE = auto() class-attribute instance-attribute

NONE = auto() class-attribute instance-attribute

QueryValidationProtocol

Bases: Protocol

__call__(query_text, max_results, query_type, offset)

ResultFilterProtocol

Bases: Protocol

__call__(results)

ValidatedQueryArgs

Bases: BaseModel

max_results instance-attribute

offset instance-attribute

query_text instance-attribute

query_type instance-attribute

query_vector_name instance-attribute

fill_location_contents(read_only_uow, locations) async

Load any location contents from the database to return only the text contents.

get_num_points(uow, filter_args) async

Get the number of points that match the given filter.

load_full_chunk_infos_for_results(uow, results) async

Helper function to load additional chunk info for qdrant search results from the db.

Note: Works even if multiple search results share the same chunk (i.e., different embedded parts of same func)

scroll_all_points(uow, filter_args=None, num_per_iteration=100, load_additional_chunk_info=False) async

Scroll through all points that match the given filter.

PARAMETER DESCRIPTION
uow

The unit of work to use for the search.

TYPE: GithubSearchUOW

filter_args

The filter to use for the scroll.

TYPE: GithubFilterArgs | None DEFAULT: None

num_per_iteration

The number of results to return in each iteration.

TYPE: int DEFAULT: 100

load_additional_chunk_info

Whether to load additional metadata from the db for the search results.

TYPE: bool DEFAULT: False

scroll_points(uow, filter_args=None, limit=100, from_point_id=None, load_additional_chunk_info=False) async

Scroll through all points that match the given filter.

PARAMETER DESCRIPTION
uow

The unit of work to use for the search.

TYPE: GithubSearchUOW

filter_args

The filter to use for the scroll.

TYPE: GithubFilterArgs | None DEFAULT: None

limit

The maximum number of results to return.

TYPE: int DEFAULT: 100

from_point_id

The id of the last point from previous scroll to start from (won't be included).

TYPE: str | None DEFAULT: None

load_additional_chunk_info

Whether to load additional metadata from the db for the search results.

TYPE: bool DEFAULT: False

search(uow, query_text, query_type, filter_args, max_results=None, result_index_offset=0, load_full_chunk_info=False) async

Direct search results for github content.

I.e., Returns the search results found within Qdrant (with contents loaded from db where necessary). Does not return additional metadata that is stored in db only for the search results.

PARAMETER DESCRIPTION
uow

The unit of work to use for the search.

TYPE: GithubSearchUOW

query_text

The text to search for.

TYPE: str

query_type

The type of query to perform (e.g. "text" or "code"). Determines which vectors to search against.

TYPE: LiteralQueryType

filter_args

The filter arguments to use for the search.

TYPE: GithubFilterArgs

max_results

The maximum number of results to return.

TYPE: int | None DEFAULT: None

result_index_offset

The offset to start returning results from. (For pagination of results)

TYPE: int DEFAULT: 0

load_full_chunk_info

Whether to load additional metadata from the db for the search results.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
GithubSearchResults

The search results as found in qdrant. I.e. the content and metadata for each point without additional

GithubSearchResults

processing.

search_file_results(uow, query_text, query_type, filter_args, max_results=None, load_full_chunk_info=False) async

Return whole files of any search results.

Note: This does not return additional metadata that is stored in db only for the search results (only the basic metadata stored in qdrant).

PARAMETER DESCRIPTION
uow

The unit of work to use for the search.

TYPE: GithubSearchUOW

query_text

The text to search for.

TYPE: str

query_type

The type of query to perform (e.g. "text" or "code"). Determines which vectors to search against.

TYPE: LiteralQueryType

filter_args

The filter arguments to use for the search.

TYPE: GithubFilterArgs

max_results

The maximum number of results to return.

TYPE: int | None DEFAULT: None

load_full_chunk_info

Whether to load additional metadata from the db for the search results.

TYPE: bool DEFAULT: False

vector_stats_for_repo(uow, repo_id, user_id, iter_size=100) async

Get some basic stats about what is vectorized for the given repository.

Note: This streams back the results as they are found -- changing the results in-place.

I.e., Can show updates as they are found, but not useful to collect all the results as they will all be the same object.

PARAMETER DESCRIPTION
uow

The unit of work to use for the search.

TYPE: GithubSearchUOW

repo_id

The repository to get the stats for.

TYPE: RepoID

user_id

The user (for permissions to access repo info).

TYPE: UserID | Literal['public']

iter_size

The number of results to scroll through at a time

TYPE: int DEFAULT: 100