Combined github services
Composing together services related to github.
I.e., Combining parsing services (that interacts with github api, file parsing, and databse) with the github search services (that interacts with qdrant and the database).
2024-12-12 -- Ideally, this is where an event bus and message passing should be used instead. I.e., the individual services should be able to communicate with each other and coordinate via message passing, but for now we'll just use this to coordinate actions between the services.
delete_github_repo_from_search(github_parsing_uow, github_search_uow, repo_id, user_id, wait=False)
async
Remove a github repository from the search database.
| PARAMETER | DESCRIPTION |
|---|---|
github_parsing_uow
|
Unit of work for the database operations. |
github_search_uow
|
Unit of work for github searching (i.e. qdrant parts).
TYPE:
|
repo_id
|
The id of the repository to be removed.
TYPE:
|
user_id
|
ID of user for this repository (or None if it's a public repo).
TYPE:
|
wait
|
Whether to wait for the qdrant vectorization to finish before returning.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
tuple[list[ParsedGithubFile], UpdateResult]
|
The status of the deletion. |
initialize_github_repo_for_search(github_parsing_uow, github_search_uow, repo_id, user_id, wait=False, as_repo_id=None)
async
First time setup for a github repository to be searched.
I.e., loads whole repo, vectorizes all (or specific subset) of files and adds them to the database and qdrant vectorstore.
| PARAMETER | DESCRIPTION |
|---|---|
github_parsing_uow
|
Unit of work for combined GitHub adapter/parsing/database operations. |
github_search_uow
|
Unit of work for github searching (i.e. qdrant parts).
TYPE:
|
repo_id
|
The id of the repository to be initialized.
TYPE:
|
user_id
|
ID of user for this repository (or None if it's a public repo).
TYPE:
|
wait
|
Whether to wait for the qdrant vectorization to finish before returning.
TYPE:
|
as_repo_id
|
If provided, replace the repo_id with this value in the database.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
tuple[UpdateStatus, list[ParsedGithubFile]]
|
Tuple of status and parsed github files: The files that were saved to the main database. Note: Data is further chunked before adding to qdrant. |
update_github_repo_for_search(github_parsing_uow, github_search_uow, repo_id, user_id, from_repo_id=None, as_repo_id=None, max_fill_size=10000, wait=False)
async
Update an already initialized github repository for search.
I.e., find any changes from new commits, and update database and vectorstore entries accordingly.