Skip to content

init

Parsing of whole repositories of data.

I.e. Combines file parsing with things like github adapter and database storage.

Does NOT include vectorization storing/searching (i.e. qdrant stuff).

__all__ = ['AbstractGithubParsingUnitOfWork', 'ParsedCodeFileID', 'ParsedFileID', 'ParsedTextFileID', 'SQLGithubParsingUnitOfWork'] module-attribute

AbstractGithubParsingUnitOfWork

Bases: AbstractDatabaseUnitOfWork, ABC

github_adapter abstractmethod property

repository instance-attribute

ParsedCodeFileID

Bases: ParsedFileID

ParsedFileID

Bases: UniqueIDBase[ParsedGithubFile], ABC

branch instance-attribute

node_id instance-attribute

owner instance-attribute

path instance-attribute

repo instance-attribute

equal_excluding_version_and_deletion(other)

from_schema(schema, exact_version=False) classmethod

ParsedTextFileID

Bases: ParsedFileID

SQLGithubParsingUnitOfWork

Bases: SessionDatabaseUnitOfWork, AbstractGithubParsingUnitOfWork

github_adapter property

repository_class property

__init__(session_maker, github_adapter)