Skip to content

init

__all__ = ['Changes', 'GithubAdapter', 'MiniBlob', 'Repositories', 'RepositoryInfo', 'Tree', 'TreeEntry', 'check_api_key_valid', 'get_files_for_node_ids', 'get_filled_blobs', 'get_full_repo', 'get_full_repo_structure', 'get_github_adapter', 'get_owner_repositories', 'get_repo_branches', 'get_repo_changes', 'get_specific_file'] module-attribute

Changes

Bases: BaseModel

changed_blobs = Field(default_factory=list) class-attribute instance-attribute

deleted_blobs = Field(default_factory=list) class-attribute instance-attribute

new_blobs = Field(default_factory=list) class-attribute instance-attribute

renamed_blobs = Field(default_factory=list) class-attribute instance-attribute

unchanged = Field(default_factory=list) class-attribute instance-attribute

__add__(other)

__str__()

clear()

GithubAdapter

Adapter around the githubkit.GitHub class.

The githubkit.GitHub client is nice, but it does too many things... This adapter makes the boundary of what code internal to the project uses. It also protects against future changes and can provide helper methods.

client = client instance-attribute

__init__(client)

get_basic_owner_info(login) async

Get basic info about an owner (user or org) by their login.

Note: login is case-insensitive, but the returned login will have the correct case.

get_branch_comparison(repo_id, base_ref, head_ref=None) async

Compare two branches in a repository (via graphQL).

Note: GraphQL does not support comparing two commits. Although, it is possible to compare a base branch to a later commit.

PARAMETER DESCRIPTION
repo_id

The owner/repo for the comparison (branch will be used as latest_ref if latest_ref is None).

TYPE: RepoID

base_ref

The ref to compare against (Note: MUST be a branch or tag (not a commit sha))

TYPE: str

head_ref

The later ref (that can be a commit sha) (defaults to the branch in repo_id).

TYPE: str | None DEFAULT: None

get_client_user() async

Get the user that is associated with the GitHub client.

I.e., this is the user that the api is authenticated as.

get_full_blob_by_path(repo_id, file_path) async

Get the full blob at the given path.

get_full_blobs_from_entries(blob_entries, size_fill_threshold=10000) async

Get full blobs from a list of TreeEntries containing unfilled blobs.

Note: Returns ALL blobs, even if they are not filled because they exceed the fill threshold. Note: Fills miniblobs in-place in the TreeEntry, but does not otherwise change TreeEntry

get_full_blobs_from_node_ids(node_ids, tree, size_fill_threshold=10000) async

Get full blobs from a list of node_ids.

Basically the same as get_full_blobs_from_entries but with a more convenient interface for direct calling.

PARAMETER DESCRIPTION
node_ids

The node_ids of the blobs to load.

TYPE: Sequence[str]

tree

The tree that contains the node_ids. (this is used to populate path, name, etc.)

TYPE: Tree

size_fill_threshold

The threshold for filling the text field of the blobs.

TYPE: int DEFAULT: 10000

get_full_repo(repo_id, size_fill_threshold=10000) async

Get the full repository including all content.

Same as get_full_repo_structure, but includes the content of all blobs.

get_full_repo_structure(repo_id, depth=None) async

Get the full repository structure for the given owner/repo/branch.

Structure means getting all the TreeEntries/Blobs (name, path, node_ids), but not the content.

get_rate_limits() async

Get the rate limits for the current api key.

get_recent_commits(repo_id, last=10, cursor=None) async

Get the last last commits for the given repo.

get_repositories(owner, previous_cursor=None) async

Get the repositories for the given owner.

PARAMETER DESCRIPTION
owner

The owner of the repositories to get.

TYPE: str

previous_cursor

The cursor to get the next page of results (returned in repositories.end_cursor)

TYPE: str | None DEFAULT: None

get_repository_info(owner, name, previous_cursor=None) async

Get the repository info for the given owner/repo.

get_rest_comparison(repo_id, base, head) async

Use rest API to compare two branches/commits.

Note: Rest API DOES let you compare specific commits via the truncated SHA.

Note: See there is a difference between two dot and three dot comparisons: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-comparing-branches-in-pull-requests#three-dot-and-two-dot-git-diff-comparisons

get_tree(repo_id, tree_entry=None, include_text=False, size_fill_threshold=10000, recursive=False) async

Gets the git tree of the repo or subtree.


repo_id: The owner/repo/branch to get the tree for. (branch only used if tree_entry is None)
tree_entry: TreeEntry object that contains the node_id of the tree to further fill in.
    Note: the `fill_tree_entries` method can fill multiple subtrees in parallel
include_text: Whether to include the text contents of the blobs returned
size_fill_threshold: If the size of the blob is greater than this (in bytes), don't fill the text
recursive: Whether to recursively fill nested trees that were too deep the first time around

MiniBlob

Bases: GraphqlQuerySchema

The minimal Blob info extracted via graphql queries.

byte_size = Field(alias='byteSize', validation_alias=AliasChoices('byteSize', 'byte_size')) class-attribute instance-attribute

is_binary = Field(alias='isBinary', validation_alias=AliasChoices('isBinary', 'is_binary')) class-attribute instance-attribute

node_id = Field(description='GH node_id') class-attribute instance-attribute

text = None class-attribute instance-attribute

Repositories

Bases: GraphqlQuerySchema

end_cursor instance-attribute

has_next_page instance-attribute

nodes instance-attribute

from_query_result(result) classmethod

RepositoryInfo

Bases: GraphqlQuerySchema

branch_refs instance-attribute

branches_count instance-attribute

created_at = Field(alias='createdAt') class-attribute instance-attribute

description instance-attribute

disk_usage = Field(alias='diskUsage') class-attribute instance-attribute

end_cursor_branches = None class-attribute instance-attribute

fork_count = Field(alias='forkCount') class-attribute instance-attribute

has_next_page_branches = False class-attribute instance-attribute

is_fork = Field(alias='isFork') class-attribute instance-attribute

is_in_organization = Field(alias='isInOrganization') class-attribute instance-attribute

is_private = Field(alias='isPrivate') class-attribute instance-attribute

issues_count instance-attribute

languages instance-attribute

languages_count instance-attribute

license_name instance-attribute

name instance-attribute

owner instance-attribute

primary_language instance-attribute

pull_requests_count instance-attribute

stargazer_count = Field(alias='stargazerCount') class-attribute instance-attribute

updated_at = Field(alias='updatedAt') class-attribute instance-attribute

url instance-attribute

from_query_result(result) classmethod

Tree

Bases: GraphqlQuerySchema

The minimal Tree info extracted via graphql queries.

entries = None class-attribute instance-attribute

node_id = Field(description='GH node_id') class-attribute instance-attribute

num_nodes property

walk_all_blobs()

walk_all_miniblobs()

TreeEntry

Bases: GraphqlQuerySchema

Additional info related to Tree or Blob objects in a Tree.

name instance-attribute

object instance-attribute

oid = Field(description='GH object id') class-attribute instance-attribute

path = None class-attribute instance-attribute

type instance-attribute

check_api_key_valid(api_key) async

Checks if a github api key is valid.

If valid, returns the name of the api key owner.

get_files_for_node_ids(adapter, tree, node_ids) async

Get the files for the given node_ids directly from GitHub.

Can efficiently load multiple blobs at once in a single request.

get_filled_blobs(adapter, blob_entries, max_size_fill=10000, all_in_one=False) async

Fill text of the blobs in the given TreeEntries.

Useful when updating repository.

get_full_repo(adapter, repo_id, max_size_fill=10000) async

Get the full repository including all content.

Same as get_full_repo_structure, but includes the content of all blobs.

PARAMETER DESCRIPTION
adapter

The GithubAdapter instance to use.

TYPE: GithubAdapter

repo_id

The owner/repo/branch to get the tree for.

TYPE: RepoID

max_size_fill

If the size of the blob is greater than this (in bytes), don't fill the text

TYPE: int DEFAULT: 10000

get_full_repo_structure(adapter, repo_id, depth=None) async

Get the full repository structure for the given owner/repo/branch.

Structure means getting all the TreeEntries/Blobs (name, path, node_ids), but not the content.

get_github_adapter(api_key=settings.GITHUB_API_KEY)

Get the github adapter with the given api_key.

get_owner_repositories(adapter, owner, previous_cursor=None) async

Get owners repositories.

Owner can be user or organization. The first 100 results will be returned.

PARAMETER DESCRIPTION
adapter

The GithubAdapter instance to use.

TYPE: GithubAdapter

owner

The owner of the repositories to get.

TYPE: str

previous_cursor

The end_cursor of the last repository in the previous call (for pagination).

TYPE: str | None DEFAULT: None

get_repo_branches(adapter, owner, repo_name, previous_cursor=None) async

Get only the branches of the given repository.

Basically just returns a subset of information fom the get_repository_info method.

PARAMETER DESCRIPTION
adapter

The GithubAdapter instance to use.

TYPE: GithubAdapter

owner

The owner of the repository.

TYPE: str

repo_name

The name of the repository.

TYPE: str

previous_cursor

The last id of the last branch in the previous call (for pagination).

TYPE: str | None DEFAULT: None

get_repo_changes(adapter, repo_id, existing_metadatas) async

Get the changes between the existing metadatas and the current repository structure.

Note: For now just works from the full repo structure straight away, but could later be optimized to only recurse into changed trees. (makes sense to be a github_service since it could later rely on many calls to the github api)

get_specific_file(adapter, repo_id, file_path) async

Get the most up-to-date contents of the given file directly from GitHub.

Note: This only works for a single path per request. For loading many files, prefer get_files_for_node_ids.