init
__all__ = ['Changes', 'GithubAdapter', 'MiniBlob', 'Repositories', 'RepositoryInfo', 'Tree', 'TreeEntry', 'check_api_key_valid', 'get_files_for_node_ids', 'get_filled_blobs', 'get_full_repo', 'get_full_repo_structure', 'get_github_adapter', 'get_owner_repositories', 'get_repo_branches', 'get_repo_changes', 'get_specific_file']
module-attribute
Changes
Bases: BaseModel
changed_blobs = Field(default_factory=list)
class-attribute
instance-attribute
deleted_blobs = Field(default_factory=list)
class-attribute
instance-attribute
new_blobs = Field(default_factory=list)
class-attribute
instance-attribute
renamed_blobs = Field(default_factory=list)
class-attribute
instance-attribute
unchanged = Field(default_factory=list)
class-attribute
instance-attribute
__add__(other)
__str__()
clear()
GithubAdapter
Adapter around the githubkit.GitHub class.
The githubkit.GitHub client is nice, but it does too many things... This adapter makes the boundary of what code internal to the project uses. It also protects against future changes and can provide helper methods.
client = client
instance-attribute
__init__(client)
get_basic_owner_info(login)
async
Get basic info about an owner (user or org) by their login.
Note: login is case-insensitive, but the returned login will have the correct case.
get_branch_comparison(repo_id, base_ref, head_ref=None)
async
Compare two branches in a repository (via graphQL).
Note: GraphQL does not support comparing two commits. Although, it is possible to compare a base branch to a later commit.
| PARAMETER | DESCRIPTION |
|---|---|
repo_id
|
The owner/repo for the comparison (branch will be used as latest_ref if latest_ref is None).
TYPE:
|
base_ref
|
The ref to compare against (Note: MUST be a branch or tag (not a commit sha))
TYPE:
|
head_ref
|
The later ref (that can be a commit sha) (defaults to the branch in repo_id).
TYPE:
|
get_client_user()
async
Get the user that is associated with the GitHub client.
I.e., this is the user that the api is authenticated as.
get_full_blob_by_path(repo_id, file_path)
async
Get the full blob at the given path.
get_full_blobs_from_entries(blob_entries, size_fill_threshold=10000)
async
Get full blobs from a list of TreeEntries containing unfilled blobs.
Note: Returns ALL blobs, even if they are not filled because they exceed the fill threshold. Note: Fills miniblobs in-place in the TreeEntry, but does not otherwise change TreeEntry
get_full_blobs_from_node_ids(node_ids, tree, size_fill_threshold=10000)
async
Get full blobs from a list of node_ids.
Basically the same as get_full_blobs_from_entries but with a more convenient interface for direct calling.
| PARAMETER | DESCRIPTION |
|---|---|
node_ids
|
The node_ids of the blobs to load.
TYPE:
|
tree
|
The tree that contains the node_ids. (this is used to populate path, name, etc.)
TYPE:
|
size_fill_threshold
|
The threshold for filling the text field of the blobs.
TYPE:
|
get_full_repo(repo_id, size_fill_threshold=10000)
async
Get the full repository including all content.
Same as get_full_repo_structure, but includes the content of all blobs.
get_full_repo_structure(repo_id, depth=None)
async
Get the full repository structure for the given owner/repo/branch.
Structure means getting all the TreeEntries/Blobs (name, path, node_ids), but not the content.
get_rate_limits()
async
Get the rate limits for the current api key.
get_recent_commits(repo_id, last=10, cursor=None)
async
Get the last last commits for the given repo.
get_repositories(owner, previous_cursor=None)
async
Get the repositories for the given owner.
| PARAMETER | DESCRIPTION |
|---|---|
owner
|
The owner of the repositories to get.
TYPE:
|
previous_cursor
|
The cursor to get the next page of results (returned in repositories.end_cursor)
TYPE:
|
get_repository_info(owner, name, previous_cursor=None)
async
Get the repository info for the given owner/repo.
get_rest_comparison(repo_id, base, head)
async
Use rest API to compare two branches/commits.
Note: Rest API DOES let you compare specific commits via the truncated SHA.
Note: See there is a difference between two dot and three dot comparisons: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-comparing-branches-in-pull-requests#three-dot-and-two-dot-git-diff-comparisons
get_tree(repo_id, tree_entry=None, include_text=False, size_fill_threshold=10000, recursive=False)
async
Gets the git tree of the repo or subtree.
repo_id: The owner/repo/branch to get the tree for. (branch only used if tree_entry is None)
tree_entry: TreeEntry object that contains the node_id of the tree to further fill in.
Note: the `fill_tree_entries` method can fill multiple subtrees in parallel
include_text: Whether to include the text contents of the blobs returned
size_fill_threshold: If the size of the blob is greater than this (in bytes), don't fill the text
recursive: Whether to recursively fill nested trees that were too deep the first time around
MiniBlob
Bases: GraphqlQuerySchema
The minimal Blob info extracted via graphql queries.
byte_size = Field(alias='byteSize', validation_alias=AliasChoices('byteSize', 'byte_size'))
class-attribute
instance-attribute
is_binary = Field(alias='isBinary', validation_alias=AliasChoices('isBinary', 'is_binary'))
class-attribute
instance-attribute
node_id = Field(description='GH node_id')
class-attribute
instance-attribute
text = None
class-attribute
instance-attribute
Repositories
Bases: GraphqlQuerySchema
end_cursor
instance-attribute
has_next_page
instance-attribute
nodes
instance-attribute
from_query_result(result)
classmethod
RepositoryInfo
Bases: GraphqlQuerySchema
branch_refs
instance-attribute
branches_count
instance-attribute
created_at = Field(alias='createdAt')
class-attribute
instance-attribute
description
instance-attribute
disk_usage = Field(alias='diskUsage')
class-attribute
instance-attribute
end_cursor_branches = None
class-attribute
instance-attribute
fork_count = Field(alias='forkCount')
class-attribute
instance-attribute
has_next_page_branches = False
class-attribute
instance-attribute
is_fork = Field(alias='isFork')
class-attribute
instance-attribute
is_in_organization = Field(alias='isInOrganization')
class-attribute
instance-attribute
is_private = Field(alias='isPrivate')
class-attribute
instance-attribute
issues_count
instance-attribute
languages
instance-attribute
languages_count
instance-attribute
license_name
instance-attribute
name
instance-attribute
owner
instance-attribute
primary_language
instance-attribute
pull_requests_count
instance-attribute
stargazer_count = Field(alias='stargazerCount')
class-attribute
instance-attribute
updated_at = Field(alias='updatedAt')
class-attribute
instance-attribute
url
instance-attribute
from_query_result(result)
classmethod
Tree
Bases: GraphqlQuerySchema
The minimal Tree info extracted via graphql queries.
entries = None
class-attribute
instance-attribute
node_id = Field(description='GH node_id')
class-attribute
instance-attribute
num_nodes
property
walk_all_blobs()
walk_all_miniblobs()
TreeEntry
Bases: GraphqlQuerySchema
Additional info related to Tree or Blob objects in a Tree.
name
instance-attribute
object
instance-attribute
oid = Field(description='GH object id')
class-attribute
instance-attribute
path = None
class-attribute
instance-attribute
type
instance-attribute
check_api_key_valid(api_key)
async
Checks if a github api key is valid.
If valid, returns the name of the api key owner.
get_files_for_node_ids(adapter, tree, node_ids)
async
Get the files for the given node_ids directly from GitHub.
Can efficiently load multiple blobs at once in a single request.
get_filled_blobs(adapter, blob_entries, max_size_fill=10000, all_in_one=False)
async
Fill text of the blobs in the given TreeEntries.
Useful when updating repository.
get_full_repo(adapter, repo_id, max_size_fill=10000)
async
Get the full repository including all content.
Same as get_full_repo_structure, but includes the content of all blobs.
| PARAMETER | DESCRIPTION |
|---|---|
adapter
|
The GithubAdapter instance to use.
TYPE:
|
repo_id
|
The owner/repo/branch to get the tree for.
TYPE:
|
max_size_fill
|
If the size of the blob is greater than this (in bytes), don't fill the text
TYPE:
|
get_full_repo_structure(adapter, repo_id, depth=None)
async
Get the full repository structure for the given owner/repo/branch.
Structure means getting all the TreeEntries/Blobs (name, path, node_ids), but not the content.
get_github_adapter(api_key=settings.GITHUB_API_KEY)
Get the github adapter with the given api_key.
get_owner_repositories(adapter, owner, previous_cursor=None)
async
Get owners repositories.
Owner can be user or organization. The first 100 results will be returned.
| PARAMETER | DESCRIPTION |
|---|---|
adapter
|
The GithubAdapter instance to use.
TYPE:
|
owner
|
The owner of the repositories to get.
TYPE:
|
previous_cursor
|
The end_cursor of the last repository in the previous call (for pagination).
TYPE:
|
get_repo_branches(adapter, owner, repo_name, previous_cursor=None)
async
Get only the branches of the given repository.
Basically just returns a subset of information fom the get_repository_info method.
| PARAMETER | DESCRIPTION |
|---|---|
adapter
|
The GithubAdapter instance to use.
TYPE:
|
owner
|
The owner of the repository.
TYPE:
|
repo_name
|
The name of the repository.
TYPE:
|
previous_cursor
|
The last id of the last branch in the previous call (for pagination).
TYPE:
|
get_repo_changes(adapter, repo_id, existing_metadatas)
async
Get the changes between the existing metadatas and the current repository structure.
Note: For now just works from the full repo structure straight away, but could later be optimized to only recurse into changed trees. (makes sense to be a github_service since it could later rely on many calls to the github api)
get_specific_file(adapter, repo_id, file_path)
async
Get the most up-to-date contents of the given file directly from GitHub.
Note: This only works for a single path per request. For loading many files, prefer get_files_for_node_ids.