Skip to content

The Repository Interface

The kit.Repository object is the backbone of the library. It serves as your primary interface for accessing, analyzing, and understanding codebases, regardless of their language or location (local path or remote Git URL).

Interacting directly with code across different languages, file structures, and potential locations (local vs. remote) can be cumbersome. The Repository object provides a unified and consistent abstraction layer to handle this complexity.

Key benefits include:

  • Unified Access: Provides a single entry point to read files, extract code structures (symbols), perform searches, and more.
  • Location Agnostic: Works seamlessly with both local file paths and remote Git repository URLs (handling cloning and caching automatically when needed).
  • Language Abstraction: Leverages tree-sitter parsers under the hood to understand the syntax of various programming languages, allowing you to work with symbols (functions, classes, etc.) in a standardized way.
  • Foundation for Tools: Acts as the foundation upon which you can build higher-level developer tools and workflows, such as documentation generators, AI code reviewers, or semantic search engines.

Once you instantiate a Repository object pointing to your target codebase:

from kit import Repository
# Point to a local project
my_repo = Repository("/path/to/local/project")
# Or point to a remote GitHub repo
# github_repo = Repository("https://github.com/owner/repo-name")

You can perform various code intelligence tasks:

  • Explore Structure: Get the file tree (.get_file_tree()).
  • Read Content: Access the raw content of specific files (.get_file_content()).
  • Understand Code: Extract detailed information about functions, classes, and other symbols (.extract_symbols()).
  • Analyze Dependencies: Find where symbols are defined and used (.find_symbol_usages()).
  • Search: Perform literal text searches (.search_text()) or powerful semantic searches (.search_semantic()).
  • Prepare for LLMs: Chunk code intelligently by lines or symbols (.chunk_file_by_lines(), .chunk_file_by_symbols()) and get code context around specific lines (.extract_context_around_line()).
  • Integrate with AI: Obtain configured summarizers (.get_summarizer()) or vector searchers (.get_vector_searcher()) for advanced AI workflows.
  • Export Data: Save the file tree, symbol information, or full repository index to structured formats like JSON (.write_index(), .write_symbols(), etc.).

The following table lists some of the key classes and tools you can access through the Repository object:

Class/ToolDescription
SummarizerGenerate summaries of code using LLMs
VectorSearcherQuery vector index of code for semantic search
DocstringIndexerBuild vector index of LLM-generated summaries
SummarySearcherQuery that index