Text & Regex Search
Text search provides fast, precise searching across your codebase using regex patterns. Itβs the foundation for finding specific strings, identifiers, or patterns in your code.
Quick Start
from kit import Repository
repo = Repository("/path/to/codebase")
# Simple text searchresults = repo.search_text("TODO")
# Search with file patternresults = repo.search_text("authenticate", file_pattern="**/*.py")
# Regex searchresults = repo.search_text(r"def \w+_handler", file_pattern="*.py")
for match in results: print(f"{match['file']}:{match['line_number']}: {match['line']}")Basic Usage
The search_text() method searches for text patterns across files:
repo.search_text( query="pattern", # String or regex pattern file_pattern="*.py" # Glob pattern for files)Returns: List of matches, each containing:
file: Relative path to the fileline_number: Line number (1-indexed)line: Content of the matching linecontext_before: Lines before the matchcontext_after: Lines after the match
File Patterns
Use glob patterns to filter which files to search:
# Single extensionrepo.search_text("TODO", file_pattern="*.py")
# Multiple extensionsrepo.search_text("TODO", file_pattern="*.{py,js,ts}")
# Specific directoryrepo.search_text("authenticate", file_pattern="src/auth/**/*.py")
# All filesrepo.search_text("config", file_pattern="**/*")Regex Patterns
Use Python regex for powerful pattern matching:
# Find all function definitionsrepo.search_text(r"def \w+\(", file_pattern="*.py")
# Find API endpointsrepo.search_text(r"@app\.(get|post|put|delete)", file_pattern="*.py")
# Find TODO/FIXME commentsrepo.search_text(r"(TODO|FIXME|HACK):", file_pattern="*.py")
# Find importsrepo.search_text(r"^import |^from .* import", file_pattern="*.py")
# Find SQL queriesrepo.search_text(r"SELECT .* FROM", file_pattern="*.py")Advanced Options
Control search behavior with SearchOptions:
from kit.code_searcher import SearchOptions
# Case-insensitive searchoptions = SearchOptions(case_sensitive=False)results = repo.search_text("todo", options=options)
# Include context linesoptions = SearchOptions( context_lines_before=2, context_lines_after=2)results = repo.search_text("error", options=options)
for match in results: print(f"\n{match['file']}:{match['line_number']}") for line in match['context_before']: print(f" {line}") print(f"> {match['line']}") for line in match['context_after']: print(f" {line}")
# Ignore .gitignore rules (search all files)options = SearchOptions(use_gitignore=False)results = repo.search_text("secret", options=options)Common Use Cases
Finding Security Issues
# Search for hardcoded credentialssecurity_patterns = [ r"password\s*=\s*['\"]", r"api_key\s*=\s*['\"]", r"secret\s*=\s*['\"]", r"token\s*=\s*['\"]"]
for pattern in security_patterns: results = repo.search_text(pattern, file_pattern="*.py") if results: print(f"Found {len(results)} potential security issues:") for match in results: print(f" {match['file']}:{match['line_number']}")Finding Code Patterns
# Find all async functionsasync_functions = repo.search_text(r"async def \w+", file_pattern="*.py")
# Find error handlingerror_handlers = repo.search_text(r"except \w+Error", file_pattern="*.py")
# Find deprecated codedeprecated = repo.search_text(r"@deprecated", file_pattern="**/*.py")Code Cleanup Tasks
# Find all TODOs grouped by filetodos = repo.search_text(r"TODO:|FIXME:|HACK:", file_pattern="**/*.py")
todo_by_file = {}for match in todos: file = match['file'] if file not in todo_by_file: todo_by_file[file] = [] todo_by_file[file].append(match)
for file, matches in todo_by_file.items(): print(f"\n{file} ({len(matches)} items):") for match in matches: print(f" Line {match['line_number']}: {match['line'].strip()}")CLI Usage
Text search is available via the kit search-text command:
# Basic searchkit search-text /path/to/repo "TODO"
# With file patternkit search-text /path/to/repo "authenticate" --pattern "**/*.py"
# Case-insensitivekit search-text /path/to/repo "todo" --case-insensitive
# With contextkit search-text /path/to/repo "error" --context 2
# Output as JSONkit search-text /path/to/repo "TODO" --format json > todos.jsonPerformance Considerations
Fast for:
- Exact string matches
- Simple regex patterns
- Small to medium codebases
Slower for:
- Complex regex patterns
- Very large repositories
- Searching all files (
**/*)
Tips:
- Use specific file patterns to reduce search scope
- Respect
.gitignore(enabled by default) - Use simpler patterns when possible
- Consider symbol search for structural queries
When to Use Text Search
Good for:
- Finding specific strings or identifiers
- Pattern-based code analysis
- Quick lookups of known terms
- Security audits for specific patterns
Consider alternatives:
- Symbol search: Finding functions/classes by name
- Semantic search: Finding code by meaning
- AST search: Finding code by structure
Examples
Find All API Routes
# Find Flask routesflask_routes = repo.search_text(r"@app\.route\(", file_pattern="*.py")
# Find FastAPI endpointsfastapi_routes = repo.search_text(r"@(app|router)\.(get|post|put|delete)", file_pattern="*.py")
for route in fastapi_routes: print(f"{route['file']}:{route['line_number']}") print(f" {route['line'].strip()}")Find Database Queries
# Find all SQL queriesqueries = repo.search_text(r"(SELECT|INSERT|UPDATE|DELETE) .* FROM", file_pattern="**/*.py")
print(f"Found {len(queries)} SQL queries:")for query in queries: print(f"\n{query['file']}:{query['line_number']}") print(f" {query['line'].strip()}")Find Configuration Usage
# Find all config referencesconfigs = repo.search_text(r"config\[.+\]|config\..+", file_pattern="**/*.py")
# Group by fileby_file = {}for match in configs: by_file.setdefault(match['file'], []).append(match)
for file, matches in sorted(by_file.items()): print(f"\n{file}: {len(matches)} config usages")