Improve repo onboarding and agent guidance

This commit is contained in:
root 2026-05-05 01:05:52 +00:00
parent 8708ae55fe
commit 3a664b5a79
6 changed files with 461 additions and 194 deletions

9
.gitignore vendored
View File

@ -27,3 +27,12 @@ Thumbs.db
# Temporary files
*.tmp
*.temp
# Local environment and secret-bearing files
.env
*.env
# Generated outputs and local data caches
supabase monitor/output/
youtube/captions/
github_repo_scanners/contributor_repos/

86
AGENTS.md Normal file
View File

@ -0,0 +1,86 @@
# AGENTS.md
Guidance for AI coding agents working in this repository.
## Purpose
This repo is a mixed operational workspace, not a single product codebase. Your first job is orientation:
- identify whether the task targets root Bash tooling, a subproject, or generated data
- avoid treating tracked JSON or output files as canonical source without checking context
- prefer doc and safety improvements unless the requested task clearly requires behavioral changes
## Canonical Docs
Read these first:
1. `README.md`
2. `docs/getting-started.md`
3. `docs/repo-map.md`
4. `CLAUDE.md` for Claude-specific guidance
## High-Signal Areas
### Primary
- `bytelyst-cli.sh`
- `remove_user_interactive.sh`
- `remove_user_guided.sh`
- `remove_user_from_repos.sh`
- `git-work-safety-tools/`
- `github_access_scripts/`
### Secondary Or Self-Contained
- `Slack Message/`
- `youtube/`
- `supabase monitor/`
Do not assume these subprojects share the same dependencies or conventions as the root scripts.
## Sensitive And Generated Files
Handle these carefully:
- `accounts.json`
- account or user snapshot JSON files in the repo root
- `github_repo_scanners/contributor_repos/*.json`
- `.env` files
- generated outputs under `supabase monitor/output/`
These may contain secrets, usernames, or operational snapshots. Avoid printing contents unless necessary for the task.
## Editing Rules
- Prefer small, targeted changes.
- Preserve operational scripts' current interface unless the task asks for a breaking change.
- When reorganizing docs, keep old filenames if external references may depend on them.
- Do not move or delete generated/tracked data files unless explicitly asked.
- If you add a new directory or script, update `docs/repo-map.md`.
## Recommended Workflow
1. Determine whether the task is about:
- GitHub admin Bash scripts
- multi-repo git safety tooling
- a Python side project
- repo documentation or cleanup
2. Read only the files relevant to that surface.
3. Check whether there is an existing README in that subdirectory.
4. Make the smallest coherent change set.
5. If docs or discoverability changed, update the canonical docs listed above.
## Good First Checks
```bash
git status --short --branch
rg --files
sed -n '1,220p' README.md
sed -n '1,220p' docs/repo-map.md
```
## Pitfalls
- The repo contains legacy one-off scripts and tracked local artifacts.
- Directory names do not always reflect actual function. `supabase monitor/` is a YouTube-processing workflow project.
- The root README should stay broad; detailed usage belongs in `docs/` or a subdirectory README.

102
CLAUDE.md
View File

@ -1,85 +1,55 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Claude-specific working notes for this repository.
## Repository Overview
## Read First
This is a collection of Bash-based DevOps tools for GitHub repository management, specifically designed for ByteLyst AI organization. The tools focus on repository security, user management, and multi-repository operations.
Use these as the canonical orientation docs:
## Core Components
1. `README.md`
2. `docs/getting-started.md`
3. `docs/repo-map.md`
4. `AGENTS.md`
### Main CLI Interface
- `bytelyst-cli.sh` - Unified command-line interface for all tools with both CLI and interactive modes
- Requires `jq` and `curl` dependencies
- Requires `GITHUB_TOKEN` environment variable for GitHub API access
This file is intentionally short and should not become a second full repo README.
### Git Safety Tools (`git-work-safety-tools/`)
- `git_repos_status.sh` - Recursively scans directories for git repositories and shows status (commits, uncommitted files, unpushed changes)
- `git_repos_rebase_commit_push.sh` - Safely rebases, commits, and pushes changes across multiple repos
- `multi_repo_*.sh` - Additional multi-repository management utilities
## Repository Shape
### User Management Tools
- `remove_user_interactive.sh` - Interactive user removal with wildcard repository matching
- Supports patterns like `*-go-api*`, `frontend-*`
- Handles both root account and organization repositories
- Rich visual logging with progress tracking
- Flexible confirmation system (yes/no/all/skip/quit)
The repo is a mixed workspace:
### Individual Tools
- `list_all_public_repos.sh` - Lists public repositories for a user
- `list_orgs_teams_members.sh` - Lists organization teams and members
- `list_prs_by_user.sh` - Lists pull requests by specific user
- `list_repos_contributors.sh` - Lists contributors for repositories
- `make_repos_private.sh` - Converts public repos to private
- `delete_team_interactive.sh` - Interactive team deletion tool
- root Bash scripts for GitHub administration and maintenance
- `git-work-safety-tools/` for multi-repo git workflows
- `github_access_scripts/` and `github_repo_scanners/` for access scanning
- self-contained Python utilities in `Slack Message/`, `youtube/`, and `supabase monitor/`
## Development Commands
Do not assume a single dependency graph or runtime model across the whole repo.
## High-Value Files
- `bytelyst-cli.sh`
- `remove_user_interactive.sh`
- `remove_user_guided.sh`
- `remove_user_from_repos.sh`
- `git-work-safety-tools/*.sh`
## Safety Notes
- Treat `accounts.json`, account snapshot JSON files, `.env` files, and generated contributor/output data as sensitive.
- Avoid echoing secret-bearing file contents unless the task requires it.
- Many scripts are operational and potentially destructive. Preserve prompts, dry-run behavior, and confirmations unless asked to change them.
## Developer Commands
### Setup
```bash
./setup.sh
```
This installs pre-commit hooks and sets up the development environment.
### Code Quality
```bash
pre-commit run --all-files
```
Runs all pre-commit hooks including shellcheck for shell scripts.
### Testing
```bash
./test.sh
```
Basic test script (currently minimal).
## Architecture Notes
## Documentation Rule
### Authentication
All tools use GitHub Personal Access Tokens via `GITHUB_TOKEN` environment variable. The token can be set in:
- Environment variables
- `.env` file (loaded automatically by `bytelyst-cli.sh`)
If you change repo structure, onboarding, or navigation, update:
### Configuration Files
- `github_repos.json` - Repository configuration for batch operations
- `github_acc_input.json` - Account input configuration
- `users_white_list.json` / `users_black_list.json` - User access control lists
- `repos.json` / `repos.txt` - Repository lists for operations
### Error Handling
- Tools validate required dependencies (`jq`, `curl`) before execution
- GitHub API responses are checked for HTTP status codes
- Interactive confirmations for destructive operations
### Multi-Repository Operations
The git safety tools are designed to work across multiple repositories by:
1. Recursively finding all `.git` directories
2. Operating on each repository independently
3. Providing summary reports of operations
## Security Considerations
- All tools require explicit confirmation for destructive operations
- GitHub tokens are validated before operations
- Pre-commit hooks include security checks (large file detection, trailing whitespace)
- Repository access is controlled via whitelist/blacklist mechanisms
- `README.md`
- `docs/repo-map.md`
- `AGENTS.md` if agent navigation is affected

234
README.md
View File

@ -1,159 +1,137 @@
# ByteLyst DevOps Tools
A comprehensive collection of Bash-based DevOps tools for GitHub repository management, specifically designed for ByteLyst AI organization. The tools focus on repository security, user management, and multi-repository operations.
Internal repository for GitHub administration scripts, multi-repo safety helpers, and a few adjacent utility projects used by ByteLyst.
## 🚀 Quick Start
This repo is not a single application. It is a workspace of operational tools with three main characteristics:
1. **Setup the environment:**
```bash
./setup.sh
```
- The primary surface area is Bash scripts for GitHub and repository operations.
- Some subdirectories are self-contained Python utilities with their own setup and runtime expectations.
- A number of JSON files and outputs are generated artifacts or operational inputs, not source code to edit casually.
2. **Set your GitHub token:**
```bash
export GITHUB_TOKEN=your_github_token_here
```
## Start Here
3. **Run the main CLI:**
```bash
./bytelyst-cli.sh
```
If you are new to the repo, read these in order:
## 📋 Available Tools
1. [docs/getting-started.md](docs/getting-started.md)
2. [docs/repo-map.md](docs/repo-map.md)
3. [AGENTS.md](AGENTS.md) if you are working through an AI coding agent
4. [CLAUDE.md](CLAUDE.md) if you are using Claude Code specifically
### 🎯 Main CLI Interface
- **`bytelyst-cli.sh`** - Unified command-line interface for all tools
- Interactive menu mode
- Direct command execution
- Supports both user and organization operations
## Primary Entry Points
### 🔐 User Management Tools
- **`remove_user_interactive.sh`** - Interactive user removal with wildcard repository matching
- Supports patterns like `*-go-api*`, `frontend-*`
- Handles both root account and organization repositories
- Rich visual logging with progress tracking
- Flexible confirmation system (yes/no/all/skip/quit)
### GitHub Operations
### 🛡️ Git Safety Tools (`git-work-safety-tools/`)
- **`git_repos_status.sh`** - Recursively scan directories for git repositories status
- **`git_repos_rebase_commit_push.sh`** - Safely rebase, commit, and push changes
- **`multi_repo_*.sh`** - Additional multi-repository management utilities
- `./bytelyst-cli.sh`
- Main unified CLI for common GitHub admin operations.
- Requires `curl`, `jq`, and `GITHUB_TOKEN`.
- `./remove_user_interactive.sh`
- Interactive collaborator-removal workflow with repository pattern matching.
- `./remove_user_guided.sh`
- Guided wrapper around the same removal flow with a more opinionated interactive UX.
- `./remove_user_from_repos.sh`
- Scripted removal flow suitable for repeatable or semi-automated use.
### 📊 Repository Analysis Tools
- **`list_all_public_repos.sh`** - List all public repositories for a user
- **`list_orgs_teams_members.sh`** - List organization teams and members
- **`list_prs_by_user.sh`** - List pull requests by specific user
- **`list_repos_contributors.sh`** - List contributors for repositories
- **`list_repos_contributors_by_user.sh`** - List contributors filtered by user
### Multi-Repo Git Safety
### 🔧 Repository Management Tools
- **`make_repos_private.sh`** - Convert public repositories to private
- **`delete_team_interactive.sh`** - Interactive team deletion tool
- **`cleanup.sh`** - General cleanup utilities
- `git-work-safety-tools/git_repos_status.sh`
- `git-work-safety-tools/git_repos_rebase_commit_push.sh`
- `git-work-safety-tools/multi_repo_safe_push.sh`
- `git-work-safety-tools/multi_repo_status.sh`
## 🛠️ Prerequisites
These are for scanning many repositories, checking dirty state, and performing safer batch git workflows.
- **Required tools:** `curl`, `jq`
- **GitHub Personal Access Token** with permissions:
- `repo` (for repository access)
- `admin:org` (for organization management)
## Repository Layout
## 📝 Usage Examples
### Core Operational Scripts
- Root `*.sh` files
- Main Bash-based GitHub and maintenance utilities.
- `git-work-safety-tools/`
- Safer multi-repo git helpers.
- `github_access_scripts/`
- Focused access checks and repo listing utilities.
- `github_repo_scanners/`
- Scripts plus generated repo/contributor JSON outputs.
### Side Projects
- `Slack Message/`
- Python CLI for Slack posting and AI-assisted chat.
- `youtube/`
- YouTube transcript and summarization helpers.
- `supabase monitor/`
- Separate Python workflow project for YouTube processing despite the directory name.
### Documentation
- `docs/`
- Canonical onboarding and repo-orientation docs.
- Legacy root docs:
- `README_interactive_script.md`
- `README_remove_user_script.md`
These older docs are still useful but are no longer the best starting point.
## Setup
### Root Tooling
### Interactive User Removal
```bash
./setup.sh
```
This installs the local development hooks and prepares the shell-based workflow.
### Required Dependencies
- `bash`
- `curl`
- `jq`
### Authentication
Most GitHub-facing scripts require:
```bash
export GITHUB_TOKEN=your_token_here
```
Use a token with the minimum permissions required for the task. Many admin flows assume `repo` and `admin:org`.
## Common Commands
```bash
./bytelyst-cli.sh help
./bytelyst-cli.sh list-public-repos --user <username>
./bytelyst-cli.sh list-private-repos --org <orgname>
./remove_user_interactive.sh
# Follow the prompts to:
# 1. Enter GitHub token
# 2. Specify root user/organization
# 3. Enter user to remove
# 4. Define repository pattern (e.g., *-go-api*)
# 5. Confirm removals interactively
```
### CLI Commands
```bash
# List public repositories
./bytelyst-cli.sh list-public-repos --user username
# List private repositories
./bytelyst-cli.sh list-private-repos --org orgname
# Check collaborators
./bytelyst-cli.sh check-collaborators --input input.json
# Remove user from all matching repositories
./bytelyst-cli.sh remove-user-from-all-repos --user username
```
### Git Repository Status
```bash
# Check status of all git repositories in current directory tree
./git-work-safety-tools/git_repos_status.sh
# Safe multi-repository operations
./git-work-safety-tools/multi_repo_safe_push.sh
pre-commit run --all-files
```
## 📁 Configuration Files
## Operational Safety
- **`github_repos.json`** - Repository configuration for batch operations
- **`github_acc_input.json`** - Account input configuration
- **`users_white_list.json`** / **`users_black_list.json`** - User access control lists
- **`repos.json`** / **`repos.txt`** - Repository lists for operations
- Treat `accounts.json`, `*.json` account snapshots, `.env` files, and generated collaborator data as potentially sensitive.
- Prefer dry runs or interactive confirmation flows before bulk removal or visibility changes.
- Do not assume every tracked JSON file is a stable source file; many are data snapshots or inputs.
- Review scripts before reuse in automation. Some are one-off operational helpers and may encode assumptions about ByteLyst org structure.
## 🔒 Security Features
## Notes On Tracked Secrets And Outputs
- **Token validation** before operations
- **Interactive confirmations** for destructive operations
- **Access control** via whitelist/blacklist mechanisms
- **Pre-commit hooks** with security checks
- **Error handling** for API failures and network issues
This repo currently contains some tracked local environment files and generated outputs from older workflows. The `.gitignore` now protects against future accidental additions, but tracked files remain tracked until they are intentionally removed from version control in a separate change.
## 🎨 Visual Features
## Contributing
- **Rich logging** with colors and emojis
- **Progress tracking** with percentage indicators
- **Summary reports** with success/failure statistics
- **Interactive menus** for ease of use
- Keep new docs in `docs/` unless they are tightly scoped to a subproject.
- Prefer adding a short README to a subdirectory instead of expanding the root README with niche workflow details.
- Validate shell scripts with:
## 📋 Pre-commit Hooks
This project uses [pre-commit](https://pre-commit.com/) to ensure code quality and consistent formatting for all contributors.
### First-time setup
1. Run the setup script (recommended):
```bash
./setup.sh
```
This will install pre-commit (if not already installed) and set up the git hooks for you.
2. Or, manually install pre-commit and the hooks:
```bash
pip install pre-commit
pre-commit install
```
### Running hooks manually
To check all files with pre-commit hooks:
```bash
pre-commit run --all-files
```
### CI Enforcement
All commits and pull requests are checked with pre-commit hooks in CI. You must pass these checks to merge code.
## 🤝 Contributing
1. Ensure all scripts pass shellcheck validation
2. Follow existing code conventions and patterns
3. Add appropriate error handling and logging
4. Test scripts thoroughly before committing
5. Update documentation for new features
## 📄 License
This project is designed for internal use by ByteLyst AI organization.
- When adding new operational scripts, document:
- required environment variables
- destructive behavior
- expected input files
- example usage

86
docs/getting-started.md Normal file
View File

@ -0,0 +1,86 @@
# Getting Started
This repository is easiest to work with if you first identify which of its three modes you are dealing with:
1. Root Bash tooling for GitHub and repo administration
2. Self-contained Python utilities in subdirectories
3. Generated data, snapshots, and operational inputs
## Quick Orientation
### If You Need GitHub Admin Scripts
Start with:
- `bytelyst-cli.sh`
- `remove_user_interactive.sh`
- `remove_user_guided.sh`
- `remove_user_from_repos.sh`
### If You Need Multi-Repo Git Helpers
Start with:
- `git-work-safety-tools/git_repos_status.sh`
- `git-work-safety-tools/git_repos_rebase_commit_push.sh`
- `git-work-safety-tools/multi_repo_safe_push.sh`
### If You Need A Side Project
Use the local README in:
- `Slack Message/`
- `youtube/`
- `supabase monitor/`
## Root Setup
### Prerequisites
- `bash`
- `curl`
- `jq`
### Setup Command
```bash
./setup.sh
```
### GitHub Authentication
Most root scripts expect:
```bash
export GITHUB_TOKEN=your_token_here
```
Some workflows also rely on repo-local JSON input files such as `github_repos.json` or `github_acc_input.json`.
## Recommended First Commands
```bash
git status --short --branch
./bytelyst-cli.sh help
pre-commit run --all-files
```
## Operational Safety
- Use interactive or dry-run flows before bulk updates.
- Read the script you are about to run if it modifies collaborators, teams, or repository visibility.
- Treat tracked `.env` and JSON account files as sensitive operational data.
## Where To Put New Things
- New repo-level onboarding or orientation docs: `docs/`
- New shell tools for GitHub admin: repo root or `git-work-safety-tools/` if multi-repo specific
- New focused subproject docs: inside that subproject directory
## Before Opening A PR
```bash
pre-commit run --all-files
```
If you changed navigation or documentation, also verify that `README.md`, `docs/repo-map.md`, and `AGENTS.md` still agree.

138
docs/repo-map.md Normal file
View File

@ -0,0 +1,138 @@
# Repository Map
High-level map of the repository for contributors and AI agents.
## Top Level
### Core Bash Tooling
- `bytelyst-cli.sh`
- Main CLI entry point for common GitHub operations.
- `remove_user_interactive.sh`
- Interactive collaborator removal across matching repositories.
- `remove_user_guided.sh`
- Guided UX wrapper for the removal workflow.
- `remove_user_from_repos.sh`
- Non-guided removal script.
- `delete_team_interactive.sh`
- Interactive team deletion.
- `make_repos_private.sh`
- Repository visibility changes.
- `list_*.sh`
- Read-oriented listing and reporting utilities.
### Supporting Data And Inputs
- `github_repos.json`
- `github_acc_input.json`
- `repos.json`
- `repos.txt`
- `users_white_list.json`
- `users_black_list.json`
- account snapshot JSON files in the root
These are operational inputs or snapshots, not general-purpose source files.
## Directories
### `docs/`
Canonical onboarding and repo navigation docs.
Current key files:
- `docs/getting-started.md`
- `docs/repo-map.md`
- `docs/remove_user_interactive.md`
### `git-work-safety-tools/`
Scripts for scanning many git repositories and performing safer bulk workflows.
Key files:
- `git_repos_status.sh`
- `git_repos_rebase_commit_push.sh`
- `multi_repo_safe_push.sh`
- `multi_repo_status.sh`
- `multi_repo_interactive_fix.sh`
### `github_access_scripts/`
Focused GitHub access checks.
Key files:
- `check_repo_access.sh`
- `list_user_repos.sh`
### `github_repo_scanners/`
Repo scanning scripts plus generated JSON outputs.
Key files:
- `create_user_repo_lists.sh`
- `create_contributor_repo_lists.sh`
- `run_contributor_json_creation.sh`
- `contributor_repos/*.json`
### `Slack Message/`
Python Slack CLI utility.
Key files:
- `slack_poster.py`
- `README.md`
- `requirements.txt`
### `youtube/`
YouTube transcript and summarization utilities.
Key files:
- `transcribe_yt_video.py`
- `enhanced_yt_transcript.py`
- `summarize_with_perplexity.py`
- `README_youtube_transcripts.md`
- `captions/`
- `prompts/`
### `supabase monitor/`
Separate Python workflow project for YouTube processing. The folder name is misleading; inspect local docs before changing code.
Key files:
- `README.md`
- `EXECUTION_GUIDE.md`
- `workflow.py`
- `api.py`
- `agents/`
- `utils/`
- `output/`
### `_AZURE/`
Account-specific notes and operational docs.
Treat contents as environment-specific documentation rather than core project source.
## Legacy Docs
These still exist but are no longer the best starting point:
- `README_interactive_script.md`
- `README_remove_user_script.md`
Prefer `docs/` for new canonical documentation.
## Cleanup Guidance
When cleaning this repo:
- do not delete tracked outputs or snapshots unless explicitly asked
- prefer adding orientation docs over moving operational files
- avoid renaming legacy scripts unless you also provide compatibility shims or update all references