From 3a664b5a795a09628b8ebd8e6682caa799df9cd0 Mon Sep 17 00:00:00 2001 From: root Date: Tue, 5 May 2026 01:05:52 +0000 Subject: [PATCH] Improve repo onboarding and agent guidance --- .gitignore | 9 ++ AGENTS.md | 86 +++++++++++++++ CLAUDE.md | 102 +++++++----------- README.md | 234 ++++++++++++++++++---------------------- docs/getting-started.md | 86 +++++++++++++++ docs/repo-map.md | 138 ++++++++++++++++++++++++ 6 files changed, 461 insertions(+), 194 deletions(-) create mode 100644 AGENTS.md create mode 100644 docs/getting-started.md create mode 100644 docs/repo-map.md diff --git a/.gitignore b/.gitignore index a36238b..458bcfb 100644 --- a/.gitignore +++ b/.gitignore @@ -27,3 +27,12 @@ Thumbs.db # Temporary files *.tmp *.temp + +# Local environment and secret-bearing files +.env +*.env + +# Generated outputs and local data caches +supabase monitor/output/ +youtube/captions/ +github_repo_scanners/contributor_repos/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..3f94bec --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,86 @@ +# AGENTS.md + +Guidance for AI coding agents working in this repository. + +## Purpose + +This repo is a mixed operational workspace, not a single product codebase. Your first job is orientation: + +- identify whether the task targets root Bash tooling, a subproject, or generated data +- avoid treating tracked JSON or output files as canonical source without checking context +- prefer doc and safety improvements unless the requested task clearly requires behavioral changes + +## Canonical Docs + +Read these first: + +1. `README.md` +2. `docs/getting-started.md` +3. `docs/repo-map.md` +4. `CLAUDE.md` for Claude-specific guidance + +## High-Signal Areas + +### Primary + +- `bytelyst-cli.sh` +- `remove_user_interactive.sh` +- `remove_user_guided.sh` +- `remove_user_from_repos.sh` +- `git-work-safety-tools/` +- `github_access_scripts/` + +### Secondary Or Self-Contained + +- `Slack Message/` +- `youtube/` +- `supabase monitor/` + +Do not assume these subprojects share the same dependencies or conventions as the root scripts. + +## Sensitive And Generated Files + +Handle these carefully: + +- `accounts.json` +- account or user snapshot JSON files in the repo root +- `github_repo_scanners/contributor_repos/*.json` +- `.env` files +- generated outputs under `supabase monitor/output/` + +These may contain secrets, usernames, or operational snapshots. Avoid printing contents unless necessary for the task. + +## Editing Rules + +- Prefer small, targeted changes. +- Preserve operational scripts' current interface unless the task asks for a breaking change. +- When reorganizing docs, keep old filenames if external references may depend on them. +- Do not move or delete generated/tracked data files unless explicitly asked. +- If you add a new directory or script, update `docs/repo-map.md`. + +## Recommended Workflow + +1. Determine whether the task is about: + - GitHub admin Bash scripts + - multi-repo git safety tooling + - a Python side project + - repo documentation or cleanup +2. Read only the files relevant to that surface. +3. Check whether there is an existing README in that subdirectory. +4. Make the smallest coherent change set. +5. If docs or discoverability changed, update the canonical docs listed above. + +## Good First Checks + +```bash +git status --short --branch +rg --files +sed -n '1,220p' README.md +sed -n '1,220p' docs/repo-map.md +``` + +## Pitfalls + +- The repo contains legacy one-off scripts and tracked local artifacts. +- Directory names do not always reflect actual function. `supabase monitor/` is a YouTube-processing workflow project. +- The root README should stay broad; detailed usage belongs in `docs/` or a subdirectory README. diff --git a/CLAUDE.md b/CLAUDE.md index 6661efb..7873e7c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,85 +1,55 @@ # CLAUDE.md -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +Claude-specific working notes for this repository. -## Repository Overview +## Read First -This is a collection of Bash-based DevOps tools for GitHub repository management, specifically designed for ByteLyst AI organization. The tools focus on repository security, user management, and multi-repository operations. +Use these as the canonical orientation docs: -## Core Components +1. `README.md` +2. `docs/getting-started.md` +3. `docs/repo-map.md` +4. `AGENTS.md` -### Main CLI Interface -- `bytelyst-cli.sh` - Unified command-line interface for all tools with both CLI and interactive modes -- Requires `jq` and `curl` dependencies -- Requires `GITHUB_TOKEN` environment variable for GitHub API access +This file is intentionally short and should not become a second full repo README. -### Git Safety Tools (`git-work-safety-tools/`) -- `git_repos_status.sh` - Recursively scans directories for git repositories and shows status (commits, uncommitted files, unpushed changes) -- `git_repos_rebase_commit_push.sh` - Safely rebases, commits, and pushes changes across multiple repos -- `multi_repo_*.sh` - Additional multi-repository management utilities +## Repository Shape -### User Management Tools -- `remove_user_interactive.sh` - Interactive user removal with wildcard repository matching - - Supports patterns like `*-go-api*`, `frontend-*` - - Handles both root account and organization repositories - - Rich visual logging with progress tracking - - Flexible confirmation system (yes/no/all/skip/quit) +The repo is a mixed workspace: -### Individual Tools -- `list_all_public_repos.sh` - Lists public repositories for a user -- `list_orgs_teams_members.sh` - Lists organization teams and members -- `list_prs_by_user.sh` - Lists pull requests by specific user -- `list_repos_contributors.sh` - Lists contributors for repositories -- `make_repos_private.sh` - Converts public repos to private -- `delete_team_interactive.sh` - Interactive team deletion tool +- root Bash scripts for GitHub administration and maintenance +- `git-work-safety-tools/` for multi-repo git workflows +- `github_access_scripts/` and `github_repo_scanners/` for access scanning +- self-contained Python utilities in `Slack Message/`, `youtube/`, and `supabase monitor/` -## Development Commands +Do not assume a single dependency graph or runtime model across the whole repo. + +## High-Value Files + +- `bytelyst-cli.sh` +- `remove_user_interactive.sh` +- `remove_user_guided.sh` +- `remove_user_from_repos.sh` +- `git-work-safety-tools/*.sh` + +## Safety Notes + +- Treat `accounts.json`, account snapshot JSON files, `.env` files, and generated contributor/output data as sensitive. +- Avoid echoing secret-bearing file contents unless the task requires it. +- Many scripts are operational and potentially destructive. Preserve prompts, dry-run behavior, and confirmations unless asked to change them. + +## Developer Commands -### Setup ```bash ./setup.sh -``` -This installs pre-commit hooks and sets up the development environment. - -### Code Quality -```bash pre-commit run --all-files -``` -Runs all pre-commit hooks including shellcheck for shell scripts. - -### Testing -```bash ./test.sh ``` -Basic test script (currently minimal). -## Architecture Notes +## Documentation Rule -### Authentication -All tools use GitHub Personal Access Tokens via `GITHUB_TOKEN` environment variable. The token can be set in: -- Environment variables -- `.env` file (loaded automatically by `bytelyst-cli.sh`) +If you change repo structure, onboarding, or navigation, update: -### Configuration Files -- `github_repos.json` - Repository configuration for batch operations -- `github_acc_input.json` - Account input configuration -- `users_white_list.json` / `users_black_list.json` - User access control lists -- `repos.json` / `repos.txt` - Repository lists for operations - -### Error Handling -- Tools validate required dependencies (`jq`, `curl`) before execution -- GitHub API responses are checked for HTTP status codes -- Interactive confirmations for destructive operations - -### Multi-Repository Operations -The git safety tools are designed to work across multiple repositories by: -1. Recursively finding all `.git` directories -2. Operating on each repository independently -3. Providing summary reports of operations - -## Security Considerations - -- All tools require explicit confirmation for destructive operations -- GitHub tokens are validated before operations -- Pre-commit hooks include security checks (large file detection, trailing whitespace) -- Repository access is controlled via whitelist/blacklist mechanisms \ No newline at end of file +- `README.md` +- `docs/repo-map.md` +- `AGENTS.md` if agent navigation is affected diff --git a/README.md b/README.md index 9046166..8df7a5f 100644 --- a/README.md +++ b/README.md @@ -1,159 +1,137 @@ # ByteLyst DevOps Tools -A comprehensive collection of Bash-based DevOps tools for GitHub repository management, specifically designed for ByteLyst AI organization. The tools focus on repository security, user management, and multi-repository operations. +Internal repository for GitHub administration scripts, multi-repo safety helpers, and a few adjacent utility projects used by ByteLyst. -## 🚀 Quick Start +This repo is not a single application. It is a workspace of operational tools with three main characteristics: -1. **Setup the environment:** - ```bash - ./setup.sh - ``` +- The primary surface area is Bash scripts for GitHub and repository operations. +- Some subdirectories are self-contained Python utilities with their own setup and runtime expectations. +- A number of JSON files and outputs are generated artifacts or operational inputs, not source code to edit casually. -2. **Set your GitHub token:** - ```bash - export GITHUB_TOKEN=your_github_token_here - ``` +## Start Here -3. **Run the main CLI:** - ```bash - ./bytelyst-cli.sh - ``` +If you are new to the repo, read these in order: -## 📋 Available Tools +1. [docs/getting-started.md](docs/getting-started.md) +2. [docs/repo-map.md](docs/repo-map.md) +3. [AGENTS.md](AGENTS.md) if you are working through an AI coding agent +4. [CLAUDE.md](CLAUDE.md) if you are using Claude Code specifically -### 🎯 Main CLI Interface -- **`bytelyst-cli.sh`** - Unified command-line interface for all tools - - Interactive menu mode - - Direct command execution - - Supports both user and organization operations +## Primary Entry Points -### 🔐 User Management Tools -- **`remove_user_interactive.sh`** - Interactive user removal with wildcard repository matching - - Supports patterns like `*-go-api*`, `frontend-*` - - Handles both root account and organization repositories - - Rich visual logging with progress tracking - - Flexible confirmation system (yes/no/all/skip/quit) +### GitHub Operations -### 🛡️ Git Safety Tools (`git-work-safety-tools/`) -- **`git_repos_status.sh`** - Recursively scan directories for git repositories status -- **`git_repos_rebase_commit_push.sh`** - Safely rebase, commit, and push changes -- **`multi_repo_*.sh`** - Additional multi-repository management utilities +- `./bytelyst-cli.sh` + - Main unified CLI for common GitHub admin operations. + - Requires `curl`, `jq`, and `GITHUB_TOKEN`. +- `./remove_user_interactive.sh` + - Interactive collaborator-removal workflow with repository pattern matching. +- `./remove_user_guided.sh` + - Guided wrapper around the same removal flow with a more opinionated interactive UX. +- `./remove_user_from_repos.sh` + - Scripted removal flow suitable for repeatable or semi-automated use. -### 📊 Repository Analysis Tools -- **`list_all_public_repos.sh`** - List all public repositories for a user -- **`list_orgs_teams_members.sh`** - List organization teams and members -- **`list_prs_by_user.sh`** - List pull requests by specific user -- **`list_repos_contributors.sh`** - List contributors for repositories -- **`list_repos_contributors_by_user.sh`** - List contributors filtered by user +### Multi-Repo Git Safety -### 🔧 Repository Management Tools -- **`make_repos_private.sh`** - Convert public repositories to private -- **`delete_team_interactive.sh`** - Interactive team deletion tool -- **`cleanup.sh`** - General cleanup utilities +- `git-work-safety-tools/git_repos_status.sh` +- `git-work-safety-tools/git_repos_rebase_commit_push.sh` +- `git-work-safety-tools/multi_repo_safe_push.sh` +- `git-work-safety-tools/multi_repo_status.sh` -## 🛠️ Prerequisites +These are for scanning many repositories, checking dirty state, and performing safer batch git workflows. -- **Required tools:** `curl`, `jq` -- **GitHub Personal Access Token** with permissions: - - `repo` (for repository access) - - `admin:org` (for organization management) +## Repository Layout -## 📝 Usage Examples +### Core Operational Scripts + +- Root `*.sh` files + - Main Bash-based GitHub and maintenance utilities. +- `git-work-safety-tools/` + - Safer multi-repo git helpers. +- `github_access_scripts/` + - Focused access checks and repo listing utilities. +- `github_repo_scanners/` + - Scripts plus generated repo/contributor JSON outputs. + +### Side Projects + +- `Slack Message/` + - Python CLI for Slack posting and AI-assisted chat. +- `youtube/` + - YouTube transcript and summarization helpers. +- `supabase monitor/` + - Separate Python workflow project for YouTube processing despite the directory name. + +### Documentation + +- `docs/` + - Canonical onboarding and repo-orientation docs. +- Legacy root docs: + - `README_interactive_script.md` + - `README_remove_user_script.md` + +These older docs are still useful but are no longer the best starting point. + +## Setup + +### Root Tooling -### Interactive User Removal ```bash +./setup.sh +``` + +This installs the local development hooks and prepares the shell-based workflow. + +### Required Dependencies + +- `bash` +- `curl` +- `jq` + +### Authentication + +Most GitHub-facing scripts require: + +```bash +export GITHUB_TOKEN=your_token_here +``` + +Use a token with the minimum permissions required for the task. Many admin flows assume `repo` and `admin:org`. + +## Common Commands + +```bash +./bytelyst-cli.sh help +./bytelyst-cli.sh list-public-repos --user +./bytelyst-cli.sh list-private-repos --org ./remove_user_interactive.sh -# Follow the prompts to: -# 1. Enter GitHub token -# 2. Specify root user/organization -# 3. Enter user to remove -# 4. Define repository pattern (e.g., *-go-api*) -# 5. Confirm removals interactively -``` - -### CLI Commands -```bash -# List public repositories -./bytelyst-cli.sh list-public-repos --user username - -# List private repositories -./bytelyst-cli.sh list-private-repos --org orgname - -# Check collaborators -./bytelyst-cli.sh check-collaborators --input input.json - -# Remove user from all matching repositories -./bytelyst-cli.sh remove-user-from-all-repos --user username -``` - -### Git Repository Status -```bash -# Check status of all git repositories in current directory tree ./git-work-safety-tools/git_repos_status.sh - -# Safe multi-repository operations -./git-work-safety-tools/multi_repo_safe_push.sh +pre-commit run --all-files ``` -## 📁 Configuration Files +## Operational Safety -- **`github_repos.json`** - Repository configuration for batch operations -- **`github_acc_input.json`** - Account input configuration -- **`users_white_list.json`** / **`users_black_list.json`** - User access control lists -- **`repos.json`** / **`repos.txt`** - Repository lists for operations +- Treat `accounts.json`, `*.json` account snapshots, `.env` files, and generated collaborator data as potentially sensitive. +- Prefer dry runs or interactive confirmation flows before bulk removal or visibility changes. +- Do not assume every tracked JSON file is a stable source file; many are data snapshots or inputs. +- Review scripts before reuse in automation. Some are one-off operational helpers and may encode assumptions about ByteLyst org structure. -## 🔒 Security Features +## Notes On Tracked Secrets And Outputs -- **Token validation** before operations -- **Interactive confirmations** for destructive operations -- **Access control** via whitelist/blacklist mechanisms -- **Pre-commit hooks** with security checks -- **Error handling** for API failures and network issues +This repo currently contains some tracked local environment files and generated outputs from older workflows. The `.gitignore` now protects against future accidental additions, but tracked files remain tracked until they are intentionally removed from version control in a separate change. -## 🎨 Visual Features +## Contributing -- **Rich logging** with colors and emojis -- **Progress tracking** with percentage indicators -- **Summary reports** with success/failure statistics -- **Interactive menus** for ease of use +- Keep new docs in `docs/` unless they are tightly scoped to a subproject. +- Prefer adding a short README to a subdirectory instead of expanding the root README with niche workflow details. +- Validate shell scripts with: -## 📋 Pre-commit Hooks - -This project uses [pre-commit](https://pre-commit.com/) to ensure code quality and consistent formatting for all contributors. - -### First-time setup - -1. Run the setup script (recommended): - ```bash - ./setup.sh - ``` - This will install pre-commit (if not already installed) and set up the git hooks for you. - -2. Or, manually install pre-commit and the hooks: - ```bash - pip install pre-commit - pre-commit install - ``` - -### Running hooks manually - -To check all files with pre-commit hooks: ```bash pre-commit run --all-files ``` -### CI Enforcement - -All commits and pull requests are checked with pre-commit hooks in CI. You must pass these checks to merge code. - -## 🤝 Contributing - -1. Ensure all scripts pass shellcheck validation -2. Follow existing code conventions and patterns -3. Add appropriate error handling and logging -4. Test scripts thoroughly before committing -5. Update documentation for new features - -## 📄 License - -This project is designed for internal use by ByteLyst AI organization. \ No newline at end of file +- When adding new operational scripts, document: + - required environment variables + - destructive behavior + - expected input files + - example usage diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 0000000..8c2a460 --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,86 @@ +# Getting Started + +This repository is easiest to work with if you first identify which of its three modes you are dealing with: + +1. Root Bash tooling for GitHub and repo administration +2. Self-contained Python utilities in subdirectories +3. Generated data, snapshots, and operational inputs + +## Quick Orientation + +### If You Need GitHub Admin Scripts + +Start with: + +- `bytelyst-cli.sh` +- `remove_user_interactive.sh` +- `remove_user_guided.sh` +- `remove_user_from_repos.sh` + +### If You Need Multi-Repo Git Helpers + +Start with: + +- `git-work-safety-tools/git_repos_status.sh` +- `git-work-safety-tools/git_repos_rebase_commit_push.sh` +- `git-work-safety-tools/multi_repo_safe_push.sh` + +### If You Need A Side Project + +Use the local README in: + +- `Slack Message/` +- `youtube/` +- `supabase monitor/` + +## Root Setup + +### Prerequisites + +- `bash` +- `curl` +- `jq` + +### Setup Command + +```bash +./setup.sh +``` + +### GitHub Authentication + +Most root scripts expect: + +```bash +export GITHUB_TOKEN=your_token_here +``` + +Some workflows also rely on repo-local JSON input files such as `github_repos.json` or `github_acc_input.json`. + +## Recommended First Commands + +```bash +git status --short --branch +./bytelyst-cli.sh help +pre-commit run --all-files +``` + +## Operational Safety + +- Use interactive or dry-run flows before bulk updates. +- Read the script you are about to run if it modifies collaborators, teams, or repository visibility. +- Treat tracked `.env` and JSON account files as sensitive operational data. + +## Where To Put New Things + +- New repo-level onboarding or orientation docs: `docs/` +- New shell tools for GitHub admin: repo root or `git-work-safety-tools/` if multi-repo specific +- New focused subproject docs: inside that subproject directory + +## Before Opening A PR + +```bash +pre-commit run --all-files +``` + +If you changed navigation or documentation, also verify that `README.md`, `docs/repo-map.md`, and `AGENTS.md` still agree. diff --git a/docs/repo-map.md b/docs/repo-map.md new file mode 100644 index 0000000..8e30af2 --- /dev/null +++ b/docs/repo-map.md @@ -0,0 +1,138 @@ +# Repository Map + +High-level map of the repository for contributors and AI agents. + +## Top Level + +### Core Bash Tooling + +- `bytelyst-cli.sh` + - Main CLI entry point for common GitHub operations. +- `remove_user_interactive.sh` + - Interactive collaborator removal across matching repositories. +- `remove_user_guided.sh` + - Guided UX wrapper for the removal workflow. +- `remove_user_from_repos.sh` + - Non-guided removal script. +- `delete_team_interactive.sh` + - Interactive team deletion. +- `make_repos_private.sh` + - Repository visibility changes. +- `list_*.sh` + - Read-oriented listing and reporting utilities. + +### Supporting Data And Inputs + +- `github_repos.json` +- `github_acc_input.json` +- `repos.json` +- `repos.txt` +- `users_white_list.json` +- `users_black_list.json` +- account snapshot JSON files in the root + +These are operational inputs or snapshots, not general-purpose source files. + +## Directories + +### `docs/` + +Canonical onboarding and repo navigation docs. + +Current key files: + +- `docs/getting-started.md` +- `docs/repo-map.md` +- `docs/remove_user_interactive.md` + +### `git-work-safety-tools/` + +Scripts for scanning many git repositories and performing safer bulk workflows. + +Key files: + +- `git_repos_status.sh` +- `git_repos_rebase_commit_push.sh` +- `multi_repo_safe_push.sh` +- `multi_repo_status.sh` +- `multi_repo_interactive_fix.sh` + +### `github_access_scripts/` + +Focused GitHub access checks. + +Key files: + +- `check_repo_access.sh` +- `list_user_repos.sh` + +### `github_repo_scanners/` + +Repo scanning scripts plus generated JSON outputs. + +Key files: + +- `create_user_repo_lists.sh` +- `create_contributor_repo_lists.sh` +- `run_contributor_json_creation.sh` +- `contributor_repos/*.json` + +### `Slack Message/` + +Python Slack CLI utility. + +Key files: + +- `slack_poster.py` +- `README.md` +- `requirements.txt` + +### `youtube/` + +YouTube transcript and summarization utilities. + +Key files: + +- `transcribe_yt_video.py` +- `enhanced_yt_transcript.py` +- `summarize_with_perplexity.py` +- `README_youtube_transcripts.md` +- `captions/` +- `prompts/` + +### `supabase monitor/` + +Separate Python workflow project for YouTube processing. The folder name is misleading; inspect local docs before changing code. + +Key files: + +- `README.md` +- `EXECUTION_GUIDE.md` +- `workflow.py` +- `api.py` +- `agents/` +- `utils/` +- `output/` + +### `_AZURE/` + +Account-specific notes and operational docs. + +Treat contents as environment-specific documentation rather than core project source. + +## Legacy Docs + +These still exist but are no longer the best starting point: + +- `README_interactive_script.md` +- `README_remove_user_script.md` + +Prefer `docs/` for new canonical documentation. + +## Cleanup Guidance + +When cleaning this repo: + +- do not delete tracked outputs or snapshots unless explicitly asked +- prefer adding orientation docs over moving operational files +- avoid renaming legacy scripts unless you also provide compatibility shims or update all references