bytelyst-devops-tools/supabase monitor/README.md

9.0 KiB

Multi-Agent YouTube Processing Workflow

A comprehensive multi-agent workflow built with CrewAI that processes YouTube videos through transcription, translation, summarization, and local output.

🎯 Overview

This project demonstrates a complete end-to-end workflow using CrewAI agents to:

  1. Transcribe YouTube videos using OpenAI Whisper
  2. Translate transcripts to target languages using LLM APIs
  3. Summarize translated content based on custom prompts
  4. Save final summaries to local files

🏗️ Architecture

Agents

  1. Transcriber Agent - Extracts audio from YouTube videos and generates transcripts
  2. Translator Agent - Translates transcripts between languages
  3. Summarizer Agent - Creates summaries based on custom prompts
  4. Publisher Agent - Saves final content to local files

Workflow Flow

graph TD
    A[YouTube URL] --> B[Transcriber Agent]
    B --> C[Transcript]
    C --> D[Translator Agent]
    D --> E[Translated Text]
    E --> F[Summarizer Agent]
    F --> G[Summary]
    G --> H[Publisher Agent]
    H --> I[Local Files]

🚀 Quick Start

Prerequisites

  • Python 3.8+
  • FFmpeg installed on your system
  • Valid API keys (see Configuration section)

Installation

  1. Clone the repository

    git clone <repository-url>
    cd multi-agent-workflow
    
  2. Install dependencies

    pip install -r requirements.txt
    
  3. Install FFmpeg (required for audio processing)

    • Windows: Download from ffmpeg.org
    • macOS: brew install ffmpeg
    • Ubuntu: sudo apt update && sudo apt install ffmpeg
  4. Configure environment variables

    cp env.example .env
    # Edit .env with your API keys
    

Configuration

Create a .env file with the following variables:

# Perplexity API Configuration
PERPLEXITY_API_KEY=your_perplexity_api_key_here

# Local output will be saved to ./output/ directory

# Optional: OpenAI API Key (as backup LLM)
OPENAI_API_KEY=your_openai_api_key_here

API Keys Setup

Perplexity API

  1. Visit Perplexity AI
  2. Sign up and get your API key
  3. Add it to your .env file as PERPLEXITY_API_KEY

OpenAI API (Backup)

  1. Visit OpenAI Platform
  2. Create an API key
  3. Add it to your .env file as OPENAI_API_KEY

Local Output

Output files will be automatically saved to the ./output/ directory in JSON and TXT formats.

📖 Usage Examples

Command Line Interface

Process a complete YouTube video:

python workflow.py \
  "https://www.youtube.com/watch?v=example" \
  "Spanish" \
  "Summarize in 5 bullet points for students to revise quickly"

Python Script Usage

from workflow import YouTubeProcessingWorkflow

# Initialize workflow
workflow = YouTubeProcessingWorkflow()

# Process video
results = workflow.process_youtube_video(
    youtube_url="https://www.youtube.com/watch?v=example",
    target_language="Spanish", 
    summarization_prompt="Summarize in 5 bullet points for students to revise quickly"
)

# Print results
workflow.print_workflow_summary(results)

REST API Usage

Start the API Server

python api.py

The server will start on http://localhost:5000

Process Video via API

curl -X POST http://localhost:5000/process \
  -H "Content-Type: application/json" \
  -d '{
    "youtube_url": "https://www.youtube.com/watch?v=example",
    "target_language": "Spanish",
    "summarization_prompt": "Summarize in 5 bullet points for students to revise quickly",
    "metadata": {
      "user_id": "student_123",
      "course": "Data Science 101"
    }
  }'

Individual Operations

Transcribe only:

curl -X POST http://localhost:5000/transcribe \
  -H "Content-Type: application/json" \
  -d '{"youtube_url": "https://www.youtube.com/watch?v=example"}'

Translate text:

curl -X POST http://localhost:5000/translate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here",
    "target_language": "Spanish"
  }'

Summarize text:

curl -X POST http://localhost:5000/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here",
    "summarization_prompt": "Summarize in 5 bullet points"
  }'

📁 Project Structure

multi-agent-workflow/
├── agents/                    # Agent implementations
│   ├── __init__.py
│   ├── transcirer_agent.py    # YouTube transcription
│   ├── translator_agent.py    # Language translation  
│   ├── summarizer_agent.py    # Content summarization
│   └── publisher_agent.py     # API publishing
├── utils/                     # Utility modules
│   ├── __init__.py
│   └── speech_processing.py  # Audio processing utilities
├── config.py                  # Configuration management
├── workflow.py                # Main workflow orchestration
├── api.py                     # REST API interface
├── requirements.txt           # Python dependencies
├── env.example               # Environment variables template
└── README.md                 # This file

🔧 Customization

Adding Custom Prompts

You can customize summarization prompts for different use cases:

# Educational summary
educational_prompt = "Summarize in 5 bullet points for students to revise quickly"

# Business summary  
business_prompt = "Create a 3-point executive summary highlighting key business insights"

# Creative summary
creative_prompt = "Rewrite as an engaging story with dialogue and vivid descriptions"

Modifying Agent Behavior

Each agent can be customized in its respective file:

  • Transcriber: Modify YouTubeTranscriber class in utils/speech_processing.py
  • Translator: Update translation logic in agents/translator_agent.py
  • Summarizer: Customize summarization prompts in agents/summarizer_agent.py
  • Publisher: Modify API integration in agents/publisher_agent.py

Adding New Languages

The translation system supports 100+ languages. Simply specify the language name in your target language:

supported_languages = [
    "Spanish", "French", "German", "Italian", "Portuguese",
    "Chinese", "Japanese", "Korean", "Arabic", "Russian", 
    "Dutch", "Swedish", "Norwegian", "Danish", "Finnish"
]

🐛 Troubleshooting

Common Issues

FFmpeg Not Found

Error: ffmpeg not found

Solution: Install FFmpeg and ensure it's in your system PATH.

Whisper Model Download Issues

Error downloading Whisper model

Solution: Check internet connection and ensure sufficient disk space (~1GB per model).

API Key Errors

Error: PERPLEXITY_API_KEY not found

Solution: Verify your .env file contains valid API keys.

YouTube Access Issues

Error extracting audio from YouTube video

Solution:

  • Ensure the video is public and accessible
  • Check if the video has age restrictions
  • Verify the URL format is correct

Debug Mode

Enable debug logging for detailed error information:

import logging
logging.basicConfig(level=logging.DEBUG)

📊 Performance Tips

  1. Model Selection: Use smaller Whisper models (tiny, base) for faster processing
  2. Batch Processing: Process multiple videos using the API for better throughput
  3. Caching: Implement caching for repeated transcriptions of the same video
  4. Async Processing: Use async/await patterns for large-scale deployments

🧪 Testing

Run the test suite:

# Test individual components
python -m pytest tests/

# Test complete workflow
python test_workflow.py

📄 API Reference

Main Workflow Class

YouTubeProcessingWorkflow

Methods:

  • process_youtube_video(youtube_url, target_language, summarization_prompt, metadata=None)
  • print_workflow_summary(results)

REST API Endpoints

POST /process

Complete video processing workflow

POST /transcribe

YouTube video transcription only

POST /translate

Text translation to target language

POST /summarize

Text summarization based on prompt

GET /health

Health check endpoint

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Commit changes: git commit -am 'Add feature'
  4. Push to branch: git push origin feature-name
  5. Submit a Pull Request

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📞 Support

For support and questions:

  • Create an issue on GitHub
  • Contact the development team
  • Check the troubleshooting section above