bytelyst-devops-tools/supabase monitor/EXECUTION_GUIDE.md

9.7 KiB

YouTube Processing Workflow - Complete Execution Guide

📋 Table of Contents

  1. Overview
  2. Prerequisites
  3. Installation
  4. Configuration
  5. Execution Methods
  6. Expected Output
  7. Troubleshooting
  8. Examples

🎯 Overview

This is a multi-agent YouTube processing workflow that:

  • Transcribes YouTube videos using OpenAI Whisper
  • Translates transcripts to target languages using LLM APIs
  • Summarizes content based on custom prompts
  • Saves results to local files in JSON and TXT formats

🔧 Prerequisites

System Requirements

  • Python 3.8+ installed
  • FFmpeg installed and in system PATH
  • Internet connection for API calls and video processing

API Keys Required

  • Perplexity API Key (primary LLM)
  • OpenAI API Key (backup LLM)

📦 Installation

1. Install Python Dependencies

pip install -r requirements.txt

2. Install FFmpeg

  • Windows: Download from ffmpeg.org and add to PATH
  • macOS: brew install ffmpeg
  • Ubuntu: sudo apt update && sudo apt install ffmpeg

3. Verify Installation

python -c "import crewai, openai, whisper, yt_dlp; print('All dependencies installed successfully!')"

⚙️ Configuration

1. Create Environment File

Copy the example environment file:

cp env.example .env

2. Add API Keys

Edit the .env file with your API keys:

# Perplexity API Configuration
PERPLEXITY_API_KEY=your_perplexity_api_key_here

# Optional: OpenAI API Key (as backup LLM)
OPENAI_API_KEY=your_openai_api_key_here

3. Get API Keys

Perplexity API

  1. Visit Perplexity AI
  2. Sign up and get your API key
  3. Add it to your .env file

OpenAI API (Backup)

  1. Visit OpenAI Platform
  2. Create an API key
  3. Add it to your .env file

🚀 Execution Methods

python example.py

What it does:

  • Uses a demo YouTube video (Rick Roll)
  • Processes in English
  • Creates a 5-bullet point summary
  • Saves output to ./output/ directory

Method 2: Command Line Workflow

python workflow.py "https://www.youtube.com/watch?v=VIDEO_ID" "English" "Summarize in 5 bullet points for students to revise quickly"

Parameters:

  • youtube_url: Full YouTube video URL
  • target_language: Language for translation (e.g., "English", "Spanish", "French")
  • summarization_prompt: Custom prompt for summary generation

Example:

python workflow.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Spanish" "Create a 3-point executive summary"

Method 3: REST API Server

python api.py

Server starts on: http://localhost:5000

API Endpoints:

  • POST /process - Complete video processing
  • POST /transcribe - Transcription only
  • POST /translate - Translation only
  • POST /summarize - Summarization only
  • GET /health - Health check

Method 4: Demo with Multiple Examples

python demo.py

Method 5: Run Tests

python test.py

📊 Expected Output

File Structure

output/
├── YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.json
└── YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.txt

JSON Output Format

{
  "summary": "Complete summary based on your prompt...",
  "metadata": {
    "youtube_url": "https://www.youtube.com/watch?v=example",
    "target_language": "English",
    "original_transcript_length": 1848,
    "translated_text_length": 1848,
    "workflow_timestamp": "1759118170.3120418",
    "example_run": true,
    "source": "example.py"
  },
  "timestamp": "20250129_143022",
  "type": "youtube_summary",
  "workflow_version": "1.0"
}

TXT Output Format

# YouTube Video Summary
**Video:** https://www.youtube.com/watch?v=example
**Language:** English
**Generated:** 20250129_143022

---

Summary based on prompt 'Summarize in 5 bullet points for students to revise quickly':

• Point 1: Key insight or main topic
• Point 2: Important detail or concept
• Point 3: Supporting information
• Point 4: Additional context
• Point 5: Conclusion or takeaway

---

**Metadata:**
{
  "youtube_url": "https://www.youtube.com/watch?v=example",
  "target_language": "English",
  "original_transcript_length": 1848,
  "translated_text_length": 1848,
  "workflow_timestamp": "1759118170.3120418",
  "example_run": true,
  "source": "example.py"
}

Console Output

YouTube Processing Workflow - Simple Example
=======================================================

Configuration looks good!

Processing: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Target Language: English
Summary Prompt: Summarize in 5 bullet points for students to revise quickly

Running workflow...
Starting transcription...
Starting translation to English...
Starting summarization...
Starting local file publishing...
Workflow completed!

================================================================================
YOUTUBE PROCESSING WORKFLOW SUMMARY
================================================================================
YouTube URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Target Language: English
Summary Prompt: Summarize in 5 bullet points for students to revise quickly
Overall Success: True

STAGE DETAILS:

TRANSCRIPTION:
  Success: True
  Content Preview: Never gonna give you up, never gonna let you down...

TRANSLATION:
  Success: True
  Content Preview: Never gonna give you up, never gonna let you down...

SUMMARIZATION:
  Success: True
  Content Preview: • This is a famous song by Rick Astley
• The song is about commitment and loyalty in relationships...

PUBLISHING:
  Success: True
  Output Files:
    - JSON: ./output/20250129_143022_dQw4w9WgXcQ_English.json
    - TXT: ./output/20250129_143022_dQw4w9WgXcQ_English.txt

================================================================================

Example completed successfully!

🐛 Troubleshooting

Common Issues

1. FFmpeg Not Found

Error: ffmpeg not found

Solution: Install FFmpeg and ensure it's in your system PATH.

2. API Key Errors

Error: PERPLEXITY_API_KEY not found

Solution:

  • Check your .env file exists
  • Verify API keys are correctly formatted
  • Ensure no extra spaces or quotes around the keys

3. YouTube Access Issues

Error extracting audio from YouTube video

Solution:

  • Ensure the video is public and accessible
  • Check if the video has age restrictions
  • Verify the URL format is correct

4. Whisper Model Download Issues

Error downloading Whisper model

Solution:

  • Check internet connection
  • Ensure sufficient disk space (~1GB per model)
  • Try running again (models are cached after first download)

5. Import Errors

ImportError: No module named 'crewai'

Solution:

pip install -r requirements.txt

Debug Mode

Enable debug logging for detailed error information:

import logging
logging.basicConfig(level=logging.DEBUG)

📝 Examples

Example 1: Educational Summary

python workflow.py "https://www.youtube.com/watch?v=example" "English" "Summarize in 5 bullet points for students to revise quickly"

Example 2: Business Summary

python workflow.py "https://www.youtube.com/watch?v=example" "English" "Create a 3-point executive summary highlighting key business insights"

Example 3: Creative Summary

python workflow.py "https://www.youtube.com/watch?v=example" "English" "Rewrite as an engaging story with dialogue and vivid descriptions"

Example 4: Multi-language Processing

# Spanish
python workflow.py "https://www.youtube.com/watch?v=example" "Spanish" "Resumir en 5 puntos clave"

# French
python workflow.py "https://www.youtube.com/watch?v=example" "French" "Résumer en 5 points principaux"

# German
python workflow.py "https://www.youtube.com/watch?v=example" "German" "In 5 Hauptpunkten zusammenfassen"

Example 5: API Usage

# Start server
python api.py

# Process video via API
curl -X POST http://localhost:5000/process \
  -H "Content-Type: application/json" \
  -d '{
    "youtube_url": "https://www.youtube.com/watch?v=example",
    "target_language": "English",
    "summarization_prompt": "Summarize in 5 bullet points",
    "metadata": {
      "user_id": "student_123",
      "course": "Data Science 101"
    }
  }'

📈 Performance Tips

  1. Model Selection: Use smaller Whisper models for faster processing
  2. Batch Processing: Process multiple videos using the API
  3. Caching: Models are cached after first download
  4. Async Processing: Use async/await patterns for large-scale deployments

🔍 Supported Languages

The system supports 100+ languages including:

  • European: English, Spanish, French, German, Italian, Portuguese, Dutch, Swedish, Norwegian, Danish, Finnish
  • Asian: Chinese, Japanese, Korean, Hindi, Thai, Vietnamese
  • Middle Eastern: Arabic, Hebrew, Turkish
  • Others: Russian, Polish, Czech, Hungarian, Romanian

📞 Support

For issues and questions:

  1. Check the troubleshooting section above
  2. Verify all prerequisites are met
  3. Check API key configuration
  4. Review console output for specific error messages

🎉 Success Indicators

Your workflow is working correctly when you see:

  • "Configuration looks good!" message
  • All stages show "Success: True"
  • Output files created in ./output/ directory
  • "Example completed successfully!" message
  • Full summaries (not truncated text)

Last Updated: January 29, 2025 Version: 1.0 Author: YouTube Processing Workflow Team