# YouTube Processing Workflow - Complete Execution Guide ## πŸ“‹ Table of Contents 1. [Overview](#overview) 2. [Prerequisites](#prerequisites) 3. [Installation](#installation) 4. [Configuration](#configuration) 5. [Execution Methods](#execution-methods) 6. [Expected Output](#expected-output) 7. [Troubleshooting](#troubleshooting) 8. [Examples](#examples) ## 🎯 Overview This is a multi-agent YouTube processing workflow that: - **Transcribes** YouTube videos using OpenAI Whisper - **Translates** transcripts to target languages using LLM APIs - **Summarizes** content based on custom prompts - **Saves** results to local files in JSON and TXT formats ## πŸ”§ Prerequisites ### System Requirements - **Python 3.8+** installed - **FFmpeg** installed and in system PATH - **Internet connection** for API calls and video processing ### API Keys Required - **Perplexity API Key** (primary LLM) - **OpenAI API Key** (backup LLM) ## πŸ“¦ Installation ### 1. Install Python Dependencies ```bash pip install -r requirements.txt ``` ### 2. Install FFmpeg - **Windows**: Download from [ffmpeg.org](https://ffmpeg.org/download.html) and add to PATH - **macOS**: `brew install ffmpeg` - **Ubuntu**: `sudo apt update && sudo apt install ffmpeg` ### 3. Verify Installation ```bash python -c "import crewai, openai, whisper, yt_dlp; print('All dependencies installed successfully!')" ``` ## βš™οΈ Configuration ### 1. Create Environment File Copy the example environment file: ```bash cp env.example .env ``` ### 2. Add API Keys Edit the `.env` file with your API keys: ```env # Perplexity API Configuration PERPLEXITY_API_KEY=your_perplexity_api_key_here # Optional: OpenAI API Key (as backup LLM) OPENAI_API_KEY=your_openai_api_key_here ``` ### 3. Get API Keys #### Perplexity API 1. Visit [Perplexity AI](https://perplexity.ai/) 2. Sign up and get your API key 3. Add it to your `.env` file #### OpenAI API (Backup) 1. Visit [OpenAI Platform](https://platform.openai.com/) 2. Create an API key 3. Add it to your `.env` file ## πŸš€ Execution Methods ### Method 1: Simple Example (Recommended for Testing) ```bash python example.py ``` **What it does:** - Uses a demo YouTube video (Rick Roll) - Processes in English - Creates a 5-bullet point summary - Saves output to `./output/` directory ### Method 2: Command Line Workflow ```bash python workflow.py "https://www.youtube.com/watch?v=VIDEO_ID" "English" "Summarize in 5 bullet points for students to revise quickly" ``` **Parameters:** - `youtube_url`: Full YouTube video URL - `target_language`: Language for translation (e.g., "English", "Spanish", "French") - `summarization_prompt`: Custom prompt for summary generation **Example:** ```bash python workflow.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Spanish" "Create a 3-point executive summary" ``` ### Method 3: REST API Server ```bash python api.py ``` **Server starts on:** `http://localhost:5000` **API Endpoints:** - `POST /process` - Complete video processing - `POST /transcribe` - Transcription only - `POST /translate` - Translation only - `POST /summarize` - Summarization only - `GET /health` - Health check ### Method 4: Demo with Multiple Examples ```bash python demo.py ``` ### Method 5: Run Tests ```bash python test.py ``` ## πŸ“Š Expected Output ### File Structure ``` output/ β”œβ”€β”€ YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.json └── YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.txt ``` ### JSON Output Format ```json { "summary": "Complete summary based on your prompt...", "metadata": { "youtube_url": "https://www.youtube.com/watch?v=example", "target_language": "English", "original_transcript_length": 1848, "translated_text_length": 1848, "workflow_timestamp": "1759118170.3120418", "example_run": true, "source": "example.py" }, "timestamp": "20250129_143022", "type": "youtube_summary", "workflow_version": "1.0" } ``` ### TXT Output Format ``` # YouTube Video Summary **Video:** https://www.youtube.com/watch?v=example **Language:** English **Generated:** 20250129_143022 --- Summary based on prompt 'Summarize in 5 bullet points for students to revise quickly': β€’ Point 1: Key insight or main topic β€’ Point 2: Important detail or concept β€’ Point 3: Supporting information β€’ Point 4: Additional context β€’ Point 5: Conclusion or takeaway --- **Metadata:** { "youtube_url": "https://www.youtube.com/watch?v=example", "target_language": "English", "original_transcript_length": 1848, "translated_text_length": 1848, "workflow_timestamp": "1759118170.3120418", "example_run": true, "source": "example.py" } ``` ### Console Output ``` YouTube Processing Workflow - Simple Example ======================================================= Configuration looks good! Processing: https://www.youtube.com/watch?v=dQw4w9WgXcQ Target Language: English Summary Prompt: Summarize in 5 bullet points for students to revise quickly Running workflow... Starting transcription... Starting translation to English... Starting summarization... Starting local file publishing... Workflow completed! ================================================================================ YOUTUBE PROCESSING WORKFLOW SUMMARY ================================================================================ YouTube URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ Target Language: English Summary Prompt: Summarize in 5 bullet points for students to revise quickly Overall Success: True STAGE DETAILS: TRANSCRIPTION: Success: True Content Preview: Never gonna give you up, never gonna let you down... TRANSLATION: Success: True Content Preview: Never gonna give you up, never gonna let you down... SUMMARIZATION: Success: True Content Preview: β€’ This is a famous song by Rick Astley β€’ The song is about commitment and loyalty in relationships... PUBLISHING: Success: True Output Files: - JSON: ./output/20250129_143022_dQw4w9WgXcQ_English.json - TXT: ./output/20250129_143022_dQw4w9WgXcQ_English.txt ================================================================================ Example completed successfully! ``` ## πŸ› Troubleshooting ### Common Issues #### 1. FFmpeg Not Found ``` Error: ffmpeg not found ``` **Solution:** Install FFmpeg and ensure it's in your system PATH. #### 2. API Key Errors ``` Error: PERPLEXITY_API_KEY not found ``` **Solution:** - Check your `.env` file exists - Verify API keys are correctly formatted - Ensure no extra spaces or quotes around the keys #### 3. YouTube Access Issues ``` Error extracting audio from YouTube video ``` **Solution:** - Ensure the video is public and accessible - Check if the video has age restrictions - Verify the URL format is correct #### 4. Whisper Model Download Issues ``` Error downloading Whisper model ``` **Solution:** - Check internet connection - Ensure sufficient disk space (~1GB per model) - Try running again (models are cached after first download) #### 5. Import Errors ``` ImportError: No module named 'crewai' ``` **Solution:** ```bash pip install -r requirements.txt ``` ### Debug Mode Enable debug logging for detailed error information: ```python import logging logging.basicConfig(level=logging.DEBUG) ``` ## πŸ“ Examples ### Example 1: Educational Summary ```bash python workflow.py "https://www.youtube.com/watch?v=example" "English" "Summarize in 5 bullet points for students to revise quickly" ``` ### Example 2: Business Summary ```bash python workflow.py "https://www.youtube.com/watch?v=example" "English" "Create a 3-point executive summary highlighting key business insights" ``` ### Example 3: Creative Summary ```bash python workflow.py "https://www.youtube.com/watch?v=example" "English" "Rewrite as an engaging story with dialogue and vivid descriptions" ``` ### Example 4: Multi-language Processing ```bash # Spanish python workflow.py "https://www.youtube.com/watch?v=example" "Spanish" "Resumir en 5 puntos clave" # French python workflow.py "https://www.youtube.com/watch?v=example" "French" "RΓ©sumer en 5 points principaux" # German python workflow.py "https://www.youtube.com/watch?v=example" "German" "In 5 Hauptpunkten zusammenfassen" ``` ### Example 5: API Usage ```bash # Start server python api.py # Process video via API curl -X POST http://localhost:5000/process \ -H "Content-Type: application/json" \ -d '{ "youtube_url": "https://www.youtube.com/watch?v=example", "target_language": "English", "summarization_prompt": "Summarize in 5 bullet points", "metadata": { "user_id": "student_123", "course": "Data Science 101" } }' ``` ## πŸ“ˆ Performance Tips 1. **Model Selection**: Use smaller Whisper models for faster processing 2. **Batch Processing**: Process multiple videos using the API 3. **Caching**: Models are cached after first download 4. **Async Processing**: Use async/await patterns for large-scale deployments ## πŸ” Supported Languages The system supports 100+ languages including: - **European**: English, Spanish, French, German, Italian, Portuguese, Dutch, Swedish, Norwegian, Danish, Finnish - **Asian**: Chinese, Japanese, Korean, Hindi, Thai, Vietnamese - **Middle Eastern**: Arabic, Hebrew, Turkish - **Others**: Russian, Polish, Czech, Hungarian, Romanian ## πŸ“ž Support For issues and questions: 1. Check the troubleshooting section above 2. Verify all prerequisites are met 3. Check API key configuration 4. Review console output for specific error messages ## πŸŽ‰ Success Indicators Your workflow is working correctly when you see: - βœ… "Configuration looks good!" message - βœ… All stages show "Success: True" - βœ… Output files created in `./output/` directory - βœ… "Example completed successfully!" message - βœ… Full summaries (not truncated text) --- **Last Updated:** January 29, 2025 **Version:** 1.0 **Author:** YouTube Processing Workflow Team