9.7 KiB
9.7 KiB
YouTube Processing Workflow - Complete Execution Guide
📋 Table of Contents
- Overview
- Prerequisites
- Installation
- Configuration
- Execution Methods
- Expected Output
- Troubleshooting
- Examples
🎯 Overview
This is a multi-agent YouTube processing workflow that:
- Transcribes YouTube videos using OpenAI Whisper
- Translates transcripts to target languages using LLM APIs
- Summarizes content based on custom prompts
- Saves results to local files in JSON and TXT formats
🔧 Prerequisites
System Requirements
- Python 3.8+ installed
- FFmpeg installed and in system PATH
- Internet connection for API calls and video processing
API Keys Required
- Perplexity API Key (primary LLM)
- OpenAI API Key (backup LLM)
📦 Installation
1. Install Python Dependencies
pip install -r requirements.txt
2. Install FFmpeg
- Windows: Download from ffmpeg.org and add to PATH
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt update && sudo apt install ffmpeg
3. Verify Installation
python -c "import crewai, openai, whisper, yt_dlp; print('All dependencies installed successfully!')"
⚙️ Configuration
1. Create Environment File
Copy the example environment file:
cp env.example .env
2. Add API Keys
Edit the .env file with your API keys:
# Perplexity API Configuration
PERPLEXITY_API_KEY=your_perplexity_api_key_here
# Optional: OpenAI API Key (as backup LLM)
OPENAI_API_KEY=your_openai_api_key_here
3. Get API Keys
Perplexity API
- Visit Perplexity AI
- Sign up and get your API key
- Add it to your
.envfile
OpenAI API (Backup)
- Visit OpenAI Platform
- Create an API key
- Add it to your
.envfile
🚀 Execution Methods
Method 1: Simple Example (Recommended for Testing)
python example.py
What it does:
- Uses a demo YouTube video (Rick Roll)
- Processes in English
- Creates a 5-bullet point summary
- Saves output to
./output/directory
Method 2: Command Line Workflow
python workflow.py "https://www.youtube.com/watch?v=VIDEO_ID" "English" "Summarize in 5 bullet points for students to revise quickly"
Parameters:
youtube_url: Full YouTube video URLtarget_language: Language for translation (e.g., "English", "Spanish", "French")summarization_prompt: Custom prompt for summary generation
Example:
python workflow.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Spanish" "Create a 3-point executive summary"
Method 3: REST API Server
python api.py
Server starts on: http://localhost:5000
API Endpoints:
POST /process- Complete video processingPOST /transcribe- Transcription onlyPOST /translate- Translation onlyPOST /summarize- Summarization onlyGET /health- Health check
Method 4: Demo with Multiple Examples
python demo.py
Method 5: Run Tests
python test.py
📊 Expected Output
File Structure
output/
├── YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.json
└── YYYYMMDD_HHMMSS_VIDEOID_LANGUAGE.txt
JSON Output Format
{
"summary": "Complete summary based on your prompt...",
"metadata": {
"youtube_url": "https://www.youtube.com/watch?v=example",
"target_language": "English",
"original_transcript_length": 1848,
"translated_text_length": 1848,
"workflow_timestamp": "1759118170.3120418",
"example_run": true,
"source": "example.py"
},
"timestamp": "20250129_143022",
"type": "youtube_summary",
"workflow_version": "1.0"
}
TXT Output Format
# YouTube Video Summary
**Video:** https://www.youtube.com/watch?v=example
**Language:** English
**Generated:** 20250129_143022
---
Summary based on prompt 'Summarize in 5 bullet points for students to revise quickly':
• Point 1: Key insight or main topic
• Point 2: Important detail or concept
• Point 3: Supporting information
• Point 4: Additional context
• Point 5: Conclusion or takeaway
---
**Metadata:**
{
"youtube_url": "https://www.youtube.com/watch?v=example",
"target_language": "English",
"original_transcript_length": 1848,
"translated_text_length": 1848,
"workflow_timestamp": "1759118170.3120418",
"example_run": true,
"source": "example.py"
}
Console Output
YouTube Processing Workflow - Simple Example
=======================================================
Configuration looks good!
Processing: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Target Language: English
Summary Prompt: Summarize in 5 bullet points for students to revise quickly
Running workflow...
Starting transcription...
Starting translation to English...
Starting summarization...
Starting local file publishing...
Workflow completed!
================================================================================
YOUTUBE PROCESSING WORKFLOW SUMMARY
================================================================================
YouTube URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Target Language: English
Summary Prompt: Summarize in 5 bullet points for students to revise quickly
Overall Success: True
STAGE DETAILS:
TRANSCRIPTION:
Success: True
Content Preview: Never gonna give you up, never gonna let you down...
TRANSLATION:
Success: True
Content Preview: Never gonna give you up, never gonna let you down...
SUMMARIZATION:
Success: True
Content Preview: • This is a famous song by Rick Astley
• The song is about commitment and loyalty in relationships...
PUBLISHING:
Success: True
Output Files:
- JSON: ./output/20250129_143022_dQw4w9WgXcQ_English.json
- TXT: ./output/20250129_143022_dQw4w9WgXcQ_English.txt
================================================================================
Example completed successfully!
🐛 Troubleshooting
Common Issues
1. FFmpeg Not Found
Error: ffmpeg not found
Solution: Install FFmpeg and ensure it's in your system PATH.
2. API Key Errors
Error: PERPLEXITY_API_KEY not found
Solution:
- Check your
.envfile exists - Verify API keys are correctly formatted
- Ensure no extra spaces or quotes around the keys
3. YouTube Access Issues
Error extracting audio from YouTube video
Solution:
- Ensure the video is public and accessible
- Check if the video has age restrictions
- Verify the URL format is correct
4. Whisper Model Download Issues
Error downloading Whisper model
Solution:
- Check internet connection
- Ensure sufficient disk space (~1GB per model)
- Try running again (models are cached after first download)
5. Import Errors
ImportError: No module named 'crewai'
Solution:
pip install -r requirements.txt
Debug Mode
Enable debug logging for detailed error information:
import logging
logging.basicConfig(level=logging.DEBUG)
📝 Examples
Example 1: Educational Summary
python workflow.py "https://www.youtube.com/watch?v=example" "English" "Summarize in 5 bullet points for students to revise quickly"
Example 2: Business Summary
python workflow.py "https://www.youtube.com/watch?v=example" "English" "Create a 3-point executive summary highlighting key business insights"
Example 3: Creative Summary
python workflow.py "https://www.youtube.com/watch?v=example" "English" "Rewrite as an engaging story with dialogue and vivid descriptions"
Example 4: Multi-language Processing
# Spanish
python workflow.py "https://www.youtube.com/watch?v=example" "Spanish" "Resumir en 5 puntos clave"
# French
python workflow.py "https://www.youtube.com/watch?v=example" "French" "Résumer en 5 points principaux"
# German
python workflow.py "https://www.youtube.com/watch?v=example" "German" "In 5 Hauptpunkten zusammenfassen"
Example 5: API Usage
# Start server
python api.py
# Process video via API
curl -X POST http://localhost:5000/process \
-H "Content-Type: application/json" \
-d '{
"youtube_url": "https://www.youtube.com/watch?v=example",
"target_language": "English",
"summarization_prompt": "Summarize in 5 bullet points",
"metadata": {
"user_id": "student_123",
"course": "Data Science 101"
}
}'
📈 Performance Tips
- Model Selection: Use smaller Whisper models for faster processing
- Batch Processing: Process multiple videos using the API
- Caching: Models are cached after first download
- Async Processing: Use async/await patterns for large-scale deployments
🔍 Supported Languages
The system supports 100+ languages including:
- European: English, Spanish, French, German, Italian, Portuguese, Dutch, Swedish, Norwegian, Danish, Finnish
- Asian: Chinese, Japanese, Korean, Hindi, Thai, Vietnamese
- Middle Eastern: Arabic, Hebrew, Turkish
- Others: Russian, Polish, Czech, Hungarian, Romanian
📞 Support
For issues and questions:
- Check the troubleshooting section above
- Verify all prerequisites are met
- Check API key configuration
- Review console output for specific error messages
🎉 Success Indicators
Your workflow is working correctly when you see:
- ✅ "Configuration looks good!" message
- ✅ All stages show "Success: True"
- ✅ Output files created in
./output/directory - ✅ "Example completed successfully!" message
- ✅ Full summaries (not truncated text)
Last Updated: January 29, 2025 Version: 1.0 Author: YouTube Processing Workflow Team