# Multi-Agent YouTube Processing Workflow

A comprehensive multi-agent workflow built with CrewAI that processes YouTube videos through transcription, translation, summarization, and local output.

## 🎯 Overview

This project demonstrates a complete end-to-end workflow using CrewAI agents to:
1. **Transcribe** YouTube videos using OpenAI Whisper
2. **Translate** transcripts to target languages using LLM APIs
3. **Summarize** translated content based on custom prompts
4. **Save** final summaries to local files

## 🏗️ Architecture

### Agents

1. **Transcriber Agent** - Extracts audio from YouTube videos and generates transcripts
2. **Translator Agent** - Translates transcripts between languages
3. **Summarizer Agent** - Creates summaries based on custom prompts
4. **Publisher Agent** - Saves final content to local files

### Workflow Flow

```mermaid
graph TD
    A[YouTube URL] --> B[Transcriber Agent]
    B --> C[Transcript]
    C --> D[Translator Agent]
    D --> E[Translated Text]
    E --> F[Summarizer Agent]
    F --> G[Summary]
    G --> H[Publisher Agent]
    H --> I[Local Files]
```

## 🚀 Quick Start

### Prerequisites

- Python 3.8+
- FFmpeg installed on your system
- Valid API keys (see Configuration section)

### Installation

1. **Clone the repository**
   ```bash
   git clone <repository-url>
   cd multi-agent-workflow
   ```

2. **Install dependencies**
   ```bash
   pip install -r requirements.txt
   ```

3. **Install FFmpeg** (required for audio processing)
   - **Windows**: Download from [ffmpeg.org](https://ffmpeg.org/download.html)
   - **macOS**: `brew install ffmpeg`
   - **Ubuntu**: `sudo apt update && sudo apt install ffmpeg`

4. **Configure environment variables**
   ```bash
   cp env.example .env
   # Edit .env with your API keys
   ```

### Configuration

Create a `.env` file with the following variables:

```env
# Perplexity API Configuration
PERPLEXITY_API_KEY=your_perplexity_api_key_here

# Local output will be saved to ./output/ directory

# Optional: OpenAI API Key (as backup LLM)
OPENAI_API_KEY=your_openai_api_key_here
```

### API Keys Setup

#### Perplexity API
1. Visit [Perplexity AI](https://perplexity.ai/)
2. Sign up and get your API key
3. Add it to your `.env` file as `PERPLEXITY_API_KEY`

#### OpenAI API (Backup)
1. Visit [OpenAI Platform](https://platform.openai.com/)
2. Create an API key
3. Add it to your `.env` file as `OPENAI_API_KEY`

#### Local Output
Output files will be automatically saved to the `./output/` directory in JSON and TXT formats.

## 📖 Usage Examples

### Command Line Interface

Process a complete YouTube video:

```bash
python workflow.py \
  "https://www.youtube.com/watch?v=example" \
  "Spanish" \
  "Summarize in 5 bullet points for students to revise quickly"
```

### Python Script Usage

```python
from workflow import YouTubeProcessingWorkflow

# Initialize workflow
workflow = YouTubeProcessingWorkflow()

# Process video
results = workflow.process_youtube_video(
    youtube_url="https://www.youtube.com/watch?v=example",
    target_language="Spanish", 
    summarization_prompt="Summarize in 5 bullet points for students to revise quickly"
)

# Print results
workflow.print_workflow_summary(results)
```

### REST API Usage

#### Start the API Server

```bash
python api.py
```

The server will start on `http://localhost:5000`

#### Process Video via API

```bash
curl -X POST http://localhost:5000/process \
  -H "Content-Type: application/json" \
  -d '{
    "youtube_url": "https://www.youtube.com/watch?v=example",
    "target_language": "Spanish",
    "summarization_prompt": "Summarize in 5 bullet points for students to revise quickly",
    "metadata": {
      "user_id": "student_123",
      "course": "Data Science 101"
    }
  }'
```

#### Individual Operations

**Transcribe only:**
```bash
curl -X POST http://localhost:5000/transcribe \
  -H "Content-Type: application/json" \
  -d '{"youtube_url": "https://www.youtube.com/watch?v=example"}'
```

**Translate text:**
```bash
curl -X POST http://localhost:5000/translate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here",
    "target_language": "Spanish"
  }'
```

**Summarize text:**
```bash
curl -X POST http://localhost:5000/summarize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your text here",
    "summarization_prompt": "Summarize in 5 bullet points"
  }'
```

## 📁 Project Structure

```
multi-agent-workflow/
├── agents/                    # Agent implementations
│   ├── __init__.py
│   ├── transcirer_agent.py    # YouTube transcription
│   ├── translator_agent.py    # Language translation  
│   ├── summarizer_agent.py    # Content summarization
│   └── publisher_agent.py     # API publishing
├── utils/                     # Utility modules
│   ├── __init__.py
│   └── speech_processing.py  # Audio processing utilities
├── config.py                  # Configuration management
├── workflow.py                # Main workflow orchestration
├── api.py                     # REST API interface
├── requirements.txt           # Python dependencies
├── env.example               # Environment variables template
└── README.md                 # This file
```

## 🔧 Customization

### Adding Custom Prompts

You can customize summarization prompts for different use cases:

```python
# Educational summary
educational_prompt = "Summarize in 5 bullet points for students to revise quickly"

# Business summary  
business_prompt = "Create a 3-point executive summary highlighting key business insights"

# Creative summary
creative_prompt = "Rewrite as an engaging story with dialogue and vivid descriptions"
```

### Modifying Agent Behavior

Each agent can be customized in its respective file:

- **Transcriber**: Modify `YouTubeTranscriber` class in `utils/speech_processing.py`
- **Translator**: Update translation logic in `agents/translator_agent.py`
- **Summarizer**: Customize summarization prompts in `agents/summarizer_agent.py`
- **Publisher**: Modify API integration in `agents/publisher_agent.py`

### Adding New Languages

The translation system supports 100+ languages. Simply specify the language name in your target language:

```python
supported_languages = [
    "Spanish", "French", "German", "Italian", "Portuguese",
    "Chinese", "Japanese", "Korean", "Arabic", "Russian", 
    "Dutch", "Swedish", "Norwegian", "Danish", "Finnish"
]
```

## 🐛 Troubleshooting

### Common Issues

#### FFmpeg Not Found
```
Error: ffmpeg not found
```
**Solution**: Install FFmpeg and ensure it's in your system PATH.

#### Whisper Model Download Issues
```
Error downloading Whisper model
```
**Solution**: Check internet connection and ensure sufficient disk space (~1GB per model).

#### API Key Errors
```
Error: PERPLEXITY_API_KEY not found
```
**Solution**: Verify your `.env` file contains valid API keys.

#### YouTube Access Issues
```
Error extracting audio from YouTube video
```
**Solution**: 
- Ensure the video is public and accessible
- Check if the video has age restrictions
- Verify the URL format is correct

### Debug Mode

Enable debug logging for detailed error information:

```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

## 📊 Performance Tips

1. **Model Selection**: Use smaller Whisper models (`tiny`, `base`) for faster processing
2. **Batch Processing**: Process multiple videos using the API for better throughput
3. **Caching**: Implement caching for repeated transcriptions of the same video
4. **Async Processing**: Use async/await patterns for large-scale deployments

## 🧪 Testing

Run the test suite:

```bash
# Test individual components
python -m pytest tests/

# Test complete workflow
python test_workflow.py
```

## 📄 API Reference

### Main Workflow Class

#### `YouTubeProcessingWorkflow`

**Methods:**
- `process_youtube_video(youtube_url, target_language, summarization_prompt, metadata=None)`
- `print_workflow_summary(results)`

### REST API Endpoints

#### `POST /process`
Complete video processing workflow

#### `POST /transcribe`
YouTube video transcription only

#### `POST /translate` 
Text translation to target language

#### `POST /summarize`
Text summarization based on prompt

#### `GET /health`
Health check endpoint

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch: `git checkout -b feature-name`
3. Commit changes: `git commit -am 'Add feature'`
4. Push to branch: `git push origin feature-name`
5. Submit a Pull Request

## 📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

## 🙏 Acknowledgments

- [CrewAI](https://crewai.com/) for the agent orchestration framework
- [OpenAI Whisper](https://openai-research.github.io/whisper/) for speech recognition
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for YouTube video downloading
- [Flask](https://flask.palletsprojects.com/) for the REST API framework

## 📞 Support

For support and questions:
- Create an issue on GitHub
- Contact the development team
- Check the troubleshooting section above