## AI Agents and Workflows
### Overview
The concept of AI agents is gaining popularity, but some argue they are overhyped with limited real-world applications. Andrew Ng's example of multi-agent translation shows how three agents (direct translation, review, paraphrasing) can improve quality. However, similar results can be achieved by guiding LLMs through a "Chain of Thought" (CoT) process.
### Key Points
- **Using AI Effectively**: The core principle is designing an AI-friendly workflow rather than relying solely on AI agents.
- **Avoid Anthropomorphizing AI**: Instead of applying human methods to AI, leverage the strengths of LLMs using CoT processes.
- **AI as a Copilot**: Use AI to assist humans in decision-making and handle well-defined tasks without complex decisions.
### Designing an AI-Friendly Workflow
1. **Avoid Limiting AI Solutions to Human Methods**:
- Break down tasks into steps suitable for AI, like direct translation, reflection, and paraphrasing.
- Avoid mimicking human job roles (project managers, developers) as it doesn't fit well with how AI can work effectively.
2. **Don't Rely Solely on AI for Decisions—Use AI to Assist**:
- Use AI for simpler tasks like sentiment analysis and response generation in handling customer reviews.
- Examples include GitHub Copilot for code generation, which improves software workflows without complex decisions.
3. **Combine AI Models and Tools Across Domains for Better Solutions**:
- Integrate LLMs with domain-specific tools to create efficient solutions.
4. **Return to the Root Problem—AI Is Just a Tool**:
- Apply first principles thinking: define the core problem, break it down into essential components, and create a solution from the ground up.
### Examples of AI-Optimized Workflows
1. **Converting PDFs to Markdown**
- Use PyMuPDF to detect images, figures, and tables.
- Convert pages into images with red boxes marking figures and tables.
- GPT-4 interprets marked images and generates corresponding Markdown content.
2. **Translating Comics**
- Detect speech bubbles using a specialized model.
- Extract text from these bubbles using OCR.
- Remove original text, translate it using GPT-4, and re-insert the translated text into the original layout.
### Conclusion
To maximize AI's potential, design workflows that fit its strengths. Whether using an AI agent or LLMs, focus on solving core problems effectively by leveraging AI as a tool rather than an end in itself.
Source:https://baoyu.io/blog/ai-agent/what-you-need-is-ai-friendly-workflow