Lemontropia-Suite/OCR_SETUP.md

228 lines
6.0 KiB
Markdown

# Lemontropia Suite - OCR Setup Guide
This guide helps you set up OCR (Optical Character Recognition) for game text detection in Lemontropia Suite.
## Quick Start (No Installation Required)
The system works **out of the box** with OpenCV EAST text detection - no additional dependencies needed!
```python
from modules.game_vision_ai import GameVisionAI
# Initialize (auto-detects best available backend)
vision = GameVisionAI()
# Process screenshot
result = vision.process_screenshot("screenshot.png")
print(f"Detected {len(result.text_regions)} text regions")
```
## OCR Backend Options
The system supports multiple OCR backends, automatically selecting the best available one:
| Backend | Speed | Accuracy | Dependencies | Windows Store Python |
|---------|-------|----------|--------------|---------------------|
| **OpenCV EAST** | ⚡ Fastest | ⭐ Detection only | None (included) | ✅ Works |
| **EasyOCR** | 🚀 Fast | ⭐⭐⭐ Good | PyTorch | ❌ May fail |
| **Tesseract** | 🐢 Slow | ⭐⭐ Medium | Tesseract binary | ✅ Works |
| **PaddleOCR** | 🚀 Fast | ⭐⭐⭐⭐⭐ Best | PaddlePaddle | ❌ May fail |
## Windows Store Python Compatibility
⚠️ **Important**: If you're using Python from the Microsoft Store, PyTorch-based OCR (EasyOCR, PaddleOCR) may fail with DLL errors like:
```
OSError: [WinError 126] The specified module could not be found (c10.dll)
```
**Solutions:**
1. Use **OpenCV EAST** (works out of the box)
2. Use **Tesseract OCR** (install Tesseract binary)
3. Switch to Python from [python.org](https://python.org)
4. Use Anaconda/Miniconda instead
## Installation Options
### Option 1: OpenCV EAST Only (Recommended for Windows Store Python)
No additional installation needed! OpenCV is already included.
```bash
pip install opencv-python numpy pillow
```
### Option 2: With EasyOCR (Better accuracy, requires PyTorch)
```bash
# Install PyTorch first (see pytorch.org for your CUDA version)
pip install torch torchvision
# Then install EasyOCR
pip install easyocr
# Install remaining dependencies
pip install opencv-python numpy pillow
```
### Option 3: With Tesseract (Most stable)
1. **Install Tesseract OCR:**
- Windows: `choco install tesseract` or download from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki)
- Linux: `sudo apt-get install tesseract-ocr`
- macOS: `brew install tesseract`
2. **Install Python package:**
```bash
pip install pytesseract opencv-python numpy pillow
```
3. **(Windows only) Add Tesseract to PATH** or set path in code:
```python
from modules.ocr_backends import TesseractBackend
backend = TesseractBackend(tesseract_cmd=r"C:\Program Files\Tesseract-OCR\tesseract.exe")
```
### Option 4: Full Installation (All Backends)
```bash
# Install all OCR backends
pip install -r requirements-ocr.txt
# Or selectively:
pip install opencv-python numpy pillow easyocr pytesseract paddleocr
```
## Testing Your Setup
Run the test script to verify everything works:
```bash
python test_ocr_system.py
```
Expected output:
```
============================================================
HARDWARE DETECTION TEST
============================================================
GPU Backend: CPU
...
📋 Recommended OCR backend: opencv_east
============================================================
OCR BACKEND TESTS
============================================================
OPENCV_EAST:
Status: ✅ Available
GPU: 💻 CPU
...
🎉 All tests passed! OCR system is working correctly.
```
## Usage Examples
### Basic Text Detection
```python
from modules.game_vision_ai import GameVisionAI
# Initialize with auto-selected backend
vision = GameVisionAI()
# Process image
result = vision.process_screenshot("game_screenshot.png")
# Print detected text
for region in result.text_regions:
print(f"Text: {region.text} (confidence: {region.confidence:.2f})")
print(f"Location: {region.bbox}")
```
### Force Specific Backend
```python
from modules.game_vision_ai import GameVisionAI
# Use specific backend
vision = GameVisionAI(ocr_backend='tesseract')
# Or switch at runtime
vision.switch_ocr_backend('easyocr')
```
### Check Available Backends
```python
from modules.game_vision_ai import GameVisionAI
vision = GameVisionAI()
# List all backends
for backend in vision.get_ocr_backends():
print(f"{backend['name']}: {'Available' if backend['available'] else 'Not available'}")
```
### Hardware Diagnostics
```python
from modules.hardware_detection import print_hardware_summary
from modules.game_vision_ai import GameVisionAI
# Print hardware info
print_hardware_summary()
# Run full diagnostic
diag = GameVisionAI.diagnose()
print(diag)
```
## Troubleshooting
### "No OCR backend available"
- Make sure `opencv-python` is installed: `pip install opencv-python`
- The EAST model will auto-download on first use (~95MB)
### "PyTorch DLL error"
- You're likely using Windows Store Python
- Use OpenCV EAST or Tesseract instead
- Or install Python from [python.org](https://python.org)
### "Tesseract not found"
- Install Tesseract OCR binary (see Option 3 above)
- Add to PATH or specify path in code
### Low detection accuracy
- OpenCV EAST only detects text regions, doesn't recognize text
- For text recognition, use EasyOCR or PaddleOCR
- Ensure good screenshot quality and contrast
## Backend Priority
The system automatically selects backends in this priority order:
1. **PaddleOCR** - If PyTorch works and Paddle is installed
2. **EasyOCR** - If PyTorch works and EasyOCR is installed
3. **Tesseract** - If Tesseract binary is available
4. **OpenCV EAST** - Always works (ultimate fallback)
You can customize priority:
```python
from modules.game_vision_ai import UnifiedOCRProcessor
processor = UnifiedOCRProcessor(
backend_priority=['tesseract', 'opencv_east'] # Custom order
)
```
## Performance Tips
- **OpenCV EAST**: Fastest, use for real-time detection
- **GPU acceleration**: Significant speedup for EasyOCR/PaddleOCR
- **Preprocessing**: Better contrast = better OCR accuracy
- **Region of interest**: Crop to relevant areas for faster processing