Lemontropia-Suite/OCR_SETUP.md

6.0 KiB

Lemontropia Suite - OCR Setup Guide

This guide helps you set up OCR (Optical Character Recognition) for game text detection in Lemontropia Suite.

Quick Start (No Installation Required)

The system works out of the box with OpenCV EAST text detection - no additional dependencies needed!

from modules.game_vision_ai import GameVisionAI

# Initialize (auto-detects best available backend)
vision = GameVisionAI()

# Process screenshot
result = vision.process_screenshot("screenshot.png")
print(f"Detected {len(result.text_regions)} text regions")

OCR Backend Options

The system supports multiple OCR backends, automatically selecting the best available one:

Backend Speed Accuracy Dependencies Windows Store Python
OpenCV EAST Fastest Detection only None (included) Works
EasyOCR 🚀 Fast Good PyTorch May fail
Tesseract 🐢 Slow Medium Tesseract binary Works
PaddleOCR 🚀 Fast Best PaddlePaddle May fail

Windows Store Python Compatibility

⚠️ Important: If you're using Python from the Microsoft Store, PyTorch-based OCR (EasyOCR, PaddleOCR) may fail with DLL errors like:

OSError: [WinError 126] The specified module could not be found (c10.dll)

Solutions:

  1. Use OpenCV EAST (works out of the box)
  2. Use Tesseract OCR (install Tesseract binary)
  3. Switch to Python from python.org
  4. Use Anaconda/Miniconda instead

Installation Options

No additional installation needed! OpenCV is already included.

pip install opencv-python numpy pillow

Option 2: With EasyOCR (Better accuracy, requires PyTorch)

# Install PyTorch first (see pytorch.org for your CUDA version)
pip install torch torchvision

# Then install EasyOCR
pip install easyocr

# Install remaining dependencies
pip install opencv-python numpy pillow

Option 3: With Tesseract (Most stable)

  1. Install Tesseract OCR:

    • Windows: choco install tesseract or download from UB Mannheim
    • Linux: sudo apt-get install tesseract-ocr
    • macOS: brew install tesseract
  2. Install Python package:

    pip install pytesseract opencv-python numpy pillow
    
  3. (Windows only) Add Tesseract to PATH or set path in code:

    from modules.ocr_backends import TesseractBackend
    backend = TesseractBackend(tesseract_cmd=r"C:\Program Files\Tesseract-OCR\tesseract.exe")
    

Option 4: Full Installation (All Backends)

# Install all OCR backends
pip install -r requirements-ocr.txt

# Or selectively:
pip install opencv-python numpy pillow easyocr pytesseract paddleocr

Testing Your Setup

Run the test script to verify everything works:

python test_ocr_system.py

Expected output:

============================================================
HARDWARE DETECTION TEST
============================================================
GPU Backend: CPU
...
📋 Recommended OCR backend: opencv_east

============================================================
OCR BACKEND TESTS
============================================================
OPENCV_EAST:
   Status: ✅ Available
   GPU: 💻 CPU
...
🎉 All tests passed! OCR system is working correctly.

Usage Examples

Basic Text Detection

from modules.game_vision_ai import GameVisionAI

# Initialize with auto-selected backend
vision = GameVisionAI()

# Process image
result = vision.process_screenshot("game_screenshot.png")

# Print detected text
for region in result.text_regions:
    print(f"Text: {region.text} (confidence: {region.confidence:.2f})")
    print(f"Location: {region.bbox}")

Force Specific Backend

from modules.game_vision_ai import GameVisionAI

# Use specific backend
vision = GameVisionAI(ocr_backend='tesseract')

# Or switch at runtime
vision.switch_ocr_backend('easyocr')

Check Available Backends

from modules.game_vision_ai import GameVisionAI

vision = GameVisionAI()

# List all backends
for backend in vision.get_ocr_backends():
    print(f"{backend['name']}: {'Available' if backend['available'] else 'Not available'}")

Hardware Diagnostics

from modules.hardware_detection import print_hardware_summary
from modules.game_vision_ai import GameVisionAI

# Print hardware info
print_hardware_summary()

# Run full diagnostic
diag = GameVisionAI.diagnose()
print(diag)

Troubleshooting

"No OCR backend available"

  • Make sure opencv-python is installed: pip install opencv-python
  • The EAST model will auto-download on first use (~95MB)

"PyTorch DLL error"

  • You're likely using Windows Store Python
  • Use OpenCV EAST or Tesseract instead
  • Or install Python from python.org

"Tesseract not found"

  • Install Tesseract OCR binary (see Option 3 above)
  • Add to PATH or specify path in code

Low detection accuracy

  • OpenCV EAST only detects text regions, doesn't recognize text
  • For text recognition, use EasyOCR or PaddleOCR
  • Ensure good screenshot quality and contrast

Backend Priority

The system automatically selects backends in this priority order:

  1. PaddleOCR - If PyTorch works and Paddle is installed
  2. EasyOCR - If PyTorch works and EasyOCR is installed
  3. Tesseract - If Tesseract binary is available
  4. OpenCV EAST - Always works (ultimate fallback)

You can customize priority:

from modules.game_vision_ai import UnifiedOCRProcessor

processor = UnifiedOCRProcessor(
    backend_priority=['tesseract', 'opencv_east']  # Custom order
)

Performance Tips

  • OpenCV EAST: Fastest, use for real-time detection
  • GPU acceleration: Significant speedup for EasyOCR/PaddleOCR
  • Preprocessing: Better contrast = better OCR accuracy
  • Region of interest: Crop to relevant areas for faster processing