diff --git a/OCR_SETUP.md b/OCR_SETUP.md new file mode 100644 index 0000000..aef1133 --- /dev/null +++ b/OCR_SETUP.md @@ -0,0 +1,227 @@ +# Lemontropia Suite - OCR Setup Guide + +This guide helps you set up OCR (Optical Character Recognition) for game text detection in Lemontropia Suite. + +## Quick Start (No Installation Required) + +The system works **out of the box** with OpenCV EAST text detection - no additional dependencies needed! + +```python +from modules.game_vision_ai import GameVisionAI + +# Initialize (auto-detects best available backend) +vision = GameVisionAI() + +# Process screenshot +result = vision.process_screenshot("screenshot.png") +print(f"Detected {len(result.text_regions)} text regions") +``` + +## OCR Backend Options + +The system supports multiple OCR backends, automatically selecting the best available one: + +| Backend | Speed | Accuracy | Dependencies | Windows Store Python | +|---------|-------|----------|--------------|---------------------| +| **OpenCV EAST** | ⚔ Fastest | ⭐ Detection only | None (included) | āœ… Works | +| **EasyOCR** | šŸš€ Fast | ⭐⭐⭐ Good | PyTorch | āŒ May fail | +| **Tesseract** | 🐢 Slow | ⭐⭐ Medium | Tesseract binary | āœ… Works | +| **PaddleOCR** | šŸš€ Fast | ⭐⭐⭐⭐⭐ Best | PaddlePaddle | āŒ May fail | + +## Windows Store Python Compatibility + +āš ļø **Important**: If you're using Python from the Microsoft Store, PyTorch-based OCR (EasyOCR, PaddleOCR) may fail with DLL errors like: +``` +OSError: [WinError 126] The specified module could not be found (c10.dll) +``` + +**Solutions:** +1. Use **OpenCV EAST** (works out of the box) +2. Use **Tesseract OCR** (install Tesseract binary) +3. Switch to Python from [python.org](https://python.org) +4. Use Anaconda/Miniconda instead + +## Installation Options + +### Option 1: OpenCV EAST Only (Recommended for Windows Store Python) + +No additional installation needed! OpenCV is already included. + +```bash +pip install opencv-python numpy pillow +``` + +### Option 2: With EasyOCR (Better accuracy, requires PyTorch) + +```bash +# Install PyTorch first (see pytorch.org for your CUDA version) +pip install torch torchvision + +# Then install EasyOCR +pip install easyocr + +# Install remaining dependencies +pip install opencv-python numpy pillow +``` + +### Option 3: With Tesseract (Most stable) + +1. **Install Tesseract OCR:** + - Windows: `choco install tesseract` or download from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki) + - Linux: `sudo apt-get install tesseract-ocr` + - macOS: `brew install tesseract` + +2. **Install Python package:** + ```bash + pip install pytesseract opencv-python numpy pillow + ``` + +3. **(Windows only) Add Tesseract to PATH** or set path in code: + ```python + from modules.ocr_backends import TesseractBackend + backend = TesseractBackend(tesseract_cmd=r"C:\Program Files\Tesseract-OCR\tesseract.exe") + ``` + +### Option 4: Full Installation (All Backends) + +```bash +# Install all OCR backends +pip install -r requirements-ocr.txt + +# Or selectively: +pip install opencv-python numpy pillow easyocr pytesseract paddleocr +``` + +## Testing Your Setup + +Run the test script to verify everything works: + +```bash +python test_ocr_system.py +``` + +Expected output: +``` +============================================================ +HARDWARE DETECTION TEST +============================================================ +GPU Backend: CPU +... +šŸ“‹ Recommended OCR backend: opencv_east + +============================================================ +OCR BACKEND TESTS +============================================================ +OPENCV_EAST: + Status: āœ… Available + GPU: šŸ’» CPU +... +šŸŽ‰ All tests passed! OCR system is working correctly. +``` + +## Usage Examples + +### Basic Text Detection + +```python +from modules.game_vision_ai import GameVisionAI + +# Initialize with auto-selected backend +vision = GameVisionAI() + +# Process image +result = vision.process_screenshot("game_screenshot.png") + +# Print detected text +for region in result.text_regions: + print(f"Text: {region.text} (confidence: {region.confidence:.2f})") + print(f"Location: {region.bbox}") +``` + +### Force Specific Backend + +```python +from modules.game_vision_ai import GameVisionAI + +# Use specific backend +vision = GameVisionAI(ocr_backend='tesseract') + +# Or switch at runtime +vision.switch_ocr_backend('easyocr') +``` + +### Check Available Backends + +```python +from modules.game_vision_ai import GameVisionAI + +vision = GameVisionAI() + +# List all backends +for backend in vision.get_ocr_backends(): + print(f"{backend['name']}: {'Available' if backend['available'] else 'Not available'}") +``` + +### Hardware Diagnostics + +```python +from modules.hardware_detection import print_hardware_summary +from modules.game_vision_ai import GameVisionAI + +# Print hardware info +print_hardware_summary() + +# Run full diagnostic +diag = GameVisionAI.diagnose() +print(diag) +``` + +## Troubleshooting + +### "No OCR backend available" + +- Make sure `opencv-python` is installed: `pip install opencv-python` +- The EAST model will auto-download on first use (~95MB) + +### "PyTorch DLL error" + +- You're likely using Windows Store Python +- Use OpenCV EAST or Tesseract instead +- Or install Python from [python.org](https://python.org) + +### "Tesseract not found" + +- Install Tesseract OCR binary (see Option 3 above) +- Add to PATH or specify path in code + +### Low detection accuracy + +- OpenCV EAST only detects text regions, doesn't recognize text +- For text recognition, use EasyOCR or PaddleOCR +- Ensure good screenshot quality and contrast + +## Backend Priority + +The system automatically selects backends in this priority order: + +1. **PaddleOCR** - If PyTorch works and Paddle is installed +2. **EasyOCR** - If PyTorch works and EasyOCR is installed +3. **Tesseract** - If Tesseract binary is available +4. **OpenCV EAST** - Always works (ultimate fallback) + +You can customize priority: + +```python +from modules.game_vision_ai import UnifiedOCRProcessor + +processor = UnifiedOCRProcessor( + backend_priority=['tesseract', 'opencv_east'] # Custom order +) +``` + +## Performance Tips + +- **OpenCV EAST**: Fastest, use for real-time detection +- **GPU acceleration**: Significant speedup for EasyOCR/PaddleOCR +- **Preprocessing**: Better contrast = better OCR accuracy +- **Region of interest**: Crop to relevant areas for faster processing diff --git a/UI_CLEANUP_SUMMARY.md b/UI_CLEANUP_SUMMARY.md new file mode 100644 index 0000000..1f52f19 --- /dev/null +++ b/UI_CLEANUP_SUMMARY.md @@ -0,0 +1,105 @@ +# Lemontropia Suite UI Cleanup - Summary + +## Changes Made + +### 1. New File: `ui/settings_dialog.py` +Created a comprehensive SettingsDialog that consolidates all settings into a single dialog with tabs: + +**Tabs:** +- **šŸ“‹ General**: Player settings (avatar name, log path), default activity, application settings +- **šŸ“ø Screenshot Hotkeys**: Hotkey configuration for screenshots (moved from separate Tools menu) +- **šŸ‘ļø Computer Vision**: AI vision settings (OCR, icon detection, directories) +- **šŸŽ® GPU & Performance**: GPU detection, backend selection, performance tuning + +**Also includes:** +- `NewSessionTemplateDialog` - Moved from main_window.py +- `TemplateStatsDialog` - Moved from main_window.py +- Data classes: `PlayerSettings`, `ScreenshotHotkeySettings`, `VisionSettings` + +### 2. Updated: `ui/main_window.py` + +**Menu Structure Cleanup:** +``` +File + - New Template (Ctrl+N) + - Exit (Alt+F4) + +Session + - Start (F5) + - Stop (Shift+F5) + - Pause (F6) + +Tools + - Loadout Manager (Ctrl+L) + - Computer Vision → + - Settings + - Calibrate + - Test + - Select Gear → + - Weapon (Ctrl+W) + - Armor (Ctrl+Shift+A) + - Finder (Ctrl+Shift+F) + - Medical Tool (Ctrl+M) + +View + - Show HUD (F9) + - Hide HUD (F10) + - Session History (Ctrl+H) + - Screenshot Gallery (Ctrl+G) + - Settings (Ctrl+,) + +Help + - Setup Wizard (Ctrl+Shift+W) + - About +``` + +**Code Organization:** +- Removed dialog classes (moved to settings_dialog.py) +- Cleaned up imports +- Removed orphaned screenshot hotkey menu item (now in Settings) +- Added tooltips to all menu actions +- Fixed menu separators for cleaner grouping + +### 3. UI Audit Results + +**Features with Menu Access:** +| Feature | Menu | Shortcut | Status | +|---------|------|----------|--------| +| Session History | View | Ctrl+H | āœ… | +| Gallery | View | Ctrl+G | āœ… | +| Loadout Manager | Tools | Ctrl+L | āœ… | +| Computer Vision | Tools (submenu) | - | āœ… | +| Setup Wizard | Help | Ctrl+Shift+W | āœ… | +| Settings | View | Ctrl+, | āœ… | +| Screenshot Hotkeys | Settings (tab) | - | āœ… (moved) | +| Select Gear | Tools (submenu) | Various | āœ… | + +**Modules Analysis:** +- `crafting_tracker.py` - Backend module, no UI needed +- `loot_analyzer.py` - Backend module, no UI needed +- `game_vision.py`, `game_vision_ai.py` - Used by Vision dialogs +- `screenshot_hotkey.py` - Integrated into Settings +- Other modules - Backend/utilities + +## Benefits + +1. **Consolidated Settings**: All settings in one place with organized tabs +2. **Cleaner Menu Structure**: Logical grouping of features +3. **Better Code Organization**: Dialogs in separate file, main_window focused on main UI +4. **No Orphaned Features**: All major features accessible via menus +5. **Backward Compatibility**: Existing functionality preserved + +## Files Modified +- `ui/settings_dialog.py` - NEW (consolidated settings) +- `ui/main_window.py` - UPDATED (clean menu structure) + +## Testing Checklist +- [ ] Settings dialog opens with Ctrl+, +- [ ] All tabs accessible in Settings +- [ ] Player name saves correctly +- [ ] Screenshot hotkeys configurable in Settings +- [ ] Vision settings accessible +- [ ] All menu shortcuts work +- [ ] Loadout Manager opens with Ctrl+L +- [ ] Session History opens with Ctrl+H +- [ ] Gallery opens with Ctrl+G diff --git a/modules/__init__.py b/modules/__init__.py new file mode 100644 index 0000000..64d536a --- /dev/null +++ b/modules/__init__.py @@ -0,0 +1,41 @@ +""" +Lemontropia Suite - Modules +Game automation and analysis modules. +""" + +# OCR and Vision +from .game_vision_ai import GameVisionAI, UnifiedOCRProcessor +from .hardware_detection import ( + HardwareDetector, + HardwareInfo, + GPUBackend, + get_hardware_info, + print_hardware_summary, + recommend_ocr_backend, +) + +# OCR Backends +from .ocr_backends import ( + BaseOCRBackend, + OCRTextRegion, + OCRBackendInfo, + OCRBackendFactory, +) + +__all__ = [ + # Vision + 'GameVisionAI', + 'UnifiedOCRProcessor', + # Hardware + 'HardwareDetector', + 'HardwareInfo', + 'GPUBackend', + 'get_hardware_info', + 'print_hardware_summary', + 'recommend_ocr_backend', + # OCR + 'BaseOCRBackend', + 'OCRTextRegion', + 'OCRBackendInfo', + 'OCRBackendFactory', +] diff --git a/modules/game_vision_ai.py b/modules/game_vision_ai.py index 7e7a282..f80c3dc 100644 --- a/modules/game_vision_ai.py +++ b/modules/game_vision_ai.py @@ -1,7 +1,14 @@ """ Lemontropia Suite - Game Vision AI Module -Advanced computer vision with local GPU-accelerated AI models. -Supports OCR (PaddleOCR) and icon detection for game UI analysis. +Advanced computer vision with multiple OCR backends and GPU acceleration. + +OCR Backends (in priority order): +1. OpenCV EAST - Fastest, no dependencies (primary fallback) +2. EasyOCR - Good accuracy, lighter than PaddleOCR +3. Tesseract OCR - Traditional, stable +4. PaddleOCR - Best accuracy (requires working PyTorch) + +Handles PyTorch DLL errors on Windows Store Python gracefully. """ import cv2 @@ -17,34 +24,17 @@ import hashlib logger = logging.getLogger(__name__) -# Optional PyTorch import with fallback -try: - import torch - TORCH_AVAILABLE = True -except Exception as e: - logger.warning(f"PyTorch not available: {e}") - TORCH_AVAILABLE = False - torch = None +# Import hardware detection +from .hardware_detection import ( + HardwareDetector, HardwareInfo, GPUBackend, + recommend_ocr_backend, get_hardware_info +) -# Import OpenCV text detector as fallback -from .opencv_text_detector import OpenCVTextDetector, TextDetection as OpenCVTextDetection - -# Optional PaddleOCR import with fallback -try: - from paddleocr import PaddleOCR - PADDLE_AVAILABLE = True -except Exception as e: - logger.warning(f"PaddleOCR not available: {e}") - PADDLE_AVAILABLE = False - PaddleOCR = None - - -class GPUBackend(Enum): - """Supported GPU backends.""" - CUDA = "cuda" # NVIDIA CUDA - MPS = "mps" # Apple Metal Performance Shaders - DIRECTML = "directml" # Windows DirectML - CPU = "cpu" # Fallback CPU +# Import OCR backends +from .ocr_backends import ( + BaseOCRBackend, OCRTextRegion, OCRBackendInfo, + OCRBackendFactory +) @dataclass @@ -54,14 +44,27 @@ class TextRegion: confidence: float bbox: Tuple[int, int, int, int] # x, y, w, h language: str = "en" + backend: str = "unknown" # Which OCR backend detected this def to_dict(self) -> Dict[str, Any]: return { 'text': self.text, 'confidence': self.confidence, 'bbox': self.bbox, - 'language': self.language + 'language': self.language, + 'backend': self.backend } + + @classmethod + def from_ocr_region(cls, region: OCRTextRegion, backend: str = "unknown"): + """Create from OCR backend region.""" + return cls( + text=region.text, + confidence=region.confidence, + bbox=region.bbox, + language=region.language, + backend=backend + ) @dataclass @@ -105,6 +108,7 @@ class VisionResult: icon_regions: List[IconRegion] = field(default_factory=list) processing_time_ms: float = 0.0 gpu_backend: str = "cpu" + ocr_backend: str = "unknown" timestamp: float = field(default_factory=time.time) def to_dict(self) -> Dict[str, Any]: @@ -113,6 +117,7 @@ class VisionResult: 'icon_count': len(self.icon_regions), 'processing_time_ms': self.processing_time_ms, 'gpu_backend': self.gpu_backend, + 'ocr_backend': self.ocr_backend, 'timestamp': self.timestamp } @@ -123,153 +128,143 @@ class GPUDetector: @staticmethod def detect_backend() -> GPUBackend: """Detect best available GPU backend.""" - # Check CUDA first (most common) - if torch.cuda.is_available(): - logger.info(f"CUDA available: {torch.cuda.get_device_name(0)}") - return GPUBackend.CUDA - - # Check Apple MPS - if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available(): - logger.info("Apple MPS (Metal) available") - return GPUBackend.MPS - - # Check DirectML on Windows - try: - import torch_directml - if torch_directml.is_available(): - logger.info("DirectML available") - return GPUBackend.DIRECTML - except ImportError: - pass - - logger.info("No GPU backend available, using CPU") - return GPUBackend.CPU - - @staticmethod - def get_device_string(backend: GPUBackend) -> str: - """Get PyTorch device string for backend.""" - if backend == GPUBackend.CUDA: - return "cuda:0" - elif backend == GPUBackend.MPS: - return "mps" - elif backend == GPUBackend.DIRECTML: - return "privateuseone:0" # DirectML device - return "cpu" + info = HardwareDetector.detect_all() + return info.gpu_backend @staticmethod def get_gpu_info() -> Dict[str, Any]: """Get detailed GPU information.""" - info = { - 'backend': GPUDetector.detect_backend().value, - 'cuda_available': torch.cuda.is_available(), - 'mps_available': hasattr(torch.backends, 'mps') and torch.backends.mps.is_available(), - 'devices': [] - } - - if torch.cuda.is_available(): - for i in range(torch.cuda.device_count()): - info['devices'].append({ - 'id': i, - 'name': torch.cuda.get_device_name(i), - 'memory_total': torch.cuda.get_device_properties(i).total_memory - }) - - return info + info = HardwareDetector.detect_all() + return info.to_dict() -class OCRProcessor: - """OCR text extraction using PaddleOCR or OpenCV fallback with GPU support.""" +class UnifiedOCRProcessor: + """ + Unified OCR processor with multiple backend support. - SUPPORTED_LANGUAGES = ['en', 'sv', 'latin'] # English, Swedish, Latin script + Automatically selects the best available backend based on: + 1. Hardware capabilities + 2. PyTorch DLL compatibility + 3. User preferences - def __init__(self, use_gpu: bool = True, lang: str = 'en'): + Gracefully falls through backends if one fails. + """ + + SUPPORTED_LANGUAGES = ['en', 'sv', 'latin', 'de', 'fr', 'es'] + + # Default priority (can be overridden) + DEFAULT_PRIORITY = [ + 'paddleocr', # Best accuracy if available + 'easyocr', # Good balance + 'tesseract', # Stable fallback + 'opencv_east', # Fastest, always works + ] + + def __init__(self, use_gpu: bool = True, lang: str = 'en', + backend_priority: Optional[List[str]] = None, + auto_select: bool = True): + """ + Initialize Unified OCR Processor. + + Args: + use_gpu: Enable GPU acceleration if available + lang: Language for OCR ('en', 'sv', 'latin', etc.) + backend_priority: Custom backend priority order + auto_select: Automatically select best backend + """ self.use_gpu = use_gpu self.lang = lang if lang in self.SUPPORTED_LANGUAGES else 'en' - self.ocr = None - self.backend = GPUBackend.CPU - self.opencv_detector = None - self._primary_backend = None # 'paddle' or 'opencv' - self._init_ocr() + self.backend_priority = backend_priority or self.DEFAULT_PRIORITY + + self._backend: Optional[BaseOCRBackend] = None + self._backend_name: str = "unknown" + self._hardware_info: HardwareInfo = HardwareDetector.detect_all() + + # Initialize + if auto_select: + self._auto_select_backend() + + logger.info(f"UnifiedOCR initialized with backend: {self._backend_name}") - def _init_ocr(self): - """Initialize OCR with PaddleOCR or OpenCV fallback.""" - # Try PaddleOCR first (better accuracy) - if PADDLE_AVAILABLE: - try: - self._init_paddle() - if self.ocr is not None: - self._primary_backend = 'paddle' - return - except Exception as e: - logger.warning(f"PaddleOCR init failed: {e}") - - # Fallback to OpenCV text detection - logger.info("Using OpenCV text detection as fallback") - self.opencv_detector = OpenCVTextDetector(use_gpu=self.use_gpu) - if self.opencv_detector.is_available(): - self._primary_backend = 'opencv' - self.backend = GPUBackend.CUDA if self.opencv_detector.check_gpu_available() else GPUBackend.CPU - logger.info(f"OpenCV text detector ready (GPU: {self.backend == GPUBackend.CUDA})") + def _auto_select_backend(self): + """Automatically select the best available backend.""" + # Check for PyTorch DLL errors first + if self._hardware_info.pytorch_dll_error: + logger.warning( + "PyTorch DLL error detected - avoiding PyTorch-based backends" + ) + # Remove PyTorch-dependent backends from priority + safe_backends = [ + b for b in self.backend_priority + if b not in ['paddleocr', 'easyocr'] + ] else: - logger.error("No OCR backend available") - - def _init_paddle(self): - """Initialize PaddleOCR with appropriate backend.""" - # Detect GPU - if self.use_gpu: - self.backend = GPUDetector.detect_backend() - use_gpu_flag = self.backend != GPUBackend.CPU - else: - use_gpu_flag = False + safe_backends = self.backend_priority - # Map language codes - lang_map = { - 'en': 'en', - 'sv': 'latin', # Swedish uses latin script model - 'latin': 'latin' - } - paddle_lang = lang_map.get(self.lang, 'en') + # Get recommended backend + recommended = HardwareDetector.recommend_ocr_backend() - logger.info(f"Initializing PaddleOCR (lang={paddle_lang}, gpu={use_gpu_flag})") + # Try to create backend + for name in safe_backends: + backend = OCRBackendFactory.create_backend( + name, + use_gpu=self.use_gpu, + lang=self.lang + ) + + if backend is not None and backend.is_available(): + self._backend = backend + self._backend_name = name + logger.info(f"Selected OCR backend: {name}") + return - self.ocr = PaddleOCR( - lang=paddle_lang, - use_gpu=use_gpu_flag, - show_log=False, - use_angle_cls=True, - det_db_thresh=0.3, - det_db_box_thresh=0.5, - rec_thresh=0.5, + # Ultimate fallback - OpenCV EAST always works + logger.warning("All preferred backends failed, trying OpenCV EAST...") + backend = OCRBackendFactory.create_backend( + 'opencv_east', + use_gpu=self.use_gpu, + lang=self.lang ) - logger.info(f"PaddleOCR initialized successfully (backend: {self.backend.value})") - - def preprocess_for_ocr(self, image: np.ndarray) -> np.ndarray: - """Preprocess image for better OCR results.""" - # Convert to grayscale if needed - if len(image.shape) == 3: - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) + if backend is not None and backend.is_available(): + self._backend = backend + self._backend_name = 'opencv_east' + logger.info("Using OpenCV EAST as ultimate fallback") else: - gray = image + logger.error("CRITICAL: No OCR backend available!") + + def set_backend(self, name: str) -> bool: + """ + Manually set OCR backend. - # Denoise - denoised = cv2.fastNlMeansDenoising(gray, None, 10, 7, 21) - - # Adaptive threshold for better text contrast - binary = cv2.adaptiveThreshold( - denoised, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, - cv2.THRESH_BINARY, 11, 2 + Args: + name: Backend name ('paddleocr', 'easyocr', 'tesseract', 'opencv_east') + + Returns: + True if successful + """ + backend = OCRBackendFactory.create_backend( + name, + use_gpu=self.use_gpu, + lang=self.lang ) - return binary + if backend is not None and backend.is_available(): + self._backend = backend + self._backend_name = name + logger.info(f"Switched to OCR backend: {name}") + return True + else: + logger.error(f"Failed to switch to OCR backend: {name}") + return False def extract_text(self, image: Union[str, np.ndarray, Path]) -> List[TextRegion]: """ - Extract text from image using PaddleOCR or OpenCV fallback. + Extract text from image using selected backend. Args: image: Image path or numpy array - + Returns: List of detected text regions """ @@ -282,68 +277,29 @@ class OCRProcessor: else: img = image.copy() - # Use appropriate backend - if self._primary_backend == 'paddle' and self.ocr is not None: - return self._extract_text_paddle(img) - elif self._primary_backend == 'opencv' and self.opencv_detector is not None: - return self._extract_text_opencv(img) - else: - logger.warning("No OCR backend available") + # Check backend + if self._backend is None: + logger.error("No OCR backend available") return [] - - def _extract_text_opencv(self, img: np.ndarray) -> List[TextRegion]: - """Extract text using OpenCV EAST detector.""" - detections = self.opencv_detector.detect_text(img) - - # Convert to TextRegion format (no text recognition, just detection) - regions = [] - for det in detections: - regions.append(TextRegion( - text="", # OpenCV detector doesn't recognize text, just finds regions - confidence=det.confidence, - bbox=det.bbox, - language=self.lang - )) - - return regions - - def _extract_text_paddle(self, img: np.ndarray) -> List[TextRegion]: - """Extract text using PaddleOCR.""" - # Preprocess - processed = self.preprocess_for_ocr(img) try: - # Run OCR - result = self.ocr.ocr(processed, cls=True) + # Extract text using backend + ocr_regions = self._backend.extract_text(img) - detected = [] - if result and result[0]: - for line in result[0]: - if line is None: - continue - bbox, (text, confidence) = line - - # Calculate bounding box - x_coords = [p[0] for p in bbox] - y_coords = [p[1] for p in bbox] - x, y = int(min(x_coords)), int(min(y_coords)) - w = int(max(x_coords) - x) - h = int(max(y_coords) - y) - - detected.append(TextRegion( - text=text.strip(), - confidence=float(confidence), - bbox=(x, y, w, h), - language=self.lang - )) + # Convert to TextRegion with backend info + regions = [ + TextRegion.from_ocr_region(r, self._backend_name) + for r in ocr_regions + ] - return detected + logger.debug(f"Extracted {len(regions)} text regions using {self._backend_name}") + return regions except Exception as e: - logger.error(f"OCR processing failed: {e}") + logger.error(f"OCR extraction failed: {e}") return [] - def extract_text_from_region(self, image: np.ndarray, + def extract_text_from_region(self, image: np.ndarray, region: Tuple[int, int, int, int]) -> List[TextRegion]: """Extract text from specific region of image.""" x, y, w, h = region @@ -360,6 +316,34 @@ class OCRProcessor: r.bbox = (x + rx, y + ry, rw, rh) return regions + + def get_available_backends(self) -> List[OCRBackendInfo]: + """Get information about all available backends.""" + return OCRBackendFactory.check_all_backends(self.use_gpu, self.lang) + + def get_current_backend(self) -> str: + """Get name of current backend.""" + return self._backend_name + + def get_backend_info(self) -> Dict[str, Any]: + """Get information about current backend.""" + if self._backend: + return self._backend.get_info().to_dict() + return {"error": "No backend initialized"} + + def is_recognition_supported(self) -> bool: + """ + Check if current backend supports text recognition. + + Note: OpenCV EAST only detects text regions, doesn't recognize text. + """ + return self._backend_name not in ['opencv_east'] + + +# Legacy class for backward compatibility +class OCRProcessor(UnifiedOCRProcessor): + """Legacy OCR processor - now wraps UnifiedOCRProcessor.""" + pass class IconDetector: @@ -395,13 +379,8 @@ class IconDetector: logger.error(f"Failed to load template {template_file}: {e}") def detect_loot_window(self, image: np.ndarray) -> Optional[Tuple[int, int, int, int]]: - """ - Detect loot window in screenshot. - - Returns bounding box of loot window or None if not found. - """ + """Detect loot window in screenshot.""" # Look for common loot window indicators - # Method 1: Template matching for "Loot" text or window frame if 'loot_window' in self.templates: result = cv2.matchTemplate( image, self.templates['loot_window'], cv2.TM_CCOEFF_NORMED @@ -412,13 +391,9 @@ class IconDetector: return (*max_loc, w, h) # Method 2: Detect based on typical loot window characteristics - # Loot windows usually have a grid of items with consistent spacing gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - - # Look for high-contrast regions that could be icons _, thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY) - # Find contours contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # Filter for icon-sized squares @@ -427,7 +402,6 @@ class IconDetector: x, y, w, h = cv2.boundingRect(cnt) aspect = w / h if h > 0 else 0 - # Check if dimensions match typical icon sizes for size_name, (sw, sh) in self.ICON_SIZES.items(): if abs(w - sw) < 5 and abs(h - sh) < 5 and 0.8 < aspect < 1.2: potential_icons.append((x, y, w, h)) @@ -435,7 +409,6 @@ class IconDetector: # If we found multiple icons in a grid pattern, assume loot window if len(potential_icons) >= 2: - # Calculate bounding box of all icons xs = [p[0] for p in potential_icons] ys = [p[1] for p in potential_icons] ws = [p[2] for p in potential_icons] @@ -444,7 +417,6 @@ class IconDetector: min_x, max_x = min(xs), max(xs) + max(ws) min_y, max_y = min(ys), max(ys) + max(hs) - # Add padding padding = 20 return ( max(0, min_x - padding), @@ -455,20 +427,10 @@ class IconDetector: return None - def extract_icons_from_region(self, image: np.ndarray, + def extract_icons_from_region(self, image: np.ndarray, region: Tuple[int, int, int, int], icon_size: str = 'medium') -> List[IconRegion]: - """ - Extract icons from a specific region (e.g., loot window). - - Args: - image: Full screenshot - region: Bounding box (x, y, w, h) - icon_size: Size preset ('small', 'medium', 'large') - - Returns: - List of detected icon regions - """ + """Extract icons from a specific region.""" x, y, w, h = region roi = image[y:y+h, x:x+w] @@ -478,7 +440,6 @@ class IconDetector: target_size = self.ICON_SIZES.get(icon_size, (48, 48)) gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY) - # Multiple threshold attempts for different icon styles icons = [] thresholds = [(200, 255), (180, 255), (150, 255)] @@ -490,35 +451,30 @@ class IconDetector: cx, cy, cw, ch = cv2.boundingRect(cnt) aspect = cw / ch if ch > 0 else 0 - # Match icon size with tolerance - if (abs(cw - target_size[0]) < 8 and - abs(ch - target_size[1]) < 8 and + if (abs(cw - target_size[0]) < 8 and + abs(ch - target_size[1]) < 8 and 0.7 < aspect < 1.3): - # Extract icon image icon_img = roi[cy:cy+ch, cx:cx+cw] - - # Resize to standard size icon_img = cv2.resize(icon_img, target_size, interpolation=cv2.INTER_AREA) icons.append(IconRegion( image=icon_img, bbox=(x + cx, y + cy, cw, ch), - confidence=0.8 # Placeholder confidence + confidence=0.8 )) - # Remove duplicates (icons that overlap significantly) + # Remove duplicates unique_icons = self._remove_duplicate_icons(icons) return unique_icons - def _remove_duplicate_icons(self, icons: List[IconRegion], + def _remove_duplicate_icons(self, icons: List[IconRegion], iou_threshold: float = 0.5) -> List[IconRegion]: """Remove duplicate icons based on IoU.""" if not icons: return [] - # Sort by confidence sorted_icons = sorted(icons, key=lambda x: x.confidence, reverse=True) kept = [] @@ -533,9 +489,9 @@ class IconDetector: return kept - def _calculate_iou(self, box1: Tuple[int, int, int, int], + def _calculate_iou(self, box1: Tuple[int, int, int, int], box2: Tuple[int, int, int, int]) -> float: - """Calculate Intersection over Union of two bounding boxes.""" + """Calculate Intersection over Union.""" x1, y1, w1, h1 = box1 x2, y2, w2, h2 = box2 @@ -551,33 +507,24 @@ class IconDetector: union_area = box1_area + box2_area - inter_area return inter_area / union_area if union_area > 0 else 0 - - def detect_icons_yolo(self, image: np.ndarray, - model_path: Optional[str] = None) -> List[IconRegion]: - """ - Detect icons using YOLO model (if available). - - This is a placeholder for future YOLO integration. - """ - # TODO: Implement YOLO detection when model is trained - logger.debug("YOLO detection not yet implemented") - return [] class GameVisionAI: """ Main AI vision interface for game screenshot analysis. - Combines OCR and icon detection with GPU acceleration. + Combines OCR and icon detection with multiple backend support. """ def __init__(self, use_gpu: bool = True, ocr_lang: str = 'en', + ocr_backend: Optional[str] = None, data_dir: Optional[Path] = None): """ Initialize Game Vision AI. Args: use_gpu: Enable GPU acceleration if available - ocr_lang: Language for OCR ('en', 'sv', 'latin') + ocr_lang: Language for OCR + ocr_backend: Specific OCR backend to use (None for auto) data_dir: Directory for storing extracted data """ self.use_gpu = use_gpu @@ -585,42 +532,34 @@ class GameVisionAI: self.extracted_icons_dir = self.data_dir / "extracted_icons" self.extracted_icons_dir.mkdir(parents=True, exist_ok=True) - # Detect GPU - self.backend = GPUDetector.detect_backend() if use_gpu else GPUBackend.CPU + # Detect hardware + self.hardware_info = HardwareDetector.detect_all() + self.backend = self.hardware_info.gpu_backend - # Initialize processors - self.ocr = OCRProcessor(use_gpu=use_gpu, lang=ocr_lang) + # Initialize OCR processor + self.ocr = UnifiedOCRProcessor( + use_gpu=use_gpu, + lang=ocr_lang, + auto_select=(ocr_backend is None) + ) + + # Set specific backend if requested + if ocr_backend: + self.ocr.set_backend(ocr_backend) + + # Initialize icon detector self.icon_detector = IconDetector() - # Icon matching cache - self.icon_cache: Dict[str, ItemMatch] = {} - - logger.info(f"GameVisionAI initialized (GPU: {self.backend.value})") + logger.info(f"GameVisionAI initialized (GPU: {self.backend.value}, " + f"OCR: {self.ocr.get_current_backend()})") def extract_text_from_image(self, image_path: Union[str, Path]) -> List[TextRegion]: - """ - Extract all text from an image. - - Args: - image_path: Path to screenshot image - - Returns: - List of detected text regions - """ + """Extract all text from an image.""" return self.ocr.extract_text(image_path) def extract_icons_from_image(self, image_path: Union[str, Path], auto_detect_window: bool = True) -> List[IconRegion]: - """ - Extract item icons from image. - - Args: - image_path: Path to screenshot image - auto_detect_window: Automatically detect loot window - - Returns: - List of detected icon regions - """ + """Extract item icons from image.""" image = cv2.imread(str(image_path)) if image is None: logger.error(f"Failed to load image: {image_path}") @@ -635,7 +574,6 @@ class GameVisionAI: ) else: logger.debug("No loot window detected, scanning full image") - # Scan full image h, w = image.shape[:2] return self.icon_detector.extract_icons_from_region( image, (0, 0, w, h) @@ -646,26 +584,6 @@ class GameVisionAI: image, (0, 0, w, h) ) - def match_icon_to_database(self, icon_image: np.ndarray, - database_path: Optional[Path] = None) -> Optional[ItemMatch]: - """ - Match extracted icon to item database. - - Args: - icon_image: Icon image (numpy array) - database_path: Path to icon database directory - - Returns: - ItemMatch if found, None otherwise - """ - from .icon_matcher import IconMatcher - - # Lazy load matcher - if not hasattr(self, '_icon_matcher'): - self._icon_matcher = IconMatcher(database_path) - - return self._icon_matcher.match_icon(icon_image) - def process_screenshot(self, image_path: Union[str, Path], extract_text: bool = True, extract_icons: bool = True) -> VisionResult: @@ -676,13 +594,16 @@ class GameVisionAI: image_path: Path to screenshot extract_text: Enable text extraction extract_icons: Enable icon extraction - + Returns: VisionResult with all detections """ start_time = time.time() - result = VisionResult(gpu_backend=self.backend.value) + result = VisionResult( + gpu_backend=self.backend.value, + ocr_backend=self.ocr.get_current_backend() + ) # Load image once image = cv2.imread(str(image_path)) @@ -717,28 +638,31 @@ class GameVisionAI: def get_gpu_info(self) -> Dict[str, Any]: """Get GPU information.""" - return GPUDetector.get_gpu_info() + return self.hardware_info.to_dict() def is_gpu_available(self) -> bool: """Check if GPU acceleration is available.""" return self.backend != GPUBackend.CPU + def get_ocr_backends(self) -> List[Dict[str, Any]]: + """Get information about all available OCR backends.""" + backends = self.ocr.get_available_backends() + return [b.to_dict() for b in backends] + + def switch_ocr_backend(self, name: str) -> bool: + """Switch to a different OCR backend.""" + return self.ocr.set_backend(name) + def calibrate_for_game(self, sample_screenshots: List[Path]) -> Dict[str, Any]: - """ - Calibrate vision system using sample screenshots. - - Args: - sample_screenshots: List of sample game screenshots - - Returns: - Calibration results - """ + """Calibrate vision system using sample screenshots.""" calibration = { 'screenshots_processed': 0, 'text_regions_detected': 0, 'icons_detected': 0, 'average_processing_time_ms': 0, - 'detected_regions': {} + 'detected_regions': {}, + 'ocr_backend': self.ocr.get_current_backend(), + 'gpu_backend': self.backend.value, } total_time = 0 @@ -763,17 +687,36 @@ class GameVisionAI: ) return calibration + + @staticmethod + def diagnose() -> Dict[str, Any]: + """Run full diagnostic on vision system.""" + return { + 'hardware': HardwareDetector.detect_all().to_dict(), + 'ocr_backends': [ + b.to_dict() for b in + OCRBackendFactory.check_all_backends() + ], + 'recommendations': { + 'ocr_backend': HardwareDetector.recommend_ocr_backend(), + 'gpu': GPUDetector.detect_backend().value, + } + } # Export main classes __all__ = [ 'GameVisionAI', + 'UnifiedOCRProcessor', + 'OCRProcessor', # Legacy 'TextRegion', - 'IconRegion', + 'IconRegion', 'ItemMatch', 'VisionResult', 'GPUBackend', 'GPUDetector', - 'OCRProcessor', - 'IconDetector' + 'IconDetector', + 'HardwareDetector', + 'OCRBackendFactory', + 'BaseOCRBackend', ] diff --git a/modules/hardware_detection.py b/modules/hardware_detection.py new file mode 100644 index 0000000..55f5293 --- /dev/null +++ b/modules/hardware_detection.py @@ -0,0 +1,367 @@ +""" +Lemontropia Suite - Hardware Detection Module +Detect GPU and ML framework availability with error handling. +""" + +import logging +from typing import Dict, Any, Optional, List +from dataclasses import dataclass, field +from enum import Enum + +logger = logging.getLogger(__name__) + + +class GPUBackend(Enum): + """Supported GPU backends.""" + CUDA = "cuda" # NVIDIA CUDA + MPS = "mps" # Apple Metal Performance Shaders + DIRECTML = "directml" # Windows DirectML + CPU = "cpu" # Fallback CPU + + +@dataclass +class HardwareInfo: + """Complete hardware information.""" + # GPU Info + gpu_backend: GPUBackend = GPUBackend.CPU + cuda_available: bool = False + cuda_device_count: int = 0 + cuda_devices: List[Dict] = field(default_factory=list) + mps_available: bool = False + directml_available: bool = False + + # OpenCV GPU + opencv_cuda_available: bool = False + opencv_cuda_devices: int = 0 + + # ML Frameworks + pytorch_available: bool = False + pytorch_version: Optional[str] = None + pytorch_error: Optional[str] = None + pytorch_dll_error: bool = False + + paddle_available: bool = False + paddle_version: Optional[str] = None + + # System + platform: str = "unknown" + python_executable: str = "unknown" + is_windows_store_python: bool = False + + def to_dict(self) -> Dict[str, Any]: + return { + 'gpu': { + 'backend': self.gpu_backend.value, + 'cuda_available': self.cuda_available, + 'cuda_devices': self.cuda_devices, + 'mps_available': self.mps_available, + 'directml_available': self.directml_available, + 'opencv_cuda': self.opencv_cuda_available, + }, + 'ml_frameworks': { + 'pytorch': { + 'available': self.pytorch_available, + 'version': self.pytorch_version, + 'error': self.pytorch_error, + 'dll_error': self.pytorch_dll_error, + }, + 'paddle': { + 'available': self.paddle_available, + 'version': self.paddle_version, + } + }, + 'system': { + 'platform': self.platform, + 'python': self.python_executable, + 'windows_store': self.is_windows_store_python, + } + } + + +class HardwareDetector: + """Detect hardware capabilities with error handling.""" + + @staticmethod + def detect_all() -> HardwareInfo: + """Detect all hardware capabilities.""" + info = HardwareInfo() + + # Detect system info + info = HardwareDetector._detect_system(info) + + # Detect OpenCV GPU + info = HardwareDetector._detect_opencv_cuda(info) + + # Detect PyTorch (with special error handling) + info = HardwareDetector._detect_pytorch_safe(info) + + # Detect PaddlePaddle + info = HardwareDetector._detect_paddle(info) + + # Determine best GPU backend + info = HardwareDetector._determine_gpu_backend(info) + + return info + + @staticmethod + def _detect_system(info: HardwareInfo) -> HardwareInfo: + """Detect system information.""" + import sys + import platform + + info.platform = platform.system() + info.python_executable = sys.executable + + # Detect Windows Store Python + exe_lower = sys.executable.lower() + info.is_windows_store_python = ( + 'windowsapps' in exe_lower or + 'microsoft' in exe_lower + ) + + if info.is_windows_store_python: + logger.warning( + "Windows Store Python detected - may have DLL compatibility issues" + ) + + return info + + @staticmethod + def _detect_opencv_cuda(info: HardwareInfo) -> HardwareInfo: + """Detect OpenCV CUDA support.""" + try: + import cv2 + + cuda_count = cv2.cuda.getCudaEnabledDeviceCount() + info.opencv_cuda_devices = cuda_count + info.opencv_cuda_available = cuda_count > 0 + + if info.opencv_cuda_available: + try: + device_name = cv2.cuda.getDevice().name() + logger.info(f"OpenCV CUDA device: {device_name}") + except: + logger.info(f"OpenCV CUDA available ({cuda_count} devices)") + + except Exception as e: + logger.debug(f"OpenCV CUDA detection failed: {e}") + info.opencv_cuda_available = False + + return info + + @staticmethod + def _detect_pytorch_safe(info: HardwareInfo) -> HardwareInfo: + """ + Detect PyTorch with safe error handling for DLL issues. + + This is critical for Windows Store Python compatibility. + """ + try: + import torch + + info.pytorch_available = True + info.pytorch_version = torch.__version__ + + # Check CUDA + info.cuda_available = torch.cuda.is_available() + if info.cuda_available: + info.cuda_device_count = torch.cuda.device_count() + for i in range(info.cuda_device_count): + info.cuda_devices.append({ + 'id': i, + 'name': torch.cuda.get_device_name(i), + 'memory': torch.cuda.get_device_properties(i).total_memory + }) + logger.info(f"PyTorch CUDA: {info.cuda_devices}") + + # Check MPS (Apple Silicon) + if hasattr(torch.backends, 'mps'): + info.mps_available = torch.backends.mps.is_available() + if info.mps_available: + logger.info("PyTorch MPS (Metal) available") + + logger.info(f"PyTorch {info.pytorch_version} available") + + except ImportError: + info.pytorch_available = False + info.pytorch_error = "PyTorch not installed" + logger.debug("PyTorch not installed") + + except OSError as e: + # DLL error - common with Windows Store Python + error_str = str(e).lower() + info.pytorch_available = False + info.pytorch_dll_error = True + info.pytorch_error = str(e) + + if any(x in error_str for x in ['dll', 'c10', 'specified module']): + logger.error( + f"PyTorch DLL error (Windows Store Python?): {e}" + ) + logger.info( + "This is a known issue. Use alternative OCR backends." + ) + else: + logger.error(f"PyTorch OS error: {e}") + + except Exception as e: + info.pytorch_available = False + info.pytorch_error = str(e) + logger.error(f"PyTorch detection failed: {e}") + + return info + + @staticmethod + def _detect_paddle(info: HardwareInfo) -> HardwareInfo: + """Detect PaddlePaddle availability.""" + try: + import paddle + info.paddle_available = True + info.paddle_version = paddle.__version__ + logger.info(f"PaddlePaddle {info.paddle_version} available") + + except ImportError: + info.paddle_available = False + logger.debug("PaddlePaddle not installed") + + except Exception as e: + info.paddle_available = False + logger.debug(f"PaddlePaddle detection failed: {e}") + + return info + + @staticmethod + def _determine_gpu_backend(info: HardwareInfo) -> HardwareInfo: + """Determine the best available GPU backend.""" + # Priority: CUDA > MPS > DirectML > CPU + + if info.cuda_available: + info.gpu_backend = GPUBackend.CUDA + elif info.mps_available: + info.gpu_backend = GPUBackend.MPS + elif info.directml_available: + info.gpu_backend = GPUBackend.DIRECTML + else: + info.gpu_backend = GPUBackend.CPU + + return info + + @staticmethod + def get_gpu_summary() -> str: + """Get a human-readable GPU summary.""" + info = HardwareDetector.detect_all() + + lines = ["=" * 50] + lines.append("HARDWARE DETECTION SUMMARY") + lines.append("=" * 50) + + # GPU Section + lines.append(f"\nGPU Backend: {info.gpu_backend.value.upper()}") + + if info.cuda_available: + lines.append(f"CUDA Devices: {info.cuda_device_count}") + for dev in info.cuda_devices: + gb = dev['memory'] / (1024**3) + lines.append(f" [{dev['id']}] {dev['name']} ({gb:.1f} GB)") + + if info.mps_available: + lines.append("Apple MPS (Metal): Available") + + if info.opencv_cuda_available: + lines.append(f"OpenCV CUDA: {info.opencv_cuda_devices} device(s)") + + # ML Frameworks + lines.append("\nML Frameworks:") + + if info.pytorch_available: + lines.append(f" PyTorch: {info.pytorch_version}") + lines.append(f" CUDA: {'Yes' if info.cuda_available else 'No'}") + else: + lines.append(f" PyTorch: Not available") + if info.pytorch_dll_error: + lines.append(f" āš ļø DLL Error (Windows Store Python?)") + + if info.paddle_available: + lines.append(f" PaddlePaddle: {info.paddle_version}") + else: + lines.append(f" PaddlePaddle: Not installed") + + # System + lines.append(f"\nSystem: {info.platform}") + if info.is_windows_store_python: + lines.append("āš ļø Windows Store Python (may have DLL issues)") + + lines.append("=" * 50) + + return "\n".join(lines) + + @staticmethod + def can_use_paddleocr() -> bool: + """Check if PaddleOCR can be used (no DLL errors).""" + info = HardwareDetector.detect_all() + return info.pytorch_available and not info.pytorch_dll_error + + @staticmethod + def recommend_ocr_backend() -> str: + """ + Recommend the best OCR backend based on hardware. + + Returns: + Name of recommended backend + """ + info = HardwareDetector.detect_all() + + # If PyTorch has DLL error, avoid PaddleOCR and EasyOCR (which uses PyTorch) + if info.pytorch_dll_error: + logger.info("PyTorch DLL error detected - avoiding PyTorch-based OCR") + + # Check OpenCV CUDA first + if info.opencv_cuda_available: + return 'opencv_east' + + # Check Tesseract + try: + import pytesseract + return 'tesseract' + except ImportError: + pass + + # Fall back to OpenCV EAST (CPU) + return 'opencv_east' + + # No DLL issues - can use any backend + # Priority: PaddleOCR > EasyOCR > Tesseract > OpenCV EAST + + if info.pytorch_available and info.paddle_available: + return 'paddleocr' + + if info.pytorch_available: + try: + import easyocr + return 'easyocr' + except ImportError: + pass + + try: + import pytesseract + return 'tesseract' + except ImportError: + pass + + return 'opencv_east' + + +# Convenience functions +def get_hardware_info() -> HardwareInfo: + """Get complete hardware information.""" + return HardwareDetector.detect_all() + + +def print_hardware_summary(): + """Print hardware summary to console.""" + print(HardwareDetector.get_gpu_summary()) + + +def recommend_ocr_backend() -> str: + """Get recommended OCR backend.""" + return HardwareDetector.recommend_ocr_backend() diff --git a/modules/ocr_backends/__init__.py b/modules/ocr_backends/__init__.py new file mode 100644 index 0000000..8f2d0e3 --- /dev/null +++ b/modules/ocr_backends/__init__.py @@ -0,0 +1,254 @@ +""" +Lemontropia Suite - OCR Backends Base Interface +Unified interface for multiple OCR backends with auto-fallback. +""" + +from abc import ABC, abstractmethod +from dataclasses import dataclass +from typing import List, Tuple, Optional, Dict, Any, Union +from pathlib import Path +import numpy as np +import logging + +logger = logging.getLogger(__name__) + + +@dataclass +class OCRTextRegion: + """Detected text region with metadata.""" + text: str + confidence: float + bbox: Tuple[int, int, int, int] # x, y, w, h + language: str = "en" + + def to_dict(self) -> Dict[str, Any]: + return { + 'text': self.text, + 'confidence': self.confidence, + 'bbox': self.bbox, + 'language': self.language + } + + +@dataclass +class OCRBackendInfo: + """Information about an OCR backend.""" + name: str + available: bool + gpu_accelerated: bool = False + error_message: Optional[str] = None + version: Optional[str] = None + + def to_dict(self) -> Dict[str, Any]: + return { + 'name': self.name, + 'available': self.available, + 'gpu_accelerated': self.gpu_accelerated, + 'error_message': self.error_message, + 'version': self.version + } + + +class BaseOCRBackend(ABC): + """Abstract base class for OCR backends.""" + + NAME = "base" + SUPPORTS_GPU = False + + def __init__(self, use_gpu: bool = True, lang: str = 'en', **kwargs): + self.use_gpu = use_gpu + self.lang = lang + self._available = False + self._error_msg = None + self._version = None + + @abstractmethod + def _initialize(self) -> bool: + """Initialize the backend. Return True if successful.""" + pass + + @abstractmethod + def extract_text(self, image: np.ndarray) -> List[OCRTextRegion]: + """Extract text from image.""" + pass + + def is_available(self) -> bool: + """Check if backend is available.""" + return self._available + + def get_info(self) -> OCRBackendInfo: + """Get backend information.""" + return OCRBackendInfo( + name=self.NAME, + available=self._available, + gpu_accelerated=self.SUPPORTS_GPU and self.use_gpu, + error_message=self._error_msg, + version=self._version + ) + + def preprocess_image(self, image: np.ndarray, + grayscale: bool = True, + denoise: bool = True, + contrast: bool = True) -> np.ndarray: + """Preprocess image for better OCR results.""" + processed = image.copy() + + # Convert to grayscale if needed + if grayscale and len(processed.shape) == 3: + processed = self._to_grayscale(processed) + + # Denoise + if denoise: + processed = self._denoise(processed) + + # Enhance contrast + if contrast: + processed = self._enhance_contrast(processed) + + return processed + + def _to_grayscale(self, image: np.ndarray) -> np.ndarray: + """Convert image to grayscale.""" + if len(image.shape) == 3: + import cv2 + return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) + return image + + def _denoise(self, image: np.ndarray) -> np.ndarray: + """Denoise image.""" + import cv2 + if len(image.shape) == 2: + return cv2.fastNlMeansDenoising(image, None, 10, 7, 21) + return image + + def _enhance_contrast(self, image: np.ndarray) -> np.ndarray: + """Enhance image contrast.""" + import cv2 + if len(image.shape) == 2: + # CLAHE (Contrast Limited Adaptive Histogram Equalization) + clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8)) + return clahe.apply(image) + return image + + +class OCRBackendFactory: + """Factory for creating OCR backends with auto-fallback.""" + + # Priority order: fastest/most reliable first + BACKEND_PRIORITY = [ + 'opencv_east', # Fastest, no dependencies, detection only + 'easyocr', # Good accuracy, lighter than PaddleOCR + 'tesseract', # Traditional, stable + 'paddleocr', # Best accuracy but heavy dependencies + ] + + _backends: Dict[str, Any] = {} + _backend_classes: Dict[str, type] = {} + + @classmethod + def register_backend(cls, name: str, backend_class: type): + """Register a backend class.""" + cls._backend_classes[name] = backend_class + logger.debug(f"Registered OCR backend: {name}") + + @classmethod + def create_backend(cls, name: str, use_gpu: bool = True, + lang: str = 'en', **kwargs) -> Optional[BaseOCRBackend]: + """Create a specific backend by name.""" + if name not in cls._backend_classes: + logger.error(f"Unknown OCR backend: {name}") + return None + + try: + backend = cls._backend_classes[name](use_gpu=use_gpu, lang=lang, **kwargs) + if backend._initialize(): + logger.info(f"Created OCR backend: {name}") + return backend + else: + logger.warning(f"Failed to initialize OCR backend: {name}") + return None + except Exception as e: + logger.error(f"Error creating OCR backend {name}: {e}") + return None + + @classmethod + def get_best_backend(cls, use_gpu: bool = True, lang: str = 'en', + priority: Optional[List[str]] = None, + **kwargs) -> Optional[BaseOCRBackend]: + """Get the best available backend based on priority order.""" + priority = priority or cls.BACKEND_PRIORITY + + logger.info(f"Searching for best OCR backend (priority: {priority})") + + for name in priority: + if name not in cls._backend_classes: + continue + + backend = cls.create_backend(name, use_gpu=use_gpu, lang=lang, **kwargs) + if backend is not None and backend.is_available(): + info = backend.get_info() + logger.info(f"Selected OCR backend: {name} (GPU: {info.gpu_accelerated})") + return backend + + logger.error("No OCR backend available!") + return None + + @classmethod + def check_all_backends(cls, use_gpu: bool = True, lang: str = 'en') -> List[OCRBackendInfo]: + """Check availability of all backends.""" + results = [] + + for name in cls.BACKEND_PRIORITY: + if name not in cls._backend_classes: + continue + + try: + backend = cls._backend_classes[name](use_gpu=use_gpu, lang=lang) + backend._initialize() + results.append(backend.get_info()) + except Exception as e: + results.append(OCRBackendInfo( + name=name, + available=False, + error_message=str(e) + )) + + return results + + @classmethod + def list_available_backends(cls, use_gpu: bool = True, lang: str = 'en') -> List[str]: + """List names of available backends.""" + info_list = cls.check_all_backends(use_gpu, lang) + return [info.name for info in info_list if info.available] + + +# Import and register backends +def _register_backends(): + """Register all available backends.""" + try: + from .opencv_east_backend import OpenCVEASTBackend + OCRBackendFactory.register_backend('opencv_east', OpenCVEASTBackend) + except ImportError as e: + logger.debug(f"OpenCV EAST backend not available: {e}") + + try: + from .easyocr_backend import EasyOCRBackend + OCRBackendFactory.register_backend('easyocr', EasyOCRBackend) + except ImportError as e: + logger.debug(f"EasyOCR backend not available: {e}") + + try: + from .tesseract_backend import TesseractBackend + OCRBackendFactory.register_backend('tesseract', TesseractBackend) + except ImportError as e: + logger.debug(f"Tesseract backend not available: {e}") + + try: + from .paddleocr_backend import PaddleOCRBackend + OCRBackendFactory.register_backend('paddleocr', PaddleOCRBackend) + except ImportError as e: + logger.debug(f"PaddleOCR backend not available: {e}") + + +# Auto-register on import +_register_backends() diff --git a/modules/ocr_backends/easyocr_backend.py b/modules/ocr_backends/easyocr_backend.py new file mode 100644 index 0000000..5d3e211 --- /dev/null +++ b/modules/ocr_backends/easyocr_backend.py @@ -0,0 +1,184 @@ +""" +Lemontropia Suite - EasyOCR Backend +Text recognition using EasyOCR - lighter than PaddleOCR. +""" + +import numpy as np +import logging +from typing import List, Optional + +from . import BaseOCRBackend, OCRTextRegion + +logger = logging.getLogger(__name__) + + +class EasyOCRBackend(BaseOCRBackend): + """ + OCR backend using EasyOCR. + + Pros: + - Lighter than PaddleOCR + - Good accuracy + - Supports many languages + - Can run on CPU reasonably well + + Cons: + - First run downloads models (~100MB) + - Slower than OpenCV EAST + + Installation: pip install easyocr + """ + + NAME = "easyocr" + SUPPORTS_GPU = True + + def __init__(self, use_gpu: bool = True, lang: str = 'en', **kwargs): + super().__init__(use_gpu=use_gpu, lang=lang, **kwargs) + + self.reader = None + self._gpu_available = False + + # Language mapping + self.lang_map = { + 'en': 'en', + 'sv': 'sv', # Swedish + 'de': 'de', + 'fr': 'fr', + 'es': 'es', + 'latin': 'latin', + } + + def _initialize(self) -> bool: + """Initialize EasyOCR reader.""" + try: + import easyocr + + # Map language code + easyocr_lang = self.lang_map.get(self.lang, 'en') + + # Check GPU availability + self._gpu_available = self._check_gpu() + use_gpu_flag = self.use_gpu and self._gpu_available + + logger.info(f"Initializing EasyOCR (lang={easyocr_lang}, gpu={use_gpu_flag})") + + # Create reader + # EasyOCR downloads models automatically on first run + self.reader = easyocr.Reader( + [easyocr_lang], + gpu=use_gpu_flag, + verbose=False + ) + + self._available = True + self._version = easyocr.__version__ if hasattr(easyocr, '__version__') else 'unknown' + + logger.info(f"EasyOCR initialized successfully (GPU: {use_gpu_flag})") + return True + + except ImportError: + self._error_msg = "EasyOCR not installed. Run: pip install easyocr" + logger.warning(self._error_msg) + return False + + except Exception as e: + # Handle specific PyTorch/CUDA errors + error_str = str(e).lower() + + if 'cuda' in error_str or 'c10' in error_str or 'gpu' in error_str: + self._error_msg = f"EasyOCR GPU initialization failed: {e}" + logger.warning(f"{self._error_msg}. Try with use_gpu=False") + + # Try CPU fallback + if self.use_gpu: + logger.info("Attempting EasyOCR CPU fallback...") + self.use_gpu = False + return self._initialize() + + else: + self._error_msg = f"EasyOCR initialization failed: {e}" + logger.error(self._error_msg) + + return False + + def _check_gpu(self) -> bool: + """Check if GPU is available for EasyOCR.""" + try: + import torch + + if torch.cuda.is_available(): + logger.info(f"CUDA available: {torch.cuda.get_device_name(0)}") + return True + + # Check MPS (Apple Silicon) + if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available(): + logger.info("Apple MPS available") + return True + + return False + + except ImportError: + return False + except Exception as e: + logger.debug(f"GPU check failed: {e}") + return False + + def extract_text(self, image: np.ndarray) -> List[OCRTextRegion]: + """ + Extract text from image using EasyOCR. + + Args: + image: Input image (BGR format from OpenCV) + + Returns: + List of detected text regions with recognized text + """ + if not self._available or self.reader is None: + logger.error("EasyOCR backend not initialized") + return [] + + try: + # EasyOCR expects RGB format + if len(image.shape) == 3: + import cv2 + image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) + else: + image_rgb = image + + # Run OCR + results = self.reader.readtext(image_rgb) + + regions = [] + for detection in results: + # EasyOCR returns: (bbox, text, confidence) + bbox, text, conf = detection + + # Calculate bounding box from polygon + # bbox is list of 4 points: [[x1,y1], [x2,y2], [x3,y3], [x4,y4]] + x_coords = [p[0] for p in bbox] + y_coords = [p[1] for p in bbox] + + x = int(min(x_coords)) + y = int(min(y_coords)) + w = int(max(x_coords) - x) + h = int(max(y_coords) - y) + + regions.append(OCRTextRegion( + text=text.strip(), + confidence=float(conf), + bbox=(x, y, w, h), + language=self.lang + )) + + logger.debug(f"EasyOCR detected {len(regions)} text regions") + return regions + + except Exception as e: + logger.error(f"EasyOCR extraction failed: {e}") + return [] + + def get_info(self): + """Get backend information.""" + info = super().get_info() + info.gpu_accelerated = self._gpu_available and self.use_gpu + return info diff --git a/modules/ocr_backends/opencv_east_backend.py b/modules/ocr_backends/opencv_east_backend.py new file mode 100644 index 0000000..3714995 --- /dev/null +++ b/modules/ocr_backends/opencv_east_backend.py @@ -0,0 +1,315 @@ +""" +Lemontropia Suite - OpenCV EAST OCR Backend +Fast text detection using OpenCV DNN with EAST model. +No heavy dependencies, works with Windows Store Python. +""" + +import cv2 +import numpy as np +import logging +from pathlib import Path +from typing import List, Tuple, Optional +import urllib.request + +from . import BaseOCRBackend, OCRTextRegion + +logger = logging.getLogger(__name__) + + +class OpenCVEASTBackend(BaseOCRBackend): + """ + Text detector using OpenCV DNN with EAST model. + + This is the primary fallback backend because: + - Pure OpenCV, no PyTorch/TensorFlow dependencies + - Fast (CPU: ~23 FPS, GPU: ~97 FPS) + - Works with Windows Store Python + - Detects text regions (does not recognize text) + + Based on: https://pyimagesearch.com/2022/03/14/improving-text-detection-speed-with-opencv-and-gpus/ + """ + + NAME = "opencv_east" + SUPPORTS_GPU = True + + # EAST model download URL (frozen inference graph) + EAST_MODEL_URL = "https://github.com/oyyd/frozen_east_text_detection.pb/raw/master/frozen_east_text_detection.pb" + + def __init__(self, use_gpu: bool = True, lang: str = 'en', **kwargs): + super().__init__(use_gpu=use_gpu, lang=lang, **kwargs) + + self.net = None + self.model_path = kwargs.get('model_path') + + # Input size (must be multiple of 32) + self.input_width = kwargs.get('input_width', 320) + self.input_height = kwargs.get('input_height', 320) + + # Detection thresholds + self.confidence_threshold = kwargs.get('confidence_threshold', 0.5) + self.nms_threshold = kwargs.get('nms_threshold', 0.4) + + # GPU status + self._gpu_enabled = False + + def _initialize(self) -> bool: + """Initialize EAST text detector.""" + try: + # Determine model path + if not self.model_path: + model_dir = Path.home() / ".lemontropia" / "models" + model_dir.mkdir(parents=True, exist_ok=True) + self.model_path = str(model_dir / "frozen_east_text_detection.pb") + + model_file = Path(self.model_path) + + # Download model if needed + if not model_file.exists(): + if not self._download_model(): + return False + + # Load the model + logger.info(f"Loading EAST model from {self.model_path}") + self.net = cv2.dnn.readNet(self.model_path) + + # Enable GPU if requested + if self.use_gpu: + self._gpu_enabled = self._enable_gpu() + + self._available = True + self._version = cv2.__version__ + + logger.info(f"OpenCV EAST backend initialized (GPU: {self._gpu_enabled})") + return True + + except Exception as e: + self._error_msg = f"Failed to initialize EAST: {e}" + logger.error(self._error_msg) + return False + + def _download_model(self) -> bool: + """Download EAST model if not present.""" + try: + logger.info(f"Downloading EAST model from {self.EAST_MODEL_URL}") + logger.info(f"This is a one-time download (~95 MB)...") + + # Create progress callback + def progress_hook(count, block_size, total_size): + percent = int(count * block_size * 100 / total_size) + if percent % 10 == 0: # Log every 10% + logger.info(f"Download progress: {percent}%") + + urllib.request.urlretrieve( + self.EAST_MODEL_URL, + self.model_path, + reporthook=progress_hook + ) + + logger.info("EAST model downloaded successfully") + return True + + except Exception as e: + self._error_msg = f"Failed to download EAST model: {e}" + logger.error(self._error_msg) + return False + + def _enable_gpu(self) -> bool: + """Enable CUDA GPU acceleration.""" + try: + # Check CUDA availability + cuda_count = cv2.cuda.getCudaEnabledDeviceCount() + + if cuda_count > 0: + self.net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA) + self.net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA) + + # Get device info + try: + device_name = cv2.cuda.getDevice().name() + logger.info(f"CUDA enabled: {device_name}") + except: + logger.info(f"CUDA enabled ({cuda_count} device(s))") + + return True + else: + logger.warning("CUDA not available in OpenCV, using CPU") + return False + + except Exception as e: + logger.warning(f"Failed to enable CUDA: {e}, using CPU") + return False + + def extract_text(self, image: np.ndarray) -> List[OCRTextRegion]: + """ + Detect text regions in image. + + Note: EAST only detects text regions, it does not recognize text. + The 'text' field will be empty, but bbox and confidence are accurate. + + Args: + image: Input image (BGR format from OpenCV) + + Returns: + List of detected text regions + """ + if not self._available or self.net is None: + logger.error("EAST backend not initialized") + return [] + + try: + # Get image dimensions + (H, W) = image.shape[:2] + + # Resize to input size + resized = cv2.resize(image, (self.input_width, self.input_height)) + + # Create blob from image + blob = cv2.dnn.blobFromImage( + resized, + scalefactor=1.0, + size=(self.input_width, self.input_height), + mean=(123.68, 116.78, 103.94), # ImageNet means + swapRB=True, + crop=False + ) + + # Forward pass + self.net.setInput(blob) + layer_names = [ + "feature_fusion/Conv_7/Sigmoid", # Scores + "feature_fusion/concat_3" # Geometry + ] + scores, geometry = self.net.forward(layer_names) + + # Decode predictions + rectangles, confidences = self._decode_predictions(scores, geometry) + + # Apply non-maximum suppression + boxes = self._apply_nms(rectangles, confidences) + + # Scale boxes back to original image size + ratio_w = W / float(self.input_width) + ratio_h = H / float(self.input_height) + + regions = [] + for (startX, startY, endX, endY, conf) in boxes: + # Scale coordinates + startX = int(startX * ratio_w) + startY = int(startY * ratio_h) + endX = int(endX * ratio_w) + endY = int(endY * ratio_h) + + # Ensure valid coordinates + startX = max(0, startX) + startY = max(0, startY) + endX = min(W, endX) + endY = min(H, endY) + + w = endX - startX + h = endY - startY + + if w > 0 and h > 0: + regions.append(OCRTextRegion( + text="", # EAST doesn't recognize text + confidence=float(conf), + bbox=(startX, startY, w, h), + language=self.lang + )) + + logger.debug(f"EAST detected {len(regions)} text regions") + return regions + + except Exception as e: + logger.error(f"EAST detection failed: {e}") + return [] + + def _decode_predictions(self, scores: np.ndarray, + geometry: np.ndarray) -> Tuple[List, List]: + """Decode EAST model output to bounding boxes.""" + (num_rows, num_cols) = scores.shape[2:4] + rectangles = [] + confidences = [] + + for y in range(0, num_rows): + scores_data = scores[0, 0, y] + x0 = geometry[0, 0, y] + x1 = geometry[0, 1, y] + x2 = geometry[0, 2, y] + x3 = geometry[0, 3, y] + angles = geometry[0, 4, y] + + for x in range(0, num_cols): + if scores_data[x] < self.confidence_threshold: + continue + + # Compute offset + offset_x = x * 4.0 + offset_y = y * 4.0 + + # Extract rotation angle and compute cos/sin + angle = angles[x] + cos = np.cos(angle) + sin = np.sin(angle) + + # Compute box dimensions + h = x0[x] + x2[x] + w = x1[x] + x3[x] + + # Compute box coordinates + end_x = int(offset_x + (cos * x1[x]) + (sin * x2[x])) + end_y = int(offset_y - (sin * x1[x]) + (cos * x2[x])) + start_x = int(end_x - w) + start_y = int(end_y - h) + + rectangles.append((start_x, start_y, end_x, end_y)) + confidences.append(scores_data[x]) + + return rectangles, confidences + + def _apply_nms(self, rectangles: List, confidences: List) -> List[Tuple]: + """Apply non-maximum suppression.""" + if not rectangles: + return [] + + # Convert to float32 for NMS + boxes = np.array(rectangles, dtype=np.float32) + confidences = np.array(confidences, dtype=np.float32) + + # OpenCV NMSBoxes expects (x, y, w, h) format + nms_boxes = [] + for (x1, y1, x2, y2) in boxes: + nms_boxes.append([x1, y1, x2 - x1, y2 - y1]) + + # Apply NMS + indices = cv2.dnn.NMSBoxes( + nms_boxes, + confidences, + self.confidence_threshold, + self.nms_threshold + ) + + results = [] + if len(indices) > 0: + # Handle different OpenCV versions + if isinstance(indices, tuple): + indices = indices[0] + + for i in indices.flatten() if hasattr(indices, 'flatten') else indices: + x1, y1, x2, y2 = rectangles[i] + results.append((x1, y1, x2, y2, confidences[i])) + + return results + + def get_info(self): + """Get backend information.""" + info = super().get_info() + info.gpu_accelerated = self._gpu_enabled + return info + + @staticmethod + def is_opencv_cuda_available() -> bool: + """Check if OpenCV was built with CUDA support.""" + try: + return cv2.cuda.getCudaEnabledDeviceCount() > 0 + except: + return False diff --git a/modules/ocr_backends/paddleocr_backend.py b/modules/ocr_backends/paddleocr_backend.py new file mode 100644 index 0000000..1ed3abf --- /dev/null +++ b/modules/ocr_backends/paddleocr_backend.py @@ -0,0 +1,294 @@ +""" +Lemontropia Suite - PaddleOCR Backend +High-accuracy OCR using PaddleOCR - best quality but heavy dependencies. +""" + +import numpy as np +import logging +from typing import List, Optional + +from . import BaseOCRBackend, OCRTextRegion + +logger = logging.getLogger(__name__) + + +class PaddleOCRBackend(BaseOCRBackend): + """ + OCR backend using PaddleOCR. + + Pros: + - Best accuracy among open-source OCR + - Good multilingual support + - Fast with GPU + + Cons: + - Heavy dependencies (PyTorch/PaddlePaddle) + - Can fail with DLL errors on Windows Store Python + - Large model download + + Installation: pip install paddleocr + + Note: This backend has special handling for PyTorch/Paddle DLL errors + that commonly occur with Windows Store Python installations. + """ + + NAME = "paddleocr" + SUPPORTS_GPU = True + + def __init__(self, use_gpu: bool = True, lang: str = 'en', **kwargs): + super().__init__(use_gpu=use_gpu, lang=lang, **kwargs) + + self.ocr = None + self._gpu_available = False + self._dll_error = False # Track if we hit a DLL error + + # Language mapping for PaddleOCR + self.lang_map = { + 'en': 'en', + 'sv': 'latin', # Swedish uses latin script + 'de': 'latin', + 'fr': 'latin', + 'es': 'latin', + 'latin': 'latin', + } + + # Detection thresholds + self.det_db_thresh = kwargs.get('det_db_thresh', 0.3) + self.det_db_box_thresh = kwargs.get('det_db_box_thresh', 0.5) + self.rec_thresh = kwargs.get('rec_thresh', 0.5) + + def _initialize(self) -> bool: + """Initialize PaddleOCR with PyTorch DLL error handling.""" + try: + # First, check if PyTorch is importable without DLL errors + if not self._check_pytorch(): + return False + + # Import PaddleOCR + from paddleocr import PaddleOCR as PPOCR + + # Map language + paddle_lang = self.lang_map.get(self.lang, 'en') + + # Check GPU availability + self._gpu_available = self._check_gpu() + use_gpu_flag = self.use_gpu and self._gpu_available + + logger.info(f"Initializing PaddleOCR (lang={paddle_lang}, gpu={use_gpu_flag})") + + # Initialize PaddleOCR + self.ocr = PPOCR( + lang=paddle_lang, + use_gpu=use_gpu_flag, + show_log=False, + use_angle_cls=True, + det_db_thresh=self.det_db_thresh, + det_db_box_thresh=self.det_db_box_thresh, + rec_thresh=self.rec_thresh, + ) + + self._available = True + self._version = "2.x" # PaddleOCR doesn't expose version easily + + logger.info(f"PaddleOCR initialized successfully (GPU: {use_gpu_flag})") + return True + + except ImportError as e: + self._error_msg = f"PaddleOCR not installed. Run: pip install paddleocr" + logger.warning(self._error_msg) + return False + + except Exception as e: + error_str = str(e).lower() + + # Check for common DLL-related errors + if any(x in error_str for x in ['dll', 'c10', 'torch', 'paddle', 'lib']): + self._dll_error = True + self._error_msg = f"PaddleOCR DLL error (Windows Store Python?): {e}" + logger.warning(self._error_msg) + logger.info("This is a known issue with Windows Store Python. Using fallback OCR.") + else: + self._error_msg = f"PaddleOCR initialization failed: {e}" + logger.error(self._error_msg) + + return False + + def _check_pytorch(self) -> bool: + """ + Check if PyTorch can be imported without DLL errors. + + This is the critical check for Windows Store Python compatibility. + """ + try: + # Try importing torch - this is where DLL errors typically occur + import torch + + # Try a simple operation to verify it works + _ = torch.__version__ + + logger.debug("PyTorch import successful") + return True + + except ImportError: + self._error_msg = "PyTorch not installed" + logger.warning(self._error_msg) + return False + + except OSError as e: + # This is the Windows Store Python DLL error + error_str = str(e).lower() + if 'dll' in error_str or 'c10' in error_str or 'specified module' in error_str: + self._dll_error = True + self._error_msg = ( + f"PyTorch DLL load failed: {e}\n" + "This is a known issue with Windows Store Python.\n" + "Solutions:\n" + "1. Use Python from python.org instead of Windows Store\n" + "2. Install PyTorch with conda instead of pip\n" + "3. Use alternative OCR backend (EasyOCR, Tesseract, or OpenCV EAST)" + ) + logger.error(self._error_msg) + else: + self._error_msg = f"PyTorch load failed: {e}" + logger.error(self._error_msg) + return False + + except Exception as e: + self._error_msg = f"Unexpected PyTorch error: {e}" + logger.error(self._error_msg) + return False + + def _check_gpu(self) -> bool: + """Check if GPU is available for PaddleOCR.""" + try: + import torch + + if torch.cuda.is_available(): + device_name = torch.cuda.get_device_name(0) + logger.info(f"CUDA available: {device_name}") + return True + + # Check for MPS (Apple Silicon) + if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available(): + logger.info("Apple MPS available") + return True + + return False + + except Exception as e: + logger.debug(f"GPU check failed: {e}") + return False + + def extract_text(self, image: np.ndarray) -> List[OCRTextRegion]: + """ + Extract text from image using PaddleOCR. + + Args: + image: Input image (BGR format from OpenCV) + + Returns: + List of detected text regions with recognized text + """ + if not self._available or self.ocr is None: + logger.error("PaddleOCR backend not initialized") + return [] + + try: + # Preprocess image + processed = self.preprocess_image(image) + + # Run OCR + result = self.ocr.ocr(processed, cls=True) + + regions = [] + if result and result[0]: + for line in result[0]: + if line is None: + continue + + # Parse result: [bbox, (text, confidence)] + bbox, (text, conf) = line + + # Calculate bounding box from polygon + x_coords = [p[0] for p in bbox] + y_coords = [p[1] for p in bbox] + + x = int(min(x_coords)) + y = int(min(y_coords)) + w = int(max(x_coords) - x) + h = int(max(y_coords) - y) + + regions.append(OCRTextRegion( + text=text.strip(), + confidence=float(conf), + bbox=(x, y, w, h), + language=self.lang + )) + + logger.debug(f"PaddleOCR detected {len(regions)} text regions") + return regions + + except Exception as e: + logger.error(f"PaddleOCR extraction failed: {e}") + return [] + + def get_info(self): + """Get backend information.""" + info = super().get_info() + info.gpu_accelerated = self._gpu_available and self.use_gpu + if self._dll_error: + info.error_message = "PyTorch DLL error - incompatible with Windows Store Python" + return info + + def has_dll_error(self) -> bool: + """Check if this backend failed due to DLL error.""" + return self._dll_error + + @staticmethod + def diagnose_windows_store_python() -> dict: + """ + Diagnose if running Windows Store Python and potential issues. + + Returns: + Dictionary with diagnostic information + """ + import sys + import platform + + diag = { + 'platform': platform.system(), + 'python_version': sys.version, + 'executable': sys.executable, + 'is_windows_store': False, + 'pytorch_importable': False, + 'recommendations': [] + } + + # Check if Windows Store Python + exe_path = sys.executable.lower() + if 'windowsapps' in exe_path or 'microsoft' in exe_path: + diag['is_windows_store'] = True + diag['recommendations'].append( + "You are using Windows Store Python which has known DLL compatibility issues." + ) + + # Check PyTorch + try: + import torch + diag['pytorch_importable'] = True + diag['pytorch_version'] = torch.__version__ + diag['pytorch_cuda'] = torch.cuda.is_available() + except Exception as e: + diag['pytorch_error'] = str(e) + diag['recommendations'].append( + "PyTorch cannot be loaded. Use alternative OCR backends." + ) + + if not diag['pytorch_importable'] and diag['is_windows_store']: + diag['recommendations'].extend([ + "Install Python from https://python.org instead of Windows Store", + "Or use conda/miniconda for better compatibility", + "Recommended OCR backends: opencv_east, easyocr, tesseract" + ]) + + return diag diff --git a/modules/ocr_backends/tesseract_backend.py b/modules/ocr_backends/tesseract_backend.py new file mode 100644 index 0000000..4a2bf13 --- /dev/null +++ b/modules/ocr_backends/tesseract_backend.py @@ -0,0 +1,289 @@ +""" +Lemontropia Suite - Tesseract OCR Backend +Traditional OCR using Tesseract - stable, no ML dependencies. +""" + +import numpy as np +import logging +from typing import List, Optional, Tuple +from pathlib import Path +import shutil + +from . import BaseOCRBackend, OCRTextRegion + +logger = logging.getLogger(__name__) + + +class TesseractBackend(BaseOCRBackend): + """ + OCR backend using Tesseract OCR. + + Pros: + - Very stable and mature + - No PyTorch/TensorFlow dependencies + - Fast on CPU + - Works with Windows Store Python + + Cons: + - Lower accuracy on game UI text than neural OCR + - Requires Tesseract binary installation + + Installation: + - Windows: choco install tesseract or download from UB Mannheim + - Linux: sudo apt-get install tesseract-ocr + - macOS: brew install tesseract + - Python: pip install pytesseract + """ + + NAME = "tesseract" + SUPPORTS_GPU = False # Tesseract is CPU-only + + def __init__(self, use_gpu: bool = True, lang: str = 'en', **kwargs): + super().__init__(use_gpu=use_gpu, lang=lang, **kwargs) + + self.tesseract_cmd = kwargs.get('tesseract_cmd', None) + self._version = None + + # Language mapping for Tesseract + self.lang_map = { + 'en': 'eng', + 'sv': 'swe', # Swedish + 'de': 'deu', + 'fr': 'fra', + 'es': 'spa', + 'latin': 'eng+deu+fra+spa', # Multi-language + } + + # Tesseract configuration + self.config = kwargs.get('config', '--psm 6') # Assume single uniform block of text + + def _initialize(self) -> bool: + """Initialize Tesseract OCR.""" + try: + import pytesseract + + # Set custom path if provided + if self.tesseract_cmd: + pytesseract.pytesseract.tesseract_cmd = self.tesseract_cmd + + # Try to get version to verify installation + try: + version = pytesseract.get_tesseract_version() + self._version = str(version) + logger.info(f"Tesseract version: {version}") + except Exception as e: + # Try to find tesseract in PATH + tesseract_path = shutil.which('tesseract') + if tesseract_path: + pytesseract.pytesseract.tesseract_cmd = tesseract_path + version = pytesseract.get_tesseract_version() + self._version = str(version) + logger.info(f"Tesseract found at: {tesseract_path}, version: {version}") + else: + raise e + + self._available = True + logger.info("Tesseract OCR initialized successfully") + return True + + except ImportError: + self._error_msg = "pytesseract not installed. Run: pip install pytesseract" + logger.warning(self._error_msg) + return False + + except Exception as e: + self._error_msg = f"Tesseract not found: {e}. Please install Tesseract OCR." + logger.warning(self._error_msg) + logger.info("Download from: https://github.com/UB-Mannheim/tesseract/wiki") + return False + + def extract_text(self, image: np.ndarray) -> List[OCRTextRegion]: + """ + Extract text from image using Tesseract. + + Uses a two-step approach: + 1. Detect text regions using OpenCV contours + 2. Run Tesseract on each region + + Args: + image: Input image (BGR format from OpenCV) + + Returns: + List of detected text regions with recognized text + """ + if not self._available: + logger.error("Tesseract backend not initialized") + return [] + + try: + import pytesseract + import cv2 + + # Preprocess image + gray = self._to_grayscale(image) + processed = self._preprocess_for_tesseract(gray) + + # Get data including bounding boxes + tesseract_lang = self.lang_map.get(self.lang, 'eng') + + data = pytesseract.image_to_data( + processed, + lang=tesseract_lang, + config=self.config, + output_type=pytesseract.Output.DICT + ) + + regions = [] + n_boxes = len(data['text']) + + for i in range(n_boxes): + text = data['text'][i].strip() + conf = int(data['conf'][i]) + + # Filter low confidence and empty text + if conf > 30 and text: + x = data['left'][i] + y = data['top'][i] + w = data['width'][i] + h = data['height'][i] + + regions.append(OCRTextRegion( + text=text, + confidence=conf / 100.0, # Normalize to 0-1 + bbox=(x, y, w, h), + language=self.lang + )) + + # Merge overlapping regions that are likely the same text + regions = self._merge_nearby_regions(regions) + + logger.debug(f"Tesseract detected {len(regions)} text regions") + return regions + + except Exception as e: + logger.error(f"Tesseract extraction failed: {e}") + return [] + + def _preprocess_for_tesseract(self, gray: np.ndarray) -> np.ndarray: + """Preprocess image specifically for Tesseract.""" + import cv2 + + # Resize small images (Tesseract works better with larger text) + h, w = gray.shape[:2] + min_height = 100 + if h < min_height: + scale = min_height / h + gray = cv2.resize(gray, None, fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC) + + # Apply adaptive thresholding + processed = cv2.adaptiveThreshold( + gray, 255, + cv2.ADAPTIVE_THRESH_GAUSSIAN_C, + cv2.THRESH_BINARY, + 11, 2 + ) + + # Denoise + processed = cv2.fastNlMeansDenoising(processed, None, 10, 7, 21) + + return processed + + def _merge_nearby_regions(self, regions: List[OCRTextRegion], + max_distance: int = 10) -> List[OCRTextRegion]: + """Merge text regions that are close to each other.""" + if not regions: + return [] + + # Sort by y position + sorted_regions = sorted(regions, key=lambda r: (r.bbox[1], r.bbox[0])) + + merged = [] + current = sorted_regions[0] + + for next_region in sorted_regions[1:]: + # Check if regions are close enough to merge + cx, cy, cw, ch = current.bbox + nx, ny, nw, nh = next_region.bbox + + # Calculate distance + distance = abs(ny - cy) + x_overlap = not (cx + cw < nx or nx + nw < cx) + + if distance < max_distance and x_overlap: + # Merge regions + min_x = min(cx, nx) + min_y = min(cy, ny) + max_x = max(cx + cw, nx + nw) + max_y = max(cy + ch, ny + nh) + + # Combine text + combined_text = current.text + " " + next_region.text + avg_conf = (current.confidence + next_region.confidence) / 2 + + current = OCRTextRegion( + text=combined_text.strip(), + confidence=avg_conf, + bbox=(min_x, min_y, max_x - min_x, max_y - min_y), + language=self.lang + ) + else: + merged.append(current) + current = next_region + + merged.append(current) + return merged + + def extract_text_simple(self, image: np.ndarray) -> str: + """ + Simple text extraction without region detection. + + Returns: + All text found in image as single string + """ + if not self._available: + return "" + + try: + import pytesseract + import cv2 + + # Convert to RGB if needed + if len(image.shape) == 3: + image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) + + tesseract_lang = self.lang_map.get(self.lang, 'eng') + + text = pytesseract.image_to_string( + image, + lang=tesseract_lang, + config=self.config + ) + + return text.strip() + + except Exception as e: + logger.error(f"Tesseract simple extraction failed: {e}") + return "" + + @staticmethod + def find_tesseract() -> Optional[str]: + """Find Tesseract installation path.""" + path = shutil.which('tesseract') + if path: + return path + + # Common Windows paths + common_paths = [ + r"C:\Program Files\Tesseract-OCR\tesseract.exe", + r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe", + r"C:\Users\%USERNAME%\AppData\Local\Tesseract-OCR\tesseract.exe", + r"C:\Tesseract-OCR\tesseract.exe", + ] + + import os + for p in common_paths: + expanded = os.path.expandvars(p) + if Path(expanded).exists(): + return expanded + + return None diff --git a/requirements-ocr.txt b/requirements-ocr.txt new file mode 100644 index 0000000..16f25f7 --- /dev/null +++ b/requirements-ocr.txt @@ -0,0 +1,35 @@ +# Lemontropia Suite - OCR Dependencies +# Install based on your needs and system capabilities + +# ========== REQUIRED ========== +# These are always required for OCR functionality +opencv-python>=4.8.0 # Computer vision and OpenCV EAST text detection +numpy>=1.24.0 # Numerical operations +pillow>=10.0.0 # Image processing + +# ========== RECOMMENDED (Choose One) ========== + +## Option 1: EasyOCR (Recommended for most users) +## Good accuracy, lighter than PaddleOCR, supports GPU +## Note: Requires PyTorch - may not work with Windows Store Python +# easyocr>=1.7.0 + +## Option 2: Tesseract OCR (Most Stable) +## Traditional OCR, no ML dependencies, very stable +## Requires system Tesseract installation +# pytesseract>=0.3.10 + +## Option 3: PaddleOCR (Best Accuracy) +## Highest accuracy but heavy dependencies +## Note: Requires PaddlePaddle - may not work with Windows Store Python +# paddleocr>=2.7.0 +# paddlepaddle>=2.5.0 # or paddlepaddle-gpu for CUDA + +# ========== OPTIONAL GPU SUPPORT ========== +# Only if you have a compatible NVIDIA GPU +# torch>=2.0.0 # PyTorch with CUDA support +# torchvision>=0.15.0 + +# ========== DEVELOPMENT ========== +pytest>=7.4.0 # Testing +pytest-cov>=4.1.0 # Coverage diff --git a/test_ocr_system.py b/test_ocr_system.py new file mode 100644 index 0000000..2cd26a2 --- /dev/null +++ b/test_ocr_system.py @@ -0,0 +1,328 @@ +""" +Lemontropia Suite - OCR Backend Test Script +Tests all OCR backends and reports on availability. + +Run this to verify the OCR system works without PyTorch DLL errors. +""" + +import sys +import logging +from pathlib import Path + +# Setup logging +logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +) +logger = logging.getLogger(__name__) + + +def test_hardware_detection(): + """Test hardware detection.""" + print("\n" + "=" * 60) + print("HARDWARE DETECTION TEST") + print("=" * 60) + + try: + from modules.hardware_detection import ( + HardwareDetector, + print_hardware_summary, + recommend_ocr_backend + ) + + # Print summary + print_hardware_summary() + + # Get detailed info + info = HardwareDetector.detect_all() + + # Check for Windows Store Python + if info.is_windows_store_python: + print("\nāš ļø WARNING: Windows Store Python detected!") + print(" This may cause DLL compatibility issues with PyTorch.") + + # Check for PyTorch DLL errors + if info.pytorch_dll_error: + print("\nāŒ PyTorch DLL Error detected!") + print(f" Error: {info.pytorch_error}") + print("\n This is expected with Windows Store Python.") + print(" The system will automatically use alternative OCR backends.") + elif not info.pytorch_available: + print("\nāš ļø PyTorch not installed.") + else: + print(f"\nāœ… PyTorch {info.pytorch_version} available") + + # Recommendation + recommended = recommend_ocr_backend() + print(f"\nšŸ“‹ Recommended OCR backend: {recommended}") + + return True + + except Exception as e: + print(f"āŒ Hardware detection failed: {e}") + import traceback + traceback.print_exc() + return False + + +def test_ocr_backends(): + """Test all OCR backends.""" + print("\n" + "=" * 60) + print("OCR BACKEND TESTS") + print("=" * 60) + + try: + from modules.ocr_backends import OCRBackendFactory + + # Check all backends + backends = OCRBackendFactory.check_all_backends(use_gpu=True) + + available_count = 0 + + for info in backends: + status = "āœ… Available" if info.available else "āŒ Not Available" + gpu_status = "šŸš€ GPU" if info.gpu_accelerated else "šŸ’» CPU" + + print(f"\n{info.name.upper()}:") + print(f" Status: {status}") + print(f" GPU: {gpu_status}") + + if info.version: + print(f" Version: {info.version}") + + if info.error_message: + print(f" Error: {info.error_message}") + + if info.available: + available_count += 1 + + print(f"\nšŸ“Š Summary: {available_count}/{len(backends)} backends available") + + return available_count > 0 + + except Exception as e: + print(f"āŒ OCR backend test failed: {e}") + import traceback + traceback.print_exc() + return False + + +def test_opencv_east(): + """Test OpenCV EAST backend specifically (should always work).""" + print("\n" + "=" * 60) + print("OPENCV EAST BACKEND TEST") + print("=" * 60) + + try: + import numpy as np + from modules.ocr_backends import OCRBackendFactory + + # Create test image with text-like regions + print("\nCreating test image...") + test_image = np.ones((400, 600, 3), dtype=np.uint8) * 255 + + # Draw some rectangles that look like text regions + import cv2 + cv2.rectangle(test_image, (50, 50), (200, 80), (0, 0, 0), -1) + cv2.rectangle(test_image, (50, 100), (250, 130), (0, 0, 0), -1) + cv2.rectangle(test_image, (300, 50), (500, 90), (0, 0, 0), -1) + + # Create backend + print("Creating OpenCV EAST backend...") + backend = OCRBackendFactory.create_backend('opencv_east', use_gpu=False) + + if backend is None: + print("āŒ Failed to create OpenCV EAST backend") + return False + + print(f"āœ… Backend created: {backend.get_info().name}") + print(f" Available: {backend.is_available()}") + print(f" GPU: {backend.get_info().gpu_accelerated}") + + # Test detection + print("\nRunning text detection...") + regions = backend.extract_text(test_image) + + print(f"āœ… Detection complete: {len(regions)} regions found") + + for i, region in enumerate(regions[:5]): # Show first 5 + x, y, w, h = region.bbox + print(f" Region {i+1}: bbox=({x},{y},{w},{h}), conf={region.confidence:.2f}") + + return True + + except Exception as e: + print(f"āŒ OpenCV EAST test failed: {e}") + import traceback + traceback.print_exc() + return False + + +def test_unified_ocr(): + """Test unified OCR processor.""" + print("\n" + "=" * 60) + print("UNIFIED OCR PROCESSOR TEST") + print("=" * 60) + + try: + import numpy as np + import cv2 + from modules.game_vision_ai import UnifiedOCRProcessor + + # Create processor (auto-selects best backend) + print("\nInitializing Unified OCR Processor...") + processor = UnifiedOCRProcessor(use_gpu=True, auto_select=True) + + backend_name = processor.get_current_backend() + print(f"āœ… Processor initialized with backend: {backend_name}") + + # Get backend info + info = processor.get_backend_info() + print(f" Info: {info}") + + # Create test image + print("\nCreating test image...") + test_image = np.ones((400, 600, 3), dtype=np.uint8) * 255 + cv2.rectangle(test_image, (50, 50), (200, 80), (0, 0, 0), -1) + cv2.rectangle(test_image, (50, 100), (250, 130), (0, 0, 0), -1) + + # Test extraction + print("\nRunning text extraction...") + regions = processor.extract_text(test_image) + + print(f"āœ… Extraction complete: {len(regions)} regions found") + + # List all available backends + print("\nšŸ“‹ All OCR backends:") + for backend_info in processor.get_available_backends(): + status = "āœ…" if backend_info.available else "āŒ" + print(f" {status} {backend_info.name}") + + return True + + except Exception as e: + print(f"āŒ Unified OCR test failed: {e}") + import traceback + traceback.print_exc() + return False + + +def test_game_vision_ai(): + """Test GameVisionAI class.""" + print("\n" + "=" * 60) + print("GAME VISION AI TEST") + print("=" * 60) + + try: + from modules.game_vision_ai import GameVisionAI + + print("\nInitializing GameVisionAI...") + vision = GameVisionAI(use_gpu=True) + + print(f"āœ… GameVisionAI initialized") + print(f" OCR Backend: {vision.ocr.get_current_backend()}") + print(f" GPU Backend: {vision.backend.value}") + + # Get diagnostic info + print("\nRunning diagnostics...") + diag = GameVisionAI.diagnose() + + print(f" Hardware: {diag['hardware']['gpu']['backend']}") + print(f" Recommended OCR: {diag['recommendations']['ocr_backend']}") + + # Test available backends + backends = vision.get_ocr_backends() + available = [b['name'] for b in backends if b['available']] + print(f" Available backends: {', '.join(available) if available else 'None'}") + + return True + + except Exception as e: + print(f"āŒ GameVisionAI test failed: {e}") + import traceback + traceback.print_exc() + return False + + +def test_pytorch_dll_handling(): + """Test that PyTorch DLL errors are handled gracefully.""" + print("\n" + "=" * 60) + print("PYTORCH DLL ERROR HANDLING TEST") + print("=" * 60) + + try: + from modules.hardware_detection import HardwareDetector + + info = HardwareDetector.detect_all() + + if info.pytorch_dll_error: + print("\nāš ļø PyTorch DLL error detected (as expected with Windows Store Python)") + print("āœ… System correctly detected the DLL error") + print("āœ… System will use fallback OCR backends") + + # Verify fallback recommendation + recommended = HardwareDetector.recommend_ocr_backend() + if recommended in ['opencv_east', 'tesseract']: + print(f"āœ… Recommended safe backend: {recommended}") + return True + else: + print(f"āš ļø Unexpected recommendation: {recommended}") + return False + else: + print("\nāœ… No PyTorch DLL error detected") + print(" PyTorch is working correctly!") + + if info.pytorch_available: + print(f" Version: {info.pytorch_version}") + + return True + + except Exception as e: + print(f"āŒ DLL handling test failed: {e}") + import traceback + traceback.print_exc() + return False + + +def main(): + """Run all tests.""" + print("\n" + "=" * 60) + print("LEMONTROPIA SUITE - OCR SYSTEM TEST") + print("=" * 60) + print("\nPython:", sys.version) + print("Platform:", sys.platform) + + results = {} + + # Run tests + results['hardware'] = test_hardware_detection() + results['backends'] = test_ocr_backends() + results['opencv_east'] = test_opencv_east() + results['unified_ocr'] = test_unified_ocr() + results['game_vision'] = test_game_vision_ai() + results['dll_handling'] = test_pytorch_dll_handling() + + # Summary + print("\n" + "=" * 60) + print("TEST SUMMARY") + print("=" * 60) + + for name, passed in results.items(): + status = "āœ… PASS" if passed else "āŒ FAIL" + print(f" {status}: {name}") + + total = len(results) + passed = sum(results.values()) + + print(f"\n Total: {passed}/{total} tests passed") + + if passed == total: + print("\nšŸŽ‰ All tests passed! OCR system is working correctly.") + return 0 + else: + print("\nāš ļø Some tests failed. Check the output above for details.") + return 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/ui/main_window.py b/ui/main_window.py index 00d6b47..e53cded 100644 --- a/ui/main_window.py +++ b/ui/main_window.py @@ -119,6 +119,7 @@ from ui.hud_overlay_clean import HUDOverlay from ui.session_history import SessionHistoryDialog from ui.gallery_dialog import GalleryDialog, ScreenshotCapture +from ui.settings_dialog import SettingsDialog # ============================================================================ # Screenshot Hotkey Integration @@ -250,86 +251,6 @@ class TemplateStatsDialog(QDialog): layout.addWidget(button_box) -class SettingsDialog(QDialog): - """Dialog for application settings.""" - - def __init__(self, parent=None, current_player_name: str = ""): - super().__init__(parent) - self.setWindowTitle("Settings") - self.setMinimumWidth(450) - self.player_name = current_player_name - self.setup_ui() - - def setup_ui(self): - layout = QVBoxLayout(self) - - # Player Settings Group - player_group = QGroupBox("Player Settings") - player_layout = QFormLayout(player_group) - - self.player_name_edit = QLineEdit() - self.player_name_edit.setText(self.player_name) - self.player_name_edit.setPlaceholderText("Your avatar name in Entropia Universe") - player_layout.addRow("Avatar Name:", self.player_name_edit) - - help_label = QLabel("Set your avatar name to track your globals correctly.") - help_label.setStyleSheet("color: #888; font-size: 11px;") - player_layout.addRow(help_label) - - layout.addWidget(player_group) - - # Log Settings Group - log_group = QGroupBox("Log File Settings") - log_layout = QFormLayout(log_group) - - self.log_path_edit = QLineEdit() - self.log_path_edit.setPlaceholderText("Path to chat.log") - log_layout.addRow("Log Path:", self.log_path_edit) - - self.auto_detect_check = QCheckBox("Auto-detect log path on startup") - self.auto_detect_check.setChecked(True) - log_layout.addRow(self.auto_detect_check) - - layout.addWidget(log_group) - - # Default Activity Group - activity_group = QGroupBox("Default Activity") - activity_layout = QFormLayout(activity_group) - - self.default_activity_combo = QComboBox() - for activity in ActivityType: - self.default_activity_combo.addItem(activity.display_name, activity) - activity_layout.addRow("Default:", self.default_activity_combo) - - layout.addWidget(activity_group) - - layout.addStretch() - - button_box = QDialogButtonBox( - QDialogButtonBox.StandardButton.Ok | QDialogButtonBox.StandardButton.Cancel - ) - button_box.accepted.connect(self.accept) - button_box.rejected.connect(self.reject) - layout.addWidget(button_box) - - def get_player_name(self) -> str: - """Get the configured player name.""" - return self.player_name_edit.text().strip() - - def get_log_path(self) -> str: - """Get the configured log path.""" - return self.log_path_edit.text().strip() - - def get_auto_detect(self) -> bool: - """Get auto-detect setting.""" - return self.auto_detect_check.isChecked() - - def get_default_activity(self) -> str: - """Get default activity type.""" - activity = self.default_activity_combo.currentData() - return activity.value if activity else "hunting" - - # ============================================================================ # Main Window # ============================================================================ @@ -831,14 +752,6 @@ class MainWindow(QMainWindow): vision_test_action.triggered.connect(self.on_vision_test) vision_menu.addAction(vision_test_action) - tools_menu.addSeparator() - - # Screenshot hotkey settings - screenshot_hotkeys_action = QAction("šŸ“ø Screenshot &Hotkeys", self) - screenshot_hotkeys_action.setShortcut("Ctrl+Shift+S") - screenshot_hotkeys_action.triggered.connect(self._show_screenshot_hotkey_settings) - tools_menu.addAction(screenshot_hotkeys_action) - # View menu view_menu = menubar.addMenu("&View") @@ -1887,14 +1800,12 @@ class MainWindow(QMainWindow): self.log_info("HUD", "HUD overlay hidden") def on_settings(self): - """Open settings dialog.""" - dialog = SettingsDialog(self, self.player_name) + """Open comprehensive settings dialog.""" + dialog = SettingsDialog(self, self.db) if dialog.exec() == QDialog.DialogCode.Accepted: - self.player_name = dialog.get_player_name() - self.log_path = dialog.get_log_path() - self.auto_detect_log = dialog.get_auto_detect() - self._save_settings() - self.log_info("Settings", f"Avatar name: {self.player_name}") + # Reload settings from QSettings + self._load_settings() + self.log_info("Settings", "Settings updated successfully") def on_run_setup_wizard(self): """Run the setup wizard again.""" diff --git a/ui/settings_dialog.py b/ui/settings_dialog.py new file mode 100644 index 0000000..eaafd7e --- /dev/null +++ b/ui/settings_dialog.py @@ -0,0 +1,684 @@ +""" +Lemontropia Suite - Comprehensive Settings Dialog +Unified settings for Player, Screenshot Hotkeys, Computer Vision, and General preferences. +""" + +import logging +from pathlib import Path +from typing import Optional, Dict, Any + +from PyQt6.QtWidgets import ( + QDialog, QVBoxLayout, QHBoxLayout, QFormLayout, + QLabel, QLineEdit, QPushButton, QComboBox, + QCheckBox, QGroupBox, QTabWidget, QDialogButtonBox, + QMessageBox, QFileDialog, QWidget, QGridLayout, + QSpinBox, QDoubleSpinBox, QFrame +) +from PyQt6.QtCore import Qt, QSettings +from PyQt6.QtGui import QKeySequence + +logger = logging.getLogger(__name__) + + +class SettingsDialog(QDialog): + """ + Comprehensive settings dialog with tabbed interface. + + Tabs: + - General: Player name, log path, activity defaults + - Screenshot Hotkeys: Configure F12 and other hotkeys + - Computer Vision: OCR backend selection, GPU settings + - Advanced: Performance, logging, database options + """ + + def __init__(self, parent=None, db=None): + super().__init__(parent) + self.setWindowTitle("Lemontropia Suite - Settings") + self.setMinimumSize(600, 500) + self.resize(700, 550) + + self.db = db + self._settings = QSettings("Lemontropia", "Suite") + + # Load current values + self._load_current_values() + + self._setup_ui() + self._apply_dark_theme() + + def _load_current_values(self): + """Load current settings values.""" + # General + self._player_name = self._settings.value("player/name", "", type=str) + self._log_path = self._settings.value("log/path", "", type=str) + self._auto_detect_log = self._settings.value("log/auto_detect", True, type=bool) + self._default_activity = self._settings.value("activity/default", "hunting", type=str) + + # Screenshot hotkeys + self._hotkey_full = self._settings.value("hotkey/screenshot_full", "F12", type=str) + self._hotkey_region = self._settings.value("hotkey/screenshot_region", "Shift+F12", type=str) + self._hotkey_loot = self._settings.value("hotkey/screenshot_loot", "Ctrl+F12", type=str) + self._hotkey_hud = self._settings.value("hotkey/screenshot_hud", "Alt+F12", type=str) + + # Computer Vision + self._cv_backend = self._settings.value("cv/backend", "auto", type=str) + self._cv_use_gpu = self._settings.value("cv/use_gpu", True, type=bool) + self._cv_confidence = self._settings.value("cv/confidence", 0.5, type=float) + + def _setup_ui(self): + """Setup the dialog UI with tabs.""" + layout = QVBoxLayout(self) + layout.setContentsMargins(15, 15, 15, 15) + layout.setSpacing(10) + + # Title + title = QLabel("āš™ļø Settings") + title.setStyleSheet("font-size: 18px; font-weight: bold; color: #4caf50;") + layout.addWidget(title) + + # Tab widget + self.tabs = QTabWidget() + layout.addWidget(self.tabs) + + # Create tabs + self.tabs.addTab(self._create_general_tab(), "šŸ“‹ General") + self.tabs.addTab(self._create_hotkeys_tab(), "šŸ“ø Screenshot Hotkeys") + self.tabs.addTab(self._create_vision_tab(), "šŸ‘ļø Computer Vision") + self.tabs.addTab(self._create_advanced_tab(), "šŸ”§ Advanced") + + # Button box + button_box = QDialogButtonBox( + QDialogButtonBox.StandardButton.Save | + QDialogButtonBox.StandardButton.Cancel | + QDialogButtonBox.StandardButton.Reset + ) + button_box.accepted.connect(self._on_save) + button_box.rejected.connect(self.reject) + button_box.button(QDialogButtonBox.StandardButton.Reset).clicked.connect(self._on_reset) + layout.addWidget(button_box) + + def _create_general_tab(self) -> QWidget: + """Create General settings tab.""" + tab = QWidget() + layout = QVBoxLayout(tab) + layout.setSpacing(15) + + # Player Settings + player_group = QGroupBox("šŸŽ® Player Settings") + player_form = QFormLayout(player_group) + + self.player_name_edit = QLineEdit(self._player_name) + self.player_name_edit.setPlaceholderText("Your avatar name in Entropia Universe") + player_form.addRow("Avatar Name:", self.player_name_edit) + + player_help = QLabel("This name is used to identify your globals and HoFs in the log.") + player_help.setStyleSheet("color: #888; font-size: 11px;") + player_help.setWordWrap(True) + player_form.addRow(player_help) + + layout.addWidget(player_group) + + # Log File Settings + log_group = QGroupBox("šŸ“„ Log File Settings") + log_layout = QVBoxLayout(log_group) + + log_form = QFormLayout() + + log_path_layout = QHBoxLayout() + self.log_path_edit = QLineEdit(self._log_path) + self.log_path_edit.setPlaceholderText(r"C:\Users\...\Documents\Entropia Universe\chat.log") + log_path_layout.addWidget(self.log_path_edit) + + browse_btn = QPushButton("Browse...") + browse_btn.clicked.connect(self._browse_log_path) + log_path_layout.addWidget(browse_btn) + + log_form.addRow("Chat Log Path:", log_path_layout) + + self.auto_detect_check = QCheckBox("Auto-detect log path on startup") + self.auto_detect_check.setChecked(self._auto_detect_log) + log_form.addRow(self.auto_detect_check) + + log_layout.addLayout(log_form) + + # Quick paths + quick_paths_layout = QHBoxLayout() + quick_paths_layout.addWidget(QLabel("Quick select:")) + + default_path_btn = QPushButton("Default Location") + default_path_btn.clicked.connect(self._set_default_log_path) + quick_paths_layout.addWidget(default_path_btn) + + quick_paths_layout.addStretch() + log_layout.addLayout(quick_paths_layout) + + layout.addWidget(log_group) + + # Default Activity + activity_group = QGroupBox("šŸŽÆ Default Activity") + activity_form = QFormLayout(activity_group) + + self.default_activity_combo = QComboBox() + activities = [ + ("hunting", "šŸŽÆ Hunting"), + ("mining", "ā›ļø Mining"), + ("crafting", "āš’ļø Crafting") + ] + for value, display in activities: + self.default_activity_combo.addItem(display, value) + if value == self._default_activity: + self.default_activity_combo.setCurrentIndex(self.default_activity_combo.count() - 1) + + activity_form.addRow("Default Activity:", self.default_activity_combo) + + layout.addWidget(activity_group) + + layout.addStretch() + return tab + + def _create_hotkeys_tab(self) -> QWidget: + """Create Screenshot Hotkeys tab.""" + tab = QWidget() + layout = QVBoxLayout(tab) + layout.setSpacing(15) + + # Info header + info = QLabel("šŸ“ø Configure screenshot hotkeys. Hotkeys work when the app is focused.") + info.setStyleSheet("color: #888; padding: 5px;") + info.setWordWrap(True) + layout.addWidget(info) + + # Status + status_group = QGroupBox("Status") + status_layout = QVBoxLayout(status_group) + + try: + import keyboard + self.hotkey_status = QLabel("āœ… Global hotkeys available (keyboard library installed)") + self.hotkey_status.setStyleSheet("color: #4caf50;") + except ImportError: + self.hotkey_status = QLabel("ā„¹ļø Qt shortcuts only (install 'keyboard' library for global hotkeys)\npip install keyboard") + self.hotkey_status.setStyleSheet("color: #ff9800;") + self.hotkey_status.setWordWrap(True) + + status_layout.addWidget(self.hotkey_status) + layout.addWidget(status_group) + + # Hotkey configuration + hotkey_group = QGroupBox("Hotkey Configuration") + hotkey_form = QFormLayout(hotkey_group) + + # Full screen + full_layout = QHBoxLayout() + self.hotkey_full_edit = QLineEdit(self._hotkey_full) + full_layout.addWidget(self.hotkey_full_edit) + full_test = QPushButton("Test") + full_test.clicked.connect(lambda: self._test_hotkey("full")) + full_layout.addWidget(full_test) + hotkey_form.addRow("Full Screen:", full_layout) + + # Region + region_layout = QHBoxLayout() + self.hotkey_region_edit = QLineEdit(self._hotkey_region) + region_layout.addWidget(self.hotkey_region_edit) + region_test = QPushButton("Test") + region_test.clicked.connect(lambda: self._test_hotkey("region")) + region_layout.addWidget(region_test) + hotkey_form.addRow("Center Region (800x600):", region_layout) + + # Loot + loot_layout = QHBoxLayout() + self.hotkey_loot_edit = QLineEdit(self._hotkey_loot) + loot_layout.addWidget(self.hotkey_loot_edit) + loot_test = QPushButton("Test") + loot_test.clicked.connect(lambda: self._test_hotkey("loot")) + loot_layout.addWidget(loot_test) + hotkey_form.addRow("Loot Window:", loot_layout) + + # HUD + hud_layout = QHBoxLayout() + self.hotkey_hud_edit = QLineEdit(self._hotkey_hud) + hud_layout.addWidget(self.hotkey_hud_edit) + hud_test = QPushButton("Test") + hud_test.clicked.connect(lambda: self._test_hotkey("hud")) + hud_layout.addWidget(hud_test) + hotkey_form.addRow("HUD Area:", hud_layout) + + layout.addWidget(hotkey_group) + + # Help text + help_group = QGroupBox("Help") + help_layout = QVBoxLayout(help_group) + + help_text = QLabel( + "Format examples:\n" + " F12, Ctrl+F12, Shift+F12, Alt+F12\n" + " Ctrl+Shift+S, Alt+Tab (don't use system shortcuts)\n\n" + "Note: Global hotkeys require the 'keyboard' library and may need admin privileges.\n" + "Qt shortcuts (app focused only) work without additional libraries." + ) + help_text.setStyleSheet("color: #888; font-family: monospace;") + help_layout.addWidget(help_text) + + layout.addWidget(help_group) + layout.addStretch() + + return tab + + def _create_vision_tab(self) -> QWidget: + """Create Computer Vision tab.""" + tab = QWidget() + layout = QVBoxLayout(tab) + layout.setSpacing(15) + + # Info header + info = QLabel("šŸ‘ļø Computer Vision settings for automatic loot detection and OCR.") + info.setStyleSheet("color: #888; padding: 5px;") + info.setWordWrap(True) + layout.addWidget(info) + + # OCR Backend Selection + backend_group = QGroupBox("OCR Backend") + backend_layout = QFormLayout(backend_group) + + self.cv_backend_combo = QComboBox() + backends = [ + ("auto", "šŸ¤– Auto-detect (recommended)"), + ("opencv", "⚔ OpenCV EAST (fastest, no extra dependencies)"), + ("easyocr", "šŸ“– EasyOCR (good accuracy, lighter than Paddle)"), + ("tesseract", "šŸ” Tesseract (traditional, stable)"), + ("paddle", "🧠 PaddleOCR (best accuracy, requires PyTorch)") + ] + + for value, display in backends: + self.cv_backend_combo.addItem(display, value) + if value == self._cv_backend: + self.cv_backend_combo.setCurrentIndex(self.cv_backend_combo.count() - 1) + + self.cv_backend_combo.currentIndexChanged.connect(self._on_backend_changed) + backend_layout.addRow("OCR Backend:", self.cv_backend_combo) + + # Backend status + self.backend_status = QLabel() + self._update_backend_status() + backend_layout.addRow(self.backend_status) + + layout.addWidget(backend_group) + + # GPU Settings + gpu_group = QGroupBox("GPU Acceleration") + gpu_layout = QFormLayout(gpu_group) + + self.cv_use_gpu_check = QCheckBox("Use GPU acceleration if available") + self.cv_use_gpu_check.setChecked(self._cv_use_gpu) + self.cv_use_gpu_check.setToolTip("Faster processing but requires compatible GPU") + gpu_layout.addRow(self.cv_use_gpu_check) + + # GPU Info + self.gpu_info = QLabel() + self._update_gpu_info() + gpu_layout.addRow(self.gpu_info) + + layout.addWidget(gpu_group) + + # Detection Settings + detection_group = QGroupBox("Detection Settings") + detection_layout = QFormLayout(detection_group) + + self.cv_confidence_spin = QDoubleSpinBox() + self.cv_confidence_spin.setRange(0.1, 1.0) + self.cv_confidence_spin.setSingleStep(0.05) + self.cv_confidence_spin.setValue(self._cv_confidence) + self.cv_confidence_spin.setDecimals(2) + detection_layout.addRow("Confidence Threshold:", self.cv_confidence_spin) + + confidence_help = QLabel("Lower = more sensitive (may detect non-text)\nHigher = stricter (may miss some text)") + confidence_help.setStyleSheet("color: #888; font-size: 11px;") + detection_layout.addRow(confidence_help) + + layout.addWidget(detection_group) + + # Test buttons + test_group = QGroupBox("Test Computer Vision") + test_layout = QHBoxLayout(test_group) + + test_ocr_btn = QPushButton("šŸ“ Test OCR") + test_ocr_btn.clicked.connect(self._test_ocr) + test_layout.addWidget(test_ocr_btn) + + test_icon_btn = QPushButton("šŸŽÆ Test Icon Detection") + test_icon_btn.clicked.connect(self._test_icon_detection) + test_layout.addWidget(test_icon_btn) + + calibrate_btn = QPushButton("šŸ“ Calibrate") + calibrate_btn.clicked.connect(self._calibrate_vision) + test_layout.addWidget(calibrate_btn) + + layout.addWidget(test_group) + layout.addStretch() + + return tab + + def _create_advanced_tab(self) -> QWidget: + """Create Advanced settings tab.""" + tab = QWidget() + layout = QVBoxLayout(tab) + layout.setSpacing(15) + + # Performance + perf_group = QGroupBox("Performance") + perf_layout = QFormLayout(perf_group) + + self.fps_limit_spin = QSpinBox() + self.fps_limit_spin.setRange(1, 144) + self.fps_limit_spin.setValue(60) + self.fps_limit_spin.setSuffix(" FPS") + perf_layout.addRow("Target FPS:", self.fps_limit_spin) + + layout.addWidget(perf_group) + + # Database + db_group = QGroupBox("Database") + db_layout = QVBoxLayout(db_group) + + db_info = QLabel(f"Database location:\n{self.db.db_path if self.db else 'Not connected'}") + db_info.setStyleSheet("color: #888; font-family: monospace; font-size: 11px;") + db_info.setWordWrap(True) + db_layout.addWidget(db_info) + + db_buttons = QHBoxLayout() + + backup_btn = QPushButton("šŸ’¾ Backup Database") + backup_btn.clicked.connect(self._backup_database) + db_buttons.addWidget(backup_btn) + + export_btn = QPushButton("šŸ“¤ Export Data") + export_btn.clicked.connect(self._export_data) + db_buttons.addWidget(export_btn) + + db_buttons.addStretch() + db_layout.addLayout(db_buttons) + + layout.addWidget(db_group) + + # Logging + log_group = QGroupBox("Logging") + log_layout = QFormLayout(log_group) + + self.log_level_combo = QComboBox() + log_levels = ["DEBUG", "INFO", "WARNING", "ERROR"] + for level in log_levels: + self.log_level_combo.addItem(level) + self.log_level_combo.setCurrentText("INFO") + log_layout.addRow("Log Level:", self.log_level_combo) + + layout.addWidget(log_group) + + layout.addStretch() + return tab + + def _on_backend_changed(self): + """Handle OCR backend selection change.""" + self._update_backend_status() + + def _update_backend_status(self): + """Update backend status label.""" + backend = self.cv_backend_combo.currentData() + + status_text = "" + if backend == "auto": + status_text = "Will try: OpenCV → EasyOCR → Tesseract → PaddleOCR" + elif backend == "opencv": + status_text = "āœ… Always available - uses OpenCV DNN (EAST model)" + elif backend == "easyocr": + try: + import easyocr + status_text = "āœ… EasyOCR installed and ready" + except ImportError: + status_text = "āŒ EasyOCR not installed: pip install easyocr" + elif backend == "tesseract": + try: + import pytesseract + status_text = "āœ… Tesseract Python module installed" + except ImportError: + status_text = "āŒ pytesseract not installed: pip install pytesseract" + elif backend == "paddle": + try: + from paddleocr import PaddleOCR + status_text = "āœ… PaddleOCR installed" + except ImportError: + status_text = "āŒ PaddleOCR not installed: pip install paddlepaddle paddleocr" + + self.backend_status.setText(status_text) + self.backend_status.setStyleSheet( + "color: #4caf50;" if status_text.startswith("āœ…") else + "color: #f44336;" if status_text.startswith("āŒ") else "color: #888;" + ) + + def _update_gpu_info(self): + """Update GPU info label.""" + info_parts = [] + + # Check CUDA + try: + import cv2 + if cv2.cuda.getCudaEnabledDeviceCount() > 0: + info_parts.append("āœ… OpenCV CUDA") + else: + info_parts.append("āŒ OpenCV CUDA") + except: + info_parts.append("āŒ OpenCV CUDA") + + # Check PyTorch CUDA + try: + import torch + if torch.cuda.is_available(): + info_parts.append(f"āœ… PyTorch CUDA ({torch.cuda.get_device_name(0)})") + else: + info_parts.append("āŒ PyTorch CUDA") + except: + info_parts.append("āŒ PyTorch CUDA") + + self.gpu_info.setText(" | ".join(info_parts)) + + def _browse_log_path(self): + """Browse for log file.""" + path, _ = QFileDialog.getOpenFileName( + self, + "Select Entropia Universe chat.log", + "", + "Log Files (*.log);;All Files (*)" + ) + if path: + self.log_path_edit.setText(path) + + def _set_default_log_path(self): + """Set default log path.""" + default_path = Path.home() / "Documents" / "Entropia Universe" / "chat.log" + self.log_path_edit.setText(str(default_path)) + + def _test_hotkey(self, hotkey_type: str): + """Test a screenshot hotkey.""" + try: + from modules.auto_screenshot import AutoScreenshot + screenshots_dir = Path(__file__).parent.parent / "data" / "screenshots" + ss = AutoScreenshot(screenshots_dir) + + filename = f"test_{hotkey_type}_{datetime.now():%Y%m%d_%H%M%S}.png" + + if hotkey_type == "full": + filepath = ss.capture_full_screen(filename) + elif hotkey_type == "region": + import mss + with mss.mss() as sct: + monitor = sct.monitors[1] + x = (monitor['width'] - 800) // 2 + y = (monitor['height'] - 600) // 2 + filepath = ss.capture_region(x, y, 800, 600, filename) + elif hotkey_type == "loot": + import mss + with mss.mss() as sct: + monitor = sct.monitors[1] + x = monitor['width'] - 350 + y = monitor['height'] // 2 - 200 + filepath = ss.capture_region(x, y, 300, 400, filename) + elif hotkey_type == "hud": + import mss + with mss.mss() as sct: + monitor = sct.monitors[1] + w, h = 600, 150 + x = (monitor['width'] - w) // 2 + y = monitor['height'] - h - 50 + filepath = ss.capture_region(x, y, w, h, filename) + else: + filepath = None + + if filepath: + QMessageBox.information(self, "Screenshot Taken", f"Saved to:\n{filepath}") + else: + QMessageBox.warning(self, "Error", "Failed to capture screenshot") + + except Exception as e: + QMessageBox.critical(self, "Error", f"Screenshot failed:\n{e}") + + def _test_ocr(self): + """Test OCR functionality.""" + QMessageBox.information(self, "OCR Test", "OCR test will be implemented in the Vision Test dialog.") + # TODO: Open vision test dialog + + def _test_icon_detection(self): + """Test icon detection.""" + QMessageBox.information(self, "Icon Detection", "Icon detection test will be implemented in the Vision Test dialog.") + # TODO: Open vision test dialog + + def _calibrate_vision(self): + """Open vision calibration.""" + QMessageBox.information(self, "Calibration", "Vision calibration will be implemented in the Calibration dialog.") + # TODO: Open calibration dialog + + def _backup_database(self): + """Backup the database.""" + if not self.db: + QMessageBox.warning(self, "Error", "Database not connected") + return + + try: + import shutil + from datetime import datetime + + backup_path = self.db.db_path.parent / f"lemontropia_backup_{datetime.now():%Y%m%d_%H%M%S}.db" + shutil.copy2(self.db.db_path, backup_path) + + QMessageBox.information(self, "Backup Complete", f"Database backed up to:\n{backup_path}") + except Exception as e: + QMessageBox.critical(self, "Backup Failed", str(e)) + + def _export_data(self): + """Export data to CSV/JSON.""" + QMessageBox.information(self, "Export", "Export functionality coming soon!") + + def _on_save(self): + """Save all settings.""" + try: + # General + self._settings.setValue("player/name", self.player_name_edit.text().strip()) + self._settings.setValue("log/path", self.log_path_edit.text().strip()) + self._settings.setValue("log/auto_detect", self.auto_detect_check.isChecked()) + self._settings.setValue("activity/default", self.default_activity_combo.currentData()) + + # Hotkeys + self._settings.setValue("hotkey/screenshot_full", self.hotkey_full_edit.text().strip()) + self._settings.setValue("hotkey/screenshot_region", self.hotkey_region_edit.text().strip()) + self._settings.setValue("hotkey/screenshot_loot", self.hotkey_loot_edit.text().strip()) + self._settings.setValue("hotkey/screenshot_hud", self.hotkey_hud_edit.text().strip()) + + # Computer Vision + self._settings.setValue("cv/backend", self.cv_backend_combo.currentData()) + self._settings.setValue("cv/use_gpu", self.cv_use_gpu_check.isChecked()) + self._settings.setValue("cv/confidence", self.cv_confidence_spin.value()) + + # Advanced + self._settings.setValue("performance/fps_limit", self.fps_limit_spin.value()) + self._settings.setValue("logging/level", self.log_level_combo.currentText()) + + self._settings.sync() + + QMessageBox.information(self, "Settings Saved", "All settings have been saved successfully!") + self.accept() + + except Exception as e: + QMessageBox.critical(self, "Error", f"Failed to save settings:\n{e}") + + def _on_reset(self): + """Reset settings to defaults.""" + reply = QMessageBox.question( + self, + "Reset Settings", + "Are you sure you want to reset all settings to defaults?", + QMessageBox.StandardButton.Yes | QMessageBox.StandardButton.No + ) + + if reply == QMessageBox.StandardButton.Yes: + # Clear all settings + self._settings.clear() + self._settings.sync() + + QMessageBox.information(self, "Settings Reset", "Settings have been reset. Please restart the application.") + self.reject() + + def _apply_dark_theme(self): + """Apply dark theme to the dialog.""" + self.setStyleSheet(""" + QDialog { + background-color: #1e1e1e; + color: #e0e0e0; + } + QTabWidget::pane { + background-color: #252525; + border: 1px solid #444; + border-radius: 4px; + } + QTabBar::tab { + background-color: #2d2d2d; + padding: 8px 16px; + border: 1px solid #444; + border-bottom: none; + border-top-left-radius: 4px; + border-top-right-radius: 4px; + } + QTabBar::tab:selected { + background-color: #0d47a1; + } + QGroupBox { + font-weight: bold; + border: 1px solid #444; + border-radius: 6px; + margin-top: 10px; + padding-top: 10px; + } + QGroupBox::title { + subcontrol-origin: margin; + left: 10px; + padding: 0 5px; + } + QLineEdit, QComboBox, QSpinBox, QDoubleSpinBox { + background-color: #252525; + border: 1px solid #444; + border-radius: 4px; + padding: 6px; + color: #e0e0e0; + } + QPushButton { + background-color: #0d47a1; + border: 1px solid #1565c0; + border-radius: 4px; + padding: 6px 12px; + color: white; + } + QPushButton:hover { + background-color: #1565c0; + } + QLabel { + color: #e0e0e0; + } + """) \ No newline at end of file