364 lines
9.1 KiB
Markdown
364 lines
9.1 KiB
Markdown
# Vision System Plan - Game UI Reading
|
|
|
|
## Goal
|
|
Read equipped weapon/tool names, mob names, and other UI text from Entropia Universe game window with high accuracy.
|
|
|
|
## Two Approaches
|
|
|
|
### Approach 1: Template Matching + OCR (Recommended)
|
|
**Best balance of accuracy and performance.**
|
|
|
|
**How it works:**
|
|
1. Take screenshot of game window
|
|
2. Use template matching to find specific UI regions (weapon slot, target window, etc.)
|
|
3. Crop to those regions
|
|
4. Run OCR on cropped regions only
|
|
5. Much more accurate than full-screen OCR
|
|
|
|
**Pros:**
|
|
- Fast (only OCR small regions)
|
|
- Accurate (focused on known text areas)
|
|
- Works with game updates (just update templates)
|
|
- Low CPU usage
|
|
|
|
**Cons:**
|
|
- Requires creating templates for each UI layout
|
|
- UI position changes require template updates
|
|
|
|
---
|
|
|
|
### Approach 2: Pure Computer Vision (Advanced)
|
|
**Use object detection to find and read text regions automatically.**
|
|
|
|
**How it works:**
|
|
1. Train YOLO/SSD model to detect UI elements (weapon icons, text boxes, health bars)
|
|
2. Run inference on game screenshot
|
|
3. Crop detected regions
|
|
4. OCR or classifier on each region
|
|
|
|
**Pros:**
|
|
- Adapts to UI changes automatically
|
|
- Can detect new elements without templates
|
|
- Very robust
|
|
|
|
**Cons:**
|
|
- Requires training data (thousands of labeled screenshots)
|
|
- Higher CPU/GPU usage
|
|
- Complex to implement
|
|
- Overkill for this use case
|
|
|
|
---
|
|
|
|
## Recommended: Template Matching + OCR Pipeline
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Game Window
|
|
↓
|
|
Screenshot (mss or PIL)
|
|
↓
|
|
Template Matching (OpenCV)
|
|
↓
|
|
Crop Regions of Interest
|
|
↓
|
|
OCR (PaddleOCR or EasyOCR)
|
|
↓
|
|
Parse Results
|
|
↓
|
|
Update Loadout/HUD
|
|
```
|
|
|
|
### Implementation Plan
|
|
|
|
#### Phase 1: Screen Capture
|
|
```python
|
|
import mss
|
|
import numpy as np
|
|
|
|
def capture_game_window():
|
|
"""Capture Entropia Universe window."""
|
|
with mss.mss() as sct:
|
|
# Find game window by title
|
|
# Windows: use win32gui
|
|
# Return: numpy array (BGR for OpenCV)
|
|
pass
|
|
```
|
|
|
|
#### Phase 2: Template Matching
|
|
```python
|
|
import cv2
|
|
|
|
class UIFinder:
|
|
def __init__(self, template_dir):
|
|
self.templates = self._load_templates(template_dir)
|
|
|
|
def find_weapon_slot(self, screenshot):
|
|
"""Find weapon slot in screenshot."""
|
|
template = self.templates['weapon_slot']
|
|
result = cv2.matchTemplate(screenshot, template, cv2.TM_CCOEFF_NORMED)
|
|
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
|
|
|
|
if max_val > 0.8: # Threshold
|
|
x, y = max_loc
|
|
h, w = template.shape[:2]
|
|
return (x, y, w, h) # Region
|
|
return None
|
|
|
|
def find_target_window(self, screenshot):
|
|
"""Find mob target window."""
|
|
# Similar to above
|
|
pass
|
|
```
|
|
|
|
#### Phase 3: Region OCR
|
|
```python
|
|
from paddleocr import PaddleOCR
|
|
|
|
class RegionOCR:
|
|
def __init__(self):
|
|
# Use English only for speed
|
|
self.ocr = PaddleOCR(
|
|
lang='en',
|
|
use_gpu=False, # CPU only
|
|
show_log=False,
|
|
det_model_dir=None, # Use default
|
|
rec_model_dir=None,
|
|
)
|
|
|
|
def read_weapon_name(self, screenshot, region):
|
|
"""OCR weapon name from specific region."""
|
|
x, y, w, h = region
|
|
crop = screenshot[y:y+h, x:x+w]
|
|
|
|
# Preprocess for better OCR
|
|
gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
|
|
_, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY)
|
|
|
|
result = self.ocr.ocr(thresh, cls=False)
|
|
|
|
if result and result[0]:
|
|
text = result[0][0][1][0] # Extract text
|
|
confidence = result[0][0][1][1]
|
|
return text, confidence
|
|
return None, 0.0
|
|
```
|
|
|
|
#### Phase 4: Integration
|
|
```python
|
|
class GameVision:
|
|
"""Main vision system."""
|
|
|
|
def __init__(self):
|
|
self.finder = UIFinder('templates/')
|
|
self.ocr = RegionOCR()
|
|
|
|
def get_equipped_weapon(self):
|
|
"""Read currently equipped weapon name."""
|
|
screenshot = capture_game_window()
|
|
region = self.finder.find_weapon_slot(screenshot)
|
|
|
|
if region:
|
|
name, conf = self.ocr.read_weapon_name(screenshot, region)
|
|
if conf > 0.8:
|
|
return name
|
|
return None
|
|
|
|
def get_target_mob(self):
|
|
"""Read current target mob name."""
|
|
screenshot = capture_game_window()
|
|
region = self.finder.find_target_window(screenshot)
|
|
|
|
if region:
|
|
name, conf = self.ocr.read_text(screenshot, region)
|
|
if conf > 0.8:
|
|
return name
|
|
return None
|
|
```
|
|
|
|
---
|
|
|
|
## Template Creation Process
|
|
|
|
### Step 1: Capture Reference Screenshots
|
|
```python
|
|
def capture_templates():
|
|
"""Interactive tool to capture UI templates."""
|
|
print("1. Open Entropia Universe")
|
|
print("2. Equip your weapon")
|
|
print("3. Press SPACE when ready to capture weapon slot template")
|
|
input()
|
|
|
|
screenshot = capture_game_window()
|
|
|
|
# User drags to select region
|
|
region = select_region_interactive(screenshot)
|
|
|
|
# Save template
|
|
x, y, w, h = region
|
|
template = screenshot[y:y+h, x:x+w]
|
|
cv2.imwrite('templates/weapon_slot.png', template)
|
|
```
|
|
|
|
### Step 2: Create Template Library
|
|
```
|
|
templates/
|
|
├── weapon_slot.png # Weapon/tool equipped area
|
|
├── weapon_name_region.png # Just the text part
|
|
├── target_window.png # Target mob window
|
|
├── target_name_region.png # Mob name text
|
|
├── health_bar.png # Player health
|
|
├── tool_slot.png # Mining tool/finder
|
|
└── README.md # Template info
|
|
```
|
|
|
|
---
|
|
|
|
## OCR Engine Comparison
|
|
|
|
| Engine | Speed | Accuracy | Setup | Best For |
|
|
|--------|-------|----------|-------|----------|
|
|
| **PaddleOCR** | Medium | High | Easy | General text, multi-language |
|
|
| **EasyOCR** | Medium | High | Easy | English only, simple text |
|
|
| **Tesseract** | Slow | Medium | Medium | Legacy support |
|
|
| **PaddleOCR + GPU** | Fast | High | Complex | Real-time if GPU available |
|
|
|
|
**Recommendation: PaddleOCR** (already used in project)
|
|
|
|
---
|
|
|
|
## Performance Optimizations
|
|
|
|
### 1. Region of Interest Only
|
|
```python
|
|
# BAD: OCR entire screen
|
|
result = ocr.ocr(full_screenshot)
|
|
|
|
# GOOD: OCR only weapon region
|
|
result = ocr.ocr(weapon_region)
|
|
```
|
|
|
|
### 2. Frame Skipping
|
|
```python
|
|
class VisionPoller:
|
|
def __init__(self):
|
|
self.last_check = 0
|
|
self.check_interval = 2.0 # seconds
|
|
|
|
def poll(self):
|
|
if time.time() - self.last_check < self.check_interval:
|
|
return # Skip this frame
|
|
|
|
# Do OCR
|
|
self.last_check = time.time()
|
|
```
|
|
|
|
### 3. Async Processing
|
|
```python
|
|
import asyncio
|
|
|
|
async def vision_loop():
|
|
while True:
|
|
screenshot = await capture_async()
|
|
weapon = await ocr_weapon_async(screenshot)
|
|
if weapon:
|
|
update_loadout(weapon)
|
|
await asyncio.sleep(2)
|
|
```
|
|
|
|
### 4. Confidence Thresholding
|
|
```python
|
|
name, confidence = ocr.read_weapon(screenshot)
|
|
|
|
if confidence < 0.7:
|
|
# Too uncertain, skip this reading
|
|
return None
|
|
|
|
if confidence < 0.9:
|
|
# Flag for manual verification
|
|
log.warning(f"Low confidence reading: {name} ({confidence:.2f})")
|
|
|
|
return name
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Roadmap
|
|
|
|
### Week 1: Foundation
|
|
- [ ] Create screen capture module (Windows window handle)
|
|
- [ ] Install PaddleOCR (if not already)
|
|
- [ ] Test basic OCR on game screenshots
|
|
- [ ] Create template capture tool
|
|
|
|
### Week 2: Templates
|
|
- [ ] Capture weapon slot template
|
|
- [ ] Capture target window template
|
|
- [ ] Test template matching accuracy
|
|
- [ ] Handle different resolutions/UI scales
|
|
|
|
### Week 3: Integration
|
|
- [ ] Create GameVision class
|
|
- [ ] Integrate with Loadout Manager
|
|
- [ ] Auto-update equipped weapon detection
|
|
- [ ] Mob name logging for hunts
|
|
|
|
### Week 4: Polish
|
|
- [ ] Performance optimization
|
|
- [ ] Confidence thresholds
|
|
- [ ] Error handling
|
|
- [ ] Documentation
|
|
|
|
---
|
|
|
|
## Expected Accuracy
|
|
|
|
| UI Element | Expected Accuracy | Notes |
|
|
|------------|-------------------|-------|
|
|
| Weapon Name | 85-95% | Clear text, fixed position |
|
|
| Tool Name | 85-95% | Similar to weapon |
|
|
| Mob Name | 70-85% | Can be complex names, smaller text |
|
|
| Health Values | 90-98% | Numbers are easier |
|
|
| Damage Numbers | 80-90% | Floating text, harder to catch |
|
|
|
|
**Why not 100%?**
|
|
- Font rendering variations
|
|
- Transparency/effects
|
|
- Screen scaling
|
|
- Anti-aliasing
|
|
|
|
---
|
|
|
|
## Alternative: UI Memory Reading (Advanced)
|
|
|
|
**WARNING: May violate TOS - Research first!**
|
|
|
|
Some games expose UI data in memory. This would be:
|
|
- Instant (no screenshot/OCR)
|
|
- 100% accurate
|
|
- Much lower CPU usage
|
|
|
|
**Research needed:**
|
|
- Check Entropia Universe EULA
|
|
- Look for public memory maps
|
|
- Use tools like Cheat Engine (offline only!)
|
|
|
|
**Not recommended** unless explicitly allowed.
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
**Best approach for Lemontropia:**
|
|
1. **Template Matching + OCR** - Good accuracy, reasonable performance
|
|
2. Capture templates for weapon slot, target window
|
|
3. OCR only those regions
|
|
4. Update every 2-5 seconds (not every frame)
|
|
5. Use confidence thresholds to filter bad reads
|
|
|
|
**Next Steps:**
|
|
1. I can create the template capture tool
|
|
2. Create the vision module structure
|
|
3. Integrate with existing loadout system
|
|
|
|
Want me to implement any part of this? |