diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 2bf9e4d..4a87909 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -51,9 +51,9 @@ The scaffold is functional with image display, async OCR pipeline, drag-select o - Drag-select overlay with word highlighting - Ctrl+C clipboard copy - Stale OCR result cancellation via monotonic job IDs +- Zoom & pan (Ctrl+scroll, pinch, +/- keys, middle-drag pan) via custom `ZoomableCanvas` widget ### What's not implemented yet: -- Zoom and pan - Info panel (filename, dimensions, file size) - Context menu (right-click copy) - OCR caching (cache module exists but is not wired up) @@ -81,5 +81,6 @@ cargo test --all - Never block the GTK main thread — all OCR and I/O runs on background threads - Use async-channel to send results back to the UI thread - quickview-core must have zero GTK dependencies (keeps it testable without a display server) -- Coordinate transforms go through `compute_contain_transform()` — image coords vs widget coords +- Coordinate transforms go through `ViewTransform::from_center()` and related methods in `geometry.rs` — image coords vs widget coords. Fields are private; use `.scale()`, `.offset_x()`, `.offset_y()` getters. `contain()` returns `ContainResult`. - OCR results use image-space coordinates; convert to widget-space only for rendering +- OCR hit-testing uses `OcrWordIndex` spatial index (`ocr/index.rs`) for efficient drag-select queries diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 072d583..c45730a 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -32,6 +32,7 @@ jobs: gtk4 libadwaita \ tesseract tesseract-data-eng \ gtk4-layer-shell + pkg-config --atleast-version=4.10 gtk4 rustup default stable - name: Cache cargo registry and build artifacts @@ -74,6 +75,7 @@ jobs: gtk4 libadwaita \ tesseract tesseract-data-eng \ gtk4-layer-shell + pkg-config --atleast-version=4.10 gtk4 rustup default stable - name: Cache cargo registry, build artifacts, and tools diff --git a/AGENTS.md b/AGENTS.md index 18717e5..a6333d2 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -12,7 +12,7 @@ Primary target is Arch Linux + Wayland (wlroots compositors like Sway/Hyprland/n ## Repo Layout - `crates/quickview/`: CLI entrypoint (`quickview` binary). -- `crates/quickview-core/`: non-GTK core (OCR parsing, geometry, selection logic, cache helpers). +- `crates/quickview-core/`: non-GTK core (OCR parsing, geometry/ViewTransform, spatial index, selection logic, cache helpers). - `crates/quickview-ui/`: GTK4/libadwaita UI (full viewer + quick preview windows, overlay widget). - `docs/`: phased plan, architecture, decisions, development notes. - `adrs/`: deeper architecture decisions. @@ -83,7 +83,7 @@ GitHub Actions runs in an `archlinux:latest` container and installs system packa - Quick Preview window: `crates/quickview-ui/src/windows/quick_preview.rs` - Full viewer window: `crates/quickview-ui/src/windows/full_viewer.rs` - Viewer controller (loads images, kicks OCR, ignores late results): `crates/quickview-ui/src/windows/shared.rs` -- Overlay + drag selection rendering: `crates/quickview-ui/src/widgets/image_overlay.rs` +- Image rendering, zoom/pan, drag selection, OCR overlay: `crates/quickview-ui/src/widgets/image_overlay.rs` (contains `ImageOverlayWidget` wrapper + `ZoomableCanvas` custom widget subclass) - Tesseract invocation + TSV parsing: `crates/quickview-core/src/ocr/` ## Project Invariants (Don't Break These) diff --git a/CHANGELOG.md b/CHANGELOG.md index 27ef7d8..c1bffa8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,5 +2,15 @@ ## Unreleased -- Initial scaffold. +### Added +- **Zoom & pan** — Ctrl+scroll zoom (anchored at cursor), pinch-to-zoom, + `+`/`-` keyboard zoom, `0`/`Home` reset to fit-to-window. Middle-click drag + or Ctrl+left-drag to pan. Works in both Full Viewer and Quick Preview. + Selection and OCR highlights stay aligned at all zoom levels. +- **Spatial index for OCR hit-testing** — `OcrWordIndex` uniform-grid index + replaces linear scan during drag-select for faster word lookup. +- **ViewTransform hardening** — validated constructor rejects non-finite and + non-positive scale values; fields are now private with getters. +- **CI GTK4 version check** — `pkg-config --atleast-version=4.10 gtk4` in CI. +- Initial scaffold. diff --git a/README.md b/README.md index 1a82d55..b691168 100644 --- a/README.md +++ b/README.md @@ -35,11 +35,13 @@ quickview --quick-preview photo.png quickview photo.png ``` -**OCR Text Selection** — Tesseract runs asynchronously after the image loads. Drag to select recognized words, `Ctrl+C` to copy. +**Zoom & Pan** — `Ctrl+scroll` to zoom at cursor, pinch-to-zoom on touchpad, `+`/`-` keys, `0` to reset. Middle-click drag or `Ctrl+left-drag` to pan. + +**OCR Text Selection** — Tesseract runs asynchronously after the image loads. Drag to select recognized words, `Ctrl+C` to copy. Selection stays aligned at any zoom level. ## Requirements -Arch Linux (primary target): +Arch Linux (primary target, requires GTK4 >= 4.10): ```bash sudo pacman -S --needed \ diff --git a/crates/quickview-core/src/geometry.rs b/crates/quickview-core/src/geometry.rs index 3d34af2..c2fda37 100644 --- a/crates/quickview-core/src/geometry.rs +++ b/crates/quickview-core/src/geometry.rs @@ -1,5 +1,7 @@ use serde::{Deserialize, Serialize}; +use std::fmt; + #[derive(Debug, Default, Clone, Copy, PartialEq, Serialize, Deserialize)] pub struct Point { pub x: f64, @@ -41,3 +43,295 @@ impl Rect { self.x < bx2 && ax2 > other.x && self.y < by2 && ay2 > other.y } } + +/// Result of `ViewTransform::contain()`. +/// +/// This represents the baseline "fit to widget" (contain) scale and the widget-space +/// center point used by `ViewTransform::from_center()`. +#[derive(Debug, Clone, Copy, PartialEq)] +pub struct ContainResult { + /// Uniform scale that fits the entire image inside the widget. + pub contain_scale: f64, + + /// Center of the widget in widget coordinates (pixels). + pub widget_center: Point, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum ViewTransformError { + NonFinite, + NonPositiveScale, +} + +impl fmt::Display for ViewTransformError { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + match self { + ViewTransformError::NonFinite => write!(f, "non-finite view transform value"), + ViewTransformError::NonPositiveScale => write!(f, "scale must be > 0"), + } + } +} + +impl std::error::Error for ViewTransformError {} + +#[derive(Debug, Clone, Copy, PartialEq)] +pub struct ViewTransform { + scale: f64, + offset_x: f64, + offset_y: f64, +} + +impl ViewTransform { + pub fn new(scale: f64, offset_x: f64, offset_y: f64) -> Result { + if !scale.is_finite() || !offset_x.is_finite() || !offset_y.is_finite() { + return Err(ViewTransformError::NonFinite); + } + if scale <= 0.0 { + return Err(ViewTransformError::NonPositiveScale); + } + Ok(Self { + scale, + offset_x, + offset_y, + }) + } + + pub fn scale(&self) -> f64 { + self.scale + } + + pub fn offset_x(&self) -> f64 { + self.offset_x + } + + pub fn offset_y(&self) -> f64 { + self.offset_y + } + + /// Compute the baseline "contain" (fit-to-widget) scale. + /// + /// The returned `widget_center` is in widget coordinates (pixels) and is the point + /// that `from_center()` treats as the widget's visual center anchor. + pub fn contain(widget_w: f64, widget_h: f64, image_w: f64, image_h: f64) -> ContainResult { + let widget_center = Point { + x: widget_w.max(0.0) * 0.5, + y: widget_h.max(0.0) * 0.5, + }; + + if widget_w <= 0.0 || widget_h <= 0.0 || image_w <= 0.0 || image_h <= 0.0 { + return ContainResult { + contain_scale: 1.0, + widget_center, + }; + } + + let contain_scale = (widget_w / image_w) + .min(widget_h / image_h) + .max(f64::MIN_POSITIVE); + ContainResult { + contain_scale, + widget_center, + } + } + + /// Construct a `ViewTransform` from canonical view state. + /// + /// `center_img.x` and `center_img.y` must be finite. This function delegates validation + /// to `ViewTransform::new` and will panic if invariants are violated. + pub fn from_center( + widget_w: f64, + widget_h: f64, + image_w: f64, + image_h: f64, + zoom_factor: f64, + center_img: Point, + ) -> Self { + // `from_center()` delegates invariants to `ViewTransform::new`. + // `center_img.x` / `center_img.y` must be finite or `ViewTransform::new` will error. + debug_assert!( + center_img.x.is_finite() && center_img.y.is_finite(), + "from_center: center_img must be finite (x={}, y={})", + center_img.x, + center_img.y + ); + + let contain = Self::contain(widget_w, widget_h, image_w, image_h); + let scale = + (contain.contain_scale * zoom_factor.max(f64::MIN_POSITIVE)).max(f64::MIN_POSITIVE); + + let offset_x = contain.widget_center.x - center_img.x * scale; + let offset_y = contain.widget_center.y - center_img.y * scale; + Self::new(scale, offset_x, offset_y).expect("ViewTransform invariants violated") + } + + pub fn image_to_widget(&self, point: Point) -> Point { + Point { + x: self.offset_x + point.x * self.scale, + y: self.offset_y + point.y * self.scale, + } + } + + pub fn widget_to_image(&self, point: Point) -> Point { + Point { + x: (point.x - self.offset_x) / self.scale, + y: (point.y - self.offset_y) / self.scale, + } + } + + pub fn image_rect_to_widget(&self, rect: Rect) -> Rect { + Rect { + x: self.offset_x + rect.x * self.scale, + y: self.offset_y + rect.y * self.scale, + w: rect.w * self.scale, + h: rect.h * self.scale, + } + } + + pub fn widget_rect_to_image(&self, rect: Rect) -> Rect { + Rect { + x: (rect.x - self.offset_x) / self.scale, + y: (rect.y - self.offset_y) / self.scale, + w: rect.w / self.scale, + h: rect.h / self.scale, + } + } + + pub fn clamp_center( + widget_w: f64, + widget_h: f64, + image_w: f64, + image_h: f64, + scale: f64, + center_img: Point, + ) -> Point { + if widget_w <= 0.0 || widget_h <= 0.0 || image_w <= 0.0 || image_h <= 0.0 || scale <= 0.0 { + return center_img; + } + + let half_view_w = widget_w / (2.0 * scale); + let half_view_h = widget_h / (2.0 * scale); + + let center_x = if image_w * scale <= widget_w { + image_w * 0.5 + } else { + center_img.x.clamp(half_view_w, image_w - half_view_w) + }; + + let center_y = if image_h * scale <= widget_h { + image_h * 0.5 + } else { + center_img.y.clamp(half_view_h, image_h - half_view_h) + }; + + Point { + x: center_x, + y: center_y, + } + } +} + +#[cfg(test)] +mod tests { + use super::{Point, ViewTransform}; + + fn approx_eq(a: f64, b: f64, eps: f64) { + assert!((a - b).abs() <= eps, "{a} != {b} (eps={eps})"); + } + + #[test] + fn image_and_widget_mapping_are_inverse() { + let t = ViewTransform::from_center( + 1200.0, + 800.0, + 2400.0, + 1600.0, + 2.25, + Point { x: 900.0, y: 600.0 }, + ); + + let img = Point { + x: 1234.5, + y: 345.25, + }; + let widget = t.image_to_widget(img); + let roundtrip = t.widget_to_image(widget); + + approx_eq(roundtrip.x, img.x, 1e-9); + approx_eq(roundtrip.y, img.y, 1e-9); + } + + #[test] + fn anchor_preserving_zoom_keeps_widget_anchor_fixed() { + let widget_w = 1000.0; + let widget_h = 700.0; + let image_w = 3000.0; + let image_h = 2000.0; + + let center_start = Point { + x: 1300.0, + y: 900.0, + }; + let zoom_start = 1.3; + let zoom_new = 2.1; + let anchor_widget = Point { x: 120.0, y: 520.0 }; + + let t_start = ViewTransform::from_center( + widget_w, + widget_h, + image_w, + image_h, + zoom_start, + center_start, + ); + let anchor_img = t_start.widget_to_image(anchor_widget); + + let t_new_unclamped = ViewTransform::from_center( + widget_w, + widget_h, + image_w, + image_h, + zoom_new, + center_start, + ); + let contain = ViewTransform::contain(widget_w, widget_h, image_w, image_h); + let widget_center = contain.widget_center; + let center_new = Point { + x: anchor_img.x - (anchor_widget.x - widget_center.x) / t_new_unclamped.scale(), + y: anchor_img.y - (anchor_widget.y - widget_center.y) / t_new_unclamped.scale(), + }; + let t_new = + ViewTransform::from_center(widget_w, widget_h, image_w, image_h, zoom_new, center_new); + + let mapped_anchor = t_new.image_to_widget(anchor_img); + approx_eq(mapped_anchor.x, anchor_widget.x, 1e-9); + approx_eq(mapped_anchor.y, anchor_widget.y, 1e-9); + } + + #[test] + fn clamp_center_forces_image_center_when_scaled_image_fits() { + let center = + ViewTransform::clamp_center(1000.0, 800.0, 300.0, 200.0, 2.0, Point { x: 0.0, y: 0.0 }); + approx_eq(center.x, 150.0, 1e-9); + approx_eq(center.y, 100.0, 1e-9); + } + + #[test] + fn clamp_center_limits_pan_when_scaled_image_exceeds_viewport() { + let center = ViewTransform::clamp_center( + 1000.0, + 700.0, + 3000.0, + 2000.0, + 0.6, + Point { + x: -5000.0, + y: 5000.0, + }, + ); + + // half_view_w = 1000 / (2 * 0.6) = 833.333... + // half_view_h = 700 / (2 * 0.6) = 583.333... + approx_eq(center.x, 833.3333333333334, 1e-9); + approx_eq(center.y, 1416.6666666666667, 1e-9); + } +} diff --git a/crates/quickview-core/src/ocr/index.rs b/crates/quickview-core/src/ocr/index.rs new file mode 100644 index 0000000..90f92ad --- /dev/null +++ b/crates/quickview-core/src/ocr/index.rs @@ -0,0 +1,305 @@ +use crate::geometry::Rect; + +use super::models::OcrWord; + +const DEFAULT_CELL_SIZE: f64 = 256.0; + +/// A simple uniform-grid spatial index for OCR word bounding boxes. +/// +/// The index is built in image coordinates and can be queried with a rectangle to +/// efficiently find intersecting words. +/// +/// # Contract +/// This index stores buckets of *indices* into a specific OCR word list. Callers must +/// rebuild the index whenever the underlying `words` slice changes, including: +/// - replacing the OCR result +/// - reordering words +/// - mutating any word bounding boxes +/// +/// Calling `query_intersecting()` with a different `words` slice than the one used to +/// build the index may produce incorrect results. +#[derive(Debug, Clone)] +pub struct OcrWordIndex { + cell_size: f64, + grid_w: usize, + grid_h: usize, + buckets: Vec>, + seen: Vec, + seen_gen: u32, +} + +impl OcrWordIndex { + pub fn build(words: &[OcrWord], image_w: f64, image_h: f64) -> Self { + Self::build_with_cell_size(words, image_w, image_h, DEFAULT_CELL_SIZE) + } + + pub fn build_with_cell_size( + words: &[OcrWord], + image_w: f64, + image_h: f64, + cell_size: f64, + ) -> Self { + let cell_size = cell_size.max(1.0); + + let grid_w = ((image_w.max(1.0) / cell_size).ceil() as usize).max(1); + let grid_h = ((image_h.max(1.0) / cell_size).ceil() as usize).max(1); + let mut buckets = vec![Vec::::new(); grid_w.saturating_mul(grid_h).max(1)]; + + for (idx, w) in words.iter().enumerate() { + Self::insert_bbox(&mut buckets, grid_w, grid_h, cell_size, idx, w.bbox); + } + + Self { + cell_size, + grid_w, + grid_h, + buckets, + seen: vec![0; words.len()], + seen_gen: 1, + } + } + + /// Return indices of words whose bounding boxes intersect `rect`. + /// + /// `words` must be the same word list used when building this index (same ordering and + /// bounding boxes). If you swap or mutate the word list, rebuild via `OcrWordIndex::build(...)` + /// before calling this method again. + pub fn query_intersecting(&mut self, words: &[OcrWord], rect: &Rect) -> Vec { + if words.is_empty() { + return Vec::new(); + } + + if self.seen.len() != words.len() { + // Best-effort hygiene. The index must be rebuilt when `words` changes; this is only + // to avoid panics from the internal dedupe vector length drifting. + self.seen = vec![0; words.len()]; + self.seen_gen = 1; + } + + let Some((x0, y0, x1, y1)) = + Self::cell_range(self.cell_size, self.grid_w, self.grid_h, rect) + else { + return Vec::new(); + }; + + let gen = self.next_seen_gen(); + let mut out = Vec::new(); + + for gy in y0..=y1 { + for gx in x0..=x1 { + let bucket_idx = gy * self.grid_w + gx; + if let Some(bucket) = self.buckets.get(bucket_idx) { + for &word_idx in bucket { + // If the caller violates the contract and supplies a different `words` + // slice than the one used at build time, buckets can contain indices that + // are out of range. Skip rather than panic. + if word_idx >= words.len() || word_idx >= self.seen.len() { + continue; + } + + if self.seen[word_idx] == gen { + continue; + } + self.seen[word_idx] = gen; + + if words.get(word_idx).is_some_and(|w| w.bbox.intersects(rect)) { + out.push(word_idx); + } + } + } + } + } + + out + } + + fn next_seen_gen(&mut self) -> u32 { + if self.seen_gen == u32::MAX { + self.seen.fill(0); + self.seen_gen = 1; + } else { + self.seen_gen += 1; + } + self.seen_gen + } + + fn insert_bbox( + buckets: &mut [Vec], + grid_w: usize, + grid_h: usize, + cell_size: f64, + word_idx: usize, + bbox: Rect, + ) { + if grid_w == 0 || grid_h == 0 || cell_size <= 0.0 { + return; + } + + if bbox.w <= 0.0 || bbox.h <= 0.0 { + return; + } + + let x0 = (bbox.x / cell_size).floor() as isize; + let y0 = (bbox.y / cell_size).floor() as isize; + let x1 = ((bbox.x + bbox.w) / cell_size).floor() as isize; + let y1 = ((bbox.y + bbox.h) / cell_size).floor() as isize; + + let x0 = x0.clamp(0, (grid_w - 1) as isize) as usize; + let y0 = y0.clamp(0, (grid_h - 1) as isize) as usize; + let x1 = x1.clamp(0, (grid_w - 1) as isize) as usize; + let y1 = y1.clamp(0, (grid_h - 1) as isize) as usize; + + for gy in y0..=y1 { + for gx in x0..=x1 { + let bucket_idx = gy * grid_w + gx; + if let Some(bucket) = buckets.get_mut(bucket_idx) { + bucket.push(word_idx); + } + } + } + } + + fn cell_range( + cell_size: f64, + grid_w: usize, + grid_h: usize, + rect: &Rect, + ) -> Option<(usize, usize, usize, usize)> { + if cell_size <= 0.0 || grid_w == 0 || grid_h == 0 { + return None; + } + + // Keep semantics aligned with `Rect::intersects()`: degenerate rectangles (w==0 or h==0) + // can still "hit" boxes like a line/point selection. + if rect.w < 0.0 || rect.h < 0.0 { + return None; + } + + let x0 = (rect.x / cell_size).floor() as isize; + let y0 = (rect.y / cell_size).floor() as isize; + let x1 = ((rect.x + rect.w) / cell_size).floor() as isize; + let y1 = ((rect.y + rect.h) / cell_size).floor() as isize; + + let x0 = x0.clamp(0, (grid_w - 1) as isize) as usize; + let y0 = y0.clamp(0, (grid_h - 1) as isize) as usize; + let x1 = x1.clamp(0, (grid_w - 1) as isize) as usize; + let y1 = y1.clamp(0, (grid_h - 1) as isize) as usize; + + Some((x0, y0, x1, y1)) + } +} + +#[cfg(test)] +mod tests { + use super::OcrWordIndex; + use crate::geometry::Rect; + + use super::super::models::OcrWord; + + fn w(text: &str, bbox: Rect, order: usize) -> OcrWord { + OcrWord { + text: text.to_string(), + confidence: 99.0, + bbox, + order, + } + } + + #[test] + fn query_returns_intersecting_words_only() { + let words = vec![ + w( + "a", + Rect { + x: 10.0, + y: 10.0, + w: 10.0, + h: 10.0, + }, + 0, + ), + w( + "b", + Rect { + x: 300.0, + y: 10.0, + w: 10.0, + h: 10.0, + }, + 1, + ), + w( + "c", + Rect { + x: 10.0, + y: 300.0, + w: 10.0, + h: 10.0, + }, + 2, + ), + ]; + + let mut idx = OcrWordIndex::build_with_cell_size(&words, 1000.0, 1000.0, 64.0); + let r = Rect { + x: 290.0, + y: 0.0, + w: 50.0, + h: 50.0, + }; + let mut out = idx.query_intersecting(&words, &r); + out.sort_unstable(); + + assert_eq!(out, vec![1]); + } + + #[test] + fn query_deduplicates_words_that_span_multiple_cells() { + let words = vec![w( + "x", + Rect { + x: 60.0, + y: 60.0, + w: 10.0, + h: 10.0, + }, + 0, + )]; + + // With cell_size=64, this bbox overlaps both cell (0,0) and (1,1). + let mut idx = OcrWordIndex::build_with_cell_size(&words, 256.0, 256.0, 64.0); + let r = Rect { + x: 0.0, + y: 0.0, + w: 200.0, + h: 200.0, + }; + let out = idx.query_intersecting(&words, &r); + + assert_eq!(out, vec![0]); + } + + #[test] + fn degenerate_rects_still_hit_via_intersects_semantics() { + let words = vec![w( + "a", + Rect { + x: 10.0, + y: 10.0, + w: 10.0, + h: 10.0, + }, + 0, + )]; + let mut idx = OcrWordIndex::build_with_cell_size(&words, 100.0, 100.0, 32.0); + + // Point hit inside the word bbox. + let p = Rect { + x: 15.0, + y: 15.0, + w: 0.0, + h: 0.0, + }; + assert_eq!(idx.query_intersecting(&words, &p), vec![0]); + } +} diff --git a/crates/quickview-core/src/ocr/mod.rs b/crates/quickview-core/src/ocr/mod.rs index 870e262..2552ffd 100644 --- a/crates/quickview-core/src/ocr/mod.rs +++ b/crates/quickview-core/src/ocr/mod.rs @@ -1,5 +1,6 @@ //! OCR-related types and helpers. +pub mod index; pub mod models; pub mod select; pub mod tesseract; diff --git a/crates/quickview-ui/Cargo.toml b/crates/quickview-ui/Cargo.toml index db2dcfe..c350b37 100644 --- a/crates/quickview-ui/Cargo.toml +++ b/crates/quickview-ui/Cargo.toml @@ -16,7 +16,7 @@ tracing.workspace = true quickview-core = { path = "../quickview-core" } -gtk4 = { version = "0.10", package = "gtk4" } +gtk4 = { version = "0.10", package = "gtk4", features = ["v4_10"] } adw = { version = "0.8", package = "libadwaita", features = ["v1_4"] } gtk4-layer-shell = "0.7" @@ -25,4 +25,3 @@ async-channel = "2" # Optional: sandboxed image decoding (requires system glycin libs) glycin = { version = "3", optional = true } - diff --git a/crates/quickview-ui/src/widgets/image_overlay.rs b/crates/quickview-ui/src/widgets/image_overlay.rs index 92dfdee..068c397 100644 --- a/crates/quickview-ui/src/widgets/image_overlay.rs +++ b/crates/quickview-ui/src/widgets/image_overlay.rs @@ -1,41 +1,26 @@ -use std::{cell::RefCell, rc::Rc}; +use std::cell::RefCell; +use glib::subclass::types::ObjectSubclassIsExt; use gtk::prelude::*; +use gtk::subclass::prelude::*; use gtk4 as gtk; use quickview_core::{ - geometry::{Point, Rect}, - ocr::{models::OcrResult, select}, + geometry::{Point, Rect, ViewTransform}, + ocr::{index::OcrWordIndex, models::OcrResult, select}, }; -#[derive(Default)] -struct State { - image_width: f64, - image_height: f64, - ocr: Option, +const MIN_ZOOM_FACTOR: f64 = 1.0; +const BASE_MAX_ZOOM_FACTOR: f64 = 20.0; +const ZOOM_STEP: f64 = 1.25; +const INTEGER_SCALE_EPS: f64 = 0.02; +const PAN_DIM_EPS: f64 = 0.5; - // Current selection in widget coordinates. - selecting: bool, - select_start: Point, - select_current: Point, - - // Cached selected word indices (into ocr.words) - selected: Vec, -} - -/// Overlay widget that displays an image and (optionally) an OCR-backed selection layer. -/// -/// This is an MVP scaffold: -/// - selection is rectangle drag -/// - selected words are those whose bounding boxes intersect the rectangle #[derive(Clone)] pub struct ImageOverlayWidget { root: gtk::Overlay, - picture: gtk::Picture, - drawing: gtk::DrawingArea, + canvas: ZoomableCanvas, spinner: gtk::Spinner, - - state: Rc>, } impl Default for ImageOverlayWidget { @@ -48,16 +33,10 @@ impl ImageOverlayWidget { pub fn new() -> Self { let root = gtk::Overlay::new(); - let picture = gtk::Picture::new(); - picture.set_can_shrink(true); - - root.set_child(Some(&picture)); - - let drawing = gtk::DrawingArea::new(); - drawing.set_hexpand(true); - drawing.set_vexpand(true); - drawing.set_focusable(true); - root.add_overlay(&drawing); + let canvas = ZoomableCanvas::new(); + canvas.set_hexpand(true); + canvas.set_vexpand(true); + root.set_child(Some(&canvas)); let spinner = gtk::Spinner::new(); spinner.set_spinning(false); @@ -66,194 +45,928 @@ impl ImageOverlayWidget { spinner.set_valign(gtk::Align::Center); root.add_overlay(&spinner); - let state = Rc::new(RefCell::new(State::default())); + Self { + root, + canvas, + spinner, + } + } - // Draw highlights + selection rectangle. - { - let state = state.clone(); - drawing.set_draw_func(move |_, cr, width, height| { - let s = state.borrow(); + pub fn widget(&self) -> gtk::Widget { + self.root.clone().upcast() + } - if s.image_width <= 0.0 || s.image_height <= 0.0 { - return; - } + pub fn set_texture(&self, texture: gtk::gdk::Texture) { + self.canvas.set_texture(texture); + } - let (scale, ox, oy) = compute_contain_transform( - width as f64, - height as f64, - s.image_width, - s.image_height, - ); + pub fn set_ocr_result(&self, result: Option) { + self.canvas.set_ocr_result(result); + } - // Draw selection rectangle (widget coords) - if s.selecting { - let sel = Rect::from_points(s.select_start, s.select_current); - cr.set_source_rgba(0.2, 0.6, 1.0, 0.25); - cr.rectangle(sel.x, sel.y, sel.w, sel.h); - let _ = cr.fill(); - cr.set_source_rgba(0.2, 0.6, 1.0, 0.8); - cr.set_line_width(1.0); - cr.rectangle(sel.x, sel.y, sel.w, sel.h); - let _ = cr.stroke(); - } + pub fn set_ocr_busy(&self, busy: bool) { + self.spinner.set_spinning(busy); + self.spinner.set_visible(busy); + } - // Draw selected word boxes - if let Some(ocr) = &s.ocr { - cr.set_source_rgba(1.0, 1.0, 0.0, 0.25); - for &idx in &s.selected { - if let Some(w) = ocr.words.get(idx) { - let rx = ox + w.bbox.x * scale; - let ry = oy + w.bbox.y * scale; - let rw = w.bbox.w * scale; - let rh = w.bbox.h * scale; - cr.rectangle(rx, ry, rw, rh); - let _ = cr.fill(); - } + pub fn clear_selection(&self) { + self.canvas.clear_selection(); + } + + pub fn selected_text(&self) -> String { + self.canvas.selected_text() + } + + pub fn zoom_by(&self, factor: f64) { + self.canvas.zoom_by(factor); + } + + pub fn zoom_in(&self) { + self.canvas.zoom_by(ZOOM_STEP); + } + + pub fn zoom_out(&self) { + self.canvas.zoom_by(1.0 / ZOOM_STEP); + } + + pub fn reset_view(&self) { + self.canvas.reset_view(); + } +} + +mod imp { + use super::*; + + #[derive(Default)] + pub struct ZoomableCanvas { + pub(super) state: RefCell, + } + + #[glib::object_subclass] + impl ObjectSubclass for ZoomableCanvas { + const NAME: &'static str = "QuickViewZoomableCanvas"; + type Type = super::ZoomableCanvas; + type ParentType = gtk::Widget; + } + + impl ObjectImpl for ZoomableCanvas { + fn constructed(&self) { + self.parent_constructed(); + + let obj = self.obj(); + obj.set_focusable(true); + obj.setup_controllers(); + } + } + + impl WidgetImpl for ZoomableCanvas { + fn measure(&self, orientation: gtk::Orientation, for_size: i32) -> (i32, i32, i32, i32) { + let state = self.state.borrow(); + let natural = natural_size_for_measure( + orientation, + for_size, + state.image_width, + state.image_height, + ); + (1, natural, -1, -1) + } + + fn snapshot(&self, snapshot: >k::Snapshot) { + self.parent_snapshot(snapshot); + + let widget = self.obj(); + let widget_w = widget.width() as f64; + let widget_h = widget.height() as f64; + + let mut state = self.state.borrow_mut(); + let texture = state.texture.clone(); + let Some(transform) = transform_for_widget(&mut state, widget_w, widget_h) else { + return; + }; + + let Some(texture) = texture.as_ref() else { + return; + }; + + let bounds = gtk::graphene::Rect::new( + transform.offset_x() as f32, + transform.offset_y() as f32, + (state.image_width * transform.scale()) as f32, + (state.image_height * transform.scale()) as f32, + ); + snapshot.append_scaled_texture( + texture, + scaling_filter_for_scale(transform.scale()), + &bounds, + ); + + let overlay_bounds = + gtk::graphene::Rect::new(0.0, 0.0, widget_w as f32, widget_h as f32); + let cr = snapshot.append_cairo(&overlay_bounds); + + if state.selecting { + let sel = Rect::from_points(state.select_start_widget, state.select_current_widget); + cr.set_source_rgba(0.2, 0.6, 1.0, 0.25); + cr.rectangle(sel.x, sel.y, sel.w, sel.h); + let _ = cr.fill(); + + cr.set_source_rgba(0.2, 0.6, 1.0, 0.8); + cr.set_line_width(1.0); + cr.rectangle(sel.x, sel.y, sel.w, sel.h); + let _ = cr.stroke(); + } + + if let Some(ocr) = &state.ocr { + cr.set_source_rgba(1.0, 1.0, 0.0, 0.25); + for &idx in &state.selected_indices { + if let Some(word) = ocr.words.get(idx) { + let rect = transform.image_rect_to_widget(word.bbox); + cr.rectangle(rect.x, rect.y, rect.w, rect.h); + let _ = cr.fill(); } } + } + } + } + + #[derive(Clone)] + pub(super) struct CanvasState { + pub(super) texture: Option, + pub(super) image_width: f64, + pub(super) image_height: f64, + pub(super) ocr: Option, + pub(super) ocr_index: Option, + pub(super) selected_indices: Vec, + + pub(super) zoom_factor: f64, + pub(super) center_img: Point, + + pub(super) selecting: bool, + pub(super) select_start_widget: Point, + pub(super) select_current_widget: Point, + + pub(super) panning: bool, + pub(super) pan_start_widget: Point, + pub(super) pan_start_center_img: Point, + pub(super) last_cursor_widget: Option, + + pub(super) pinch_active: bool, + pub(super) pinch_start_zoom_factor: f64, + pub(super) pinch_start_center_img: Point, + pub(super) pinch_anchor_widget: Point, + } + + impl Default for CanvasState { + fn default() -> Self { + Self { + texture: None, + image_width: 0.0, + image_height: 0.0, + ocr: None, + ocr_index: None, + selected_indices: Vec::new(), + zoom_factor: 1.0, + center_img: Point::default(), + selecting: false, + select_start_widget: Point::default(), + select_current_widget: Point::default(), + panning: false, + pan_start_widget: Point::default(), + pan_start_center_img: Point::default(), + last_cursor_widget: None, + pinch_active: false, + pinch_start_zoom_factor: 1.0, + pinch_start_center_img: Point::default(), + pinch_anchor_widget: Point::default(), + } + } + } +} + +glib::wrapper! { + pub struct ZoomableCanvas(ObjectSubclass) + @extends gtk::Widget, + @implements gtk::Accessible, gtk::Buildable, gtk::ConstraintTarget; +} + +impl Default for ZoomableCanvas { + fn default() -> Self { + Self::new() + } +} + +impl ZoomableCanvas { + pub fn new() -> Self { + glib::Object::new() + } + + pub fn set_texture(&self, texture: gtk::gdk::Texture) { + let mut state = self.imp().state.borrow_mut(); + state.texture = Some(texture.clone()); + state.image_width = texture.width() as f64; + state.image_height = texture.height() as f64; + state.ocr = None; + state.ocr_index = None; + state.zoom_factor = MIN_ZOOM_FACTOR; + state.center_img = Point { + x: state.image_width * 0.5, + y: state.image_height * 0.5, + }; + state.selecting = false; + state.panning = false; + state.pinch_active = false; + state.selected_indices.clear(); + drop(state); + self.queue_draw(); + self.update_cursor(); + } + + pub fn set_ocr_result(&self, result: Option) { + let mut state = self.imp().state.borrow_mut(); + state.ocr = result; + state.ocr_index = state + .ocr + .as_ref() + .map(|ocr| OcrWordIndex::build(&ocr.words, state.image_width, state.image_height)); + state.selected_indices.clear(); + state.selecting = false; + drop(state); + self.queue_draw(); + } + + pub fn clear_selection(&self) { + let mut state = self.imp().state.borrow_mut(); + state.selecting = false; + state.selected_indices.clear(); + drop(state); + self.queue_draw(); + } + + pub fn selected_text(&self) -> String { + let state = self.imp().state.borrow(); + let Some(ocr) = &state.ocr else { + return String::new(); + }; + + let words = state + .selected_indices + .iter() + .filter_map(|&idx| ocr.words.get(idx)) + .collect::>(); + select::selected_text(words) + } + + pub fn zoom_by(&self, factor: f64) { + if factor <= 0.0 { + return; + } + self.zoom_at(self.widget_center(), factor); + } + + pub fn reset_view(&self) { + let mut state = self.imp().state.borrow_mut(); + if state.image_width <= 0.0 || state.image_height <= 0.0 { + return; + } + state.zoom_factor = MIN_ZOOM_FACTOR; + state.center_img = Point { + x: state.image_width * 0.5, + y: state.image_height * 0.5, + }; + state.selecting = false; + state.panning = false; + state.pinch_active = false; + drop(state); + self.queue_draw(); + self.update_cursor(); + } + + fn setup_controllers(&self) { + let motion = gtk::EventControllerMotion::new(); + { + let canvas = self.clone(); + motion.connect_motion(move |_, x, y| { + let mut state = canvas.imp().state.borrow_mut(); + state.last_cursor_widget = Some(Point { x, y }); + drop(state); + canvas.update_cursor(); + }); + } + { + let canvas = self.clone(); + motion.connect_leave(move |_| { + let mut state = canvas.imp().state.borrow_mut(); + state.last_cursor_widget = None; + drop(state); + canvas.update_cursor(); }); } + self.add_controller(motion); - // Drag-selection gesture + let scroll = gtk::EventControllerScroll::new(gtk::EventControllerScrollFlags::VERTICAL); + { + let canvas = self.clone(); + scroll.connect_scroll(move |controller, _dx, dy| { + let mods = controller.current_event_state(); + if !mods.contains(gtk::gdk::ModifierType::CONTROL_MASK) { + return glib::Propagation::Proceed; + } + + let factor = if dy < 0.0 { + ZOOM_STEP + } else if dy > 0.0 { + 1.0 / ZOOM_STEP + } else { + return glib::Propagation::Stop; + }; + + let anchor = canvas.last_cursor_widget(); + canvas.zoom_at(anchor, factor); + glib::Propagation::Stop + }); + } + self.add_controller(scroll); + + let pinch = gtk::GestureZoom::new(); + { + let canvas = self.clone(); + pinch.connect_begin(move |gesture, _| { + canvas.on_pinch_begin(gesture); + }); + } + { + let canvas = self.clone(); + pinch.connect_scale_changed(move |gesture, scale_factor| { + canvas.on_pinch_scale_changed(gesture, scale_factor); + }); + } { - let drag = gtk::GestureDrag::new(); - - let state_begin = state.clone(); - let drawing_begin = drawing.clone(); - drag.connect_drag_begin(move |_, x, y| { - let mut s = state_begin.borrow_mut(); - s.selecting = true; - s.select_start = Point { x, y }; - s.select_current = Point { x, y }; - s.selected.clear(); - drawing_begin.queue_draw(); + let canvas = self.clone(); + pinch.connect_end(move |_, _| { + canvas.on_pinch_end(); }); + } + { + let canvas = self.clone(); + pinch.connect_cancel(move |_, _| { + canvas.on_pinch_end(); + }); + } + self.add_controller(pinch); - let state_update = state.clone(); - let drawing_update = drawing.clone(); + let drag = gtk::GestureDrag::new(); + drag.set_button(0); + { + let canvas = self.clone(); + drag.connect_drag_begin(move |gesture, x, y| { + canvas.on_drag_begin(gesture, x, y); + }); + } + { + let canvas = self.clone(); drag.connect_drag_update(move |_, dx, dy| { - let mut s = state_update.borrow_mut(); - let cur = Point { - x: s.select_start.x + dx, - y: s.select_start.y + dy, + canvas.on_drag_update(dx, dy); + }); + } + { + let canvas = self.clone(); + drag.connect_drag_end(move |_, _, _| { + canvas.on_drag_end(); + }); + } + self.add_controller(drag); + } + + fn on_drag_begin(&self, gesture: >k::GestureDrag, x: f64, y: f64) { + let mut state = self.imp().state.borrow_mut(); + let widget_w = self.width() as f64; + let widget_h = self.height() as f64; + + let cursor = Point { x, y }; + state.last_cursor_widget = Some(cursor); + + let button = gesture.current_button(); + let mods = gesture.current_event_state(); + let ctrl_pressed = mods.contains(gtk::gdk::ModifierType::CONTROL_MASK); + let pan_requested = button == gtk::gdk::BUTTON_MIDDLE + || (button == gtk::gdk::BUTTON_PRIMARY && ctrl_pressed); + let can_pan = can_pan_at_view( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.zoom_factor, + ); + + if pan_requested && can_pan { + state.panning = true; + state.selecting = false; + state.pan_start_widget = cursor; + state.pan_start_center_img = state.center_img; + } else if button == gtk::gdk::BUTTON_PRIMARY { + state.selecting = true; + state.panning = false; + state.select_start_widget = cursor; + state.select_current_widget = cursor; + state.selected_indices.clear(); + } else { + state.selecting = false; + state.panning = false; + } + + drop(state); + self.update_cursor(); + self.queue_draw(); + } + + fn on_drag_update(&self, dx: f64, dy: f64) { + let mut needs_redraw = false; + let mut state = self.imp().state.borrow_mut(); + let widget_w = self.width() as f64; + let widget_h = self.height() as f64; + + if state.panning { + let current = Point { + x: state.pan_start_widget.x + dx, + y: state.pan_start_widget.y + dy, + }; + state.last_cursor_widget = Some(current); + + if let Some(transform) = transform_for_widget(&mut state, widget_w, widget_h) { + state.center_img = Point { + x: state.pan_start_center_img.x - (dx / transform.scale()), + y: state.pan_start_center_img.y - (dy / transform.scale()), }; - s.select_current = cur; + state.center_img = ViewTransform::clamp_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + transform.scale(), + state.center_img, + ); + needs_redraw = true; + } + } - // Update selection set. - if let Some(ocr) = &s.ocr { - let width = drawing_update.width() as f64; - let height = drawing_update.height() as f64; + if state.selecting { + let current = Point { + x: state.select_start_widget.x + dx, + y: state.select_start_widget.y + dy, + }; + state.select_current_widget = current; + state.last_cursor_widget = Some(current); + + if let Some(transform) = transform_for_widget(&mut state, widget_w, widget_h) { + let sel_widget = + Rect::from_points(state.select_start_widget, state.select_current_widget); + let sel_image = transform.widget_rect_to_image(sel_widget); + + let selected = { + let s: &mut imp::CanvasState = &mut state; + match (&s.ocr, &mut s.ocr_index) { + (Some(ocr), Some(index)) => { + index.query_intersecting(&ocr.words, &sel_image) + } + (Some(ocr), None) => ocr + .words + .iter() + .enumerate() + .filter_map(|(idx, word)| { + word.bbox.intersects(&sel_image).then_some(idx) + }) + .collect::>(), + _ => Vec::new(), + } + }; + state.selected_indices = selected; + needs_redraw = true; + } + } - let (scale, ox, oy) = - compute_contain_transform(width, height, s.image_width, s.image_height); + drop(state); + if needs_redraw { + self.queue_draw(); + } + } - let sel_widget = Rect::from_points(s.select_start, s.select_current); - let sel_image = widget_rect_to_image_rect(sel_widget, scale, ox, oy); + fn on_drag_end(&self) { + let mut state = self.imp().state.borrow_mut(); + state.selecting = false; + state.panning = false; + drop(state); + self.update_cursor(); + self.queue_draw(); + } - let selected = select::select_words(&ocr.words, sel_image) - .into_iter() - .filter_map(|w| { - // Convert reference to index. - // This is O(n) but fine for scaffold. - ocr.words.iter().position(|x| std::ptr::eq(x, w)) - }) - .collect::>(); + fn on_pinch_begin(&self, gesture: >k::GestureZoom) { + let mut state = self.imp().state.borrow_mut(); + if state.image_width <= 0.0 || state.image_height <= 0.0 { + return; + } - s.selected = selected; - } + let default_anchor = self.widget_center(); + let anchor = gesture + .bounding_box_center() + .map(|(x, y)| Point { x, y }) + .unwrap_or(default_anchor); - drawing_update.queue_draw(); - }); + state.pinch_active = true; + state.pinch_start_zoom_factor = state.zoom_factor; + state.pinch_start_center_img = state.center_img; + state.pinch_anchor_widget = anchor; + } - let state_end = state.clone(); - let drawing_end = drawing.clone(); - drag.connect_drag_end(move |_, _, _| { - let mut s = state_end.borrow_mut(); - s.selecting = false; - drawing_end.queue_draw(); - }); + fn on_pinch_scale_changed(&self, gesture: >k::GestureZoom, scale_factor: f64) { + let mut state = self.imp().state.borrow_mut(); + if !state.pinch_active || state.image_width <= 0.0 || state.image_height <= 0.0 { + return; + } - drawing.add_controller(drag); + let widget_w = self.width() as f64; + let widget_h = self.height() as f64; + if widget_w <= 0.0 || widget_h <= 0.0 { + return; } - Self { - root, - picture, - drawing, - spinner, - state, + let begin_transform = { + let begin_scale = ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.pinch_start_zoom_factor, + state.pinch_start_center_img, + ) + .scale(); + let begin_center = ViewTransform::clamp_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + begin_scale, + state.pinch_start_center_img, + ); + ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.pinch_start_zoom_factor, + begin_center, + ) + }; + + let fallback_anchor = if state.pinch_anchor_widget == Point::default() { + self.widget_center() + } else { + state.pinch_anchor_widget + }; + let anchor_widget = gesture + .bounding_box_center() + .map(|(x, y)| Point { x, y }) + .unwrap_or(fallback_anchor); + state.pinch_anchor_widget = anchor_widget; + let anchor_img = begin_transform.widget_to_image(anchor_widget); + let gesture_scale = if scale_factor > 0.0 { + scale_factor + } else { + gesture.scale_delta().max(f64::MIN_POSITIVE) + }; + + state.zoom_factor = clamp_zoom_factor( + state.pinch_start_zoom_factor * gesture_scale, + widget_w, + widget_h, + state.image_width, + state.image_height, + ); + + let new_scale = ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.zoom_factor, + state.center_img, + ) + .scale(); + let widget_center = + ViewTransform::contain(widget_w, widget_h, state.image_width, state.image_height) + .widget_center; + state.center_img = recenter_for_anchor(widget_center, new_scale, anchor_widget, anchor_img); + state.center_img = ViewTransform::clamp_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + new_scale, + state.center_img, + ); + + drop(state); + self.queue_draw(); + self.update_cursor(); + } + + fn on_pinch_end(&self) { + let mut state = self.imp().state.borrow_mut(); + state.pinch_active = false; + drop(state); + self.update_cursor(); + } + + fn zoom_at(&self, anchor_widget: Point, factor: f64) { + if factor <= 0.0 { + return; + } + + let mut state = self.imp().state.borrow_mut(); + if state.image_width <= 0.0 || state.image_height <= 0.0 { + return; + } + + let widget_w = self.width() as f64; + let widget_h = self.height() as f64; + let Some(current_transform) = transform_for_widget(&mut state, widget_w, widget_h) else { + return; + }; + + let anchor_img = current_transform.widget_to_image(anchor_widget); + let new_zoom = clamp_zoom_factor( + state.zoom_factor * factor, + widget_w, + widget_h, + state.image_width, + state.image_height, + ); + if (new_zoom - state.zoom_factor).abs() <= f64::EPSILON { + return; } + state.zoom_factor = new_zoom; + + let new_scale = ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.zoom_factor, + state.center_img, + ) + .scale(); + let widget_center = + ViewTransform::contain(widget_w, widget_h, state.image_width, state.image_height) + .widget_center; + state.center_img = recenter_for_anchor(widget_center, new_scale, anchor_widget, anchor_img); + state.center_img = ViewTransform::clamp_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + new_scale, + state.center_img, + ); + + drop(state); + self.queue_draw(); + self.update_cursor(); } - pub fn widget(&self) -> gtk::Widget { - self.root.clone().upcast() + fn widget_center(&self) -> Point { + Point { + x: (self.width() as f64) * 0.5, + y: (self.height() as f64) * 0.5, + } } - pub fn set_texture(&self, texture: gtk::gdk::Texture) { - let mut s = self.state.borrow_mut(); - s.image_width = texture.width() as f64; - s.image_height = texture.height() as f64; - drop(s); - self.picture.set_paintable(Some(&texture)); - self.drawing.queue_draw(); + fn last_cursor_widget(&self) -> Point { + self.imp() + .state + .borrow() + .last_cursor_widget + .unwrap_or_else(|| self.widget_center()) } - pub fn set_ocr_result(&self, result: Option) { - let mut s = self.state.borrow_mut(); - s.ocr = result; - s.selected.clear(); - s.selecting = false; - self.drawing.queue_draw(); + fn update_cursor(&self) { + let state = self.imp().state.borrow(); + let can_pan = can_pan_at_view( + self.width() as f64, + self.height() as f64, + state.image_width, + state.image_height, + state.zoom_factor, + ); + if state.panning && can_pan { + self.set_cursor_from_name(Some("grabbing")); + } else if can_pan { + self.set_cursor_from_name(Some("grab")); + } else { + self.set_cursor_from_name(None); + } } +} - pub fn set_ocr_busy(&self, busy: bool) { - self.spinner.set_spinning(busy); - self.spinner.set_visible(busy); +fn transform_for_widget( + state: &mut imp::CanvasState, + widget_w: f64, + widget_h: f64, +) -> Option { + if widget_w <= 0.0 || widget_h <= 0.0 || state.image_width <= 0.0 || state.image_height <= 0.0 { + return None; } - pub fn clear_selection(&self) { - let mut s = self.state.borrow_mut(); - s.selected.clear(); - s.selecting = false; - self.drawing.queue_draw(); + let max_zoom = + max_zoom_factor_for_dims(widget_w, widget_h, state.image_width, state.image_height); + if state.zoom_factor > max_zoom { + state.zoom_factor = max_zoom; + } else if state.zoom_factor < MIN_ZOOM_FACTOR { + state.zoom_factor = MIN_ZOOM_FACTOR; } - pub fn selected_text(&self) -> String { - let s = self.state.borrow(); - let Some(ocr) = &s.ocr else { - return String::new(); - }; + let mut transform = ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.zoom_factor, + state.center_img, + ); + let clamped = ViewTransform::clamp_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + transform.scale(), + state.center_img, + ); + if point_changed(clamped, state.center_img) { + state.center_img = clamped; + transform = ViewTransform::from_center( + widget_w, + widget_h, + state.image_width, + state.image_height, + state.zoom_factor, + state.center_img, + ); + } - let words = s - .selected - .iter() - .filter_map(|&idx| ocr.words.get(idx)) - .collect::>(); + Some(transform) +} - select::selected_text(words) +fn point_changed(a: Point, b: Point) -> bool { + (a.x - b.x).abs() > f64::EPSILON || (a.y - b.y).abs() > f64::EPSILON +} + +fn natural_size_for_measure( + orientation: gtk::Orientation, + for_size: i32, + image_w: f64, + image_h: f64, +) -> i32 { + if image_w <= 0.0 || image_h <= 0.0 { + return 1; + } + + if for_size > 0 { + match orientation { + gtk::Orientation::Horizontal => size_to_i32((for_size as f64) * (image_w / image_h)), + gtk::Orientation::Vertical => size_to_i32((for_size as f64) * (image_h / image_w)), + _ => 1, + } + } else { + match orientation { + gtk::Orientation::Horizontal => size_to_i32(image_w), + gtk::Orientation::Vertical => size_to_i32(image_h), + _ => 1, + } + } +} + +fn size_to_i32(value: f64) -> i32 { + value.round().clamp(1.0, i32::MAX as f64) as i32 +} + +fn clamp_zoom_factor(zoom: f64, widget_w: f64, widget_h: f64, image_w: f64, image_h: f64) -> f64 { + let max_zoom = max_zoom_factor_for_dims(widget_w, widget_h, image_w, image_h); + zoom.clamp(MIN_ZOOM_FACTOR, max_zoom) +} + +fn contain_scale_for_dims(widget_w: f64, widget_h: f64, image_w: f64, image_h: f64) -> Option { + if widget_w <= 0.0 || widget_h <= 0.0 || image_w <= 0.0 || image_h <= 0.0 { + return None; } + Some(ViewTransform::contain(widget_w, widget_h, image_w, image_h).contain_scale) } -fn compute_contain_transform( +fn effective_scale_for_dims( widget_w: f64, widget_h: f64, image_w: f64, image_h: f64, -) -> (f64, f64, f64) { - // contain - let scale = (widget_w / image_w).min(widget_h / image_h).max(0.0001); - let draw_w = image_w * scale; - let draw_h = image_h * scale; - let ox = (widget_w - draw_w) / 2.0; - let oy = (widget_h - draw_h) / 2.0; - (scale, ox, oy) + zoom_factor: f64, +) -> Option { + contain_scale_for_dims(widget_w, widget_h, image_w, image_h).map(|s| s * zoom_factor) +} + +fn max_zoom_factor_for_dims(widget_w: f64, widget_h: f64, image_w: f64, image_h: f64) -> f64 { + let contain_scale = + contain_scale_for_dims(widget_w, widget_h, image_w, image_h).unwrap_or(MIN_ZOOM_FACTOR); + let max_zoom = BASE_MAX_ZOOM_FACTOR.max(1.0 / contain_scale); + debug_assert!(contain_scale * max_zoom >= 1.0 - 1e-12); + max_zoom } -fn widget_rect_to_image_rect(sel: Rect, scale: f64, ox: f64, oy: f64) -> Rect { - Rect { - x: (sel.x - ox) / scale, - y: (sel.y - oy) / scale, - w: sel.w / scale, - h: sel.h / scale, +fn can_pan_at_view( + widget_w: f64, + widget_h: f64, + image_w: f64, + image_h: f64, + zoom_factor: f64, +) -> bool { + let Some(scale) = effective_scale_for_dims(widget_w, widget_h, image_w, image_h, zoom_factor) + else { + return false; + }; + image_w * scale > widget_w + PAN_DIM_EPS || image_h * scale > widget_h + PAN_DIM_EPS +} + +fn recenter_for_anchor( + widget_center: Point, + scale: f64, + anchor_widget: Point, + anchor_img: Point, +) -> Point { + Point { + x: anchor_img.x - (anchor_widget.x - widget_center.x) / scale, + y: anchor_img.y - (anchor_widget.y - widget_center.y) / scale, + } +} + +fn scaling_filter_for_scale(scale: f64) -> gtk::gsk::ScalingFilter { + let is_near_integer = scale > 1.0 && (scale - scale.round()).abs() <= INTEGER_SCALE_EPS; + if is_near_integer { + gtk::gsk::ScalingFilter::Nearest + } else { + gtk::gsk::ScalingFilter::Trilinear + } +} + +#[cfg(test)] +mod tests { + use super::{max_zoom_factor_for_dims, natural_size_for_measure, size_to_i32}; + use gtk4::Orientation; + use quickview_core::geometry::ViewTransform; + + #[test] + fn size_to_i32_rounds_and_clamps() { + assert_eq!(size_to_i32(-100.0), 1); + assert_eq!(size_to_i32(0.0), 1); + assert_eq!(size_to_i32(0.49), 1); + assert_eq!(size_to_i32(0.50), 1); + assert_eq!(size_to_i32(1.4), 1); + assert_eq!(size_to_i32(1.5), 2); + assert_eq!(size_to_i32((i32::MAX as f64) + 12345.0), i32::MAX); + } + + #[test] + fn measure_preserves_aspect_ratio_when_constrained() { + // 2:1 image + let image_w = 400.0; + let image_h = 200.0; + + let h = natural_size_for_measure(Orientation::Horizontal, 100, image_w, image_h); + assert_eq!(h, 200); + + let v = natural_size_for_measure(Orientation::Vertical, 100, image_w, image_h); + assert_eq!(v, 50); + } + + #[test] + fn measure_uses_image_dimensions_when_unconstrained() { + let image_w = 123.2; + let image_h = 456.6; + + let h = natural_size_for_measure(Orientation::Horizontal, -1, image_w, image_h); + assert_eq!(h, 123); + + let v = natural_size_for_measure(Orientation::Vertical, -1, image_w, image_h); + assert_eq!(v, 457); + } + + #[test] + fn dynamic_max_zoom_allows_absolute_scale_one_for_tiny_contain() { + let widget_w = 320.0; + let widget_h = 240.0; + let image_w = 12000.0; + let image_h = 8000.0; + + let contain_scale = + ViewTransform::contain(widget_w, widget_h, image_w, image_h).contain_scale; + let max_zoom = max_zoom_factor_for_dims(widget_w, widget_h, image_w, image_h); + let max_absolute_scale = contain_scale * max_zoom; + + assert!(max_zoom > 20.0); + assert!(max_absolute_scale >= 1.0); } } diff --git a/crates/quickview-ui/src/windows/full_viewer.rs b/crates/quickview-ui/src/windows/full_viewer.rs index d2790dc..7e8fece 100644 --- a/crates/quickview-ui/src/windows/full_viewer.rs +++ b/crates/quickview-ui/src/windows/full_viewer.rs @@ -27,6 +27,7 @@ pub fn present(app: &adw::Application, opts: &LaunchOptions) { // Key handling: arrows navigate, Ctrl+C copies. { let viewer = viewer.clone(); + let overlay = viewer.overlay(); let window_clone = window.clone(); let controller = gtk::EventControllerKey::new(); controller.connect_key_pressed(move |_, key, _, state| { @@ -38,6 +39,22 @@ pub fn present(app: &adw::Application, opts: &LaunchOptions) { return glib::Propagation::Stop; } + if key == gtk::gdk::Key::plus + || key == gtk::gdk::Key::equal + || key == gtk::gdk::Key::KP_Add + { + overlay.zoom_in(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::minus || key == gtk::gdk::Key::KP_Subtract { + overlay.zoom_out(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::_0 || key == gtk::gdk::Key::Home { + overlay.reset_view(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::Left { viewer.prev_image(); return glib::Propagation::Stop; diff --git a/crates/quickview-ui/src/windows/quick_preview.rs b/crates/quickview-ui/src/windows/quick_preview.rs index 77e63b7..31723ba 100644 --- a/crates/quickview-ui/src/windows/quick_preview.rs +++ b/crates/quickview-ui/src/windows/quick_preview.rs @@ -34,6 +34,7 @@ pub fn present(app: &adw::Application, opts: &LaunchOptions) { // Key handling: Esc/Space closes, Ctrl+C copies. { let viewer = viewer.clone(); + let overlay = viewer.overlay(); let window_clone = window.clone(); let controller = gtk::EventControllerKey::new(); controller.connect_key_pressed(move |_, key, _, state| { @@ -44,6 +45,22 @@ pub fn present(app: &adw::Application, opts: &LaunchOptions) { return glib::Propagation::Stop; } + if key == gtk::gdk::Key::plus + || key == gtk::gdk::Key::equal + || key == gtk::gdk::Key::KP_Add + { + overlay.zoom_in(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::minus || key == gtk::gdk::Key::KP_Subtract { + overlay.zoom_out(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::_0 || key == gtk::gdk::Key::Home { + overlay.reset_view(); + return glib::Propagation::Stop; + } + if key == gtk::gdk::Key::Escape || key == gtk::gdk::Key::space { window_clone.close(); return glib::Propagation::Stop; diff --git a/crates/quickview-ui/src/windows/shared.rs b/crates/quickview-ui/src/windows/shared.rs index 502b275..ca014e9 100644 --- a/crates/quickview-ui/src/windows/shared.rs +++ b/crates/quickview-ui/src/windows/shared.rs @@ -52,7 +52,6 @@ impl ViewerController { self.overlay.widget() } - #[allow(dead_code)] pub fn overlay(&self) -> ImageOverlayWidget { self.overlay.clone() } diff --git a/diagrams/architecture.mmd b/diagrams/architecture.mmd index 56a99ea..b348c05 100644 --- a/diagrams/architecture.mmd +++ b/diagrams/architecture.mmd @@ -1,12 +1,11 @@ -```mermaid flowchart LR subgraph UI[GTK4 / libadwaita UI Process] - A[App entry\nCLI + .desktop] --> B{Mode?} - B -->|--quick-preview| Q[Quick Preview Window\n(borderless overlay)] + A[App entry
CLI + .desktop] --> B{Mode?} + B -->|--quick-preview| Q[Quick Preview Window
borderless overlay] B -->|default| F[Full Viewer Window] - Q --> R[Renderer\n(texture + transforms)] + Q --> R[Renderer
texture + transforms] F --> R - R --> O[OCR Overlay Layer\n(hit-testing + selection)] + R --> O[OCR Overlay Layer
hit-testing + selection] end subgraph IMG[Image Pipeline] @@ -15,17 +14,17 @@ flowchart LR end subgraph OCR[OCR Pipeline] - T1[Prepare bitmap\n(optional preprocess)] --> T2[OCR engine] - T2 --> T3[Layout output\n(TSV/HOCR)] - T3 --> T4[Parsed boxes\nwords/lines + confidence] + T1[Prepare bitmap
optional preprocess] --> T2[OCR engine] + T2 --> T3[Layout output
TSV/HOCR] + T3 --> T4[Parsed boxes
words/lines + confidence] end I1 --> I2 --> I3 --> R I3 --> T1 --> T2 --> T3 --> T4 --> O subgraph Cache[Cache] - C1[(In-memory cache)]:::cache - C2[(Optional persistent cache)]:::cache + C1[In-memory cache]:::cache + C2[Optional persistent cache]:::cache end T4 --> C1 @@ -33,4 +32,3 @@ flowchart LR T4 --> C2 classDef cache fill:#f2f2f2,stroke:#bbb,color:#111; -``` diff --git a/diagrams/state_machine.mmd b/diagrams/state_machine.mmd index 4b0d14c..1c50a62 100644 --- a/diagrams/state_machine.mmd +++ b/diagrams/state_machine.mmd @@ -1,4 +1,3 @@ -```mermaid stateDiagram-v2 [*] --> Idle Idle --> LoadingImage: Open(path/stdin) @@ -11,4 +10,3 @@ stateDiagram-v2 OcrRunning --> Closed: Space/Esc (cancel/ignore) OcrReady --> Closed: Space/Esc Closed --> [*] -``` diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 770c051..d419b80 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -15,12 +15,12 @@ This architecture aims to satisfy the spec goals: ```mermaid flowchart LR subgraph UI[GTK4 / libadwaita UI Process] - A[App entry\nCLI + .desktop] --> B{Mode?} - B -->|--quick-preview| Q[Quick Preview Window\n(borderless overlay)] + A[App entry
CLI + .desktop] --> B{Mode?} + B -->|--quick-preview| Q[Quick Preview Window
borderless overlay] B -->|default| F[Full Viewer Window] - Q --> R[Renderer\n(texture + transforms)] + Q --> R[Renderer
texture + transforms] F --> R - R --> O[OCR Overlay Layer\n(hit-testing + selection)] + R --> O[OCR Overlay Layer
hit-testing + selection] end subgraph IMG[Image Pipeline] @@ -29,17 +29,17 @@ flowchart LR end subgraph OCR[OCR Pipeline] - T1[Prepare bitmap\n(optional preprocess)] --> T2[OCR engine] - T2 --> T3[Layout output\n(TSV/HOCR)] - T3 --> T4[Parsed boxes\nwords/lines + confidence] + T1[Prepare bitmap
optional preprocess] --> T2[OCR engine] + T2 --> T3[Layout output
TSV/HOCR] + T3 --> T4[Parsed boxes
words/lines + confidence] end I1 --> I2 --> I3 --> R I3 --> T1 --> T2 --> T3 --> T4 --> O subgraph Cache[Cache] - C1[(In-memory cache)]:::cache - C2[(Optional persistent cache)]:::cache + C1[In-memory cache]:::cache + C2[Optional persistent cache]:::cache end T4 --> C1 @@ -163,21 +163,36 @@ Store OCR results as: - optional: paragraph/block grouping for better selection behavior (future) ### 7.2 Hit testing -Selection requires mapping pointer coordinates → OCR boxes. Best practice: -- build a spatial index (e.g., grid index or R-tree) over word bounding boxes in image coordinates -- at drag-select, query overlapping boxes, then order them by reading order (line then x) +Selection requires mapping pointer coordinates → OCR boxes. + +Implemented in `crates/quickview-core/src/ocr/index.rs` as `OcrWordIndex` — a uniform-grid spatial index (256px cells) over word bounding boxes in image coordinates. Built once when OCR results arrive; queried on every drag-select update via `query_intersecting()`. Falls back to linear scan if no index is available. ### 7.3 Transform math -Maintain a view transform `T`: -- scale (zoom) -- translation (pan) -- fit-to-window baseline transform +Implemented in `crates/quickview-core/src/geometry.rs` as `ViewTransform`. + +**Canonical state** (stored per-widget, resize-stable): +- `zoom_factor: f64` — 1.0 = contain-fit +- `center_img: Point` — image-space point at widget center + +**Deriving the transform each frame** (`ViewTransform::from_center`): +- `contain()` returns a `ContainResult { contain_scale, widget_center }` +- `scale = contain_scale * zoom_factor` +- `offset = widget_center - center_img * scale` +- Constructor validates non-finite and non-positive scale values (`ViewTransformError`) +- Fields are private; accessed via `.scale()`, `.offset_x()`, `.offset_y()` getters Convert bounding boxes for render: -- `bbox_widget = T(bbox_image)` +- `bbox_widget = T(bbox_image)` via `image_rect_to_widget()` + +Hit-testing and selection do the inverse: +- `p_image = T⁻¹(p_widget)` via `widget_to_image()` +- `sel_image = T⁻¹(sel_widget)` via `widget_rect_to_image()` — converts a drag-selection rectangle to image coordinates for OCR word intersection testing + +**Clamping**: `clamp_center()` keeps the image covering the viewport when zoomed in, or forces `center_img` to image center when the scaled image fits within the widget. Clamped values are written back to state to keep it canonical. + +**Zoom anchoring**: anchor-preserving math ensures the image point under the cursor (or pinch center) stays fixed after zoom. See `recenter_for_anchor()` in `image_overlay.rs`. -Hit-testing does the inverse: -- `p_image = T^-1(p_widget)` +**Rendering**: `ZoomableCanvas` (custom `gtk::Widget` subclass in `image_overlay.rs`) uses the GSK/Snapshot pipeline — `snapshot.append_scaled_texture()` for GPU-accelerated image rendering, `snapshot.append_cairo()` only for lightweight overlay primitives (selection rect, OCR highlights). ### 7.4 Copy semantics When copying selection: diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md index 4068951..70ad51f 100644 --- a/docs/DECISIONS.md +++ b/docs/DECISIONS.md @@ -135,7 +135,24 @@ Recommendation: implement TSV first; add hOCR later as debug/export. --- -### 8) Threading + cancellation +### 8) Image rendering: custom Widget subclass + GSK/Snapshot pipeline + +**Why** +- `gtk::Picture` does its own internal contain-fit with no way to inject a zoom/pan transform. +- The GSK/Snapshot pipeline (`append_scaled_texture`) keeps the `GdkTexture` on the GPU — no `Texture::download()` to a Cairo surface, no CPU-bound rendering on the main thread. +- Cairo is only used for lightweight overlay primitives (selection rect, OCR highlights) via `snapshot.append_cairo()`. + +**Tradeoffs** +- Requires a custom `gtk::Widget` subclass (`ZoomableCanvas`) with `glib::subclass` boilerplate. +- Requires GTK >= 4.10 for `append_scaled_texture` (the `v4_10` feature gate). + +**Alternatives considered** +- **`gtk::Picture` + Cairo overlay**: simpler but `Texture::download()` blocks the main thread and wastes memory holding both GPU and CPU copies. +- **`snapshot.save()/translate()/scale()/append_texture()`**: works without `v4_10` but no scaling filter control. + +--- + +### 9) Threading + cancellation **Recommended** - OCR runs in a worker thread (or separate process) and returns results over a channel. diff --git a/docs/DEPENDENCIES.md b/docs/DEPENDENCIES.md index 509c608..1600f31 100644 --- a/docs/DEPENDENCIES.md +++ b/docs/DEPENDENCIES.md @@ -5,7 +5,7 @@ This file is a quick reference for system and Rust dependencies. ## System dependencies (Arch) Required: -- gtk4 +- gtk4 (>= 4.10 — required for `append_scaled_texture` used by the zoom/pan renderer) - libadwaita - tesseract - tesseract language pack(s) (at least English: `tesseract-data-eng`) @@ -17,7 +17,7 @@ Optional: ## Rust crates (workspace) -- `gtk4` (GTK4 bindings) +- `gtk4` (GTK4 bindings, `v4_10` feature enabled) - `libadwaita` (Adwaita widgets) - `gtk4-layer-shell` (Layer Shell integration) - `clap` (CLI) diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md index fe2c3ed..e227770 100644 --- a/docs/DEVELOPMENT.md +++ b/docs/DEVELOPMENT.md @@ -38,7 +38,8 @@ sudo pacman -S --needed tesseract tesseract-data-eng - Keep all OCR and I/O off the GTK main thread. - Use `async-channel` + `glib::MainContext::spawn_local()` to send results back to the UI. - Prefer small widgets with clear responsibilities: - - `ImageOverlayWidget` draws the image and highlights + - `ImageOverlayWidget` wraps the overlay + spinner; delegates to `ZoomableCanvas` + - `ZoomableCanvas` (custom `gtk::Widget` subclass) handles image rendering via GSK/Snapshot, zoom/pan state, selection gestures, and OCR highlight overlay - `ViewerController` manages OCR dispatch and yields `OcrResult` ## Useful tasks diff --git a/docs/PHASED_PLAN.md b/docs/PHASED_PLAN.md index 286e5f6..e8f2d7a 100644 --- a/docs/PHASED_PLAN.md +++ b/docs/PHASED_PLAN.md @@ -8,7 +8,7 @@ If priorities change, you can reshuffle phases, but try to keep the “render fi --- -## Phase 0 — Repo + tooling foundation +## Phase 0 — Repo + tooling foundation ✅ **Deliverables** - Repository structure (`crates/`, `docs/`, `adrs/`) @@ -17,105 +17,105 @@ If priorities change, you can reshuffle phases, but try to keep the “render fi - Packaging skeletons (Arch PKGBUILD stub + Flatpak manifest stub) **Definition of done** -- `build` succeeds on Arch in a clean environment -- `run` launches an empty window without warnings -- `docs/` renders in your preferred markdown viewer +- `build` succeeds on Arch in a clean environment ✅ +- `run` launches an empty window without warnings ✅ +- `docs/` renders in your preferred markdown viewer ✅ --- -## Phase 1 — Full Viewer: open + display image +## Phase 1 — Full Viewer: open + display image ✅ **Core tasks** - Implement CLI parsing: - - `quickview ` - - `quickview --help` + - `quickview ` ✅ + - `quickview --help` ✅ - Load and display image: - - decode to a texture - - show in a viewer widget -- Fit-to-window baseline + - decode to a texture ✅ + - show in a viewer widget ✅ +- Fit-to-window baseline ✅ - Keyboard shortcuts: - - `Esc` closes window - - `+/-` zoom (or Ctrl+scroll) -- Basic UI shell with libadwaita (headerbar, etc.) + - `Esc` closes window ✅ + - `+/-` zoom (or Ctrl+scroll) ✅ +- Basic UI shell with libadwaita (headerbar, etc.) ✅ **Definition of done** -- Opening a PNG/JPG renders correctly -- No UI freezes during decode (decode is async or sufficiently fast) -- *(Zoom and pan deferred — see Phase 5 or later)* +- Opening a PNG/JPG renders correctly ✅ +- No UI freezes during decode (decode is async or sufficiently fast) ✅ +- Zoom and pan: Ctrl+scroll, pinch-to-zoom, +/- keys, middle-drag pan ✅ --- -## Phase 2 — Directory navigation + info panel +## Phase 2 — Directory navigation + info panel (partially done) **Core tasks** -- Identify “image set” as all supported images in the same directory -- Maintain a sorted list and current index +- Identify “image set” as all supported images in the same directory ✅ +- Maintain a sorted list and current index ✅ - Add navigation: - - Left/Right arrows to prev/next + - Left/Right arrows to prev/next ✅ - Add info panel: - filename - dimensions - file size **Definition of done** -- Prev/next navigation is correct and stable +- Prev/next navigation is correct and stable ✅ - Info updates immediately when switching images --- -## Phase 3 — Quick Preview mode (borderless overlay) +## Phase 3 — Quick Preview mode (borderless overlay) ✅ **Core tasks** - Add a `--quick-preview` mode: - - borderless - - centered - - dismiss on Space/Esc -- Implement “always-on-top” behavior + - borderless ✅ + - centered ✅ + - dismiss on Space/Esc ✅ +- Implement “always-on-top” behavior ✅ - If available, integrate Layer Shell (wlroots-friendly overlay): - - runtime detect if Layer Shell is supported - - use overlay layer with appropriate keyboard focus policy + - runtime detect if Layer Shell is supported ✅ + - use overlay layer with appropriate keyboard focus policy ✅ **Definition of done** -- `quickview --quick-preview ` shows a borderless preview and closes instantly on Space/Esc -- Works on at least one wlroots compositor +- `quickview --quick-preview ` shows a borderless preview and closes instantly on Space/Esc ✅ +- Works on at least one wlroots compositor ✅ --- -## Phase 4 — OCR pipeline integration (async) +## Phase 4 — OCR pipeline integration (async) ✅ **Core tasks** -- Add OCR backend abstraction (interface/trait) +- Add OCR backend abstraction (interface/trait) ✅ - Implement default Tesseract backend: - - run OCR asynchronously - - produce word-level boxes + text -- Add a non-blocking “OCR in progress” indicator -- Ensure cancellation / ignoring late results when user navigates away + - run OCR asynchronously ✅ + - produce word-level boxes + text ✅ +- Add a non-blocking “OCR in progress” indicator ✅ +- Ensure cancellation / ignoring late results when user navigates away ✅ **Definition of done** -- OCR starts after image display -- OCR completion adds internal OCR result state (even before selection UI exists) -- App stays responsive during OCR +- OCR starts after image display ✅ +- OCR completion adds internal OCR result state (even before selection UI exists) ✅ +- App stays responsive during OCR ✅ --- -## Phase 5 — OCR overlay + text selection UX +## Phase 5 — OCR overlay + text selection UX (partially done) **Core tasks** -- Render OCR overlay (invisible by default or lightly highlighted on hover) +- Render OCR overlay (invisible by default or lightly highlighted on hover) ✅ - Implement drag-selection: - - compute selection rectangle in image coordinates - - highlight matched words + - compute selection rectangle in image coordinates ✅ + - highlight matched words ✅ - Implement copy: - - Ctrl+C copies selected text + - Ctrl+C copies selected text ✅ - context menu action “Copy” **Definition of done** -- User can reliably select and copy text from an image -- Selection stays aligned under zoom/pan +- User can reliably select and copy text from an image ✅ +- Selection stays aligned under zoom/pan ✅ --- -## Phase 6 — Integration polish +## Phase 6 — Integration polish (not started) **Core tasks** - `.desktop` integration: @@ -130,7 +130,7 @@ If priorities change, you can reshuffle phases, but try to keep the “render fi --- -## Phase 7 — Hardening + performance +## Phase 7 — Hardening + performance (not started) **Core tasks** - Add cache (in-memory first) @@ -148,7 +148,7 @@ If priorities change, you can reshuffle phases, but try to keep the “render fi --- -## Phase 8 — Nice-to-haves / future roadmap +## Phase 8 — Nice-to-haves / future roadmap (not started) - Persistent OCR cache (SQLite) - Better layout/reading-order reconstruction diff --git a/packaging/arch/PKGBUILD b/packaging/arch/PKGBUILD index c7ce8a1..70b38ff 100644 --- a/packaging/arch/PKGBUILD +++ b/packaging/arch/PKGBUILD @@ -9,7 +9,7 @@ url="https://github.com/Green2Grey2/QuickView" license=("MIT") depends=( - "gtk4" + "gtk4>=4.10" "libadwaita" "tesseract" "gtk4-layer-shell" diff --git a/scripts/bootstrap_arch.sh b/scripts/bootstrap_arch.sh index 13092f7..018d3bb 100755 --- a/scripts/bootstrap_arch.sh +++ b/scripts/bootstrap_arch.sh @@ -7,6 +7,12 @@ sudo pacman -S --needed \ tesseract tesseract-data-eng \ gtk4-layer-shell +# Enforce minimum GTK version required by the UI code (Snapshot/GSK APIs). +if ! pkg-config --atleast-version=4.10 gtk4; then + echo "Error: GTK4 >= 4.10 is required. Found: $(pkg-config --modversion gtk4)" >&2 + exit 1 +fi + # Optional: # sudo pacman -S --needed wl-clipboard # sudo pacman -S --needed glycin glycin-gtk4 diff --git a/templates/PKGBUILD.stub b/templates/PKGBUILD.stub index b8b191b..2a03ebb 100644 --- a/templates/PKGBUILD.stub +++ b/templates/PKGBUILD.stub @@ -7,7 +7,7 @@ pkgdesc="Wayland image viewer with OCR text selection" arch=("x86_64") url="https://github.com/Green2Grey2/QuickView" license=("MIT") -depends=("gtk4" "libadwaita" "tesseract") +depends=("gtk4>=4.10" "libadwaita" "tesseract") # Optional recommended deps: # depends+=("gtk4-layer-shell") # depends+=("glycin") # if packaged; otherwise vendor/build