Skip to content

Vision

Vision is the primary sense through which users interact with digital interfaces. Understanding how the human visual system works—its capabilities and limitations—is fundamental to designing effective, accessible, and usable interfaces.

Human vision is not uniform across the visual field. We have two distinct visual systems that work together:

Foveal vision (central vision):

  • Sharp, high-resolution vision in a narrow cone (~2° of visual angle)
  • Where you’re looking directly—the point of focus
  • Allows reading, recognizing faces, seeing fine details
  • Only covers about the size of your thumbnail at arm’s length

Peripheral vision (ambient vision):

  • Lower resolution, covers the rest of your visual field (up to ~180° horizontally)
  • Excellent at detecting motion and changes
  • Poor at identifying details, colors, or text
  • Guides where to look next

Design implication: Users can only focus on a small area at once. Critical information must be large enough and positioned where users are looking. Peripheral vision can guide attention through motion or high contrast, but can’t process detail.

Eyes don’t smoothly scan a page—they jump rapidly between fixation points.

Saccades: Rapid eye movements between fixation points (20-200ms)

  • Vision is suppressed during saccades (you don’t see the blur)
  • Occur 2-4 times per second during active viewing

Fixations: Brief pauses where eyes stay relatively still (200-600ms)

  • This is when we actually process visual information
  • Users fixate longer on unexpected, complex, or important content

Design implication: Users don’t see everything—they sample strategically. Visual hierarchy guides where eyes fixate. Important elements should be positioned at likely fixation points (top-left for LTR languages, headlines, first words of paragraphs).

Some visual features are processed instantly, before conscious attention, in parallel across the visual field. These features “pop out” and draw attention automatically.

Preattentive features:

  • Color: Hue, saturation, brightness differences
  • Form: Size, shape, orientation, spatial grouping
  • Motion: Moving elements stand out from static ones
  • Position: Vertical/horizontal alignment, 2D position

Design implication: Use preattentive features to:

  • Create visual hierarchy (size, color, position)
  • Guide attention (motion, bright colors)
  • Group related items (proximity, similar colors/shapes)
  • Signal important states (red for errors, green for success)

Limitation: Only one or two preattentive features work at once. Combining too many creates visual clutter and nothing stands out.

Our visual system actively organizes visual information into meaningful patterns. Gestalt principles describe how we group and structure visual elements.

Elements close together are perceived as related.

Design application:

  • Group related form fields together
  • Separate unrelated sections with whitespace
  • Use consistent spacing within and between groups

Elements that look alike are perceived as related or belonging to the same category.

Design application:

  • Style all primary buttons the same way
  • Use consistent icon styles
  • Apply the same text formatting to similar content types

We perceive continuous forms even when interrupted.

Design application:

  • Align elements to invisible grid lines
  • Create visual flow with directional cues (arrows, lines)
  • Use progressive disclosure without breaking visual connection

We mentally complete incomplete shapes.

Design application:

  • Use outlined buttons/cards without fully enclosing them
  • Simplified icons that suggest rather than fully render forms
  • Reduce visual noise by implying rather than drawing every line

We distinguish objects (figure) from their background (ground).

Design application:

  • Use contrast to separate interactive elements from background
  • Employ cards and elevation to create layered depth
  • Ensure sufficient contrast for text on backgrounds

Elements moving in the same direction are perceived as related.

Design application:

  • Animate related items together
  • Use coordinated transitions for grouped actions
  • Maintain consistent animation timing for related elements

Human color vision comes from three types of cone cells in the retina, each sensitive to different wavelengths:

  • L-cones (long wavelength): Red-orange light (~560nm)
  • M-cones (medium wavelength): Green light (~530nm)
  • S-cones (short wavelength): Blue-violet light (~420nm)

All colors we see are combinations of these three channels. About 8% of males and 0.5% of females have color vision deficiencies (CVD) affecting one or more cone types.

  • Don’t rely on color alone to convey information (see Color Accessibility)
  • Ensure sufficient contrast between text and background (Legibility & Contrast)
  • Test with colorblind simulators to verify your designs work for all users
  • Use patterns, icons, and labels alongside color coding

Contrast sensitivity—the ability to distinguish between an object and its background—varies across:

  • Low spatial frequency (large patterns, broad shapes): Easy to see even at low contrast
  • High spatial frequency (fine details, small text): Requires high contrast

Design implication: Small text needs higher contrast ratios than large text (WCAG requires 4.5:1 for small text, 3:1 for large text).

  • Luminance contrast (brightness difference): Highly effective, works for all users including those with CVD
  • Chromatic contrast (color difference): Less effective, doesn’t work for colorblind users

Design implication: Use luminance (lightness) contrast as your primary differentiator, not just hue changes.

Peripheral vision is highly sensitive to motion, which helped our ancestors detect threats. In digital interfaces:

Benefits of motion:

  • Draws attention to important changes
  • Provides feedback (loading spinners, transitions)
  • Shows relationships between elements
  • Creates sense of direct manipulation

Risks of motion:

  • Distracting and overwhelming if overused
  • Can trigger vestibular disorders or seizures
  • Increases cognitive load
  • Makes interfaces feel slower if transitions are too long

Design guidelines:

  • Keep animations brief (200-400ms for most transitions)
  • Respect prefers-reduced-motion system setting
  • Don’t use flashing content (>3 flashes per second)
  • Make motion purposeful, not decorative

A 2024 CVPR paper introduced “TransNeXt: Robust Foveal Visual Perception for Vision Transformers”, presenting Aggregated Attention, a biomimetic design-based token mixer that simulates biological foveal vision and continuous eye movement. This approach effectively avoids depth degradation and achieves natural visual perception.

A 2024 study on comparative analysis of visual field characteristics between VR and real world found that reaction times to peripheral visual stimuli are slower in VR than in real world environments. This observed discrepancy can be partially explained by latencies in hardware and software configurations, with important implications for VR interface design.

Recent research emphasizes that studying the importance of color and shape perception with peripheral vision on the screen has an important role in digital design to provide designers with guidelines for better performance that will contribute to more functional user interfaces.

Research in the last decade has shown that processing in peripheral and foveal vision is not independent, but is more directly connected than previously thought. There is a consensus that foveal and peripheral vision accomplish two opposing goals: foveal vision allows for maximal acuity and contrast sensitivity in a small region, whereas peripheral vision allows for a large field of view with lower resolution.

  • Place critical information in the center of the viewport where users naturally look first
  • Use the F-pattern for text-heavy pages (users scan top-left, across, then down the left edge)
  • Create clear visual hierarchy using size, weight, color, and position
  • Align elements to invisible grid lines (continuity principle)
  • Limit preattentive features to 1-2 per screen (too many = nothing stands out)
  • Group related items with proximity and similarity
  • Use whitespace generously to separate unrelated content
  • Remove unnecessary visual elements that don’t serve a purpose
  • Use size and contrast for most important elements
  • Reserve motion for critical state changes only
  • Position primary actions where users expect them (bottom-right for forms, top-right for global actions)
  • Create visual flow with alignment and directional cues
  • Ensure sufficient contrast (4.5:1 minimum for text)
  • Don’t rely on color alone to convey meaning
  • Support keyboard navigation with visible focus indicators
  • Respect user preferences for reduced motion and increased contrast
  • Test with actual users who have visual differences

Foundational Work:

  • Ware, C. (2020) — Information Visualization: Perception for Design (4th ed.). Morgan Kaufmann
  • Treisman, A. (1985) — Preattentive processing in vision. Computer Vision, Graphics, and Image Processing
  • Palmer, S. E. (1999) — Vision Science: Photons to Phenomenology. MIT Press
  • Koffka, K. (1935) — Principles of Gestalt Psychology. Harcourt, Brace & World

Recent Research:

Practical Resources: