As a designer working in augmented reality (AR), one of the most difficult challenges I've grappled with is spatial in nature—figuring out how best to arrange my work in a three-dimensional scene. This challenge comes from knowing that whoever views my finished product may view it from a perspective I’d not intended them to.
When beginning to arrange elements of an AR experience, questions of ergonomics threaten to derail the process from the onset. Will the audience sit when they interact with my work? Will they stand? Kneel? Where will they center the experience? On a table? On the floor? On a wall?
This short list of questions should precipitate even the most conservative designers into second guessing themselves. After all, the answers to each will change how the designed experience is viewed by the audience. This also changes the requirements before we've defined them.
So, how do we account for these variables when we cannot control them in the moment during which the audience views our work?
Because we cannot control the ergonomics of how our work is viewed, we require a flexible tool or technique for designing AR experiences that adapt to the audience, instead of the other way around.
Like responsive web design adapts content to screen size, we need a framework for understanding how and when to adapt content to the ergonomics of our audience. We need a breakpoint mechanic—not based on the size of the screen, but the individual audience member holding it.
Without a clear standard or patterns for responsive AR content, designers may have turn to black magic, like this.
The good news is that we have the technology to begin designing and building responsive AR experiences, today. To understand the mechanics of responsive AR, we’ll need to brush up on some concepts of human perception and the sense of sight.
Primer: binocular vs. monocular vision
As we all know, able-sighted humans operate with binocular vision—two eyes, two signals, one sense of sight. With mobile AR, we reduce our view of the world by looking through the camera lens, which presents a monocular view of the space around us. This, in turn, diminishes our ability to reason about the spatial relationship of the objects in our view.
Binocular vision enables those with it to determine proximity very well. As we look out upon the world, the signals going from both left and right eyes to our brain are different. This relative difference—between the left and right signals—is sufficient that our brains learn to translate it into a measure relative proximity of other entities to our own selves.
The disparity in left-eye/right-eye signals is inversely related to the relative perceived distance between the object and its viewer. This is called stereopsis and is how we triangulate objects in space.
That's great and all, if we have binocular vision, but, again, with mobile AR we do not—by limiting ourselves to monocular vision, we lose parallax, so we lose depth perception.
This isn’t much of a problem with objects of known size—we have no worry of forgetting the moon is large and sits far away. Regardless of what we know, however, our sensory systems still fall victim to optical illusions, which is how we end up with memes.
To further illustrate what happens when we lose parallax, consider two rubber spheres—both red and varying anywhere from the size of a marble to that of a baseball. Viewed through a monocular lens, it's quite possible to position them such that they appear the same size.
Without parallax, we can just put the larger sphere further away from the observer and our job is done—observer fooled. Limited by monocular vision, it may be difficult, if impossible, for our observer to come to a definitive comparison between the two.
But with binocular vision, our observer has the upper-hand and stereopsis provides the necessary information to quickly determine that the spheres do not sit at the same depth. Our observer would then conclude that one sphere is larger than the other.
So, tying this back to AR, what useful, general principle can we draw here? Simply that, limited by monocular vision, perceived size is a function of physical size and distance. This is at the core of responsive AR for mobile.
Ergonomics and content placement
All AR experiences have, at their core, some notion of planes and anchors. Planes are flat surfaces on which content sits, and anchors are spatial markers relative to which content distance is measured. That we can only ever measure content’s position in space relative to something else (the anchor) is the basis for our second big clue in designing responsive AR experiences.
Let's say you are standing up, looking at an anchor that sits on the ground plane. Now imagine that there is a target floating in space, some relative Y value, such that your gaze makes a straight line through it to the anchor.
Without moving the anchor, you sit down. Would your gaze still pass through the target? By sitting down, you decrease the vertical distance from your eyes to the anchor. Because the target is positioned relative to the anchor, not your eyes, this effectively lifts the anchor in space. As a consequence, your gaze no longer passes through the target, but under it.
What if, instead of sitting down, we remain standing and lift the anchor up to a tabletop? Because the target is positioned relative to the anchor, we get the same result! The target’s Y value does not change, only the anchor’s position in space. Relative to your gaze, the perceptual effects of lifting the anchor and sitting down are the same.
This means that a tabletop experience viewed from a standing position will be (roughly) perceptually equal to a ground-plane experience viewed from a sitting position. With this we get our second core component of responsive AR: the perception of content is a product of ergonomics and content placement.
Distance and content scale in AR
Responsive design creates content that responds to different spatial contexts. For screen design, this means device and screen size. In AR, we need to consider the distance between the viewer and the AR content.
To begin to understand how responsive AR design could work, we need to draw some general conclusions about the ergonomics of viewing AR content:
- Standing increases the distance between the camera and an AR plane.
- Sitting decreases the distance between the camera and an AR plane.
- The camera is an extension of one’s body, as it moves with the body and tends to maintain some static offset from the body over time.
We can also draw general conclusions about the placement of AR content with respect to the viewpoint:
- Putting AR content on a ground plane increases its distance from the viewpoint.
- Putting AR content on a tabletop (or elevated) plane decreases its distance from the viewpoint.
(There are exceptions to these conclusions, but, for general purposes, these will take us quite a way.)
So, we have two dichotomies: we can stand or sit, and we can place content on the ground or on a tabletop. This gives us four major combinations of ergonomics and content placement:
Taking what we’ve discussed up to this point, we can establish a graph that informs perceived content scale. It helps to add numeric value to these dichotomies, so we can establish a “scale factor” of sorts.
Human anatomy creates more variance in the sit-stand dichotomy, so we should give it more weight in consideration. Taking the sum for each of the four major ergonomics-placement combinations, we can derive three concrete values of camera-content distance. Keep in mind the “scale factors” here are only relative to one another, not to any real-world metric:
- Up-close (-3), where the distance between the camera and content is the smallest.
- Middle-ground (+/- 1), where the distance between the camera and content is loosely neutral.
- Far-back (+3), where the distance between the camera and content is the greatest.
Flipping this on its head, and thinking about what effect each distance will have on our content, we can also say that:
- When viewed up-close (-3), the content will appear the largest.
- When viewed from the middle-ground (+/- 1), the same content will be context-dependent.
- When viewed from far-back (+3), the same content will appear the smallest.
We can use and treat these three camera-content distances just like breakpoints in mobile design, changing the content as our distance from it changes. To keep our content consistent at these three placements, we can make far-back content larger, up-close content smaller, and middle-ground content somewhere in-between.
And so, we have the final core component of responsive AR: by generalizing ergonomics and AR placement mechanics, we can establish breakpoints that govern how this responsive content is rendered.
The implications of responsive AR
This post has been framed around mobile AR, but the principles of responsive AR extend far beyond. Even with the binocular experience we get from Magic Leap, or the Hololens, the way content is viewed and interacted with can only benefit from being responsive to its audience.
As we design and build for more and more people, of varying cultures, abilities, and interests, we must account for differences. By ignoring the benefits of responsive design, we risk cutting off segments of the population who might otherwise benefit from our output.
This is a first attempt to spark conversation about how we can use practical means to achieve responsive content in AR. Feedback, conversation, and additions to any and all are welcome, without reservation. Join me on the Torch Friends Slack to continue the conversation. You can find me under the handle @blackmaas.
A responsive present for you
Keep an eye out for part two of this thinking on responsive AR design. In part two of this series, I show you the mechanics behind these ideas and how you can build them in Torch. We have also added a new template to Torch AR that will help you explore these responsive AR design concepts in a hands-on way—updating a responsive project with your own creative assets. Learn more about Torch templates with this guide.
Start exploring responsive content in Torch AR, today!