Tech Convergence Will Spur Demand for New ADAS Technology

Is It Live or Is It AR?


Photo: David Stuart; Retouching: Smalldog Imageworks

Is It Live or Is It AR?

By Jay David Bolter and Blair Macintyre

By blending digital creations with our view of the world, augmented reality is set to transform the way we entertain and educate ourselves

There are two ways to tell the tale of one Sarah K. Dye, who lived through the Union Army's siege of Atlanta in the summer of 1864. One is to set up a plaque that narrates how she lost her infant son to disease and carried his body through Union lines during an artillery exchange, to reach Oakland Cemetery and bury him there.

The other is to show her doing it.

You'd be in the cemetery, just as it is today, but it would be overlaid with the sounds and sights of long ago. A headset as comfortable and fashionable as sunglasses would use tiny lasers to paint high-definition images on your retina—virtual images that would blend seamlessly with those from your surroundings. [Editor's Note: That's Microvision.] If you timed things perfectly by coming at twilight, you'd see flashes from the Union artillery on the horizon and a moment later hear shells flying overhead. Dye's shadowy figure would steal across the cemetery in perfect alignment with the ground, because the headset's differential GPS, combined with inertial and optical systems, would determine your position to within millimeters and the angle of your view to within arc seconds.

That absorbing way of telling a story is called augmented reality, or AR. It promises to transform the way we perceive our world, much as hyperlinks and browsers have already begun to change the way we read. Today we can click on hyperlinks in text to open new vistas of print, audio, and video media. A decade from now—if the technical problems can be solved—we will be able to use marked objects in our physical environment to guide us through rich, vivid, and gripping worlds of historical information and experience.

The technology is not yet able to show Dye in action. Even so, there is quite a lot we can do with the tools at our disposal. As with any new medium, there are ways not only of covering weaknesses but even of turning them into strengths—motion pictures can break free of linear narration with flashbacks; radio can use background noises, such as the sound of the whistling wind, to rivet the listener's attention.

Along with our students, we are now trying to pull off such tricks in our project at the Oakland Cemetery in Atlanta. For the past six years, we have held classes in AR design at the Georgia Institute of Technology, and for the past three we have asked our students to explore the history and drama of the site. We have distilled many ideas generated in our classes to create a prototype called the Voices of Oakland, an audio-only tour in which the visitor walks among the graves and meets three figures in Atlanta's history. By using professional actors to play the ghosts and by integrating some dramatic sound effects (gunshots and explosions during the Civil War vignettes), we made the tour engaging while keeping the visitors' attention focused on the surrounding physical space.

We hope to be able to enhance the tour, not only by adding visual effects but also by extending its range to neighboring sites, indoors and out. After you've relived scenes of departed characters in the cemetery, you might stroll along Auburn Avenue and enter the former site of the Ebenezer Baptist Church. Inside, embedded GPS transceivers would allow the GPS to continue tracking you, even as you viewed a virtual Reverend Martin Luther King Jr. delivering a sermon to a virtual congregation, re-creating what actually happened on that spot in the 1960s. Whole chapters of the history of Atlanta, from the Civil War to the civil rights era, could be presented that way, as interactive tours and virtual dramas. Even the most fidgety student probably would not get bored.

By telling the story in situ, AR can build on the aura of the cemetery—its importance as a place and its role in the Civil War. The technology could be used to stage dramatic experiences in historic sites and homes in cities throughout the world. Tourists could visit the beaches at Normandy and watch the Allies invade France. One might even observe Alexander Graham Bell spilling battery acid and making the world's first telephone call: “Mr. Watson, come here.”

The first, relatively rudimentary forms of AR technology are already being used in a few prosaic but important practical applications. Airline and auto mechanics have tested prototypes that give visual guidance as they assemble complex wiring or make engine repairs, and doctors have used it to perform surgery on patients in other cities.

But those applications are just the beginning. AR will soon combine with various mobile devices to redefine how we approach the vast and growing repository of digital information now buzzing through the Internet. The shift is coming about in part because of the development of technologies that free us from our desks and allow us to interact with digital information without a keyboard. But it is also the result of a change in attitude, broadening the sense of what computers are and what they can do.

We are already seeing how computers integrate artificially manipulated data into a variety of workaday activities, splicing the human sensory system into abstract representations of such specialized and time-critical tasks as air traffic control. We have also seen computers become a medium for art and entertainment. Now we will use them to knit together Web art, entertainment, work, and daily life.

Think of digitally modified reality as a piece of a continuum that begins on one end with the naked perception of the world around us. From there it extends through two stages of "mixed reality" (MR). In the first one, the physical world is like the main course and the virtual world the condiment—as in our AR enhancement of the Oakland Cemetery. In the other stage of MR, the virtual imagery takes the spotlight. Finally, at the far end of the continuum lies nothing but digitally produced images and sounds, the world of virtual reality.

Any AR system must meld physical reality with computer-modeled sights and sounds, a display system, and a method for determining the user's viewpoint. Each of the three components presents problems. Here we will consider only the visual elements, as they are by far the most challenging to coordinate with real objects.

The ability to model graphics objects rapidly in three dimensions continues to improve because the consumer market for games—a US $30-billion-a-year industry worldwide—demands it. The challenge that remains is to deliver the graphics to the user's eyes in perfect harmony with images of the real world. It's no mean feat.

The best-known solution uses a laser to draw images on the user's retina. [Editor's Note: That's Microvision.] There is increasing evidence that such a virtual retinal display can be done safely [see "In the Eye of the Beholder," IEEE Spectrum, May 2004]. However, the technology is not yet capable of delivering the realistically merged imagery described here. In the meantime, other kinds of visual systems are being developed and refined.

Most AR systems use head-worn displays that allow the wearer to look around and see the augmentations everywhere. In one approach, the graphics are projected onto a small transparent screen through which the viewer sees the physical world. This technology is called an optical see-through display. In another approach, the system integrates digital graphics with real-world images from a video camera, then presents the composite image to the user's eyes; it's known as a video-mixed display. The latter approach is basically the same one used to augment live television broadcasts—for example, to point out the first-down line on the field during a football game [see "All in the Game," Spectrum, November 2003].



PARIS, ENHANCED: Nokia's prototype mobile AR system couples a camera, a cellphone, GPS, accelerometers, and a compass to follow the user through a city and point out all the sights.

Some of the most compelling work uses mobile phones to combine Internet-based applications with the physical and social spaces of cities. Many such projects exploit the phone's GPS capabilities to let the device act as a navigational beacon. The positional information might let the phone's holder be tracked in cyberspace, or it might be used to let the person see, on the phone's little screen, imagery relevant to the location.

Meanwhile, new phones are coming along with processors and graphics chips as powerful as those in the personal computers that created the first AR prototypes a decade ago. Such phones will be able to blend images from their cameras with sophisticated 3-D graphics and display them on their small screens at rates approaching 30 frames per second. That's good enough to offer a portal into a world overlaid with media. A visitor to the Oakland Cemetery could point the phone's video camera at a grave (affixed with a marker, called a fiducial) and, on the phone's screen, see a ghost standing at the appropriate position next to the grave.

Video and computer games have been the leading digital entertainment technology for many years. Until recently, however, the games were entirely screen-based. Now they, too, are climbing through mobile devices and into the physical environment around us, as in an AR fishing game called Bragfish, which our students have created in the past year. Players peer into the handheld screens of game devices and work the controls, steering their boats and casting their lines to catch virtual fish that appear to float just above the tabletop. They see a shared pond, and each other's boats, but they see only the fish that are near enough to their own boats for their characters to detect.

We can imagine all sorts of casual games for children and even for adults in which virtual figures and objects interact with surfaces and spaces of our physical environment. Such games will leave no lasting marks on the places they are played. But people will be able to use AR technology to record and recall moments of social and personal engagement. Just as they now go to Google Maps to mark the positions of their homes, their offices, their vacations, and other important places in their lives, people will one day be able to annotate their AR experience at the Oakland Cemetery and then post the files on something akin to Flickr and other social-networking sites. One can imagine how people will produce AR home movies based on visits to historic sites.

Ever more sophisticated games, historic tours, and AR social experiences will come as the technology advances. We represent the possibilities in the form of a pyramid, with the simplest mobile systems at its base and fully immersive AR on top. Each successive level of technology enables more ambitious designs, but with a smaller potential population of users. In the future, however, advanced mobile phones will become increasingly widespread, the pyramid will flatten out, and more users will have access to richer augmented experiences.

Fully immersive AR, the goal with which we began, may one day be an expected feature of visits to historic sites, museums, and theme parks, just as human-guided tours are today. AR glasses and tracking devices will one day be rugged enough and inexpensive enough to be lent to visitors, as CD players are today. But it seems unlikely that the majority of visitors will buy AR glasses for general use as they buy cellphones today; fully immersive AR will long remain a niche technology. [Editor's Note: I'll disagree with this last contention; there's no reason why cool AR glasses could not be a mass market phenomenon, akin to Bluetooth earbuds, etc -- most especially given all these cool applications described in this article.]

On the other hand, increasingly ubiquitous mobile technology will usher in an era of mixed reality in which people look at an augmented version of the world through a handheld screen. [Editor's Note: Again, why use a handheld screen if you have cool AR glasses?] You may well pull information off the Web while walking through the Oakland Cemetery or along Auburn Avenue, sharing your thoughts as well as the ambient sounds and views with friends anywhere in the world.

At the beginning of the 20th century, when Kodak first sold personal cameras in the tens of thousands, the idea was to build a sort of mixed reality that blended the personal with the historic (“Here I am at the Eiffel Tower”) or to record personal history (“Here's the bride cutting the cake”). AR will put us in a kind of alternative history in which we can live through a historic moment—the Battle of Gettysburg, say, or the “I have a dream” speech—in a sense making it part of our personal histories.

Mobile mixed reality will call forth new media forms that skillfully combine the present and the past, historical fact and its interpretation, entertainment and learning. AR and mobile technology have the potential to make the world into a stage on which we can be the actors, participating in history as drama or simply playing a game in the space before us.

Comments

  1. Fascinating article. But I have to disagree with the following: "... it seems unlikely that the majority of visitors will buy AR glasses for general use as they buy cellphones today; fully immersive AR will long remain a niche technology."

    People will buy it for the hands-free functionality and the 3D images (this latter aspect will blow away even the most jaded technophile). It will go from a curiosity to a must-have item to a how-did-I-ever-live-without-it? product in a very short time.

    Once the technology matures (eg. widespread high-speed wireless streaming) and the price comes down, everyone will get one. (My prediction: 2011 industry-wide introduction; one billion models sold by 2015). Is 8 years a long time? Is a billion users a niche market? I think not.

    It would be great to see Microvision really get the Eyewear universe rolling.

    ReplyDelete
  2. Oh, yeah, one more comment/question:
    Ghostly see-through images would be okay for a cemetery (!), but how would you achieve total occlusion of just the images (not of the whole screen which is more like traditional VR)? This would bring new meaning to the term interactive.

    ReplyDelete
  3. Hi Kevin,

    You and I are in agreement on this for sure (I've added some comments inline to the article).

    To your second question, these 'ghosts' or other synthetic entities can be made as translucent or opaque as necessary simply by amping their brightness level.

    ReplyDelete
  4. "The best-known solution uses a laser to draw images on the user's retina. [Editor's Note: That's Microvision.] There is increasing evidence that such a virtual retinal display can be done safely [see "In the Eye of the Beholder," IEEE Spectrum, May 2004]. However, the technology is not yet capable of delivering the realistically merged imagery described here. In the meantime, other kinds of visual systems are being developed and refined."
    ___________________________________________________________



    "Microvision will be delivering beta units of our SD3000 HMD to the US Army at the end of this year for their evaluation. In May of 2007, we were awarded our second contract from the US Air Force to develop and deliver a see-through, full color eyewear display. We expect that development and delivery of this display will open up a lot of opportunities for this product capability in multiple vertical markets."

    Hi Ben,

    Looks like someone should contact Jay David Bolter, and Blair Macintyre to inform them of the development work going on "right now" at Microvision. Seems to me that you are far past development stage with full color retinal displays via the HMD contract.

    ReplyDelete
  5. Ben, "amping up the brightness level" would be fine for text, graphics, and light-colored objects. But what if the object you want to display is a DARK object? I.e. you'd see the lightsaber, but not Darth Maul holding it.

    ReplyDelete
  6. For this reason alone, one needs non-see-through eyeware to enjoy real-3D experience. There is market for both. Neglecting one is a mistake.

    ReplyDelete
  7. Ben, What is the expected power consumption when ASIC for picoP is completed. The following is from flyonwallstreet.blogspot.com:
    Didn't they already meet the oem requirement of 1.5 watts????
    # posted by KD : 10:54 PM

    KD:

    Not even close.

    This time last year, they were at 9 watts.

    If they were at 1.5 watts, it would be in handsets, right now.

    Basically, MOT is weaiting for the wattage to get to 2.5-3.

    If MVIS can accomplish this, by January, they would be way ahead of schedule, essentially assuring product launch in 2008.
    # posted by Broker A : 10:58 PM

    ReplyDelete
  8. I just took notice Broker A aka Fly on Wall street just increased his position in Microvision. I guess he believes this is not a problem getting the power issue resolved.

    ReplyDelete
  9. Why did AT say there was no gating issues. This is a big one. Someone is not telling the truth. AT or FLY?

    ReplyDelete
  10. Ben, I remember you mentioning a while back that you were experimenting with using old fashioned lcd technology (old fashioned as in Gameboy era) to switch between occluded and not. How did that work out?

    ReplyDelete
  11. "Why did AT say there was no gating issues. This is a big one. Someone is not telling the truth. AT or FLY?"

    I would assume that when they say they can bring the power down to 1.5 watts this is designed and run on computer models many times before they do a production model. It is not hard for engineers to prove what can be done before it is in fact produced. I would think this is the source of the discrepancy.

    ReplyDelete
  12. This is just wishful thinking. It is about time AT come up with the truth. 3? 4? 5? when everything included?

    ReplyDelete
  13. AT doesn't sound like the kind of person who indulges in wishful thinking a lot.
    He always sounded brutally honest to me.

    ReplyDelete
  14. Where is the hard numbers? From 9 W to 1.5 W is not going to be easy. Especially if 9W does not include the ASIC.

    ReplyDelete
  15. "Where is the hard numbers? From 9 W to 1.5 W is not going to be easy. Especially if 9W does not include the ASIC"

    I believe one of the reasons for developing the ASIC for picoP is to cut down on power consumption, as well as size.

    ReplyDelete
  16. Let's give our shareholders some confidence in MVIS. Give us the power consumption numbers now. Can't be that big a secret.

    ReplyDelete
  17. Just how do you measure the power consumption when the digital signals are processed by a PC?

    ReplyDelete

Post a Comment