MakerLab Blog » reality http://blog.makerlab.com Go on, be curious Thu, 14 Mar 2013 06:30:21 +0000 en-US hourly 1 http://wordpress.org/?v=3.9.15 Augmentia Redux http://blog.makerlab.com/2009/11/augmentia-redux/ http://blog.makerlab.com/2009/11/augmentia-redux/#comments Thu, 19 Nov 2009 04:17:49 +0000 http://blog.makerlab.com/?p=846 Text of AR Presentation for Dorkbot San Francisco

Anselm Hook
http://github.com/anselm
http://twitter.com/anselm

Quick notes for tonite:

Tomorrow night @sidgabriel is going to do http://www.meetup.com/augmentedreality with the extended hyper-posse. Please join. As well, if you really haven’t had enough then on December 5th is @ARDevCamp at the @hackerdojo as well. If you want to go to @ardevcamp it is free but you MUST register here. If that isn’t enough for y’all then I really can’t help you – we tried really hard to crush your meme.

Also – more personally – if you hate this then you’ll probably also enjoy hating this earlier post on Augmentia and this post on the Aleph No-Op and this one might push you over the edge if those fail Bimimetic Signaling .

Here we go….:

Are you going in or are you trying to get out?

Are you going in or are you trying to get out?

Augmented Reality Redux

Who else wants to or is playing with AR apps? I’m assuming that most people are at least familiar with Augmented Reality.

Recently I started exploring this myself and I wanted to share my experiences so far.

As I’ve been working through this  I’ve kind of ended up with a “Collection of Curiosities”. I’ll try to shout them out as I go as well.

My hope is to to encourage the rest of you to make AR apps also – and go further than I’ve gone so far.

What is Augmented Reality?

If I had to try to define it – I’d say Augmented Reality app is an enhanced interaction with everyday life. It takes advantage of super powers like computerized memory, and data driven visualization to deliver new information to you just-in-time.  Ideally connected to and overlaid on top of what we already see, hear and feel.

Beyond this it can ideally watch our behavior and respond – not just be passive.

Observation #1: Of course there’s nothing really new here – our own memories act in a similar way – the fact that we can all read, decipher signs, symbols signifiers, gestures – we are already operating at a tremendous advantage. As my friend Dav Yaginuma says  “a whole new language of gestural interaction may emerge”. Perhaps we’ll get used to seeing people gesticulating wildly on the streets at thin air – or even avoiding certain gestures ourselves because they will actually make things happen ( as Accelerando hints at in their Smart Air idea ). So what makes it interesting is that the degree of transition is like a phase transition between ice and water or between water and air as Amber Case puts it.

What are some examples?

1) You could be bored – walking down the street and see that a nearby creek could use some help. So you spend an hour pulling garbage out of it. And somehow understand that that was doing some good.

2) You could in an emergency situation, outside and maybe it is starting to rain, and maybe you’ve lost track of your friends. An AR view could hint at which way to go without being too obtrusive.

3) You could walk into a bookstore, pick up a book, and have the jacket surrounded by comments from your friends about the book.

Observation #2: The placemarks and imagemarks in our reality are about to undergo that same politicization and ownership that already affects DNS and content. Creative Commons, Electronic Frontier Foundation and other organizations try to protect our social commons. When an image becomes a kind of hyperlink – there’s really a question of what it will resolve to will your heads up display of mcdonalds show tasty treats at low prices or will it show alternative nearby places where you can get a local, organic, healthy meal quickly? Clearly there’s about to be a huge ownership battle for the emerging imageDNS. Paige and I saw when we built the ImageWiki last year and it must be an issue that people like SnapFish are already seeing.

sift-features-s

My own work so far:

My own app is a work in progress and I’ve posted the source code on the net and you can use it if you want. Here’s what my first pass looked like about a week ago:

First of all I am being motivated by looking for ways to help people see nature more clearly. I’m concerned about our planet and where we’re going with it.  Also I’m a 3d video games developer so the whole idea of augmenting my real reality and not just a video game reality seemed very appealing.

My code has two parts 1) server side and a 2) client side.

My client side app:

On the client side I let you walk around and hold the phone up and decorate the world with 3d objects.

1) You can walk around and touch the screen and drop an object
2) You can turn around and drop another object
3) If you turn back and look through the lens you can see all the objects you dropped.
4) If you go somewhere else you can see objects other people dropped.

Here is a view of it so far :

View of my AR app work in progres

View of my AR app work in progres


There were a lot of code hassles that I will go into below but basically loading and managing the art assets was the largest part of the work. Here is my dinosaur test model and in my warehouse scene:

Dinny from some dude who posted it to Google Sketchup Warehouse

Dinny from some dude who posted it to Google Sketchup Warehouse

blender-_users_anselm_dinoworldblend

Observation #3: I first was letting the participants place objects like birds and raccoons. But I ended up switching to gigantic 3d dinosaurs (just like the recently released Junaio app for iphone). The longitude and latitude precision of the iphone was so poor and I needed something really big that would really “work” in the shared 3d space. I suggest you design around that limitation for your projects too.

Oddly it was kind of coincidental but it shows the hyper velocity of this space – that something I was noodling was actually launched live by somebody else before I could even finish playing around – hat’s off to Junaio:

Junaios app

Junaio's app

My server side app:

On the server side – I also started to pull in information from public sources to enhance the client view. I used two data sources – Twitter and Waze.

Geolocating some Twitter Data to help with an AR view.

Geolocating some Twitter Data to help with an AR view.

Twitter has a lot of geo-locatable data and I started sucking that in and geo-locating it.

Curious observation #4: I found that I had to filter by a trust network because there is so much spam and noise there. So it shows how this issue of trust is still unsolved in general and how important social filters will be. Spam is annoying enough already – you sure don’t want it in a heads up display. Here’s a tip for AR developers – NEVER SHOW STARBUCKS!

Also I started collecting data from Waze (which is an employer). Waze does a real time crowd sourced car navigation solution for the iphone. Crowd source traffic accident reports for example. They don’t have a formal public API but they do publish anonymized information about where vehicles are. So I am now trying to paint contrails onto the map in real time to show a sense of life. I don’t have a screen shot of that one yet – but here’s the kind of data I want to show:

Waze data

Waze data


Observation #5: Even twitter started to feel kind of not real time. So what was interesting here was to show literally real time histories and trajectories. It seems like this means animation, and drawing polygons and lines – and not just floating markers. I imagine an AR is a true enhancement – not just billboards floating in space. It started to feel more video game like and I feel AR comes more from that heritage than GIS

Overall Work Impressions

When I was at the Banff Centre earlier this summer a pair of grad students had an AR golf game. The game did not know about the real world and so sometimes you’d have to climb embankments or even fences to get at the golf ball if you hit it into the wrong area. This area is VERY rugged – their game was sometimes VERY hard to play:

Banff Centre

Banff Centre

What was surprising to me is the degree of perversity and the kind of psycho-geography feel that the overlay creates – the tension is good and bad – you do things you would never do – the real world is like a kind of parkour experience.

In my project I had some similar issues. I had to modify my code so that I could move the center of the world to wherever I was. If I was working in Palo Alto, and then went to work in SF – all of my objects would still be in Palo Alto – so that was really inconvenient – and I would have to override their locations. I just found that weird.

I also found it weird how even if I facing a different direction in a room – it affected how long it took to fix an issue. Sometimes I wouldn’t see something, but it was just because it was behind me and I happenend to be sitting a different way.

Curious Observation #6: Making projects about the real world actually changes how you think. Normally if you are sitting at a desk and working, certain kinds of things seem important. But if you actually are standing and interacting and moving through real space And trying to avoid cars and the like. Then the things you think are important are very different. I think it is hard to think about an AR game or project if you are inside. And I think you have to playtest your ideas while actually outside, and try to remember them long enough to actually make a bit of progress, then go back outside and test them.

Implications redux

What’s funny about AR is that almost everybody is doing a piece of it – and now everybody has to talk. There are GIS people who think of AR as their turf. There are video game developers who think of AR as just a variation of a video game. There are environmentalists and privacy people and all kinds of people who are starting to really collide over the topic. All of the ambient and ubicomp people also see it as just a variation of what they’ve already been doing.

It’s also about action. At home in front of a computer you pretty much have most of what you need. In the real world you’re actually hungry, or bored or want to help on something – so it’s bringing that kind of energy – an Actionable Internet.

And spam in AR is really bad. Spam on the web or in email is mildly annoying – spam in AR is completely frustrating.

The crowd-sourcing group mind aspect is pretty interesting as it applies to the real world. It’s going to make it pretty hard to be a police officer – or at least those roles are going to have to change. I imagine that officers might even tell people where they’re placing radar guns so that people can help by being more cautious in those areas.

I also really like the idea of how it is a zero click interface – I think that’s really appealing. I use my phone when I am walking around all the time – but it can be frustrating being distracted. I kind of imagine the perfect AR interface that shows me one view only and only of what I really care about – and I think that’s probably where this is going. It is not like the web where you have many websites that you go to . I think it will be more just one view – and everything in that view ( even though those things come from different companies or people ) will have to interact with each other amicably. I’m really curious how that is going to be accomplished.

Also – I think it’s not just a novelty.  As I was working through this I started to see what other people were doing more clearly. And I started to get the strong impression that folks like Apple and Google are actually not just aiming to provide better maps or better data but actually trying to aim at a heads up displays where local advertising was blended into the view. I get the sense that there’s a kind of hidden energy here to try and own what we see of as the “view” of our reality – so I expect the hype to actually get even bigger than it is now.

AR involves our actual bodies. In a video game you’re not at risk. Even the closest thing I can imagine – an online dating site – your profile can be anonymous. But if you’re dealing with an AR situation there are real risks; stalking, traps, even just hunger, boredom and exhaustion.

Conclusion of cursory comments

There is something about how it implicates our real bodies. I guess I don’t really know or understand yet but I am finding it fascinating. And I also find it much closer to my deep interests in the world, and in being outside, standing up and interacting with a rich rich information space rather than just a computer machine space. I appreciate computers but I also love the real world and so mixing the two seems like a good idea to me. If we are our world and our world is our skin then in a sense we’re starting to deeply scrawl on our skin. What could possibly go wrong?

I asked Google to find something that would indicate this idea of writing on skin – and this is what Google told me was going to happen. I’m sure it will be alright.

Just seemed appropriate at the time.

Just seemed appropriate at the time.

More Technical Code Parts and Progress

The entire work so far took me about a week and a half. This was in my spare time and it was quite a lot of late nights. I had just started to learn the iPhone and I am sure as the weeks progress this work will get much better much faster.

The overall arc was something like this:

0) NO INTERFACE BUILDER. The very first piece of code I wrote was to build iPhone UI elements without using their ridiculous interface builder. IB means that people who are building IPhone apps cannot even cut and paste code to each other but must send videos that indicate how things connect. The entire foundation of shared programming is based on being able to “talk” about work – and having to make or watch a video – at the slow pace of video time – is a real barrier. So for me my first goal was to programmatically exercise control of the iPhone. This took me several days and was before I started on the application proper. Here is one case where I would have largely preferred to use Android.

1) SENSORS. I wrote a bare bones OpenGL app without any artwork and read the sensors and was able to move around in an augmented reality space and drop boxes. This took me a couple of days – largely because I didn’t know Objective C or the iPhone. On the bright side I know know these environments much better than I ever wanted to.

2) ART PIPELINE. First I tried using OpenGL by hand but this became a serious limitation in terms of loading and managing art assets. I tried a variety of engines and after much grief settled on the SIO2 engine because it was open source and the least bad.  Once I could actually load up art assets into a world I felt like I had some real progress. This only took me a day but it was frustrating because I am used to tools that are of a much higher caliber coming from the professional games development world.

3) MULTIPLE OBJECT INSTANCING. One would imagine that a core key primary feature of a framework like SIO2 would be to allow multiple instancing of geometry – but it had no such feature – it was surprising. I largely struggled with my own inability to comprehend how they couldn’t see it as a highest priority. Some examples did do this and I cut and pasted code over and removed some of the application specific features. I still remained surprised and that ate up an entire two days to comprehend – I basically was looking for something that wasn’t built in…  still surprises me actually.

4) CAMERA POLISH AND IN HUD RADAR. I spent a few hours really polishing the camera physics with an ease in and ease out to the target for smoother movement, and making the camera position and orientation explicit ( rather than implicit in the matrix transform ). This was very important because then I was able to quickly throw up a radar view that gave me a heads up ( within the heads up ) of my broader field of view – and this helped me quickly correct small bugs in terms of where artifacts were being placed and the like.

5) SERVER SIDE. While working in Banff at the Banff Centre briefly this summer I had written a Locative Media Server. It itself was basically a migration of another application I had written – a twitter analytics engine called ‘Angel’. Locative is a stripped down generic engine designed to let you share places – in a fashion similar to bookmarking. I dusted off this code and tidied it up so that I could get a list of features near a location and so that I could post new features to the server. While crude and needing a lot of work – it was good enough to hold persistent state of the world so that the iPhone clients could share this state with each other in a persistent and durable way.  Also, it is very satisfying to have a server side interface where I can see all posts, and can manage, edit, delete and otherwise curate the content. And as well, it was an unrelated goal of mine to have a client side for the server – and so I saw this as killing two birds with one stone .  Now the currently running server instance is being used as the master instance for this app and it is logging all of the iphone activity.  Once the server was up it became easy to drive the client.

http://locative.makerlab.org

http://angel.makerlab.org

6) IPHONE JSON. Another example of where the iPhone tool-chain sucks is just how many lines of code it takes to talk to a server and parse some JSON. I was astounded. It took me a few hours – mostly because I couldn’t believe how much of a hassle it was and kept looking for simpler solutions. I ended up using TouchJSON which was “good enough”. With this I was finally able to issue a save event from the phone up to the server and then fetch back all the nearby markers. This is a key for a shared Augmented Reality because we want all requests to be mediated by a server first. Every object has to have a unique identifier and all objects have to be shared between all instances – just like a video game.  I’ve done this over and over ad nauseum for large commercial video games so I knew exactly what I wanted and once I fought past the annoying language and toolchain issues it pretty much worked first try. It does need improvements but those can come later on – I feel like it is important to exercise the entire flow before polishing.

7) TWITTER. I also fetched geo-located posts from twitter and added them as markers to the world as well. I found that this actually wasn’t that satisfying – I wanted something that showed more of an indication of real motion and time. This was something I had largely already written and enabling it in the pipeline was just a few minutes work.

8) WAZE. With my own server working I could now talk to third party engines and populate the client side. Waze (an employer of mine) graciously provided me with a public API method to access the anonymized list of drivers. Waze in general is a crowd sourced traffic navigation solution that is free for a variety of platforms such as iPhone and Android. I fetched this data and added it to the database and then I was able to deliver that to the client so that I could show contrails of driver behavior. This is still a work in progress but I’m hoping to publish this part as soon as I have permission to do so ( I am not sure I can release their data API URL yet ).

9) FUTURE. Other things I still need to do include a number of touch ups. The heading is not saved so all objects re-appear facing the same way. And I should deal with tilting the camera up and down. And I should let you post a name for your dinosaur via a dialog.  Perhaps the biggest remaining thing is to actually SEE through the camera lens and show the real world behind . Also I would like to move towards providing an Augmented Reality OS where I can actually interact with things – not merely look at them.

Code Tool chains

I must that the tools for developing 3d applications for the iPhone are quite poor by comparison to the kinds of tools I was used to having access to when I was doing commercial video games for Electronic Arts and other companies.

The kinds of issues that I was unimpressed with were

1) The iPhone. XCode is a great programming environment and the fact that it is free, and of such high quality clearly does help the community succeed. There are many iPhone developers because the barriers to entry are so low. But at the same time Objective C itself is a stilted and verbose language and the libraries and tools provided for the iPhone are fairly simple. For people coming from the Javascript or Ruby on Rails world they won’t be that impressed by how many lines of code it takes to do what would be considered a fairly simple operation such as opening an HTTP socket, reading some JSON and decoding it.  What is one or two lines of code in Javascript or Ruby is two or three pages of code in Objective C.  For example here are some comments from people in the field on this and related topics:

http://kosmaczewski.net/2008/03/26/playing-with-http-libraries/

http://cocoadev.com/forums/comments.php?DiscussionID=259

2) Blender is the status quo for free 3D authoring. It’s actually quite a poor application. It’s interface largely consists of memorizing hot-keys and there is a lot of hidden state in sub-dialogs. Perhaps the biggest weakness of Blender is that it doesn’t have an UNDO button – so it makes making mistakes very dangerous. I once spent an hour finally figuring out how to convert an object into a mesh and painting it and I accidentally deleted it and it was completely gone. Even things that claim to be working such as oh, converting an object into a mesh, often as not simply do not and do not provide any feedback as to why. It’s intriguingly difficult to find the name of an object or change it, or to see a list of objects, or import geometry or merge resource files. Features that are claimed such as importing KML seem to fail transparently. There are many old and new tutorials that are delightfully conflicting – and often their own Wiki is offline or slow. It really does require going slowly, taking the time to read through it and memorizing the cheat sheets:

http://wiki.blender.org/index.php/Doc:Manual/3D_interaction/Navigating

http://www.cs.auckland.ac.nz/~jli023/opengl/blender3dtutorial.htm

http://download.blender.org/documentation/oldsite/oldsite.blender3d.org/142_Blender%20tutorial%20Append.html

http://download.blender.org/documentation/oldsite/oldsite.blender3d.org/177_Blender%20tutorial%20Game%20Textures.html

http://www.keyxl.com/aaac91e/403/Blender-keyboard-shortcuts.htm

Blender came out of researching how to get objects into the OpenGL on the iPhone – at the end of the day it was the only choice pretty much.

http://www.blumtnwerx.com/blog/2009/03/blender-to-pod-for-oolong/

http://iphonedevelopment.blogspot.com/2009/07/improved-blender-export.html

http://www.everita.com/lightwave-collada-and-opengles-on-the-iphone

3) SIO2. I first rolled my own OpenGL framework but managing the scene and geometry loading was too much hassle so I switched to an open source framework. SIO2 provides many tutorials and examples and at least it seems to compile build and run and is reasonably fast. But it also has several limitations. The online documentation is infuriatingly empty – while they’re proud of having documentation for all classes and methods the documentation doesn’t actually say what things DO – it just describes the name of the function and nothing else. And many of the tutorials conflate several concepts together such as multiple instancing AND physics so that one is unsure of orthogonal independent features. Also the broken english through-out creates a significant learning curve. It is free but it needs a support community to come around it and to improve not the core engine but the support materials.  SIO2 works closely with blender and has a custom exporter – some of the tutorials on youtube show how to use it ( the text documentation magically assumes these kinds of things however ).  Although I hate watching video tutorials it is a pre-requisite to actually learning to work with the SIO2 pipeline. Overall now after a few days of bashing my head against it I can finally use it with reasonable confidence.

http://sio2interactive.com/SIO2_iPhone_3D_Game_Engine_Technology.html

http://sio2interactive.wikidot.com/sio2-tutorials-explained-resource-loading

4) Google Sketchup appears to be the best place to find models. I couldn’t actually even get Google Sketchup to install and run on my new MacBook Pro so I ended up using a friends computer to dig through the Google Sketchup Model Warehouse and import a model and then convert it to 3DS. I ended up having to use Google Sketchup PRO because I couldn’t find any other way to get models from KMZ into 3DS or some other format supported by Blender. The Blender Collada importer fails silently and the Blender KML importer fails silently. The only path was via 3DS.

http://sketchup.google.com/3dwarehouse/details?mid=f95321b35c2817bdaef005b7f8d10dde&prevstart=12

http://www.katsbits.com/htm/tutorials/sketchup_converting_import_kmz_blender.htm

5) Camera. One of my big goals which I have not hit yet is to see through the camera and show the real world. On the iPhone this is undocumented – but I do have some links which I will be researching next:

http://hkserv.ugent.be/boudewijn/blogs/?p=198

http://www.developerzone.com/links/search.html?query=domain%3Amorethantechnical.com

http://code.google.com/p/morethantechnical/

http://mobile-augmented-reality.blogspot.com/2009/11/iphone-camera-access-module-from.html

Technical Conclusions

My overall technical conclusions are that Android is probably going to be easier to develop applications for than the iPhone. I haven’t done any work there yet but the grass seems very green over there. The language choices of using Java seems like it would be a more pleasing, simple and straightforward grammar, and the access to lower level device data such as the raw camera frames seems like it would also help. Also since there are fewer restrictions in the Android online stores it seems easier to get an app out the door. As well it feels like there is more competition at the device level and that devices with better features are going to emerge more rapidly for the Android platform than for the iPhone. I think if I had tried to do this for the Android platform first that it would have been doable in 2 to 3 days instead of the 7 or 8 days that I have ended up putting into it so far.

My overall general conclusion is that AR is going to be even more hyped than we see now but that it will be hindered by slow movement in hardware for at least another few years.

I do think that we’re likely to see real changes in how we interact with the world. I think today the best video that shows where I feel it is going is this one ( that isn’t even aimed at this community per se ) :

AICP Southwest Sponsor Reel by Corgan Media Lab from Corgan Media Lab on Vimeo.

]]>
http://blog.makerlab.com/2009/11/augmentia-redux/feed/ 0
Augmentia http://blog.makerlab.com/2009/11/augmentia/ http://blog.makerlab.com/2009/11/augmentia/#comments Tue, 03 Nov 2009 18:48:08 +0000 http://blog.makerlab.com/?p=821 augmentia_by_doctorwat

(Augmentia - with permission from DoctorWat)

“You can find anything at the Samaritaine” is this department store’s slogan. Yes, anything and even a panoramic view of the all of Paris. All of Paris? Not quite. On the top floor of the main building a bluish ceramic panorama allows one, as they say, “to capture the city at a glance”. On a huge circular, slightly tilted table, engraved arrows point to Parisian landmarks drawn in perspective. Soon the attentive visitor is surprised: “But where’s the Pompidou Centre?”, “Where are the tree-covered hills that should be in the north-east?”, “What’s that skyscraper that’s not on the map?”. The ceramic panorama, put there in the 1930s by the Cognac-Jays, the founders of the department store, no longer corresponds to the stone and flesh landscape spread out before us. The legend no longer matches the pictures. Virtual Paris was detached from real Paris long ago. It’s time we updated our panoramas.”

The World is the Platform

Augmented Reality is going to make it possible for us to see through walls. It will remove some of the blindness that has crept up around our industrial landscape. But what is the “use” of this tool we’ve fashioned? And how will it even be implemented; how will many different app developers ever agree on what we see from a single window?

In a couple of weeks a bunch of us are going to get together to talk about this at ARDevCamp . But as a pre-amble to that I thought I’d share some of my own questions, thoughts and observations.

The hype has started to become real as William Hurley observes. Personally I blame Bruce Sterling but perhaps the iPhone 3GS and Android phones share some of the blame. This last weeks prime example should have been brought to us by companies like TomTom or Garmin given recent acquisitions. Instead (in what is clearly a longer term strategy) Google stopped licensing TeleAtlas in the USA and started provided their own higher quality interface and UI (and taking a bit of a stab at Apple at the same time not to mention the Open Street Maps community). The interface itself is shifting from a traditional top down cartographic orthodoxy to become more game-like; with street-view projections, heads-up-displays and zero-click interfaces. The hidden pressure underneath these moves may not be to just provide better maps but to provide a better higher fructose reality. A candy coated view that shows you just what you want just-in-time decorated with lots of local advertisements and other revenue catch basins. Cars and traffic reports are just the gateway.

In my mind this isn’t just hype but something relevant and important. Augmented Reality isn’t just an academic or even safe exercise. It connects in a very primal and critical way to who we are as humans. It’s not just an avatar in Second Life or a profile on a OKCupid – it is us. It puts own embodiment at risk. And whomsoever can mitigate that risk while providing reward will probably do well. I believe that organizations such as Apple and Google see this and are pursuing not merely real-time, or hyper-local or crowd-sourced apps but ownership of the “view”. They want to own the foundation of the single consistent seamless way of presenting an understanding of the world. And as such it is about to become extremely competitive.  Everybody wants a part of the lens of reality, the zero-click base layer beneath the beneath. As Gene Becker puts it “The World is the Platform”. And an ecosystem is starting to emerge.

Personally I’m trying to approach an understanding with praxis; balancing between time reading and time making. On the making side I’ve been writing an Augmented Reality app for the iPhone. For me that’s already a unique exercise. It’s the first time I’ve written code and then had to actually go outside into the real world to test it. On the thinking side, and coming from an environmental interest, and from a critical arts and technology perspective I’ve also been fascinated by how we understand and use Augmented Reality.

Collision of Forces

Like many new technologies Augmented Reality magnifies tensions between things that were normally separate.

In a sense it is the same dream that the social cartography community has had. This is the community that coalesced around Open Street Maps, Plazes, Where 2.0 and the idea of geo-tagging as a whole. This was a vision of a crowd-sourced bottom up community driven and community owned understanding of the world. It is a vision that failed in some ways. Yes we have nice free maps but we never did get to the point of being able to see our friends, or the contrails of where our friends had been, or really where the best nearby place to have a nap was. But now the idea is returning more forcefully and with more determination than ever.

It is also about an actionable Internet. There is a community that is rebelling against the morbidity of indoor culture and a largely passive media consumption centric lived experience. One that wants to decorate the world with verbs and actions – that wants to put knobs and levers on everything – or at least make those knobs and levers more visible. Diann Eisnor talks about Transactional Cartography – an idea of maps that are not passive – that don’t just show you where you can solve a problem – but that hear your request for help and call you back with solutions. Just imagine the kinds of trust and brokering negotiation infrastructure that this inevitable end game implies.

It is also about an ideal of noise filtering as a pure problem. There’s been a long and unsolved problem of building working trust networks on the web as a whole. Even aside from spam there are acres of rotting bits out there that will completely drown out any new view unless they are filtered for. Many social graph projects have failed to help filter the deluge of information that we are inundated with every day. When you can’t see the forest or the trees then this becomes a much higher priority to resolve.

It dredges up an amusingly disparate rag-tag collection of development communities who have been safely able to ignore each other. Suddenly game developers are arguing with GIS experts and having to unify their very different ways of describing mirror worlds. Self-styled Augmented Reality Consortiums are emerging with the proposition to define the next generation notational grammars by which we will share our views of reality.

It brings the ubiquitous computing and ambient sensor network people to the table. These are folks who had safely been hiding out in academia for the last decade doing exotic, beautiful and yet largely ignored projects .

It creates a huge pressure and demand for interaction designers to actually make sense of all this and make interfaces that are usable.

It draws a pointed stare towards the act of siloing and building moats around data. When your FourSquare cannot see your Twitter and when your Layar view can’t show the gigantic T-Rex stomping towards you … well people just aren’t going to put up with that anymore. What is needed is a kind of next generation Firefox or foundation technology that underpins and unifies these radically disparate realities.

It is going to take the idea of crowd-sourcing to a wildly energetic new level above where it is now. When your body is on the line the idea of real-time tactical awareness suddenly becomes much more important to everybody. When the SFPD can volunteer that they’re going to put a radar gun at a location, or when a driver can post about a car accident to the cars behind him or her – you start to involve a real time understanding that affects your quality of life in an visceral way. It’s almost the beginning of a group organism. Something that goes beyond merely flocking type of behaviors and becomes more like a shared nervous system. It’s an evolution of us as a species – and probably just in time as well given the kinds of environmental crisis we are facing.

It takes the Apple ideals of interface to a new level. Instead of one click there are zero clicks; the interface becomes effortless. As Amber Case puts it interfaces move from being heavy and solid with big heavy buttons and knobs and rotary dials to becoming liquid and effortless like the dynamic UI of the iPhone to becoming like air itself. They become part of the background, ambient and everywhere, we breathe them and can see through them, the virtual pressure of these interfaces becomes like an information wind steering us around invisibly like toy boats on a lake.

It will connect us to the environment because everything actually is connected to the environment – we just manage to ignore this. Our natural environment underpins everything around us but we largely ignore it. There’s a feeling in the movement that things are constantly getting worse. That we’re losing more of Eden every day. We hear in the media about plastic oceans, carbon dioxide and the like. Derrick Jensen says “what would it take to live in a world where every year there were more salmon, and every year there were more birds overhead, and less concrete and more trees?” Paul Hawkens talks about an idea of thousands of local organizations developing a local understanding of their region and each working in parallel over local issues. When people can see environmental issues around them, and connect those issues more simply to related economic issues then it will vitalize action.

It will do interesting things to national boundaries. When you can look through walls and see other kids who are exactly the same as you – clearly that will have some kind of impact. Either to humanize us or to make us carry an even greater burden of cognitive dissonance.

It even brings out that eternal question of what it means to be human. We’re so willingly embracing technology today it almost feels like a planet wide mania. Consider how the One Laptop Per Child is challenged in terms of is it the best and cheapest technology device for kids but rarely is there a question of if technology at all is the right thing. We give some kids augmentia while other kids pry precious metals out old desktops while coughing out toxic smoke from nearby jury rigged smelter operations.

As Sheldon Renan posits in his ‘theory of netness’ a sufficiently dense network exhibits an emergent behavior. A virtuous field is created that affects not only the participants in the network but everything around it, even things not directly connected to it. By way of allegory in the United States we used to back our currency with gold. At some point we left that backing because the illiquidity was a hindrance to velocity. Local area information is about to get a similar speed up and disconnection from its argumentative grounding. You won’t have to visit city records to see the hidden history of the homes around you or the supply chain behind a package of smarties. AR is in some ways like seeing the speculative sum of the Noosphere. Privileged information may become cheaper. Inflationary economies may take hold. But by making hidden things visible, and visible things cheap, it will make other things possible that we don’t entirely realize yet.

Historical Perspective

In 1997 I co-founded Virtual Games Inc. We were a specialized 12 person venture funded games co focusing on real-time immersive many-participant shared experiences. You could put on a VR helmet and run around in our game worlds and interact with other players ( usually by shooting them unfortunately ).

Back then the relatively moderate performance of 3d rendering hardware made it difficult to keep up with the rapid head movements of the players. The lag between moving your head and seeing the 3d display repainted could make you nauseous. Today the average video game machine such as the WII, XBox or Playstation II can paint around 100,000 lit shaded polygons at 60 frames a second but back then home computers were much like the mobile devices today; capable of only very limited 3d performance.

The biggest challenge we faced wasn’t hardware however. Rather it was simply knowing where to start; how to define the topic as a whole. We had very few examples. Issues such as User Interface controls that could be used while moving, having a Heads Up Display, having a radar view, or decorating the VR world with visibly striking markers – these were all fairly novel ideas. We didn’t have a design grammar for representing the objects, their relationships and how they behaved.

Today many of the same issues are occurring again with Augmented Reality. The synchronization and registration between the movement of the real world and the digital overlay can feel like being on a ship at sea. Presenting complex many polygon animated geometries that interact with the users is still a challenge – especially on mobile devices where the camera is fairly dumb and the computational power limited. Making a publishable data representation of an avatar or interactive digital agent is in and of itself a significant challenge. There are fundamentally new ways of interacting that still haven’t been very well defined. The Augmented Reality Operating System has yet to be invented.

Now as a result there are fervent discussions about how to describe, publish, share and run an Augmented Reality world.  People are trying to design an ARML ( Augmented Reality Markup Language ) much like occurred years ago over VRML ( Virtual Reality Markup Language ). But the whole space still lacks the cognitive short-hand and the usability expertise that characterizes web development today.

AROS

“For instance, do you see this chunk of land, washed on one side by the ocean? Look, it’s filled with fire. A war has started there. If you look closer you’ll see the details. Margarita leaned towards the globe and saw the little square of land spread out, get painted in many colours, and turn as it were into a relief map. And then she saw the little ribbon of a river, and some village near it. A little house the side of a pea grew and became the size of a matchbox. Suddenly and noiselessly the roof of this house collapsed, so that nothing was left of the little two-storey box except a small heap with black smoke pouring from it. Bringing her eye still closer, Margarita made out a small female figure lying on the ground, and next to her, in a pool of blood, a little child with outstretched arms. “That’s it,” Woland said, smiling, “he had no time to sin. Abaddon’s work is impeccable.”

Building the technology for a next generation OS is going to be challenging.

There will need to be some kind of way of publishing AR objects onto the Internet. This description will have to describe what an AR object would like to be presented as. Its geometry as described by a series of polygons or mathematical surfaces, texture, appearance, lighting and animation. Often appearance is tied to underlying functionality and a description of the behavior of the object needs to be shipped as well. Some of this behavior is gratuitous; eye-candy for the viewer, and some is utilitarian, actual work that the object may do for you. The clear legacy for this kind of description comes from the world of video games.

Unlike the traditional web probably there will be one view – not many separate web-pages. Everybody’s stuff will all pour together into one big soup. Therefore there will need to be a way to throttle 3d objects that are presented to you; limiting the size, duration and visual effects associated with those objects so that one persons objects do not drown out another persons. Objects from different people will have to interact gracefully with the real world and with each other.

There will be an ownership battle over who owns ordinary images. Augmented Reality views may be connected to real world images around us. An image of an album cover could show the bands website, or it could show Amazon.com – depending on who ends up winning this battle. An image of you could show your home-page or a site making fun of you. Eventually a kind of Image Registry will emerge where images are connected to some kind of meta-data.  An AR View would talk to this database.

There will be user interface interaction issues. What will be the conventions for hand-swipes, grabs, drags, pulls and other operations to manipulate objects in our field of view. We’re going to evolve a set of gestures that don’t conflict with gestures we use around other humans but that are unambiguous.

There will be a messaging system. It’s pretty clear that most signage, sirens, alerts and social conventions will be virtualized. You’ll probably be able to elevate your car to being an ambulance in certain conditions and have everybody clear the road ahead of you for example.  This kind of transaction will require an agreement on protocols at least – aside from privileges, permissions, and payment systems.

There will probably be huge incentives to have trust well defined. Since your actual body is usually involved in an augmented reality – you’re likely to be more sensitive about full disclosure. Trust is usually accomplished by a whitelist of friends who are allowed to see you or contact you – and perhaps one or two degrees of separation may be allowed as well.

New Senses

Of course we can imagine that we’ll move past these challenges. And then it becomes like any human prosthetic; integrated with our faculties, shifting who we are, and becoming invisible. Modern video games have a well framed design grammar that is taken for granted – the experience of being in a VR world is completely natural. Mobility, teleporting, just-in-time information – all completely normal. We can navigate a VR world with about the same ease that we can trace our finger along a map or browse the chapters of a book. And like maps or books if it is convenient and helpful then it becomes necessary.

Today I am sitting in the park between the Metreon and the San Francisco Museum of Modern Art. I’m currently surrounded by thousands of “agents”, ranging from birds to pedestrians to street-signs to the grass itself. Clearly we are fit for this world we live in. Plants in general are color coded in such a way that their coloration has critical meaning for us. There is a well understood inter-species dialogue between ourselves and other kinds of agents at many levels. The pace of the world runs at about the pace of our ability to keep up with it. Our world is highly interactional – a total tactile and sensory immersion if we permit it. Our whole body is ventured and at risk. The world affects and defines us by the compromises we make; we put substantial cognition into avoiding harm. It is not about arbitrary irreverent static images floating around in our field of view like a detached retina. We are a persistent but porous boundary between an inner state and an outer state. Our embodiment is affected by the powers and needs we have.

Augmented Reality is (I imagine) more of a new kind of power. It isn’t quite like our own memory or quite like the counsel of friends. It stands in its own right. It is not simply “memory” – it isn’t just a mnemonic that helps bring understanding closer to the surface of consciousness. A view instrumented with extra hints and facts is of course not entirely novel. Clearly we are surrounded by our own memories, signage, advertising, radio, friends voices and an already rich complicated teeming natural landscape loaded with signifiers and cues. But it is another bridge between personal lived experience and the experience of others. It seems to lower costs of knowing, and it seems to provide stronger subjective filters. A key aspect is that it seems to be faster. It’s as if we are evolving in a Lamarckian fashion to deal with a new kind of world.

It is hard to imagine what having a new sense is like. Recently I was invited by Mike Liebhold at the IFTF to hear Quinn Norton talk about having had magnetic implants in her fingers. She is the writer for Wired Magazine who interviewed Todd Huffman a few years back on the same topic and had the procedure done to herself. By brushing her fingers over a wall she could literally feel the magnetic field lines where the electrical wires ran underneath the surface. Her mind integrated this as a new sense; not merely a tugging on her fingers but a kind of novel sensory field awareness.  Quinn also spoke about wearing a compass cuff; a small ankle bracelet that would buzz on the north facing side. Over time it gave her an awareness of which direction was true north. It wasn’t just a buzzing feeling in her leg, but a feeling for her orientation with respect to the world. This kind of sensory awareness may be like what a homing pigeon feels intuitively. Choices we make may be quietly guided by an understanding we have.

Boxes

Who have persuaded man that this admirable moving of heavens vaults, that the eternal light of these lampes so fiercely rowling over his head, that the horror-moving and continuall motion of this infinite vaste ocean were established, and contine so many ages for his commoditie and service? Is it possible to imagine so ridiculous as this miserable and wretched creature, which is not so much as master of himselfe, exposed and subject to offences of all things, and yet dareth call himself Master and Emperor.

Dirt Architecture has leaned in the direction of making our world simpler, safer and dumber. It seems to largely have been about the imposition of barriers, walls and structures to reduce the complexity of the world. This is prevalent today. Perhaps the primary legacy of the Industrial age is the fence.

Many of us still live sheltered box lives. In the morning you enter the small box that is your car and it safely navigates you to your office. During this journey you are protected from the buffeting winds, from people, from noise and from most other distractions. Once at the office you sit down in your cubicle, the walls safely blinkering away distractions as you myopically gaze into the box of your computer screen. Even the screen itself consists of very clearly delineated boxes. There are buttons that say “go” and buttons that say “cancel”. There is no rain, no sun, no noise. After the days work ends you get back in your car and you drive home. When you arrive at home you close the door behind you and relax – ignoring the outside world held at arms length outside of your domain.

There is a sense of pleasure in this artificial simplicity. A sense of closure, understanding and a lack of fear about things being hidden. There is also an undue sense of speed at our ability to race through these spaces very quickly.

This pattern is similar to that of working by yourself versus working with others. You gain privacy, concentration, control and velocity by doing it yourself, but you lose an ability to crowd-source problems and to avoid repeating work and energy that others have already put in. By expending more energy on being social you save energy on wasted effort.

This extends to the way we shop at Whole Foods, Costco, Walmart, Ikea and other such big box stores. Certainly part of the reason we don’t use local resources as much as we could is that we simply can’t see them. We don’t know that we can just pick an apple instead of buying one. We don’t know that a certain garage sale has what we need or that there’s an opportunity to volunteer just around the corner.

If we interact with spaces primarily as a series of disjoint divisions then we tend to think our actions on the world can be contained without side-effects. In any busy city you can see the store owners and proprietors manicure the space directly in front of their building. Planting plants, brushing the pavement, creating a sense of mood and ambiance around their particular restaurant. And that obligation stops immediately at the margins of their property line. Of course this just pushes negative patterns to the edge where pressure builds up more strongly.

Our aesthetic leads us to try to whitewash reality and yet it pokes through. An urban landscape becomes clotted with thrown away garbage, sidewalks blackened with bubble gum. Paint peels, weeds crack the pavement. We see sometimes vagrants, beggars and the dispossessed raging against the world, noisy, bothersome; frightening even. We see their helpless entanglement and inability to be indifferent as a kind of betrayal of Utopia.

Simplicity, linear surfaces, boxes, walls. These patterns fail because they hide but do not eliminate side-effects. In fact they magnify them. It is the lack of synthesis between spaces, the lack of free movement between them that makes pressures build up. If you can’t understand that you could share a ride with a new friend to work, or that kids are constantly vandalizing your street because they used to exhaust themselves instead in a wilder more abandoned overgrown forest, then you tend to work against opportunities, you end up spending more energy to get less.

This is so unlike a dirty natural entangled world where you have little say in how the world is phrased. Where one brushes through spider webs and thorns stick to you and you have to walk all the miles to a hopeful uncertain destination. You get wet and dirty and hungry and tired and rained on and slapped silly by nature if you make a dumb mistake. You have to balance many forces in opposition and if you tug on one thing you find it connected to everything else in the universe. In nature one is constantly leveraging the landscape itself, working very closely with what it affords and simply steering those resources slightly in your benefit rather than asserting them so strongly. And it is there that we always seemed happiest.

Augmented Reality seems to at least offer the possibility that we can punch some holes in the boxes. It seems to offer a bridge between structure and chaos rather than just structure.  It is fundamentally different to see that something in a geographical proximity to you is actionable than to see it in a list view in Craiglist or read about it in a newspaper. It becomes a physical act – you can walk towards it, you can judge if you should participate.

Use

AR is a precise assault on dirt architecture. It is a response to design – not by changing the world but by changing us. It is as if we’ve become fatigued with the attempts to refashion perspective with dirt and are instead just drawing lines in the air. How will we use it? And by use I mean use in the same way that we wear a garment or use an art object – the value we derive from it individually and culturally.

The First Union Methodist Church of Palo Alto on Webster street is designed to evoke a certain emotion. It has a Gothic style with many small windows arranged to a peak. To me these tiny windows seem to imply souls, perhaps ascending to heaven. That the windows are small also seems to imply a certain kind of suffering in life and a certain role of humility. The architect who designed this invoked a visual language that subconsciously refers to historical references and understanding. Carlton Arthur Steiner, the designer, may indeed not have been a fully rational actor; much in the way that we casually gesture with our hands and expect others to understand those gestures even though we don’t fully know them ourselves as rational acts.

This church is a fairly objective object in our shared reality. We may bring our own prejudices, history and understanding to our perception but it exists as a series of reinforcing statements by an amalgamation of the people around it. To avoid a Wittgenstein-like knot: I use my perception of said church a different way than another person but I am not using something else entirely; there is some portion of it shared between different views.

Counterpoint this with the augmented reality case where the church may not even be there, or may be some other completely arbitrary and alien cartoon artifact – something so subjective to each user that agreement is radically impossible.

We’ve always draped our landscapes with our opinions. We downscore certain things, upscore other things and in this way exhibit a kind of prejudice. We’re afraid of and offended by people who are down and out, we embrace a certain definition of nature, and a certain definition of beauty. We think certain kinds of architecture, space and geometry is beautiful. There are a set of culture aesthetics that bias us to value certain kinds of artifacts, shelters and structures over others. We read between the lines in many cases, seeing the rules that guided outcomes, seeing policy and choice as reflected in the geometry of our world and nod approvingly or disapprovingly.

Most of us are not architects and don’t have permission to rewrite our landscapes anyway. We’ve had to comfort ourselves with criticism in text, image, placard or graffiti to communicate our point of view. Often it was at a degree of remove – not so closely conflated and overlaid with the view as augmented reality affords. Even graffiti is somewhat transitory and superficial; it is not a deep rewriting of structure ( at least not yet ).

In an augmented world these factors all move around. Your critical statement may be directly attached to the subject in question; not at a remove. Your statement is explicit, it can be published to other people, it isn’t just in your head. But at the same time your statement is increasingly subjective. It loses some of the value of an embodied artifact.

In an Augmented Reality we can erase buildings that offend us and we can paint golden halo’s around people that we like. We can prejudice our contemporaries and fuel a kind of hyper tribalism if we wish. But at the same time our power is diminished unless we can get a large portion of the mainstream to agree with our view.

Consensus

AR views will make our prejudices more visible and more formal. But they will also make them more subjective. Different people will subscribe to different views and build up quite a bit of bias before they’re forced to reconcile that with other people.

It may very well be that the role of consensus builder, or at least the role of holding a consensus space where issues of consensual reality can be debated, may become most important. I imagine that the role of a bartender for example, a neutral stakeholder who bridges other people together by offering a shared public space, might become quite important.

Let’s imagine that three people walk into a bar:

The first person, let’s call her Mary, a liberal environmentalist, has an augmented reality view that shows the carbon footprint of the people and objects around her. She can also see where the rivers used to run through the urban landscape, she can see if food is locally sourced and if purchasing power goes back into her community. She can see where super-fund sites are and where poverty levels are higher.

The second person, let’s call him a Derek, an artist, has an augmented reality view that redecorates the landscape around him with a kind of graffiti. All surfaces are covered with cartoon like creatures voicing criticism, comments, banal humor and art. He automatically has a critical perspective that lets him better understand others assumptions. He can see the contrails of his friends passage, the tenuous connections between people, and the location of upcoming art events in the area.

The third person, let’s call her Sarah, has a neo-american point of view and say is deputized as a police officer. She can see the historical pattern of crime in the area, she can see the traffic congestion, parking zones, gps speed-traps and can raise her space to emergency vehicle status if she needs it, she can see the contrails of important people in the neighborhood and can turn streets off and on.

The bartender serves them all a round of beers on the house and they sit down to talk about and share their differences.

Each of them is going to see their beer, and each other a radically different light based on their powers. For Mary the beer may appear especially palatable due to being locally sourced. For Derek the beer may have an attached satire which plays out about the human condition. For Sarah the beer may be seen with respect to late night noise ordinance violations surrounding the pub. This is on top of any personal memory that they have.

They get to talking about the beer, how regulated it should be, how it should taste and the like. A small typical bar conversation, but prejudiced by fairly strongly colored and enhanced points of view. Each participant thinks they are picking facts but they’re in fact picking opinions. Over time each one has subscribed to a set of prejudices that fundamentally altered what they now see. It alters how quickly they reach for the drink, it alters if they even enjoy it.

Over the issue of regulation Sarah might say that the sale of alcohol should be restricted. Derek might say that the alcohol should be served frozen so that it takes longer to consume. Mary might argue against regulation at all.

Each persons views are accumulated views. They are accumulated out of networks of people with like minds. Some networks are based on friendship, similar sentiment and trust. Other networks are constructed out of hierarchical chains of command. Each of these individuals reflects not just themselves but is a facet of a larger community and a larger set of views.

What comes to the table is not Mary, Derek and Sarah but Mary’s tribe, Derek’s tribe and Sarah’s tribe.

And the resultant consensus conflict becomes a classic case of the same kind of pathology that occurs when anthropologists try to understand a new culture. Each person is burdened by a deeply framed cultural lens that makes it difficult to really see things as they are. There is a tendency for all of us to divide the world into categories or into prototypical objects, and to then classify what we see as an example of some kind of object. We build mental machinery to deal with objects – we know how to deal with dogs or cats or a car – and we can mistakenly treat something as dog-like or cat-like when it in fact is dangerously not quite so. We cannot always give all things equal weight all the time, and in prioritizing, categorizing and scoring we necessarily create prejudice.

The redeeming difference here is that each of these participants can choose to trade views. Saran can put on Derek’s view, and Derek can put on Mary’s view and Mary can put on Sarah’s view. They can now see the world as scored from the other person’s point of view.

We find that Sarah has a personal financial benefit to seeing the world in her perspective. Her point of view is necessarily beneficial to continuing to earn a living. Derek perhaps also has a similar dependency. His point of view is necessarily driven by a need to continue maintaining a street credibility with his artist peers. Mary’s point of view is driven by and self-reinforced by a caution and concern for her well being. Each of these points of views is an embodiment of needs.

There’s both a risk and a promise that Augmented Reality will magnify prejudice but may also help us more clearly see each others prejudices. More to the point we’ll be able to hopefully trace back down to basic needs that lead to specific prejudicial postures. We can unwind the stack and get down to embodiments – perhaps we can tease apart our deep differences or at least respect them.

Links

http://swindlemagazine.com/issue08/banksy/

http://www.walletpop.com/blog/2009/10/09/want-better-service-just-complain-on-twitter/

http://en.wikipedia.org/wiki/Puddle_Thinking

http://radar.oreilly.com/2009/10/augmented-reality-apps.html

http://crisismapping.ning.com/profiles/blogs/crisis-mapping-brings-xray

http://www.informationisbeautiful.net/leftvright_world.html

http://www.nearfield.org/2009/09/nearness

http://www.cityofsound.com/blog/2009/10/sensing-the-immaterial-city.html

http://www.urbeingrecorded.com/news/2009/09/22/rss-augmented-reality-blog-feeds/

https://xd.adobe.com/#/videos/video/436

http://www.readwriteweb.com/archives/ex-googler_brizzly_creator_on_real-time_web_filtra.php

http://www.bruno-latour.fr/virtual/index.html#

http://mashable.com/2009/10/18/wolfram-alpha-iphone-app/

http://www.techcrunch.com/2009/10/20/wowd-takes-a-stab-at-realtime-search-with-a-peer-to-peer-approach/

http://goblinxna.codeplex.com/

look at a variety of iphone 3d engines such as the ones used during GOSH

http://www.abiresearch.com/research/1004454-Augmented+Reality

http://www.timpeter.com/blog/2009/10/06/how-important-is-local-search-heres-a-hint-extremely/

http://virtual.vtt.fi/virtual/proj2/multimedia/alvar.html

http://pointandfind.nokia.com/

http://www.augmentedenvironments.org/blair/2009/09/23/has-ar-taken-off/#more-104

http://www.readwriteweb.com/archives/robotvision_a_bing-powered_iphone_augmented_realit.php

http://www.businessweek.com/technology/content/nov2009/tc2009112_353477_page_2.htm


]]>
http://blog.makerlab.com/2009/11/augmentia/feed/ 4