02 March 2009

Gesture Recognition – Breaking the Barrier between us and our tools/toys

Historically, humans have required textual/graphical user (keyboards, mice) and hepatic/tactile (joysticks, data gloves VR helmets) interfacing systems to interact and communicate with computers. The latter group of devices sense body movements and “feed” information to software based on position. Conversely, users can receive information in return through the aforementioned hepatic channels in the form felt sensation such as a vibration (amalgamated from http://www.webopedia.com/TERM/H/haptic.html and http://www.webopedia.com/TERM/I/interface.html). These input/output tools have typically chained users to their machines, requiring them to be within arms length of the devices in order to interact with them. Gesture recognition technology is striving to end this bondage, by not only improving interaction between user and hardware but also by opening doors for individuals who are physically challenged and have difficulties with the current user interfaces.

Gesture-recognition systems identify human gestures (typically head, arm, hand, and leg movements in addition to torso position) via the use of a camera (and in some cases motion sensors or ultrasound emissions and infrared lights) that read the movements of the human body and conveys the information to a computer that uses the gestures as input to control devices or applications. Though a majority of the systems can only interpret broad gestures (problematic…more later) some are capable of decoding facial and speech expressions (i.e., lip reading), and eye movements(amalgamated from http://www.webopedia.com/TERM/G/gesture_recognition.html and http://ieeexplore.ieee.org/ielx5/2/29695/01350721.pdf?arnumber=1350721.

The technology’s proponents tout a variety of potential uses for the technology including but not limed to:

• Surgical simulations
• Improving security and surveillance
• Military applications
• Interpretation of sign language to spoken word
• Replacement of traditional input devices- keyboards, mice, joysticks etc.
   o Virtual Keyboard for PDAs; http://www.ideinc.com/canesta.html
   o Head- and eye movement-based cursor and mouse interface 
       technolology, - users can work with applications by moving
      cursors  with head movements and clicking the mouse with eye
      blinks instead of double-clicking, users can double-blink while 
      looking at an icon or file name. http://www.cybernet.com/products/navigaze.html
   o Automobile controls drivers would change, for example, the 
      temperature or sound-system volume by manoeuvring their hand in   
     various ways over a designated area. This could increase safety by
     eliminating  drivers’ current need to take their eyes off the road to
     search for controls  http://www.cc.gatech.edu/ccg/projects/gestpan/ 
Gesture recognition technology has faced and still faces major hurdles from intrusive hepatic devices (as not all systems have switched to camera, sensors etc.) to its resource intensity resulting in the interference with other system functions and slow image processing/lag which is undesirable when playing a dynamic video game. Furthermore, many products can’t read motions accurately or optimally when faced with changing or excessive lighting or when the user is situated near or in front of “busy” backgrounds. An additional key issue lies with the technology’s inability to interpret which intended gesture a series of motions actually represents; this is compounded by the lack of a common gesture language, specifying the way users should make gestures to make sure they are easily recognized. http://ieeexplore.ieee.org/ielx5/2/29695/01350721.pdf?arnumber=1350721.


Research, continues to improve the field for example, improvements to algorithms resulting in faster, more robust, and more accurate systems. Additionally, hardware and processing costs have decreased considerably, whilst processor speeds, sensor accuracy and camera recognition have increased. When hepatic devices are required technology has made them smaller, wearable and less intrusive for example gesture recognition sensors are being placed in rings. One of the more innovative improvements in the field comes from GestureTek who used heuristics (a branch of artificial intelligence) to achieve more “robust, accurate, and quicker tracking of gestures” by turning multiple cameras onto a subject; thus allowing enhanced data analysis of gestures in three dimensions http://ieeexplore.ieee.org/ielx5/2/29695/01350721.pdf?arnumber=1350721.

On February 6th GestureTek the pioneer, patent-holder and world-leader in camera-enabled gesture-recognition technology for presentation and entertainment systems invited our class to their Toronto office where we were given the opportunity to learn about and interact with cutting edge gesture technology.
The company's multi-patented video gesture control technology (VGC) lets users control multi-media content, access information, manipulate special effects, even immerse themselves in an interactive 3D virtual world – simply by moving their hands or body delivering Wii-like gesture-control without the need to wear, hold or touch anything http://www.gesturetek.com/aboutus/corporatebackground.php
The company features seven products (http://www.gesturetek.com/pdfs/Gesturetek_fact_sheet.pdf) used across a variety of industries including:
• museums, science centers, amusement parks, aquariums, zoos, visitor centres
• television production, game development
• trade shows, real estate presentation centers, corporate showrooms, boardrooms, virtual 
  videoconferencing
• advertising - retail locations, digital signage, interactive multi-media displays, kiosks
• bars & nightclubs and other public spaces such as airports and stadiums
• surface computing solutions, mobile
• healthcare sector http://www.gesturetek.com/marketuses/industryuses.php

Though GestureTek got its start in the healthcare sector as a result of Vincent John Vincent (one of the co-founder’s) background in physical rehabilitation and his desire to develop an engaging solution that could hasten patient recover; today GestureTek heavily focuses on advertising and gaming.

I personally am interested in the work they have been doing with museums and other cultural centres http://www.gesturetek.com/marketuses/museums_sciencecenters.php as my background is in education and technology. I view education as being broader then just what is taught in the classroom or formal system, museums and other tourism and cultural facilities have an enormous opportunity at their fingertips to engage their audiences, if they will step outside of their “traditional boxes.” Additionally I would like to see GestureTek expand their gaming division to focus on serious and educational game development; the learning potential due to the experiential nature of their technology is limitless, the simulations that could potentially be created could revolutionize not just corporate level and post secondary training but also if made affordable enough k-12 truly enabling teachers to become facilitators rather then gatekeepers of information/knowledge.
A relatively extensive internet search (over an hour of Googling) confirmed what GestureTek told the class about having little competition in the field. Reactrix a company who licensed GestureTek’s technology and then began competing against them selling interactive advertising displays went bankrupt in December 2008 http://www.technologyreview.com/communications/14508/?a=f. According to Venture Beat http://venturebeat.com/tag/invde-shaw-co/ several other companies pursuing the same goals have remained afloat. The major example is Catchyoo, a Japanese company that projects massive interactive ads on floors, walls and surfaces. Israeli 3DV Systems are poised to pick up where their fallen rival left off. There are also lots of companies, such as Danoo, that are putting up flat-panel displays at places such as Starbucks. The videos and other ads on the screens can entertain people while they’re waiting in line. Extreme Reality http://www.xtr3d.com/ XTR developed proprietary real-time high-resolution software that analyzes 3D human motions using one simple web cam, with out any additional accessories while Oblong Industries, Inc an LA-based company is using gesture recognition software in conjunction with gloves, cameras, and 3D sensors that track position to replace the mouse by following a user's hand and finger positions and orientations which are then parsed and interpreted by Oblong's proprietary gesture recognition engine for interpretation http://elianealhadeff.blogspot.com/2008/04/gesturetek-and-oblong-serious-gaming.html

30 September 2008

Mobile Technology


Ubiquitous…pervasive…omnipresent these words are all used to describe mobile technology. Though the term is generally associated with mobile phone technology, and how can it not be with statistics like 2 Billion mobile phones compared to 900 million PCs http://ca.youtube.com/watch?v=7iAyC1vyJ44mobile devices and there capability and potential fall into a variety of categories. We began our Multimedia Pioneering course by defining the two terms that constitute the name of the course, in that light I turn to http://www..dictionary.com/to define mobile technology:

  • Mobile: Definition
    • capable of moving or being moved readily
    • capable of changing quickly from one state or condition to another
  • Technology: Definition
    • the branch of knowledge that deals with the creation and use of technical means and their interrelation with life, society, and the environment

On October 24th, James Eberhardt of Echo Mobile generously volunteered his time to speak to our class. James is a a Technical Director with over 13 years of experience of leading convergent media projects. He spoke to us regarding QR (quick response) Code; a technology firmly ensconced in Japan, has acquired a foothold in Europe, and is in the early stages of dipping its toes in North American waters. QR Code is a bar code; it differs from standard UPC bars found on products today as it is two-dimensional. A URL, business card, message (up to 7500 text characters) or image (7k jpg) is encoded along an X and Y access.

A mobile phone equipped with a camera and QR Code generator (provided by the mobile phone provider or downloaded by the user, snaps a picture of (or sometimes scans in real time) the encoded square. The reader decrypts the QR Code for the user. The following URL provides a demonstration http://ca.youtube.com/watch?v=uf_DNHPBV-s

QR Codes have been used to encode Wikipedia articles http://www.semapedia.org/,in contests http://www.cbc.ca/theborder/blog/2008/01/annoucing_winner_2_for_1000_at.html, city walking tours http://mobilestance.com/2008/03/28/san-francisco-the-plymouth-rock-of-qr-codes/, billboard advertising http://ca.youtube.com/watch?v=EBja1blJ3GU, ticket purchases (both concert and airline) http://ca.youtube.com/watch?v=BDop0sqOR2E, loyalty programs http://theponderingprimate.blogspot.com/2006/08/coke-uses-qr-codes-for-mobile.html, gaming http://qrcode.es/?p=209&language=en, and matrix tracking as in the example James gave about the Australian wine company who used QR Codes to geographically tract from where people were coming to their websites.

As is evidenced by the aforementioned examples the potential for QR Code is vast yet for a technology which is meant to act as a quick way to retain information on the go the it is not without its inherent problems, some of which James touched on. For example, multiple readers, which decode different things i.e. one reader for QR Code images another for text as opposed to a single reader that can read all. James pointed out like with most applications, the majority of end users will use what has been made available for them by their mobile phone provider. However, as most mobile phones are not yet generally equipped with QR Code readers the onus is currently on the end user to download the reader and it is inconvenient to download and configure multiple readers. Another issue is how the decoded QR Codes are stored, neither the phones nor the readers have a convenient way of storing the generated material, though one can save the material to a bookmark on del.icio.us it still requires the end user to make the second step of going to del.icio.us to retrieve the link, image etc. Marketers could potentially loose a percentage of users because of the extra step or because they download so many QR Codes that they can’t easily find the one they are looking for a second time. Additional issues with QR code can be found at http://www.canadianmarketingblog.com/archives/2007/08/qr_codes_aka_3_d_codes_the_hol_1.html.

As I mentioned above mobile technology is more then just mobile phones. Below is a list of links for other mobile applications, I do not elaborate in detail because they were not the focus of James presentation but I do feel they are worth noting.

01 December 2007

CONVERGENCE IN MEDIA, CULTURE, AND EDUCATION

a) Choose a major concept from the reading and define the concept in your own words
Language is enacted daily to generate or destroy/deny experiences which create meaning. Two powerful teaching tools of language are narrative and inscription (the use of paper, signs, prints and diagrams). They are used to judge a learners comprehension on a topic based on whether or not they present and represent the topic from a school-based perspective -- they ‘talk the talk’ i.e. model principles learned from a text (with graphs and ‘appropriate’ language) when explaining concepts, even if the models are counterintuitive to their personal experiences. Students’ explanations of acquired knowledge using personal experiences are diminished and dismissed, both consciously and unconsciously by those evaluating them.

b) Choose a quote from the reading that elucidates the concept further
“These practices — so material and mundane, so practical, so modest, so pervasive, so close to the hands and the eyes that they escape our attention are, from an historical perspective, radical modifications of the fundamental way in which people strive with one another for power… Shifts in technologies for simplifying, abstracting, deflating the world [into a finite number of relevant aspects (charts, diagrams, formulas, statistics, tables, models which are their representations) [come to be taken as more relevant than the experience itself] so that it can be brought together in new places, and assembled in increasingly complex and abstract ways…ultimately allowing the few to dominate the many” pg. 270. “We are ourselves failing to interrogate the inscriptions we trade in… In as much as we are scarcely able to apprehend the bare existence of what is obscured or obviated by our inscriptions, we seldom name or notice our own blindness” pg.272.

c) Suggest an example to illustrate the concept
The example given by the article pgs. 273-280 (Michaels & Sohmer.pdf) explains the concept well. Four fourth graders are evaluated on the concept of reasoning about seasonal change and its causes. All the students explanations were wrong; yet, one of the four students ‘appeared’ more scientific in his explanation by using appropriate speech as well as constructing and employing a physical model using his hands as the earth and sun, that those evaluating him believed he had acquired some knowledge. The other three students presented as though they had not understood the concept at all, their language was not fluid, and their examples were experiential as opposed to text based. It was only in a later revaluation of the students’ interviews that researcher’s realized that in fact the other three students probably understood the topic better then the student who regurgitated the text.

d) Situate the concept in your own work or life
I would say I could be equated to the student who appeared to be the more scientific especially in my pre-university schooling. I had a knack for learning the concept as it was taught to me and then parroting it back on tests or in the classroom without ever questioning what was I really learning (if anything) and how did it fit with my personal experiences. I think this has stunted my learning in some ways because it is more difficult for me to articulate concepts in my own words, I’m very good at stringing other people’s words together to illustrate my thoughts and opinions but appear inarticulate when I try and explain it in my own words. Further reproducing concepts without questioning them has left gaps in my own understanding of how to teach. The article points this out on page 275 “the teacher was neither able to teach the theory to the children in her own words nor to register (much less identify and correct) the presence of specific inadequacies and errors in the text.” I may know a concept I learned by rote but for me explain it to others requires me going back to the beginning and relearning it.