Thursday, March 24, 2011

Paper Reading #16: Mixture model based label association techniques for web accessibility

Reference Information
Title: Mixture Model based Label Association Techniques for Web Accessibility
Authors: Muhammad Asiful Islam, Yevgen Borodin, I. V. Ramakrishnan
Venue: UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology



Summary
With the continued growth of the internet, it becomes increasingly necessary for blind users to have access to it. The dominant method of accessibility to the web for people with difficulty seeing, is audio based browsing. Browsers designed to do this attempt to associate text labels with relevant elements on the page, and then read the labels to the user. This can range from very easy, to extremely difficult depending on how the labels are associated in the HTML code or if there are labels at all.


This article proposes a solution aimed to address this issue by using their Finite Mixture Model algorithm to associate text with nearby elements. Elements that don't have labels or candidate labels are assigned labels from a database. The most likely match is calculated by a similar algorithm. Once key elements on the page have labels, they can be read to the user to describe what is on the page.


Discussion
The article was interesting, and the approach is without a doubt effective at labeling elements, but I feel that the entire solution needs to be reworked. Reading small parts of pages designed to be visible to the user is a poor substitute for being able to actually see the page, and I have a feeling that getting anything done with audio based browsers, regardless of how well labels are associated, would take a very long time.


Instead perhaps one of the substitute sight technologies would be a better place to spend research money on this area. I know there is ongoing work to map visual elements to a person's tongue. While a bit awkward to use, perhaps perfecting a technique like or similar to that would be the best approach for blind people. 

Tuesday, March 22, 2011

Paper Reading #15: TurKit: human computation algorithms on mechanical turk

Reference Information
Title: TurKit: human computation algorithms on mechanical turk
Authors: Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller
Venue: UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology



Summary
This paper is about the integration of human computation into projects. As described in the article, Mechanical Turk is an on-demand source of human computation. People in third world countries are paid to work by answering queries relating to text recognition and CAPTCHA. The solution is implemented as a JavaScript that can be called by other programs, similar to how a library works. The system has already been used for a number of applications, including various psychological research.

Discussion
I liked this idea, even if it was a bit unorthodox. I don't see a huge amount of application for this however beyond research and text recognition. There aren't a whole lot of tasks that computers are unable to preform that are easy for your average person to preform.

Paper Reading #14: A framework for robust and flexible handling of inputs with uncertainty

Reference Information
Title: A framework for robust and flexible handling of inputs with uncertainty
Authors: Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, Andrew D. Wilson
Venue: UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology


Summary
This paper is about addressing the natural error in new age user input devices such as touch screens and pointers. While these devices tend to convey input in a very error prone manner, the interpretation of their input is quickly converted to boolean values typically by programs. This results in a loss of information that is not ideal.


To address this, the authors propose ranking events on a decimal scale, similar to how analog to digital converters work. The event is either true or false dependent on a threshold set on the decimal value, but the information of what exactly happened is maintained, allowing additional logic to interpret the event.






Discussion
I found the concept behind this idea interesting, however the mouse is as uncertain as pointers and touch screens. The style of input hasn't changed, in all three cases you are essentially pointing to a location on screen and triggering an event. I wonder how useful this research really is, considering that the mouse works just fine, and touch screens appear to be doing quite well as well. If there is an issue with touch interfaces, it is because using a finger to point is too big, as opposed to a mouse or a pointer - not because the style or interaction has changed.