CSCE 436 - Computer Human Interaction

Tuesday, April 5, 2011

Paper Reading #19: Social signal processing: detecting small group interaction in leisure activity

Title: Social signal processing: detecting small group interaction in leisure activity
Authors: Eyal Dim, Tsvi Kuflik
Venue: IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

Comments
http://shennessy11.blogspot.com/2011/04/paper-reading-19.html
http://angel-at-chi.blogspot.com/2011/04/paper-reading-19-tell-me-more-not-just.html

Summary
This paper was about the use of social signal processing to identify key social behavior elements in a given environment. Social signal processing hopes to be able to classify groups of individuals in different social profiles, and identify the best time to introduce social stimulus when interaction is lacking. This paper attempted to implement these ideas at a museum, as the authors postulate that social interaction at museums constitutes a significant portion of the learning experience for visitors. Their system allowed them to direct social interaction when necessary to improve the experience.

The authors observed 58 small groups of visitors at the Yitxhak Livneh-Astonishment exhibition at the Tel-Aviv Museum of Arts. Data collected included position proximity of group memebers and duration of voice interaction within 1 minute intervals. Proximity was rated as either separated, joined, or left. To the left is a picture of their voice detection application.

Discussion
While I thought this research was interesting, I think the factors that determine social interaction in a particular group should be considered fixed for the group's stay at the museum. The dynamics behind social interaction are complex and deep rooted. In order to truly make use of this data, I feel the authors would have needed to collect a great deal more information in order to identify those deep rooted variables, such as race, age, income, nationality and other demographics. Intervention based solely on proximity and voice activity of group members as observed for 10 minutes is insufficient both to justify intervention with regards to social interaction, or to justify filing a particular group as a specific profile.

Final Project Proposal

For the final project I am planning to implement the web app I presented for the second project. I will most likely revise the design while implementing it, so not all features will necessarily be in the final version. There are several other elements from other people's designs that I liked as well, so I may incorporate those. The goal will be to create a site with as few pages as possible that still does everything necessary to manage the class, while being as intuitive as possible. This is a fairly lofty goal however, so rather than compromise quality, I may choose to leave some pages or features conceptual only for the prototype. Currently I am working by myself, but I may do the project with Angel Narvaez (we are doing the ethnography together).

Paper Reading #18: Personalized user interfaces for product configuration

Title: Personalized user interfaces for product configuration
Authors: Alexander Felfernig, Monika Mandl, Juha Tiihonen, Monika Schubert, Gerhard Leitner
Venue: IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

Summary
The authors of this paper are presenting a system for improved product configuration. This is mostly with regards to customers, and the system is designed to increase user satisfaction with the product they decide on. The premise behind this research is that there are frequently too many alternatives to a product for a user to explore all of them and find the one that they want themselves. Instead, the system discussed will allow the user to configure their choices before recommending a product them. While the abstract and introduction are much more general, the products used throughout the paper as examples are mobile phones and the various subscriptions users can purchase for them.

Discussion
While there wasn't a huge amount of actual math, there was a considerable amount of technical jargon and database language in the paper. This made it a bit tedious to read, and difficult to summarize the exact process by which the authors employ their algorithms. Regardless, this sort of work is not new, though the authors claim that knowledge-based configuration is not available in commercial systems. Nearly any major mobile phone website, such as for Verizon or AT&T already has a section that helps users peruse their products in a streamlined fashion.

Paper Reading #17: A natural language interface of thorough coverage by concordance with knowledge bases

Title: A natural language interface of thorough coverage by concordance with knowledge bases
Authors: Yong-Jin Han, Tae-Gil Noh, Seong-Bae Park, Se Young Park, Sang-Jo Lee
Venue: IUI '10: Proceedings of the 15th international conference on Intelligent user interfaces

Summary
This paper is about improving the use of natural language interfaces by converting them to a formal language acceptable to the system. While the authors in this paper are talking about a broader range of systems, this work is very similar to search engines such as Google. Essentially, when a user enters a query in natural language, like you would in Google, you are presented with a series of recognized keywords from your query. The user then reviews the system's interpretation of their query to make sure it makes sense. Since the query is now written purely in formal language the system understands, results are much more predictable and accurate.

Discussion
The concepts discussed in this paper are very similar to auto complete on Google searches. However, as seen in the image above, queries made through this system are configured to be very precise, and are probably more intended for knowledge base systems such as Watson. Ultimately it comes down to what sort of database you are searching. In this paper the authors are clearly searching a highly formatted database, where as Google has to search a database that looks more like a heap.

While this was a good idea for these systems, I don't feel like it was a particularly original or creative solution. Most computer scientists, if presented with this problem, would probably come to a similar solution on their own. Additionally, the methods employed by Google and Watson are clearly both excellent already, so to some extent this is recreating the wheel, and not really new research.

Thursday, March 24, 2011

Paper Reading #16: Mixture model based label association techniques for web accessibility

Reference Information
Title: Mixture Model based Label Association Techniques for Web Accessibility
Authors: Muhammad Asiful Islam, Yevgen Borodin, I. V. Ramakrishnan
Venue: UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary
With the continued growth of the internet, it becomes increasingly necessary for blind users to have access to it. The dominant method of accessibility to the web for people with difficulty seeing, is audio based browsing. Browsers designed to do this attempt to associate text labels with relevant elements on the page, and then read the labels to the user. This can range from very easy, to extremely difficult depending on how the labels are associated in the HTML code or if there are labels at all.

This article proposes a solution aimed to address this issue by using their Finite Mixture Model algorithm to associate text with nearby elements. Elements that don't have labels or candidate labels are assigned labels from a database. The most likely match is calculated by a similar algorithm. Once key elements on the page have labels, they can be read to the user to describe what is on the page.

Discussion
The article was interesting, and the approach is without a doubt effective at labeling elements, but I feel that the entire solution needs to be reworked. Reading small parts of pages designed to be visible to the user is a poor substitute for being able to actually see the page, and I have a feeling that getting anything done with audio based browsers, regardless of how well labels are associated, would take a very long time.

Instead perhaps one of the substitute sight technologies would be a better place to spend research money on this area. I know there is ongoing work to map visual elements to a person's tongue. While a bit awkward to use, perhaps perfecting a technique like or similar to that would be the best approach for blind people.

Tuesday, March 22, 2011

Paper Reading #15: TurKit: human computation algorithms on mechanical turk

Reference Information
Title: TurKit: human computation algorithms on mechanical turk
Authors: Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller
Venue: UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary
This paper is about the integration of human computation into projects. As described in the article, Mechanical Turk is an on-demand source of human computation. People in third world countries are paid to work by answering queries relating to text recognition and CAPTCHA. The solution is implemented as a JavaScript that can be called by other programs, similar to how a library works. The system has already been used for a number of applications, including various psychological research.

Discussion
I liked this idea, even if it was a bit unorthodox. I don't see a huge amount of application for this however beyond research and text recognition. There aren't a whole lot of tasks that computers are unable to preform that are easy for your average person to preform.

Paper Reading #14: A framework for robust and flexible handling of inputs with uncertainty

Reference Information
Title: A framework for robust and flexible handling of inputs with uncertainty
Authors: Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, Andrew D. Wilson
Venue: UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary
This paper is about addressing the natural error in new age user input devices such as touch screens and pointers. While these devices tend to convey input in a very error prone manner, the interpretation of their input is quickly converted to boolean values typically by programs. This results in a loss of information that is not ideal.

To address this, the authors propose ranking events on a decimal scale, similar to how analog to digital converters work. The event is either true or false dependent on a threshold set on the decimal value, but the information of what exactly happened is maintained, allowing additional logic to interpret the event.

Discussion
I found the concept behind this idea interesting, however the mouse is as uncertain as pointers and touch screens. The style of input hasn't changed, in all three cases you are essentially pointing to a location on screen and triggering an event. I wonder how useful this research really is, considering that the mouse works just fine, and touch screens appear to be doing quite well as well. If there is an issue with touch interfaces, it is because using a finger to point is too big, as opposed to a mouse or a pointer - not because the style or interaction has changed.