Yahoo/MIT EECS HCI-IR Seminar Series

Fall-Spring 2008

The Yahoo/MIT EECS HCI-IR Seminar Series is a monthly series of speakers on topics at the intersection of human-computer interaction and information retrieval.  Topics of interest include novel interaction techniques, interactive information retrieval, exploratory search, information visualization, and field studies and user studies of information retrieval needs.  The seminar series is hosted by MIT EECS and sponsored by Yahoo.

Seminars are generally held on the first Tuesday morning of each month, in the Kiva/Patil Seminar room (32-G449).  To receive announcements about the seminar, add yourself to the HCI Seminar mailing list, or contact Rob Miller with questions.

September 16, 2008

Geography in Web Search
Rosie Jones, Yahoo!

November 4, 2008

Set Retrieval 2.0
Daniel Tunkelang, Endeca
February 4, 2009

Augmented Social Cognition: Using Web2.0 technology to enhance the ability of groups to remember, think, and reason
Ed Chi, PARC
March 3, 2009

Universal Web Search Relevance
Belle Tseng, Yahoo!
April 3, 2009

The Web Changes Everything: How Dynamic Content Affects the Way People Find Online
Jaime Teevan, Microsoft Research

May 5, 2009

Putting our digital information in its place: Lessons learned from fieldwork and prototyping in the Keeping Found Things Found project
William Jones, University of Washington
Speaker: Rosie Jones
Speaker Affiliation: Yahoo
Host: Rob Miller
Host Affiliation: CSAIL-MIT

Date: 9-16-2008
Time: 11:00 AM - 12:00 PM
Refreshments: 10:45 AM
Location: 32-G449 (Kiva)

Web search results are typically based on the user's search query,
without taking other contextual information into account. However,
people often conduct web searches with a specific location in mind,
searching for, for example, an "Armenian restaurant in Cambridge". If
they can't find what they are looking for, they may be flexible about
the location, considering, say, an Armenian restaurant in Boston, or
flexible about the task, considering an alternative cuisine. In general,
considering geographic features in addition to text features can aid
with finding relevant information. However, understanding the trade-offs
between topical and geographic relevance may be more subtle, with the
trade-off depending on the topic itself. The collective behavior of web
searchers can be used to detect typical geographic profiles for a topic.
Based on these findings, we
propose a more flexible approach to web search, in which we prefer a
ranking with results close to the user location when this will best
satisfy the user's information need.

Speaker Bio:

Rosie Jones is a Senior Research Scientist at Yahoo!. Her research
interests include web search, geographic information retrieval, and
natural language processing. She received her PhD from the School of
Computer Science at Carnegie Mellon University under the supervision of
Tom Mitchell, where her doctoral thesis was titled Learning to Extract
Entities from Labeled and Unlabeled Text. She is co-organizing the WSDM
2009 Workshop on Web Search Click Data (WSCD09). She served on the
Senior PC for SIGIR in 2007 and 2008, and is a Senior Member of the ACM.

Speaker: Daniel Tunkelang
Speaker Affiliation: Endeca
Host: Rob Miller
Host Affiliation: MIT CSAIL

Date: 11-4-2008
Time: 11:00 AM - 12:00 PM
Refreshments: 10:45 AM
Location: Patil Conference Room (32-G449)
The earliest information retrieval systems were set retrieval systems,
also known as Boolean retrieval systems because they expected users to
enter queries as Boolean expressions. While set retrieval still survives
in professional search applications, it has been largely supplanted by
best-match or ranked retrieval familiar to anyone who has used web

Best-match retrieval offers several advantages, the most salient being
that it does not require users to be professionally trained. But one of
its significant disadvantages is a loss of transparency. Users make a
leap of faith that the ranking algorithm works, and then resign
themselves to trying again when they are not satisfied with their search

What we need is a retrieval approach that combines the best of both
worlds, providing transparency but not requiring professional training.
We find such an approach in the emerging field of human-computer
information retrieval (HCIR), which conceives information seeking as a
dialogue between the user and the system.

This presentation will outline the principles of information seeking as
a dialogue and walk though concrete examples that illustrate the
principles of HCIR. The foundation is an interactive set retrieval
approach that responds to queries with an overview of the user's current
context and an organized set of options for incremental exploration.
Contextual summaries of document sets optimize system's communication
with user, while query refinement options optimize user's communication
with system.

By enabling bidirectional communication between the user and the system,
we can address the inherent limitations of best-match approaches.

Speaker Bio:

Daniel Tunkelang is co-founder and Chief Scientist of Endeca, a provider
of enterprise information access solutions. He leads Endeca's efforts to
develop features and capabilities that emphasize user interaction and is
a leading industry advocate of dialog-oriented approaches to information
retrieval. He publishes The Noisy Channel (,
a blog about HCIR and related issues.

Speaker: Ed Chi
Speaker Affiliation: PARC
Host: Rob Miller
Host Affiliation: MIT CSAIL

Date: 2-3-2009
Time: 11:00 AM - 12:00 PM
Refreshments: 10:45 AM
Location: Patil Conference Room (32-G449)
We are experiencing the new Social Web, where people share, communicate, commiserate, and conflict with each other. As evidenced by Wikipedia and, Web 2.0 environments are turning people into social information foragers and sharers. Users interact to resolve conflicts and jointly make sense of topic areas from "Obama vs. Clinton" to "Islam."

PARC's Augmented Social Cognition researchers -- who come from cognitive psychology, computer science, HCI, sociology, and other disciplines -- focus on understanding how to "enhance a group of people's ability to remember, think, and reason". Through Web 2.0 systems like social tagging, blogs, Wikis, and more, we can finally study, in detail, these types of enhancements on a very large scale.

In this talk, we summarize recent PARC work and early findings on: (1) how conflict and coordination have played out in Wikipedia, and how social transparency might affect reader trust; (2) how decreasing interaction costs might change participation in social tagging systems; and (3) how computation can help organize user-generated content and metadata.

Speaker Bio:

Ed H. Chi is area manager and senior research scientist at Palo Alto Research Center's Augmented Social Cognition Group. He leads the group in understanding how Web2.0 and Social Computing systems help groups of people to remember, think and reason. Ed completed his three degrees (B.S., M.S., and Ph.D.) in 6.5 years from University of Minnesota, and has been doing research on user interface software systems since 1993. He has been featured and quoted in the press, such as the Economist, Time Magazine, LA Times, and the Associated Press.

With 19 patents and over 50 research articles, his most well-known past project is the study of Information Scent --- understanding how users navigate and understand the Web and information environments. He has also worked on computational molecular biology, ubicomp, and recommendation/search engines. He has won awards for both teaching and research. In his spare time, Ed is an avid Taekwondo martial artist, photographer, and snowboarder.

Speaker: Belle Tseng, Yahoo!
Date: Tuesday, March 3 2009
Time: 11:00AM to 12:00PM
Refreshments: 10:45AM
Location: Star Seminar Room 32-D463
Host: Rob Miller, MIT CSAIL
Contact: Michael Bernstein, (617) 253-0452,
With the fast penetration of the Web throughout the world, the number of search users has increased dramatically from many geographic locations. Search engines are now facing the problem of providing search results to many countries. Machine Learned Ranking (MLR) approach has shown successes in web search. With the increasing demand to develop effective ranking functions for many countries (domains), we face a big bottleneck of insufficient training data to build a learned ranker for each domain.

In my talk, I will present two approaches to resolve this problem.

The first is a tree-based adaptation that takes a ranking function from one domain and tunes it with a small amount of training data from the target domain. The second approach is a Dynamic Bayesian Network click model that combines small amounts of training data with click data to build an unbiased estimation of the search relevance. Finally, I will report our experiments in evaluating the two approaches on a large dataset from the Yahoo! Search query logs, and report our findings.

Dr. Belle Tseng is a Senior Manager in the Web Search Ranking Department of Yahoo. She leads a R&D team of researchers with strong background in information retrieval and machine learning to improve the search relevance of Yahoo search engines across the world. Belle is an alumnus of MIT receiving her B.S. and M.S. in Mathematics and Electrical Engineering from MIT, and her Ph.D. from Columbia University. Before joining Yahoo, Belle spent four years as a Senior Research Staff Member at NEC Laboratories America where she manages research projects on relational data mining and social network analysis. Prior to joining NEC, she spent seven years as a Research Staff Member at IBM T. J. Watson Research Center working on multimedia retrieval, personalization, and summarization.

Dr. Tseng published over 100 technical papers in the area of web search, multimedia understanding, stereoscopic system, and social information analysis. She is a receipt of the NSF Fellowship, the IBM Invention Achievement Awards, and the NEC Technology Commercialization Award.

Speaker: Jaime Teevan, Microsoft Research
Date: Friday, April 3 2009
Time: 2:00PM to 3:00PM
Refreshments: 1:45PM
Location: Patil/Kiva Seminar Room 32-G449
Host: Rob Miller, MIT CSAIL
Contact: Michael Bernstein, (617) 253-0452,
Relevant URL:

When you visit a colleague's Web page, do the new papers she's posted jump out at you? When you return to your favorite Web news site, is it easy to find the front page article you saw yesterday? The Web is a dynamic, ever-changing collection of information, and the changes can affect, drive, and interfere with people's information seeking activities. This talk will explore how and why people revisit Web content that has changed, and illustrate how understanding the association between change and revisitation might improve browser, crawler, and search engine design.

Speaker Biography:
Jaime Teevan is a researcher in the Context, Learning, and User Experience for Search (CLUES) group at Microsoft Research, Redmond, Washington. Dr. Teevan's research interests lie at the intersection of human-computer interaction, information retrieval, and machine learning. For her doctoral thesis, she developed the Re:Search Engine, a system that helps people return to information they have previously seen in a dynamic Web environment. She has also explored personalized search, the learning of probabilistic retrieval models from textual data, and techniques to combine search and navigation. She received a Ph.D. and S.M. from MIT and a B.S. in Computer Science from Yale University.

Speaker: William Jones, University of Washington
Date: Tuesday, May 5 2009
Time: 11:00AM to 12:00PM
Refreshments: 10:45AM
Location: Patil/Kiva Seminar Room 32-G449
Host: Rob Miller, MIT CSAIL
Contact: Michael Bernstein, x3-0452,

Does place matter for digital information? If so, how? Research points to the importance of "place-like" senses of direction, context, connection and control when managing digital information. Support for place in the Personal Project Planner prototype begins with the idea that relevant information can be located with reference to a simple planning document. This document works as a light-weight, editable overlay to existing applications and the stores of information managed by these applications. A basic premise of the Planner is that effective management of personal information can leverage and emerge from informal planning and other everyday activities.

Speaker Biography:
William Jones is a Research Associate Professor in the Information School at the University of Washington where he manages the Keeping Found Things Found group ( He has published in the areas of personal information management (PIM), human-computer interaction, information retrieval and cognitive psychology. Prof. Jones wrote the book "Keeping Found Things Found: The Study and Practice of Personal Information Management" and also edited the book "Personal Information Management" (with co-editor Jaime Teevan). He holds several patents relating to search and PIM from his work as a program manager at Microsoft in Office and in MSN Search. Prof. Jones received his doctorate from Carnegie-Mellon University for research into human memory.