User-Directed Sketch Interpretation

Matt Notowidigdo & Robert Miller

Structured diagrams (e.g. flow charts, module dependency diagrams) are commonly created using an editor such as Visio or XFig. These applications are powerful and expressive, but they are cumbersome to use, and they make it difficult to communicate drawing styles and shape sizes. Realizing the shortcomings of the user interfaces of these applications, recent research has focused on sketch understanding systems that recognize hand-drawn structured diagrams. These systems use stroke information collected on a tablet computer to accurately recognize parts of the diagram as the user sketches them. While these systems are much more natural to use than the menu- and mousedriven editors, they have subtle deviations from an actual pen-and-paper interface. For example, adding an arrowhead to a line segment that was drawn much earlier will likely confuse the system since the recognition uses temporal information. The user must delete the line segment and re-draw it with the arrowhead.

Figure 1: Communicating drawing styles in Microsoft Visio.

UDSI (User-Directed Sketch Interpretation) is a new system for creating structured diagrams that is based on understanding hand-drawn sketches of structured diagrams. Unlike the existing systems that require devices that can capture stroke information while the user sketches the diagram, UDSI uses scanned images of sketches. The user presents a scanned pen sketch of the diagram to the UDSI system and guides the application's interpretation of the sketch. The final interpretation is a structured diagram that can be incorporated into technical documents or refined in an existing structured graphics editor. The user can therefore iteratively create the diagram using a pure pen-and-paper interface until she is satisfied with the style, layout, and position of the components of the diagram.

The power of this system is that the user can use the pen-and-paper interface where it is natural and convenient (e.g. sizing shapes, placing arrows, and routing line segments) and then can use a mouse and keyboard where it is more efficient (e.g. typing in text, selecting colors, formatting fonts).

The UDSI system combines standard algorithms from the machine vision literature for filtering, segmentation, and shape detection [2] with novel algorithms for extracting text regions and recognizing arrowheads. These algorithms produce (possibly conflicting) interpretations of the scanned sketch that are combined using an elimination algorithm to produce not only the best candidate interpretation, but also multiple alternative interpretations. These alternatives are presented to the user through a novel user interface that allows the user to effectively select alternative interpretations that the system has generated.


[1] Matt Notowidigdo and Robert C. Miller. "Off-line Sketch Interpretation." Proceedings of AAAI Fall Symposium on Making Pen-Based Interaction Intelligent and Natural, October 2004.

[2] Matthew Notowidigdo. User-Directed Sketch Interpretation. MEng thesis, Massachusetts Institute of Technology, June 2004.