Scalable Visual Analysis of Patent and Scientific Document Collections

Applicant:

Professor Dr. Thomas Ertl, Stuttgart
Universität Stuttgart
Institut für Visualisierung und Interaktive Systeme (VIS)
Stuttgart

Professor Dr. Hinrich Schütze, Ph.D., Stuttgart
Universität Stuttgart
Institut für Maschinelle Sprachverarbeitung
Lehrstuhl Theoretische Computerlinguistik
Stuttgart

Project:

Scalable Visual Analysis of Patent and Scientific Document Collections
(Publications)

Summary:

In addition to patent documents, other information sources, such as scientific literature, play an important role in many tasks in intellectual property (IP) analysis. Particularly the detection of shifts in scientific research, of emerging trends, and of promising technological innovations, which we will refer to as white spot analysis, are of great interest to different knowledge workers. These tasks, however, are almost impossible to perform in a fully automated way, since interpretation of the results requires background knowledge and the experience of human users. The main goal of this proposal is, therefore, to develop a new visual analytics approach for white spot detection by combining sophisticated text analysis techniques and interactive visual methods to facilitate sensemaking in the field of scientific literature. Subgoals are the identification of topic shifts and changes in citation networks over time, including provenance tracking to help users understand uncertainties of their analysis and to support collaborative scenarios on high resolution displays as well as web-based interfaces. These will leverage proactive, visual guidance to potentially interesting findings (white spots), thereby relieving users of the burden to explore large, complex, and high-dimensional document spaces manually. By combining the knowledge of both groups in text analysis, interactive visualization, and the experiences gained during the previous funding period, this project aims to develop support systems for strategic tasks in the intellectual property domain that improve the current state of the art regarding topic and citation analysis as well as white spot identification.