- 25 Aug 2007
Computation Institute Disciplinary Deep Dive on Language and Computation
Language and Computation 3-D Agenda
Topic 1: Text-mining and natural-language processing: computational problems
Computer-aided analysis of natural texts (books, scientific articles, voice recordings) + text-generation.
When we study the literature surrounding a given scientific topic, we
typically embark on a series of tasks, each of variable difficulty. We (A)
decide what is relevant to the topic, (B) determine where to look for
information, (C) identify and capture pertinent statements scattered through
volumes of unrelated passages, and (D) synthesize disparate pieces of
knowledge into a coherent whole, possibly through resolution of conflicts
among myriad noisy statements and between textual and raw experimental data.
To achieve a deeper understanding of the literature and the data therein, we
may (E) discover novel connections between seemingly unrelated phenomena and
generate testable hypotheses. A still higher level of comprehension might
include (F) the construction of advanced logical or computational inferences
— in much the same way as a mathematician might conceive of a theorem while
thinking about corollaries formulated by colleagues.
While many of these individual tasks can be performed at some level by
computers, humans remain the main synthesizers of information. We combine
the various parts and slices of understanding to generate the big-picture
view. However, while the task of extracting multi-level meaning from text
passages today remains one that humans do best, recent advances in
computational technology suggest that computers may participate too.
Computer scientists tend to assign tasks (A-B) and (C) to the disciplines
of information retrieval (IR) and information extraction (IE), respectively
(although one can certainly find publications with different definitions).
Although multiple definitions do exist, text mining is typically associated
with information retrieval, extraction, and synthesis (tasks B-E)—with
stress on discovering novel knowledge (E). Note that steps (A-F) are also
the bread-and-butter of the artificial intelligence (AI) research community,
while (A-C) are also the territory of natural language processing (NLP) and
Topic 2: Problems in computational linguistics
Topic 3: Origin and evolution of language
Topic 4: Computational psychology of language
Topic 5: Language and computational neuroscience
Normal neurobiology of language and the effects of disease on this system.
Topic 6: Language and complexity
Other Topics, Details TBD
We are considering a further set of meetings, with the precise topics to be defined.
ThreeD Web Utilities