The University of Maryland (UMD) is part of a multi-institutional team tasked with building a powerful set of language technologies that can unlock information that has previously been unsearchable, and thus unfindable.
The four-year project, funded by a $14.4M grant from the Intelligence Advanced Research Projects Activity (IARPA), is expected to produce a language processing system that allows a user to type in a query in English and have information returned in English—even if the content is only available in a lesser-known language like Croatian.
The project involves faculty, postdocs and students from Maryland, Columbia University, Yale University, the University of Cambridge, and the University of Edinburgh. Columbia is the lead institution, with Kathleen McKeown, the founding director of Columbia’s Data Science Institute, serving as principal investigator.
The interdisciplinary research—already underway—includes experts in natural language processing, speech processing, and information retrieval.
“Today’s internet brings us closer together than ever before, but the diversity and richness of human language remains a challenge,” says Douglas Oard, a professor at the UMD College of Information Studies, who is heading up the UMD research team. “Computers can be trained to transform human language in many useful ways, but today that training process is still too expensive to affordably be applied to all the world’s languages, and too dependent on the artisanal skills of a small number of experts.”
Click here for more of this story on the UMD website.