Back

Speaker "Tatiana Erekhinskaya" Details Back

 

Topic

Extracting profiles for People, Organizations and Beyond: Summarization on Steroids.

Abstract

“How on earth am I going to read it all?” Is it what you think when you need some key information from a mountain of documents? At Lymba, we are developing a knowledge extraction tool that automatically creates concise profiles for various entities. Such template-based summarization can satisfy different knowledge needs in multiple domains. For example, professional profiles for people include education and employment history, key projects, awards, publications, key colleagues, research interests, etc. For organizations, the profiles include information of interest such as products, management, legal history, corporate culture and financial information. The same approach can extract patient profiles with key health-related information in the medical domain, and publication profiles describing clinical trial settings, socio-demographics and health problems, interventions and outcomes. Furthermore, this approach was applied to grant proposals and scientific publications, reducing information burden on researchers. Attendees will learn use cases in multiple domains, tools used (including open source ones) and best practices to for a successful project.

Profile

Tatiana Erekhinskaya is a Research Scientist and Product Manager at Lymba Corporation. She received a PhD degree in Computer Science from the University of Texas at Dallas with a dissertation on probabilistic models for text understanding. Tatiana has been working in Natural Language Processing for more than 10 years. In her career, she acted as a technical leader on a broad range of projects that included misspelling-robust syntactic parsing for Russian, the first syntax-based opinion mining for Russian, and more recently semantics-driven projects for English in medical domain, national security and enterprise applications. One of her latest projects is knowledge extraction from Chinese texts. Her primary research areas are deep semantic processing and big data with a special emphasis on the medical domain.