The Elevator Pitch
Cognition's Semantic Natural Language Processing (NLP) technologies add word and phrase meaning and understanding to computer applications, providing a technology and/or end-user with actionable content based upon semantic knowledge. This understanding results in simultaneously much higher precision and recall of salient data within the universe of possible results. Cognition's Semantic NLPTM makes technologies and applications more human-like in their understanding of language, thereby resulting in more robust applications, greater user satisfaction and new capabilities available for exploitation. On the Web in particular, powering applications with Cognition's semantic understanding technology drives these applications ever closer to Web 3.0 (the semantic Web).
Cognition - Giving technologies new meaning.TM
Introduction
Cognition Technologies, Inc. ("Cognition") is a next generation Semantic Natural Language Processing (NLP) company, based in Culver City, CA.
What is Semantic NLP?
- Semantics is the sub-field of linguistics that is devoted to the study of meaning, as expressed by words, phrases, sentences, and even larger units of speech or text.
- Natural Language Processing (NLP) is a sub-field of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages by computers.
- Cognition's Semantic NLPTM is technology that "understands" word and phrase meanings within context in modern computer applications. Cognition's mission is to make its clients' technologies and applications more human-like in the understanding of language and more profitable.
Cognition's Semantic NLP has been in development for over 23 years by Dr. Kathleen Dahlgren, Cognition's co-founder and CTO, and a team of linguists and computer scientists. Cognition's technology employs a mix of linguistics and mathematical algorithms which has, in effect, taught the computer the meanings of virtually all the words and frequent phrases within the common English language. Semantic Natural Language Processing is superior to common pattern matching that is found in most search engines and text-interaction tools because it focuses on the understanding of word and phrase meanings within context. No other commercially available natural language processing technology comes close to Cognition in its breadth and depth of understanding the English language.
Statistics
Cognition's Semantic NLP technology contains one of the world's largest computational dictionaries, also known as a Semantic Map.
This Semantic Map encodes a wealth of morphological, syntactic and semantic information about the words of the English language and their relationships to each other. These resources were created and reviewed by lexicographers and linguists over a span of twenty-four years.
- Word Stems -- [510,000]: Each word (including phrases and acronyms) in the Cognition lexicon is stored in a base (inflected) form.
- Word Senses -- [540,000]: Key meanings of semantically ambiguous terms (such as, “strike” meaning “hit” versus “strike” meaning “labor dispute”) are stored as individual sub-entries within the word’s lexical entry. Importantly, each sub-entry may have distinct morphological, syntactic and semantic features; and distinct ontology/synonymy relationships in the semantic map. Different meanings are distinguished, but the same meaning may have various parts of speech (as in “love”, which can be both a noun and verb in the same meaning).
- Taxonomy -- [8,000 nodes with 540,000 leaves]: All word meanings (senses) are placed in one or more positions in a semantic ontology. This allows Cognition’s Semantic NLP to reason from the general to the specific (e.g. knowing that one meaning of "tank" is a type of "container") and plays a significant role in the technology’s syntactic and semantic features.
- Meaning Thesaurus -- [75,000 groupings]: Word meanings with a fair degree of semantic equivalence are associated with each other (e.g. associating one sense of "car" with "automobile"). These relationships include synonymy but may go across syntactic categories (e.g. associating conceptually related nouns, adjectives and verbs, etc.). The parts of speech are marked, so that if it is desired to restrict a paraphrase relationship to a given part of speech, that can be done.
- Sense Contexts -- [8,300,000 contexts for disambiguation of 17,000 ambiguous word stems]: Terms that may co-occur with ambiguous stems and aid in their disambiguation are stored for each sense of such stems.
- Morphology Features -- [199 patterns]: Syntactic categories with regular and irregular inflectional and derivational morphology are encoded for each sense in the lexicon or identified by the morphology processor. This allows Cognition’s Semantic NLP to recognize tens of millions of word forms and associate them with their appropriate stems, as in "babies"-"baby", "re-run", "run", etc.
- Syntax Features -- [3,246 patterns]: Syntax features spell out the syntactic sub-categorization frames for words, 3,215,335 morphological and syntax features are encoded in the lexicon.
- Selectional Restrictions -- [45,812 encodings]: Ontological restrictions on arguments (e.g. that "vehicles" are the typical objects of one sense of the verb "drive") are stored in the sense entries for words that take arguments.
- Acronyms -- [19,122]: Acronyms are stored with their spell-outs. Each acronym sense may have many spell-outs. Different spell-outs for ambiguous acronyms are encoded in separate senses.
- Phrases -- [192,000]: Multi-word expressions are stored with their own lexical features and semantic relationships. To the extent that they are compositional, the particular senses of the individual words in each phrase may be indicated. (Note: the Cognition’s Semantic NLP "reader" module recognizes and regularizes additional phrases, such as names, dates, phone numbers, etc., that may not be stored in the lexicon, yielding an indefinite number of phrases that can be recognized.)
- Synographs -- [17,000]: Common and/or dialect-dependent alternate spellings are stored for word stems.
- "Naive" Semantic Features -- [Approximately 50 feature types, 540,684 encodings]: A variety of commonsense knowledge, such as "cats have tails", "hands have five fingers", "the function of a chair is sitting", "the consequence of buying X is owning X", etc., may be stored with individual word senses.
Cognition's place in the world related to the "Semantic Web" (Web 3.0) and Google
Cognition employs semantic technology to delve into the meaning of words and phrases, and unlike others who are trying to make the Semantic Web a reality through hand-tagging, such as Web Search, Cognition applies its Semantic NLP to other technologies to give these products and services a differentiation and competitive edge.
"We look at what we're doing as a significant component to the Semantic Web," said Scott Jarus, Cognition's CEO, "Our focus on semantically enhancing other technologies means we're not competing with Google, Yahoo! or other consumer Search engines. Indexing the entire World Wide Web ourselves is not currently on our business roadmap. However, we might become a semantic component of someone else's application which may index deep content on the Web similar to the examples you can see on our Website."
Management
Bill Collins
Chairman
Bill is chairman of Cognition and a member and past president of the Tech Coast Angels, the dominant source of angel funding in southern California. Bill started at Intel Corporation, quickly becoming a lead sales person on the IBM account and helping to establish the Intel-IBM relationship. He was a key executive for International Rectifier (NYSE: IRF), ramping the company from $60M to over $700M, helping to take IRF’s technology from the early adopter phase to sustained market leadership. He has a successful venture portfolio in semiconductor, electronics, enterprise software and Internet segments. He is a certified corporate director, and a guest lecturer at Caltech and USC. He holds a BSEE from Clarkson University.
Kathleen Dahlgren, PhD
CTO / Founder
Dr. Kathleen Dahlgren is the Founder and Chief Technology Officer of Cognition Technologies. She began her career as a professor of computational linguistics at Pitzer College of the Claremont Colleges and then worked for IBM at their Los Angeles Scientific Center, focusing on building a "natural language understanding system." Dr. Dahlgren has a Ph.D. in Linguistics and a post-doctorate in Computer Science from the University of California, Los Angeles. She has published a number of scholarly articles on the subjects of linguistics and computer science, and is the author of Naive Semantics for Natural Language Understanding. She is the co-author of Cognition's seminal patent (1998), and she received the Small Business Innovation Award from the U.S. Army in 1995. Currently, she is also an adjunct professor of Linguistics at the University of California, Los Angeles.
Daniel Albro, PhD
Chief Scientist
Dr. Daniel Albro is the Chief Scientist of Cognition Technologies. He received his Ph.D. in Computational Linguistics from the University of California, Los Angeles, and his Bachelor's of Science in Computer Science and Computer Engineering from the Massachusetts Institute of Technology. His research outside of Cognition Technologies has involved finite state phonology, efficient chart parsing of n-multiple context-free grammars (MCFGs), the intersection of MCFGs with weighted finite state machines, machine learning of phonological grammars, and implementation of phonological frameworks. The machine learning work involved data compression via the Minimum Description Length framework. Dr. Albro continued his development work on compression techniques at Cognition Technologies, where it has been used to create a massively scalable indexing architecture for the Company's Search engine. Currently, Dr. Albro heads Cognition Technologies' Natural Language Processing group.