In the last decades there has been a huge investment by various communities in developing large lexical and knowledge resources. The lexical resources and computational linguistics community on the one hand have developed important and widely used lexical resources such as WordNet1, FrameNet2, VerbNet3, Propbanksup4 etc. as well as created metamodels to represent computational lexicons (e.g. LMF). The knowledge engineering and knowledge representation communities on the other hand have a strong tradition of developing foundational (e.g. DOLCE) or general ontologies such as SUMO5 or CYC6. The Semantic Web community has created the infrastructure that makes it possible to develop and maintain a large number of domain-specific ontologies (by means of the standards such as OWL and RDFS, ontology editors such as Protégé or the Neon Toolkit etc.). Dedicated ontology search engines (such as Ontoselect, with 1.543 indexed ontologies7, Swoogle8 with over 10.000 ontologies, or Watson with more than 8.300 indexed ontologies9) have been built allowing users to access the large number of available ontologies on the Web, e.g. in order to find an appropriate ontology to reuse in their own application and context.
While world knowledge clearly plays a crucial role in natural language processing as well as the other way round, the interplay and relation between the different types of resources (lexical resources on the one hand and ontologies on the other) remain underexplored to a large extent. There have been certainly a number of projects where both traditions have met:
WordNet has been for example restructured (“sweetened”) according to the principles of formal ontology [Gangemi et al. 2003].
FrameNet has been linked to the Suggested Upper Merged Ontology (SUMO) [Scheffczyk et al. 2006a].
FrameNet has been represented using the OWL datamodel allowing to check annotations for inconsistencies [Scheffczyk et al. 2006b].
First models allowing to associate linguistic information to ontologies have been developed in the Semantic Web community, i.e. Lexical Information Resource (LIR) [Peters et al. 2006] and LexInfo [Buitelaar et al. 2009], both building on the Lexical Markup Framework (LMF).
Domain-specific framenets with links to ontologies have been created (e.g. BioFrameNet).
While these are good examples for projects where both communities have met, there is still a long way to go to understand the relation and interplay between lexical and ontological resources and to develop best practices how they can be applied/linked/merged at the level of applications. In contrast to earlier work on the field at the intersection of KR and lexical semantics (generative lexicon etc. etc.), we are nowadays in the unique position of exploring these issues from a practical point of view by exploiting the concrete and large resources developed in the different communities.
The particular questions which could help us to better understand the role of ontologies and their relation to lexical resources are the following:
What is the role/impact of domain-specific ontologies modeling a certain domain in lexical semantics, computational semantics and natural language processing in general?
What are the benefits and established good practices for restructuring lexical resources according to ontological principles?
How can lexical resources guide the development of ontologies? How can the large number of ontologies available word-wide contribute to the development of lexical resources?
What is the relation between language and ontologies (if there is one at all)? Are ontologies language-independent human-engineered artifacts?
What are the main differences between KR formalisms and lexically inspired formalisms (ambiguity vs. non/ambiguity, degree of formalization etc.)
Can we enrich ontologies with linguistic information in a principled way? How could applications benefit from this?
Can ontologies provide the necessary knowledge that is needed for NLP applications?
[Gangemi et al. 2003] - A. Gangemi, N. Guarion, C. Masolo and A. Oltramari, “Sweetening WORDNET with DOLCE”, AI Mag., Vol. 24, No. 3., pp. 13-24, 2003.
[Scheffczyk et al. 2006] - J. Scheﬀczyk, A. Pease, Adam and M. Ellsworth, “Linking FrameNet to the Suggested Upper Merged Ontology”. In Proceedings of the conference on Formal Ontology in Information Systems. 2006.
[Scheffczyk et al. 2006] - J. Scheffczyk, C. F. Baker, and S. Narayanan, "Ontology-Based Reasoning about Lexical Resources" Proceedings of the Workshop on Interfacing Ontologies and Lexical Resources for Semantic Web Technologies (OntoLex 2006), Genoa, Italy, pp. 1-8, 2006.