THE PROCESS OF CONSTRUCTING ONTOLOGICAL MEANING BASED ON CRIMINAL LAW VERBS

This study intends to account for the process involved in the construction of the conceptual meaning of verbs (#EVENTS) directly related to legal aspects of terrorism and organized crime based on the evidence provided by the Globalcrimeterm Corpus and the consistent application of specific criteria for term extraction. The selected 49 concepts have eventually been integrated in the Core Ontology of FunGramKB (Functional Grammar Knowledge Base), a knowledge base which is founded on the principles of deep semantics and is also aimed at the computational development of the Lexical Constructional Model (www.fungramkb.com). To achieve this purpose, key phases of the COHERENT methodology (Periñán Pascual & Mairal Usón 2011) are followed, particularly those which involve the modelling, subsumption and hierarchisation of the aforementioned verbal concepts. The final outcome of this

It is a well-known fact that the main problem in the construction of natural language understanding systems is usually found in the lack of a robust semantic knowledge base and a powerful inference component (Vossen 2003).Moreover, a key aspect in knowledge engineering is the design and construction of an ontology model under a series of well-founded guidelines, particularly when you want to reuse it in different natural language processing (henceforth NLP) applications, e.g.document retrieval, information extraction, text categorization, etc.
Consequently, ontology structuring must be supported by some theory about the elements in the domain, their inherent properties and the way in which these elements are related to each other.To reach that purpose, the comprehensive theory of constructional meaning known as the Lexical Constructional Model FunGramKB (see Functional Grammar Knowledge Base at www.fungramkb.com):an online lexical conceptual knowledge base that integrates semantic and syntactic information for the creation of NLP applications (Periñán Pascual & Arcas Túnez 2004, 2005, 2007).The main advantage of this knowledge base is its capacity to combine linguistic knowledge and human cognitive abilities within the same integrated system.The concept-oriented interlingua which is used (COREL) serves to describe the properties of the different modules that integrate FunGramKB in the cognitive level (Periñán Pascual & Arcas Túnez 2010).As a consequence, this knowledge base moves away from the traditional solutions based 1 This article is based on research carried out within the framework of the projects FFI2014-53788-C3-1-P and FFI2010-15983, which are funded by the Spanish Ministry of Economy and Competitiveness.
2 I wish to express my sincere gratitude to my colleagues Alba Luzondo Oyón and Pedro Ureña Gómez-Moreno that collaborated so generously in the conceptual modelling of verbal concepts included in the Globalcrimeterm corpus and collected in the appendix.
clac 65/2016, 109-148 felices: verbs 112 on surface semantics to offer a fully-fledged alternative in which linguistic information is grounded on conceptual structures representing human knowledge. 3  However, the focus in this article is on the FunGramKB Ontology, which can be considered as the pivotal module for the whole architecture of the knowledge base.The Ontology, along with the Cognicon and the Onomasticon (see section 3), is presented as a hierarchical catalogue of the concepts that a person has in mind when talking about everyday situations and is also the repository where semantic knowledge is stored in the form of meaning postulates.(Periñán Pascual & Arcas Túnez 2007: 198).The Ontology consists of a general-purpose module (Core Ontology) and several domain-specific terminological modules (Satellite Ontologies or Subontologies).With reference to the latter, in the last few years a research project has been carried out in order to create a terminological subontology based on the international cooperation against terrorism and organized crime (Globalcrimeterm) under the postulates of FunGramKB. 4This domainspecific ontology combines a narrow and, at the same time, fuzzy terminological scope with diverse interdisciplinary sub-fields.However, it is helpful to note that in the following sections I will explain how this Subontology shares the same integrated structure as the Core Ontology, and both contain a well-structured body of concepts related to each other in an "IS-A" conceptual hierarchy.Furthermore, both types of ontologies distinguish between metaconcepts, basic concepts and terminal concepts; both have COREL as a common metalanguage for meaning representation, and both share and split the metaconcepts into three subontologies which arrange lexical units of a different part of speech; i.e. #ENTITIES for nouns, #EVENTS for verbs, and #QUALITIES for adjectives and some adverbs.
Within this context, the purpose of this paper is to account for the process involved in the construction of the conceptual meaning of verbs (#EVENTS) directly related to the aforementioned domain-specific Ontology and the application of the COHERENT 3 A conceptual approach to meaning construction is proposed, being based on the methodological principles which have been essential for both formal and functional linguistic models, e.g.Jackendoff (1990), Pustejovsky (1995), Levin & Rappaport (2005), Van Valin (2005) or Reinhart (2006). 4The project referred to above was funded by the Spanish Ministry of Economy and Competitiveness, code no.FFI2010-15983 and the results have been included in the FunGramKB editor.Gómez-Moreno (2012, 2014).Consequently, this article is organized as follows: sections 2 and 3 deal with an introduction to legal ontologies, followed by an overview of FunGramKb and the building of ontological meaning under the principles of deep semantics; section 4 explains the methodology used for (a) compiling the corpus, (b) designing the term extractor, and (c) analysing the verbal units finally selected; sections 5 and 6 describe and discuss the results of the conceptualisation and hierarchisation phases in the application of the COHERENT methodology and section 7 offers some concluding remarks; finally, an appendix of the selected 49 concepts (under #EVENT) is provided, with a full description of their meaning postulates.

Ontology building and legal ontologies
The origin of the term ontology comes from philosophy and bears no relation with the concept of ontology in NLP (e.g.Musen 1992, Gruber 1993), even if both share the human endeavour to comprehend the structure of knowledge and reality.From a more linguistic perspective, Sowa (2000: 492) defines ontology as "a catalogue of the type of things that are assumed to exist in a domain of interest D, from the perspective of a person who uses a language L for the purpose of talking about D".However, it is our concern the interpretation of the concept ontology in the framework of knowledge engineering (Gruber 1993), which consists of a hierarchy of concepts, attributes and their associations in order to allow the establishment of a semantic network of relations.
In this vein, a domain-specific ontology of concepts within a certain field, along with their relations and properties, is a new medium for the storage and propagation of specialised knowledge (Hsieh et al. 2010 Oncoterm (Faber 2002), Ecolexicon (Faber 2014), Genoma-KB (Cabré et al. 2004), PoCeHRMOM ( Kerremans, Temmerman et al. 2007), Prolex (Maurel 2008), or the pioneering Cogniterm prototype (developed by Skuče between 1991 and 1997) and its management system of data knowledge bases called CODE (Conceptually Oriented Design Environment).However, the description of ontologies as conceptual schemas within the legal domain arises in 1995 (Valente & Breuker 1994, Valente 1995), as a result of the growing necessity to formalize the information exchange and linkage among all the components which make up a legal system.In 1997 this field will emerge as a new area of research in the First International Workshop on Legal Ontologies (LEGONT`97) held within the biennial International Conference on Artificial Intelligence and Law (ICAIL-97).The objective has always been to provide the adequate instruments for accessing and managing a growing amount of legal information which is rapidly produced in electronic format every day (Breuker, Casanovas, Klein & Francesconi 2008).
Concerning the origins and development of ontologies used in the legal field, I should cite Liebwald (2007: 140), who concluded that "the formalization of implicit [legal] knowledge proved to be especially difficult"."The cross-linking of different domains and the connection between legal concepts and world concepts is still problematic.
Contrary to e.g. a biological taxonomy, a legal ontology is not language and country independent" 5 .In consequence, the most feasible options are application-oriented or specific domain ontologies.He also adds that "… ontology developers should always consider the specific needs of the intended application area(s) and user group(s)".Due to the complexity of this new and heterogeneous field, Valente (2005: 72) proposed a classification of the set of types and roles of ontologies in order to account for the legal 5 From this point of view, law is a dynamic, normative field and its conceptualization would necessarily include those aspects, together with the representation of world knowledge or common-sense knowledge (see, for example, Lame (2002) and Breuker and Hoekstra (2004)).
clac 65/2016, 109-148 felices: verbs 115 domain ontologies developed since the 1990s.He collected a catalogue of 24 ontologies 6 and concluded that "different authors mean different things by the term 'ontology"' and that "ontologies are used in very different ways" 7 .In addition, Periñán Pascual & Arcas Túnez (2007) assert that the large majority of these "misnamed" ontologies are, in fact, lexical taxonomies which do not give formal representation of meaning to each of their terms, but which are rather infra-defined as regards their subsumptive relation with other terms (and sometimes with other semantic relations such as synonymy, meronymy, etc.).Some of the so called ontologies (Casanovas, Sartor, Biasiotti & Fernández-Barrera 2011: 5-7): (1) organize and structure information, as in the case of projects such as Jur-Wordnet (Gagemi, Sagre & Tiscornia 2005) or the Italian ontology of crimes (Asaro et al. 2003;Lenci 2008); (2) have a reasoning and a problem solving engine, such as the ontology CLIME for maritime law (Boer, Hoekstra & Winkels 2001) or Argument Developer, which works with different types of legal data bases (Zeleznikow & Stranieri, 2001); (3) have semantic indexing and search, such as the ontologies of French codes (Lame 2002), ontologies which represent cases of financial fraud (Leary, Vandenberghe & Zeleznikow 2004) or which develop an intelligent FAQ (Frequently Asked Questions) system for judges (Benjamins et al. 2004;Casanovas, Casellas & Vallbé 2009) or young legal professionals ([OPJK: Ontology of Professional Judicial Knowledge], Casanovas, Casellas & Vallbé 2009).
(4) understand a domain, such as those which are more generally applied in law, e.g.
the functional ontologies of law (based on Ontolingua) by Valente andBreuker (1994, 1999), and those of language of legal discourse by McCarty (1989) or 6 Breuker et al. (2008) increased the previous list to 33 ontologies.In Casanovas, Sartor, Biasiotti & Fernández-Barrera (2011) the list reached more than 60 references. 7This author also considers that the term 'ontology' should not be used when referring to domainindependent knowledge representations-representation languages-.Also, although the origins of ontologies were related to knowledge sharing and reuse, most ontologies are built "with some application in mind." clac 65/2016, 109-148 felices: verbs 116 those more general ontologies used for knowledge representation (Frame Ontology) by Van Kralingen (1995).They all use general language for expressing legal knowledge.
The number of specialists who are working at present on legal ontologies is very high, although, as far as I know, none of the applications which have been designed so far is formally based on deep semantics or, in other words, on a functional linguistic model similar to the Lexical Constructional Model or the architecture offered by FunGramKB (see section 3).Moreover, none of the so called legal ontologies contains any development which covers the area of terrorism and organized crime from a procedural or criminal law perspective.Consequently, the methodology used in the construction of the Globalcrimeterm Subontology will be explained in sections 4 to 6,8 focusing both on a brief description of the corpus collection/term extraction process and on the conceptual modelling, subsumption and hierarchisation of verbs related to procedural law and criminal events.

The architecture of FunGramKB
Over 20 years ago Velardi et al. (1991: 156) distinguished two well-defined strategies when describing meaning in NLP: the cognitive content in a lexical unit can be described by means of semantic features or primitives (conceptual meaning), or through associations with other lexical units in the lexicon (relational meaning).Strictly speaking, the latter doesn't give a real definition of the lexical unit, but it describes its usage in the language via 'meaning relations' with other lexical units.Bender (2009) and Periñán Pascual (2012) maintain that it is certainly easier to state associations among lexical units in the way of meaning relations than describing the cognitive content of lexical units formally, but the inference power of conceptual meaning is much stronger.Surface semantics can be adequate in some NLP systems, but the construction of a robust knowledge base guarantees its use in most NLP tasks, reinforcing thus the concept of resource reuse.This crucial distinction set up the foundations for FunGramKB, which can be defined as a multipurpose lexico-conceptual knowledge base for natural language processing systems and natural language understanding.This knowledge base is made up of three major knowledge levels, consisting in turn of several independent but interrelated modules.As shown in Periñán Pascual & Arcas Túnez (2010b) and figure 1 below, these are: (a) The lexical level (linguistic knowledge) comprising the Lexicon, which stores morphosyntactic, pragmatics and collocational information about lexical units in a specific language, and the Morphicon, which handles cases of inflectional morphology.

(b)
The grammatical level (linguistic knowledge), formed by the Grammaticon, which stores and captures the properties that are specific to the most relevant constructional families in the languages selected.

(c)
The conceptual level (non-linguistic knowledge) which consists of three modules: 1.The Ontology, a hierarchical catalogue of the concepts that a person has in mind, so here is where semantic knowledge is stored in the form of meaning postulates.The ontology consists of a general-purpose module (i.e.Core Ontology) and several domain-specific terminological modules or satellite ontologies).
2. The Cognicon, a repository of procedural knowledge which is stored by means of scripts, that is, conceptual schemata in which a sequence of stereotypical actions is organized on the basis of temporal continuity, and more particularly on Allen's temporal model (Allen 1983;Allen and Ferguson 1994).
(c) Terminals (e.g.$ASSASSINATION_00, $FELONY_00, $GANGSTER_00, $CONSPIRE_00, $DISHONEST_N_00, etc.) are headed by the symbol $.The borderline between basic concepts and terminals is based on their definitory potential to take part in meaning postulates.Hierarchical structuring of the terminal level is practically non-existent.
Basic and terminal concepts in FunGramKB are provided with semantic properties which are captured by thematic frames (TF) and meaning postulates (MP).Every event (or quality) in the ontology is assigned one single TF, e.g. a conceptual construct which states the number and type of participants involved in the prototypical cognitive 9 The examples of basic and terminal concepts indicated here have been obtained from FunGramKB Core Ontology and the Globalcrimeterm Subontology.The original source for most of the basic concepts in the Core Ontology was a scrutinised reclassification of the defining vocabulary in the Longman Dictionary of Contemporary English (Procter 1978).conceptual constructs that represent the generic features of concepts.As stated above, the basic concepts are the main building blocks of these types of constructs in the Core Ontology.
Since metaconcepts and basic concepts are already defined in FunGramKB, it is worth noticing the importance of building adequate terminal concepts for a fine-grained knowledge base which is based on deep semantics.As a consequence, knowledge engineers have to cope with the modelling of ontological meaning which means not only deciding on the creation of terminal concepts, but also formalizing these concepts in COREL interface language or determining which lexical units should be linked to them.In the following sections, I will briefly explain the methodology used for the design of the Globalcrimeterm Corpus (henceforth GCTC) 11 and FunGramKB Terminology Extractor (henceforth FGKBTE) as a previous step towards a detailed description of the method employed for the conceptual modelling of the selected verbs (EVENTS) related to procedural and criminal law.

Corpus design and terminological extraction
The initial stages in the process of corpus compilation included a number of decisions and selections that helped us to collect and organize the GCTC coherently and efficiently (Bowker & Pearson 2002, Koester 2010).To begin with, the legal subdomain of organized crime and terrorism was selected for its current international relevance and for the scarce NLP references on the topic, particularly with the purpose of populating ontologies.Therefore, the winning terms extracted from the GCTC helped us to populate both, the specific-domain subontology and the Core Ontology in the system of FunGramKB.(Biber 1993).In this respect, the GCTC consists of approximately 5,600,000 tokens from a wide variety of text types, including international treaties, fact sheets, rules, resolutions, conventions and acts, among others. 12The corpus is also reasonably balanced with respect to the number of texts on the domains under study, i.e. 49% are focused on terrorism, while 35% deal with organized crime and 16% account for texts on both types of subject areas.Other sources, such as academic reference works and journal articles, were also considered due to the usual high concentration of specialised terms in their texts.
Once the relevant documents were selected and downloaded, a second step in the compilation of the corpus refers to text editing.A series of manual and semiautomatic editing tasks were required in order to filter out typographical mistakes resulting from the reformatting of original formats (usually pdf) to plain text.This preparatory preprocessing of the texts was necessary because of the characteristics of the term extractor tool (part of the FunGramKB suite) 13 , which only works with raw texts.Thus, whenever 12 All the sources included in the GCTC were in English. 13FunGramKB Suite is the name used to refer to the knowledge engineering tool and FunGramKB is the resulting knowledge base.FunGramKB Suite was developed in C# using ASP.The data gathered in the database had three main objectives.First, they served as a guide to monitor criteria such as corpus balance and representativeness.Second, some of the data registered in the database could be used during the uploading of texts onto the extractor and had to be conveniently stored.Finally, the database also provided the documentary basis for the calculation of simple descriptive statistics about the corpus.
Once the GCTC was completed and closed, the following stage comprised the extraction of specialised terms, whose process is described in the following section.

Term extraction process
Terminological extraction in the FGKBTE is based on corpus data, since this information can contribute to finding the relevant terminology used by professionals 14 The first field, "ID", assigns a unique numeric code to each text.The field "Language" contains information about the language in which the text is written."Brief description" offers very succinct information about the contents of the text."Title" provides a title that summarises the specific topic of the document.The "Topic" field, on the other hand, records the subdomain the text belongs to; in the case of GCTC, a distinction is drawn between "Organized crime", "Terrorism" or "Both".Finally, the field "Type of document" contains information about the text type (e.g.joint action, agreement, green paper, proceedings, etc.), while "Source" adds a reference on the source from which the original document was extracted.
clac 65/2016, 109-148 felices: verbs 123 and practitioners within a domain.Once the textual repository is set up, term extraction is the following step.As can be seen in figure 3, this process comprises two stages, an automatic phase and a manual phase:  From the top leftmost button: the "Pre-processing" tab contains an area for testing new features for the extractor.The "Processing (indexing)" tab is used for uploading texts of a corpus to the extractor."Processing (statistics)" is a key function allowing the terminologist to automatically obtain the list of candidate terms from the corpus.
"View" allows the terminologist to filter false terms by means of a series of removal options.The "Search" tab is a secondary tool for searching strings of text in a corpus.
Finally, "Corpus" shows basic descriptive statistics concerning the number of indexed texts making up a given corpus as well as the number of tokens included.This tab also shows a terminological box containing a list of false candidates that were discarded during the filtering process tackled in the "View" function.
One of the most outstanding features of FGKBTE lies in its potential for filtering false candidates.The "View" mode contains for each term candidate an option for "simple removal", so that if the terminologist chooses this option, a bigram such as "avoid transact" would be sent to the list of false candidates in "Corpus".More interestingly, the extractor can also make complex removal of lexical bigrams and trigrams.For example, the nested removal of "avoid transact" will result in the removal of "avoid transact" as a bigram as well as in the removal of each component individually ("avoid" and "transact").
Previous results emphasize the utility of this approach for term extraction.After uploading the components of the GCTC to the extractor, which contains roughly 5,500,000 tokens, and applying the preparatory filters and the statistical processor clac 65/2016, 109-148 felices: verbs 125 afterwards, the initial count was reduced to a set of approximately 5,700 candidate terms, a comparatively much smaller quantity of acceptable terms.It is important to emphasise that such a reduced set of candidates was reached in a short period of time, if compared to other approaches such as manual inspection of concordances or collocations.
Manual filtering 15 : For identifying terms it is not enough to apply the previous statistical processor and decide about units' termhood on the bases of their statistical significance, since there are other theoretical problems to be faced.To facilitate term identification, terminologists should consider three additional criteria during the manual filtering process (Felices Lago & Ureña Gómez-Moreno 2014: 264-266): (1) Ontological criterion: To decide whether a candidate is a specialised unit, the speaker's mere introspection is sometimes a valid criterion.Within the framework of FunGramKB, introspection is carried out at the level of the Core Ontology, so that the question of whether a candidate is actually a term can be answered by means of another more specific question: does the Core Ontology contain a concept that could include this candidate as a possible lexical realisation?
( Consequently, considering all the criteria involved above the terminologist must conclude the analysis by determining the specific nature of the lexical candidate and the place it should occupy inside the Ontology.

Conceptual modelling
This section refers to the main procedural aspects concerning the transduction of terminological units into conceptual constructs, and the use of the latter in the population of the FunGramKB Ontology. 16It is essential to emphasize that I will only be dealing with #EVENTS, and thus both #QUALITIES and #ENTITIES will be disregarded.Results expected at this stage include a set of terminal concepts, as well as a group of lexical units representing the linguistic expression of each of these cognitive categories.The terminal concepts will occupy slots within the Ontology (be it Core or Satellite), while the linguistic representations of terminal concepts (lexical units) will fill the corresponding slots in the lexicon of the language selected.
The final output of "events" linked to the domain of the international cooperation against terrorism and organized crime amounts to 49 concepts and their corresponding lexical units in the English lexicon.This was the result of the exhaustive scrutiny of the 16 This process corresponds to the CONCEPTUALIZATION phase of the COHERENT methodology referred to above.
clac 65/2016, 109-148 felices: verbs 127 verbal concept candidates and their definition in natural language as a previous step to their conceptual modelling and hierarchisation. 17The transduction of these definitions into conceptual constructs in COREL interface language creates the semantic properties which are captured by thematic frames (TF) and meaning postulates (MP) and represented in basic or terminal concepts in FunGramKB ontologies. 18A complete list of the fully defined 49 concepts can be seen in Appendix 1.In order to illustrate how to define terms, let us consider the following example of the concept $BRIBE_00 included in the criminal law domain: (1) Term in the English lexicon: Bribe The lexical conceptual information of terminological units is introduced in FGKBTE by means of the "Edit" tool included in the "View" tab (see figure 4 above)."Edit" appears 17 The role of the knowledge engineer at this stage is to gather the semantic content of a term from a selected number of dictionaries and to produce a general description in natural language which encompasses all the different lexicographical definitions.The most common sources have been the updated editions of the following reference works: Longman Dictionary of Contemporary English, Oxford Dictionary of Law or Black`s Law Dictionary, among others.In order to illustrate the editing function further, this figure captures a screen with the dialog boxes shown above filled with information related to the term bribe.To understand each subsection included here, it is helpful to describe them starting with "Senses" and showing the other subsections in a clockwise fashion."Senses" is aimed at storing the several senses of homonymous and polysemous terms.Each sense shall carry a distinctive numerical index (e.g.+SEIZE_00, +SEIZE_01 and so on).It is an automatic dialog box, i.e., the information displayed here is generated automatically after the information in the other dialog boxes has been introduced.The "Delete" and "Rename" options allow the terminologist to make corrections before eventually validating the term at work together with its lexical conceptual information."Concept" is the label or the COREL name that serves as a host cognitive category of the terminological unit."Description", as the name suggests, is a space set aside for entering a description in natural language that captures the meaning of the concept.It is worth recalling that FGKBTE uses English as a lingua franca for this purpose.
"Metaconcept" is completed automatically with the ontological data selected among "entities", "events" or "qualities".This option is a first contribution to the hierarchical organization of concepts in the domain.Once the fields "Concept", "Description" and clac 65/2016, 109-148 felices: verbs 129 "Metaconcept" are completed, if you click on "Save" the online information is automatically stored."Duplication" will serve the purpose of creating mirror concepts.
If the concept that is about to be introduced in the Subontology is already included in the Core Ontology, a note will appear prompting the engineer to create a mirror concept or to warn them not to repeat information.The last element in the "Edit" tool is the Lexicon, which gathers the different lexical realisations, in this case, terminological units, instantiating a concept.
FGKBTE is currently designed to interpret and process information in seven languages: English, Spanish, Italian, French, German, Bulgarian and Catalan.Moreover, it also allows the assignment one or more terms to each concept.As it was mentioned above, it is necessary that there is at least one lexeme for each concept in any of these (or other) languages.Once all the data mentioned above have been introduced, the engineer must click on "Done" and all the information will be validated definitely, although this process can be reversed in case further changes or corrections are needed.The importance of "Done" is that only the terms so validated (in this final validation process) will be included in the Ontology (see figure 6), while the rest will be discarded.
Figure 6: the concept $BRIBE_00 and its integration in the Ontology of FunGramKB editor.
Besides the guidelines just mentioned as to how to define terms in FGKBTE, it is necessary to enter three additional caveats: Firstly, terminologists and knowledge engineers must be careful not to include the definiens within the definiendum; in other clac 65/2016, 109-148 felices: verbs 130 words, definitions should not take the shape of paraphrases in which the word being defined is a component part itself, such as "if someone commits X (…)", where "X" is the definiens.Secondly, terminologists should also avoid including examples in the definiens showing how this term is used in natural language.Thirdly, definitions shall conveniently be expressed using simple syntactic structures such as "S+P+O", and, whenever possible, support themselves on the reiteration of keywords.The example of $BRIBE_00 in figure 6 illustrates a definition with a simple syntactic outline and the recursive use of simple but key concepts.

Hierarchisation process
The new stage to be reached is the hierarchisation phase, which deals with the establishment of hierarchical meaning relations among concepts in the domain.
Designing a networked hierarchy will endow FunGramKB with the capacity to derive relevant and meaningful inferences, as well as to understand and produce knowledge for a specific user-defined goal.The present section deals with the details of conceptualhierarchy construction.
Hierarchisation consists of determining for each terminological concept defined in FGKBTE its corresponding hyperordinate, subordinate(s) and sister concept(s).
Hyperordinates are the most general type of units in the hierarchy and work as host concepts for the classification of one or more subordinate concepts.Each subordinate concept can in turn have one or more sister concepts, which are characterised by sharing common semantic features inherited from the hyperordinate.This arrangement of concepts is called the "IS-A" subsumption.An illustrative example of how inheritance and subsumption operate within the hierarchy of concepts is the terminal concept $BRIBE_00: (2) #EVENT> #MATERIAL> #MOTION> #TRANSFER>+TRANSFER_00> +GIVE_00> $BRIBE_00 In (2) the concept $BRIBE_00 depends on the metaconcept #TRANSFER, and, as a consequence, it inherits the prototypical scheme from the metaconcept and the subsequent superordinate concepts +TRANSFER_00 and +GIVE_00.3) and ( 4), which represent the relevant conceptual information of $BRIBE_00 19 , inheritance is crucial for knowledge organization in FunGramKB.It is moreover of paramount importance in case the knowledge base is intended for reasoning tasks of the utmost precision, as in legal practice, since semantic features must be inherited without causing incongruence or deriving erroneous conclusions.
Hierarchies of specialised concepts show the same classification tenets and share the same upper conceptual level as the Core Ontology.Therefore, in order to build the hierarchy consistently, the first step is to select the basic hyperordinate concepts under which the remaining concepts will be classified.In the case of the #EVENT subontology for the domain of international cooperation against terrorism and organized crime the diverse conceptual paths for the selected 49 criminal actions or procedural steps are classified as follows: -#COMMUNICATION>+SAY_00: $ACQUIT_00, $CONFESS_00, $DECLARE_00, $INTERROGATE_00, $SENTENCE_01, $TESTIFY_00.-#COMMUNICATION>+SAY_00>+REQUEST_01: $APPEAL_01.
The possible disconnection between the diverse hierarchies of conceptual paths shown above and the way the "domain-specific" concepts are classified is only apparent and can be explained with a brief account of NLP in relation with the different approaches to ontology building.
In surface semantics, legal ontology engineers have been producing taxonomies and have established connections among units (or concepts) basing their assumptions on expert extra-linguistic information, for example, legal theories or deontic logic, but the reasoning capacity has been generally limited to very specific tasks.However, the way the concepts relate to each other in this proposal is based on deep semantics, which combines an extensive commonsense knowledge base (FunGramKB) and a reasoning engine.Consequently, the Ontology of FunGramKB (and the other two modules: macroknowing, which may allow the integration of the MPs of the "events", "entities" or "qualities" in the Ontology with the cognitive macrostructures in the Cognicon or the episodic knowledge stored in the Onomasticon.
Another apparent contradiction in the building of this domain-specific ontology is the full inclusion of the "specific" verbal concepts in the Core Ontology.This methodological decision is based on a series of unpredictable results during the compilation phase.
As previously explained in section 4, the basic concepts of the Core Ontology come from a scrutinized reclassification of the defining vocabulary in the Longman Dictionary of Contemporary English (Procter 1978).However, in the case of domainspecific ontologies, the engineer must decide to what extent a percentage of basic commonsense concepts from the Core Ontology are useful for a given subontology and, on the other hand, to what extent it will be necessary to create new basic concepts for the proper classification of domain-based terminals.Precisely, the combination of lexicographical evidence and the hierarchical paths shown above furnish the evidence that all the relevant "events" in this supposedly specific domain need not be included in the Globalcrimeterm Subontology.Two reasons are given: (1) The way in which concepts relate to each other within a domain and the way in which this relationship should be represented in a hierarchical taxonomy is not always clear.Precisely for this reason, the role of the ontology engineer is to 22 For an account of these two reasoning processes, see Periñán Pascual & Arcas Túnez (2005, 2007).
clac 65/2016, 109-148 felices: verbs 135 find out common ontological properties and to discern differences among the selected units.This premise may lead me to conclude that most of the apparently specialised conceptual units referred to criminal law in the area of international cooperation against terrorism and organized crime have eventually been included in the Core Ontology as terminal concepts, due to the fact that the semantic content of their corresponding lexical units can be found in widely used learner`s dictionaries and, consequently, this conceptual information is generally known and used by the layperson. 23  (2) In the same vein, the expert knowledge in the area of legal and social sciences is more accessible to the non-specialised knowledge (common sense) of a layperson than in the case of natural or "hard" sciences.Moreover, the results of the GCTC term extraction have shown that most of the selected terms (specific verbs) included in this field are also well known by the general public.

Conclusions
A key factor for the development of this research has been the possibility to use FunGramKB, which was designed to cover many of the most noticeable problems currently faced by NLP and practitioners in the area of artificial intelligence.The main advantage of this knowledge base is its capacity to combine linguistic knowledge and human cognitive abilities within the same integrated system.The concept-orientated interlingua (COREL) serves to describe the properties of the different modules that integrate FunGramKB in the cognitive level.As a consequence, this knowledge base moves away from the traditional solutions based on surface semantics to offer a fullyfledged alternative in which linguistic information is grounded on conceptual structures representing human knowledge. 23The final outcome of the Globalcrimeterm project referred to in footnote 2 is that all of the specialised concepts included there have been "entities" (nouns), even if a large number of the relevant "entities" are also part of the Core Ontology.At a more specific level, the concept $BRIBE_00 has been used as a canonical instantiation of conceptual modelling and a similar process has been followed to represent the meaning of the remaining 48 events collected in Appendix 1.Moreover, the hierarchisation phase has demonstrated how the apparently 49 specialised concepts replicate the same classification tenets and share the same upper conceptual level as the basic concepts of the Core Ontology.In fact, the "specific" verbal concepts are eventually included in the Core Ontology and not in the domain-specific ontology, as previously calculated.In this respect the selected events collected here clearly differ from the selected "entities" or terminological nouns, which are generally integrated in the Globalcrimeterm Subontology.Among the reasons that could explain this unexpected result, it is worth noting the evidence provided by lexicographical sources, which show how the semantic content of the units linked to the selected concepts is not only known by legal practitioners but also shared by the average speaker of the language.
Figure 2. Sample of corpus database

Figure 3 .
Figure 3. Flowchart of semi-automatic term extraction in FGKBTE

Figure 4 Figure 4 .
Figure 4 below shows the main menu of FGKBTE containing the principal functions of the tool: 2) Lexicological criterion: It relates to the lexicological features of the candidate terms.Terms were traditionally characterised by a univocal, unambiguous monosemic meaning.This misconception has been successfully overcome in the last decades.However, aspects such as meaning banalisation or the acquisition of new terminological senses in general language lexical units through processes of metaphorisation and metonymic mapping require the terminologists to check 15 The whole process involved in the manual extraction (preceded by the automatic extraction) would correspond to the selection and acquisition phases described in the methodology for the the construction of satellite ontologies (Periñán Pascual & Arcas Túnez 2014) and the COHERENT methodology (Periñán Pascual & Mairal Usón 2011).clac 65/2016, 109-148 felices: verbs 126 whether a candidate term is polysemous or homonymous and, if so, decide which sense is technical and discard common knowledge meanings.(3) Lexicographical criterion: The most important criterion, nevertheless, is the consultation of specialised dictionaries, since they reflect the necessary knowledge for the understanding of expert knowledge.It is necessary to note that trained terminologists and lexicographers (with the advice and support of domain-specific experts and practitioners) are the best placed to determine and define terms, since they know how to concisely formulate a definition in a systematic way.
18 I refer the reader to Periñán Pascual & Mairal Usón (2010) for an exhaustive description of the notation system used in the grammar of COREL, particularly the diverse satellites and operator used for the thematic frames and the predications of the meaning postulates.clac 65/2016, 109-148 felices: verbs 128 next to each candidate term and by clicking on it the terminologist accesses the screen in figure 5.

19
Example (1) above offers a full-fledged representation of the conceptual information of this terminal concept.
is the fact that most specialists who are working at present on legal ontologies are not developing applications formally inspired in deep semantics or, more specifically, in a functional linguistic model similar to the Lexical Constructional Model (LCM) or associated computational developments, such as FunGramKB.Moreover, none of the so called legal ontologies, as far as I know, contains any development which covers the area of terrorism and organized crime from a procedural or criminal law perspective.Consequently, the methodology used in the development of the Globalcrimeterm project focuses not only on a brief description of the corpus collection/term extraction process, but also on the conceptual modelling, subsumption and hierarchisation of verbs related to procedural law and criminal events.In this respect, this ontological construction based on the COHERENT methodology may contribute to a new perspective of analysis in the field of legal ontology building.
Gómez-Moreno (2014)d in Felices Lago & UreñaGómez-Moreno (2014), the first step in the compilation lies in the selection of sources, e.g.academic and professional repositories containing specialised documents on the topic(s) of interest.This step is of vital importance, since it will determine to a great extent whether the corpus is optimal, both qualitatively and quantitatively, for the purpose of term extraction.The selected sources must therefore meet high scientific standards or be highly regarded by the professional community.For example, the GCTC contains a selection of more than 10 sources, such as the European Union (EU), the Council of Europe, the Organization for the Cooperation and Security in Europe (OSCE), Eurojust or the International Criminal Court (ICC), which offer reliable information concerning cooperation against criminal and terrorist activities.In addition to the data sources, another important decision is the representativeness of the corpus Knowing Spreading).22Microknowing is performed by two types of reasoning mechanisms: inheritance and inference.Inheritance, for instance, strictly involves the transfer of one or several predications from a superordinate concept to a subordinate one in the ontology.On the other hand, inference is based on the structures shared between predications linked to conceptual units which do not take part in the same subsumption relation within the ontology.The application of these two mechanisms on the MPs allows FunGramKB to minimize redundancy and maximize the informative capacity of the knowledge base.Outside the scope of this article is the role played by