Software Process Ontology: A case study of software organisations software process sub domains

In a domain like software process that is intensively knowledge driven, transforming intellectual knowledge by formal representation is an invaluable requirement. An improved use of this knowledge could lead to maximum payoff in software organisations which is key. The purpose of formal representation is to help organisations achieve success by modelling successful organisations. In this paper, Software process knowledge from successful organisations was harvested and formally modelled using ontology. Domain specific knowledge base ontology was produced for core software process subdomain, with its resulting software process ontology produced.


Introduction
Software process is a knowledge driven and knowledge intensive process that involves several other sub-processes. Software process can be defined as the set of related activities that are used in developing software. Knowledge in Software Engineering (SE) is diverse and organizations have problems capturing, retrieving, and reusing it. An improved use of this knowledge is the basic motivation and driver for Knowledge Management (KM) in SE .
Harvesting, representing and reusing knowledge within a domain leads to maximum payoff, which is desirable in most organisations [23]. Knowledge Management (KM) is defined as an effort to capture critical knowledge and share it within an organization [3,17]. It capitalizes on the collective organiza-tional memory to improve decision making, enhance productivity, and promote innovation [18,19].
It is also the process of transforming information and intellectual assets into persisting value. KM connects people with the knowledge that they need to take action, when they need it [20]. Knowledge management involves the identification and analysis of available and required knowledge [21] and helps an organization to gain insight and understanding from its own experience. Specific knowledge management activities focus on acquiring, storing and utilizing knowledge for problem solving, dynamic leaning, strategic planning and decision making. This prevents intellectual assets from decay, adds to a firm's intelligence and provides increased flexibility [22]. SE comprises several interrelated subdomains such as Requirements, Design, Coding, Testing, Project Management, and Configuration Management. There are several software process models which describe the sequence of activities carried out in developing software. These software process models are a stan-dard way of planning and organizing a software process. The major phases are requirement gathering, design and coding, implementation and maintenance. It has been identified that there are few works in literature that aim at developing ontologies covering wide portions of the SE domain, such as [4][5][6].
A lot of SE domain ontologies model SE subdomains [7][8][9][10][11]. Ref. [12] described these subdomain ontologies as weak or not interrelated, and are often applied in isolation. Thus he made an attempt to provide an integrated solution for better dealing with KM-related problems in SE by means of a Software Engineering Ontology Network (SEON). It was designed with mechanisms for easing the development and integration of SE domain ontologies, covering the main technical software engineering subdomains (i.e requirements, design, coding and testing). However, he only represented a small portion of software engineering ontology. [7], identified that the combination of ontologies of all SE subdomains would result in an ontology of the complete SE domain. He further stated that the reality is that this goal is extremely laborious, not only due to its size, but also due to the numerous problems related to ontology integration and merging, such as overlapping concepts, diverse foundational theories, and different representation and description levels, among others. He concluded that despite the challenges involved, an ontological representation covering a large extension of the SE domain remains a desired solution. This paper represents a software process ontology covering major SE subdomains (i.e. requirement gathering, design and coding, implementation and maintenance).

Related Literature
Ontologies have been widely recognized as a key enabling technology for KM. They are used for establishing a common conceptualization of the domain of interest to support knowledge representation, integration, storage, search and communication [2]. A domain ontology identifies the key concepts, objects and entities that exist in some knowledge domain or area of interest and the relationships between them [15,16]. Ontologies play a significant role for knowledge sharing and as knowledge models in instructional science, technology-enhanced learning, knowledge management and training [15,14,13]. Ontologies consist of instances, properties and classes, where instances represent specific project data, properties represent binary relations held among software engineering concepts/instances, and classes represent the software engineering concepts interpreted as sets that contain specific project data [25]. [7], did an extensive review of SE ontologies, where he classified them into generic and specific ontology. Generic SE Ontologies, have the ambitious goal of modeling the complete SE body of knowledge; while Specific SE Ontologies, attempting to conceptualize only part (a subdomain) of this discipline.
The management of knowledge and experience are key means by which systematic software development and process improvement occur. Within the domain of Software Engineering (SE), quality continues to remain an issue of concern. Knowledge Management (KM) gives organizations the opportunity to ap-preciate the challenges and complexities inherent in software development [24].
Successful organisations continuously improve their processes. Like organisational standard process definition, systematic process improvement is more effective and efficient if it is done guided by process quality models and standards. The purpose of most standards is to help software organisations achieve excellence by following the processes and activities adopted by the most successful organisations [26].

Methodology
Two complementary methods were used for data collection. They are: case study and interview methods. For the case study, four (4) Software Development Organizations was studied. For reason of privacy and confidentially, the organizations studied are not referenced by their names. Table 1 shows the details of the organisations.
From Table 4, the domain concept, the data value/property and the instances are specified. For example the entity domain expert in number 9 has the property: name that can take a data value string, data property domain that can take a data value string and a data property years of experience that can take a data property integer. It also has instances of the class as: business rules and directory of experts. Figure 6 shows the domain concepts and their instances

Result
Four main subdomains were identified as knowledge entities in a typical software development process irrespective of the life cycle model adopted: Requirements Definition, Design & Coding, implementation and maintenance. These are core human centric activities performed by developers that create opportunities for sharing tacit knowledge during the software development process. This research used both inductive and deductive analysis by first identifying keywords related to software development process and then grouping the keywords into categories related to requirements definition, coding, implementation and maintenance. For each process activity, a Union of the useful themes obtained each case study was used to determine the useful knowledge constructs for that activity. That is, for each Process Activity (PA), Knowledge Harvested (KH) for the software process is given as

Software Process Ontology
There is no globally accepted methodology for Ontology construction [28] but the development is usually an iterative process. A 5-step iterative process from [29,30] was adopted for the Ontology construction:  Each concept (Table 2) connotes a software process activity that describes a task, function, action, strategy, or reasoning process. A Concept is a collection of objects. It is the fundamental element of a domain and usually represents a group or class whose members share common properties [30] • Step 2: Organisation of the concepts into a hierarchy The purpose of this categorization is to establish a systematic relationship between the knowledge entities and the specific software development process activities. Competency questions was used here in creating the ontology. Competency questions as defined by some methodologies for ontology engineering describe what kind of knowledge the resulting ontology is supposed to answer. According to [27] one of the ways to determine the scope of the ontology is to sketch a list of questions that a knowledge based on the ontology should be able to answer. The following competency questions were used: 1. Ethnographic study in requirements gathering involves the study of? 2. What are the processes for gathering requirements from stakeholders and end-users? 3. What are the processes of software development? 4. What are the physiological processes for blocks resolution during coding? 124 What are the appropriate approach to software design? 6. What are the people and processes involved in user tasks? 7. What are the tools used in ethnographic study of stakeholders and end-users? 8. What are the stages involved in implementation? 9. What does a deployed system need for maintenance? 10. Who can business rules be obtained from in a domain? 11. How can low bus factor issues be handled in coding? 12. How can code ownership issues be resolved in coding? 13. How can knowledge be shared in software process? 14. How can knowledge be transferred in software process? 125 Every objects have both data property and object property. The object property shows the relationship between classes or instances, and the data property shows the relationship between instances and the data value. It provides a logical relationship between objects. Tables 3 and  4 present sample Object and Data property defined for some objects in the software process ontology.
• Step 4: Add logical expressions Axioms represent assertions formulated in a logical form that together comprise the core knowledge that the ontology describes in its domain of application. They are used to model sentences that are always true. They provide a powerful way to add logical expressions to ontology and they are used to verify the consistency of the ontology.
Axioms are usually added by default in protege.

•
Step 5: Create the ontology Protégé 5.0 was then used to build the software process knowledge ontology, with its subdomains.as shown in Figure 1-5

Discussion
The requirements definition (figure 1) is the first stage in software development process. It is a very critical phase. A major finding from the organizations studied is that Ethnography study was used to obtain the software requirements. It involves studying the organization's culture to understand the key elements and observable patterns of behaviour. This usually requires interaction between stakeholders from the software de- velopment organization and those from the user organization in order to gain a better understanding of the problem. It was observed from the study that quality time is dedicated to identifying and interacting with the end users during requirements elicitation. These series of interactions provide opportunity for the company to distinguish itself and to learn about how the users' tasks are performed. Users are asked to discuss their routine tasks with the requirements elicitation team and they are informed about the services to be provided by the proposed system. This helps to prevent user resistance to the new software system and provides opportunity for users to discuss their daily routines with the developers in an informal way. During the process, apart from using informal interviews to gather data, participant observation, questionnaires, documents review and other suitable requirements elicitation techniques are also used when necessary. Sometimes, the user's expressions and reactions send useful signals regarding what they expect in the system. One of the organizations studied adopted the Agile software Method were the initial requirements are quickly developed into the first build which is installed for the users and additional features are reported as feedbacks for the next build. Users e-mail addresses and phone numbers are collected. A repository is created to store user complains and any additional features requested. Figure 2 shows the approach to design and coding. For design it should be broken down into mathematical process and made modular. Approaches to coding should include: pairwise coding, avoiding code ownership, code review, and encourage knowledge retention. Bus factor issue should be resolved by creating clean code and documenting codes. Blocks which are dead ends during coding should be resolved by going to: senior and experienced programmers, problem domain, back to definition and design, physical objects like games, and interactive blogs like stack overflow or stack exchange. Implementation in figure 3 should adopt phased change over approach instead of a holistic approach. It should be done incrementally. Maintenance in figure 4 requires the system to be first deployed, and further requirements obtained from stakeholders and end users based on their usage. Figure 5 is the various subdomain put together to form the software process ontology.

Conclusion
Software process knowledge is a knowledge driven process with sub-processes. This knowledge is latent and could be lost if not formally harvested and documented. An improved use of this knowledge could lead to maximum payoff in software organisations. This is the heart of knowledge management, which focuses on knowledge capturing and sharing. This paper presents a generic software process domain ontology, covering the main technical software engineering subdomains of requirements definition, design & coding, implementation and maintenance.