Home > Research > Research Group > Knowledge Organisation and Discovery
 
Current Research
Singapore Internet Research Centre
Research Groups
- Singapore Internet Project
- Digital Intelligence Research Cluster
- Knowledge Organisation Research Cluster
- Information Literacy Research Cluster
- Knowledge Management Research Cluster
Singapore Internet Research Centre
Asian Communication Resource Centre
- Asian Communication Resource Centre (ACRC)
Fellowship Award
Asian Media Information and
Communication Centre
 
 
 
   Knowledge Organisation Research Group
The Knowledge Organisation research group is oriented towards developing advanced knowledge organisation models and techniques to support user browsing, knowledge discovery and tasks in Web sites, portals and repositories, as well as to support automated and semantic web applications. Current projects are focused on the following areas::
Developing taxonomies for content organization of organizational Web sites, portals and institutional repositories
Developing taxonomies and metadata for repositories of learning objects and e-learning
Ontology development to support automated reasoning and summarization
Human categorization behavior
 
   Research Group Members
A/P Abdus Sattar Chaudhry
A/P Christopher Khoo
A/P Shaheen Majid
A/P Theng Yin Leng
Ast/P Jin Cheon Na
Ast/P Brendan Luyt
 
  Current Research Projects
Using Classification Schemes and Thesauri to Build an Organizational Taxonomy for Organizing Content and Aiding Navigation Using Taxonomy of Learning Objects for Enhancing Knowledge Use and Reuse
Content Organization of Organizational Websites, Portals and Repositories Using Taxonomies Developing a Disease-Treatment Ontology
How Users Organize Electronic Files on Their Workstations in the Office Environment Human Clustering of Web Pages
  Postgraduate Student Projects
M.A.Sc. and Ph.D. Projects Completed Theses
back to top

Title of Project: Using Classification Schemes and Thesauri to Build an Organizational Taxonomy for Organizing Content and Aiding Navigation

Investigators: Abdus Sattar Chaudhry; Christopher Khoo; Shaheen Maid; and Wang Zhonghong

Description: This study aims to investigate the feasibility, issues and techniques for building an organizational taxonomy using bibliographic tools. The purpose of the taxonomy is to facilitate the organization of knowledge resources and enhance the navigation capability of an intranet portal for promoting knowledge sharing and communication among students and faculty.

The prototype taxonomy developed comprises six facets and four-level categorization scheme. The initial phase of research has documented the process delineating the steps of constructing the subject categories by using selected classification schemes, domain taxonomies, and thesauri. In this process, we attempted to determine the relevance of classification schemes and thesauri to organizational taxonomies, and identified areas of strengths for their contribution to the process. Difficulties encountered were also recorded and strategies that could be deployed to overcome the shortcomings of the tools were proposed.

The Division intranet was used for providing organizational context for the taxonomy. Students and instructors, the two important groups of stakeholders in the Division, create content as well as make use of information resources from various channels to perform their tasks of study, teaching and research. The taxonomy is expected to play an important role to facilitate their tasks through enhanced resource discovery. We made use of existing classification schemes and thesauri as well as relevant existing domain taxonomies in constructing the organizational taxonomy. We used one classification scheme (Dewey Decimal Classification), two existing domain taxonomies in the areas of information science and information systems and three thesauri (ASIS&T, LISA, and ERIC) for identifying relevant terms and categories. At the same time, the organizational community sources were used to capture the organizational context. The taxonomy was constructed by combining capabilities and features of these tools and sources.

Classification schemes and thesauri were found helpful in creating structure and categories related to the subject facet, while organizational sources had to be consulted to provide an appropriate organizational context. The organizational activities and the stakeholder’s needs were helpful to determine the objectives, facets, and the subject coverage of the taxonomy. The main categories were determined by identifying the stakeholders’ interests and consulting the organizational sources and domain taxonomies. The top categories were determined by combining classes/terms identified from the classification schemes, hierarchal index of the thesauri, domain taxonomies and the stakeholder’s interests. The stakeholder’s perspectives were identified by reviewing knowledge structures inherent in the organizational sources and through informal input from the stakeholders. The structures/term relationships in the classification schemes and thesauri were consulted to develop the hierarchical levels of categories. Category labels were formatted according to a commonly used standard. The draft taxonomy was validated by consulting stakeholders.

A prototype institutional repository has been built for taxonomy deployment and implementation using the TLE-Equella software and the university e-learning platform. A metadata scheme with 19 core elements based on the GEM 2.0 and accompanying best practices were developed. An interface was provided to facilitate contribution of resources by stakeholder to the repository.

Currently, we are preparing for the taxonomy evaluation. We will be using scenario-based approach for evaluating the taxonomy. Users will create navigation paths for tasks using the Information Studies Taxonomy. These tasks, solicited from 18 users from the two groups of stakeholders, cover activities related to study, teaching, and research. Researchers will observe user behavior during the process for qualitative data.

back to top

Title of Project: Using Taxonomy of Learning Objects for Enhancing Knowledge Use and Reuse

Investigators: Abdus Sattar Chaudhry, Christopher Khoo, Yin Leng Theng, and Abdul Halim

Description: This project explores how taxonomy complemented by metadata designed to organize learning objects can facilitate knowledge use and reuse. Initial work focused on developing a definition of learning objects that could be used as criteria in further analysis. This definition used ‘learning point’ as the main criteria for analysis of learning objects at two levels: atomic and aggregated. The focus of the project is on learning objects that are to be used in the teaching and learning of knowledge management, an important emerging discipline that holds much promise for future research.

In the second phase of the project, sample taxonomy was built using core literature in the field of knowledge management as represented in selected text books. Concurrently, a metadata template was developed using enhanced Dublin Core Education metadata schema extended by additional elements from the IEEE LO Metadata standard. Metadata scheme and categories from the taxonomy were used for developing a prototype repository of learning objects. The repository was built using a specialized learning management system. The prototype taxonomy, which is expected to serve as a test-bed for the experiment, comprise of 500 learning objects based on more than 100 PowerPoint slides on various aspects of knowledge management.

Currently, the project is in its last phase – evaluation of the taxonomy system and the potential use of repository of learning objects. We have selected a group of working knowledge management professionals to participate in the study. The participants will be asked to use the system to create content for teaching. Data would be collected on how the participants made use of the taxonomy categories in navigating the repository for information finding and how learning objects that are analyzed at a detailed level of granularity were helpful in creating contents. This exercise will be carried out in a usability lab that is equipped with tools that can help capture the process of navigation and information finding. Researchers will also observe the participants in their use of learning objectives and take note of their use of taxonomy categories and use of atomic learning objects. Post-exercise interviews will be conducted using an interview guide to collect further data aimed at verification and validation of usage patterns. We hope to determine whether the atomization of learning objects lead to more effective usability as compared to the use of aggregated learning objects. We also expect that the results from this experiment will be helpful to determine that learning objects organized using a pedagogically sensitive taxonomy augmented with comprehensive metadata facilitate knowledge use and reuse.

back to top

Title of Project: Content Organization of Organizational Websites, Portals and Repositories Using Taxonomies

Investigators: A/P Abdus Sattar Chaudhry, A/P Chris Khoo, A/P Shaheen Majid, & A/P Theng Yin Leng

Description: Corporate and public sector organizations are increasingly using infrastructure services such as Websites, intranets, portals and institutional repositories to leverage knowledge resources in the organization by their employees and customers. The objectives of the project are:

  • to analyse and identify the characteristics of various types of organizational taxonomies and metadata used to organize and structure Websites, enterprise portals and institutional repositories.
  • to develop taxonomies for various applications, repositories and organizations.

We assume that the organization of a Website can be represented by taxonomy of concepts and terms, and that designing the information architecture of a Website involves a first step of constructing taxonomy as an abstract representation of the structure of the Website and organization of its contents. The taxonomy is then expressed as a navigation or search structure, manifested in one of many possible menu designs or interaction designs, and instantiated in a visual design (graphics design).

The project incorporates several small studies. One study involves surveying corporate Websites to identify common structures, facets, categories and terms used in organizing these Websites. This survey is limited to Websites of multinational companies that sell products. The survey is implemented in the following three phases:

  1. Analyzing a small sample of Websites to identify common facets, common categories in each facet, and the common structure of each facet;
  2. The common facets, categories and structures are then used as a taxonomy checklist to analyze the Websites that are surveyed in our study.
  3. The metadata elements used to describe knowledge resources on the Websites are analyzed.

At the end of the survey, the taxonomy checklist will be improved in the light of the survey results, and can then be used as a reference by Information Architects when designing corporate Websites. Follow-up user studies can be carried out to investigate the effectiveness of the different facets in the taxonomy checklist, how they can be improved, how they should be used in the Website navigation system, and issues that designers should take into consideration.

Taxonomy development projects have included the following:

  • Taxonomy for a nursing portal. This project focused on building and implementing a nursing taxonomy to organize knowledge resources and to facilitate resource discovery through browsing. The implementation of the taxonomy consisted of three phases--preparatory phase, development and deployment. In the preparatory phase, the emphasis was on defining the objectives and collecting all resources such as term sources and tools needed for the next two phases. In the development phase, the concentration was on defining and developing the structure of the faceted taxonomy. In the deployment phase, indexing and categorizing the knowledge resources into the appropriate categories of the taxonomy structure and deployment of an effective browsing interface was of importance. An assessment of the taxonomy was then done to study the effectiveness in improving knowledge discovery through browsing. The test was carried using nursing scenarios provided by the director of nursing at Singapore General Hospital (SGH).
  • A taxonomy system for the business consulting environment. The main objective was to build a prototype taxonomy system which can be adapted for use by business consulting companies. The final deliverable was a sample taxonomy, consisting of 12 main categories and approximately 500 terms. It was built based on the existing system and information needs of a regional business consulting company. The main categories represent 12 business industries, which are the key industries of the company’s business consulting work. The taxonomies and indexes used by various online database providers and web sites were used to provide the building blocks for the prototype taxonomy, while informal interviews with industry experts and feedback from users of the company were instrumental in helping to determine the structure of the taxonomy.
  • Taxonomy for cultural and heritage resources. This study was carried out to develop a taxonomy for a museum and archives system in Singapore. The taxonomy contains some 500 terms, organized into five broad categories. The number of sub-categories in each of these broad categories was determined by the intuitive factor, the type of resources held by the museum and archives institutions and user feedback. The lack of descriptive metadata and text-mining efforts on the part of the institutions limited the accuracy of the resources to be represented in the taxonomy. However, it was found that collecting terms and concepts from various external and internal sources was sufficient to kick-start the taxonomy development process. The taxonomy provides a high level overview of the resources held by the institutions and facilitates discovery of these resources through navigation and browsing. These structures also contribute to serendipitous discovery of resources and supports precise keyword searching through the provision of contexts (categories and sub-categories). The taxonomy can also serve to identify gaps in the institutional collections.
  • Taxonomy development and deployment at a government statutory body. The taxonomy being used for organization of resources in the knowledge portal system of a government body was reviewed and evaluated with a view to refining the taxonomy to better support resource discovery and knowledge management. The review included information gathering about user requirements, content audit, focus groups with stakeholders, and review of existing terms, categories, and structures.
back to top

Title of Project: Developing a Disease-Treatment Ontology

Investigators: A/P Chris Khoo, Ast/P Na Jin Cheon, A/P Chan Syin (School of Computer Engineering), Ms Wang Wei

Description: A disease-treatment ontology is being developed to model and represent treatment information found in the abstracts of medical articles. The ontology divides disease-treatment information into five classes: disease, treatment, condition, effect, and evidence. The disease-treatment ontology is being constructed as an enhancement to existing medical taxonomies and ontologies. We adopt the Unified Medical Language System (UMLS) semantic network (U.S. National Library of Medicine, 2006 & 2007), the Medical Subject Headings (MeSH) (U.S. National Library of Medicine, 2005) and the U.S. National Cancer Institute (NCI, n.d.) thesaurus as the base medical ontology which we enrich with additional classes and semantic relations to link potential medical treatments with diseases.

This study is part of a bigger project to develop an automatic extraction system to extract treatment information from medical abstracts retrieved from the Medline database, to support information retrieval, question-answering, summarization and knowledge discovery. The purpose of the ontology is to serve as a knowledge base to store the extracted information and support these functions. The ontology is also expected to be useful in supporting synthesis of information extracted from different publications, and inferencing of potentially new relations between chemical substances and effects on diseases, such as envisaged by Swanson and others (Swanson & Smalheiser,1997; Bekhuis, 2006). Information stored in an ontology can also support evidence-based medicine (Sackett et al., 1996; Guyatt, Cook, & Haynes, 2004)—to alert doctors to the range and quality of clinical data available to make informed treatment decisions. A disease-treatment ontology is potentially important for use in medical digital libraries/portals and medical information systems.

back to top

Title of Project: How Users Organize Electronic Files on Their Workstations in the Office Environment

Investigators: A/P Chris Khoo & Ast/P Brendan Luyt

Description: This is a study of how users organize electronic files on the harddisk of their office computers, the structural and labelling characteristics, file organization strategy and behavior, reasons and factors related to their behavior, and issues and problems encountered by users. The research questions that the project seeks to address fall into three areas:

  • File structure: How are files organized into folders? What are the common types of folders and folder labels (folders are assumed to represent categories of files)? What are the common types of files and filenames? What are the common hierarchical structures? What are the temporal, spatial and organizational characteristics of the files and folders (for example, do some folders become obsolete and forgotten, are there duplicate folders, and so on)?
  • User behaviour: How do users develop, maintain and manage their file structures? How do they locate and retrieve information and documents from their file structure? What problems do they encounter?
  • User cognition and perception: What principles do users follow when organizing their files? What is their reason or rationale for organizing files and folders in a particular way? What perceptions do they have of their file structure and their behaviour?
  • Relationships between file structure, user behaviour and cognition: How do user behaviour, cognition and perception affect file organization characteristics, and vice versa? File organization on the hard disk is related to personal information seeking behaviour—how people locate information in their own files and personal repositories. It is hypothesized that people’s preferred way of locating personal information will have an impact on how they organize their files, and vice versa.

The results of this project will shed light on an important type of human categorization behaviour, engaged in by most knowledge workers and white collar workers today. The results will also have implications for the design of file structures in operating systems, for designing personal information systems, for designing personal work spaces and personalization features in enterprise portals and organizational digital repositories. Further research in the future will build on this base to relate folder organization and naming conventions to particular groups of people and the occupational roles they engage in.

 
back to top

Title of Project Human Clustering of Web Pages

Investigators: A/P Chris Khoo & Ast/P Brendan Luyt

Description: This study seeks to find out how humans cluster Web pages naturally. Web search engines are developed to help users locate relevant Web pages, but they often retrieve too many pages. One promising approach to help users make sense of the large retrieval results and locate useful documents is to group the retrieved pages into clusters to give users an overview of the types of Web pages retrieved and allow users to select “promising” clusters for closer examination.

But what kind of clusters or categories are likely to be useful to the user and help the user locate relevant Web pages? Perhaps the useful categories are categories that the users themselves would use in grouping or clustering the Web pages. This study seeks to find out how human beings cluster Web pages. In particular, the study seeks to answer the following questions:

  1. What kind of categories are formed?
  2. How do people decide on the categories to use?
  3. How do they assign Web pages to the categories?
  4. What criteria are used in deciding on the categories and in the assignment of Web pages to categories?
  5. Are there “universal” or common categories that are created by many users?
  6. For the same set of Web pages and query, do different subjects form different categories? Are there differences between the categories constructed by subjects who contributed the query and subjects who did not contribute the query?
  7. What kind of features determine or explain the similarity of Web pages within each category?

Our expectation is that many of the categories formed will not be subject related categories but pertain to the form of the documents, the purpose of the author or the type of treatment given to the subject -- and other aspects that cut across subject categories.

It is hoped that in the future, automatic methods can be developed to clustering or categorize Web pages in a way that mimics human clustering. Given the widespread evidence that humans adopt a “path of least cognitive resistance” approach to Web searching, reducing the cognitive load on users must be a prime concern of Web information retrieval. Our hypothesis is that if search results are organized in a way that is natural to human beings or is a reflection of how they might organize the results themselves, this will reduce the cognitive burden on users. The clusters will also help users gain an overall view and understanding of the different subsets of Web pages retrieved by their search.

back to top

Graduate Student Projects

M.A.Sc. and Ph.D. Projects

Using classification schemes and thesauri to build an organizational taxonomy for organizing content and aiding navigation
Student: Wang Zhonghong (PhD student)
Supervisor: A/P Abdus Sattar Chaudhry & A/P Christopher Khoo

Using taxonomy of learning objects for enhancing knowledge use and reuse: A study in the domain of knowledge management
Student: Abdul Halim Abdul Karim (PhD student)
Supervisor: A/P Abdus Sattar Chaudhry & A/P Christopher Khoo

 
Completed Theses
The role of taxonomies in improving information architecture for better resource discovery
Student: Jaya Kumaran (MASc)
Supervisor: A/P Abdus Sattar Chaudhry
Building a taxonomy system for the business consulting environment.
Student: Julie Goh (MSc)
Supervisor: A/P Abdus Sattar Chaudhry
Taxonomy for cultural and heritage resources
Student: Tan Pei Juin (MSc)
Supervisor: A/P Abdus Sattar Chaudhry
Building a taxonomy system for banking and finance industry
Student: Tan Gee Kian (MSc 2005)
Supervisor: A/P Abdus Sattar Chaudhry