KI Wissen

Two new publications concerning knowledge extraction

Knowledge extraction is one of the core issues of the KI Wissen project. For this purpose, different methods for extracting knowledge are investigated and developed in the project. Among other things, this should contribute to the explainability and analyzability of AI systems. New publications now shed light on two of those methods.


Firstly, the paper "SViT: Hybrid Vision Transformer Models with Scattering Transform" published in IEEE MLSP 2022 explores how to specify a graphical entity (token) with unique semantic representations in images. As an increasingly popular network architecture, Transformer not only enables significant performance in natural language processing, but also demonstrates comparable feature extraction ability as convolutional neural networks in computer vision tasks.  By combining Scattering Transform with Vision Transformer (ViT), this work demonstrates the effect of different tokenization approaches.

In contrast to SViT, which deals with local semantic representation, the second publication "Concept Embedding Analysis: A Review" surveys methods that aim to find global, evaluable associations of human-interpretable semantic concepts with internal representations of a deep neural network.

The new paper therefore establishes a general definition of concept (embedding) analysis (CA) and a taxonomy for CA methods, uniting several ideas from literature. That allows to easy position and compare CA approaches. Guided by the defined notions, the current state-of-the-art research regarding CA methods and interesting applications are reviewed and more than thirty relevant methods are discussed, compared, and categorized. Finally, for practitioners, a survey of fifteen datasets is provided that have been used for supervised concept analysis and future challenges and research directions are pointed out. In the long run, being able to read stored knowledge from DNNs can be useful for debugging, verification, or knowledge gain. Accordingly, CA has several important applications, such as in the verification of symbolic requirements on DNNs for safety evidence.

Image: Oleg Klementiev  © CC BY 2.0; Nick Webb © CC BY 2.0; Yandle  © CC BY 2.0