Working on Symbols and Concept Linking

View WebSci 2020 Presentation in a new tab

The WebSci 2020 virtual conference has a special theme on Digital (In)Equality, Digital Inclusion, Digital Humanism the first day of this virtual conference. This will gave us the chance to show the initial findings from our linking of freely available Augmentative and Alternative Communication (AAC) symbol sets to support understanding of web content.

There are no standards in the way graphical AAC symbol sets are designed or collated other than the Blissymbolics ideographic set that was “standardized as ISO-IR 169 a double-byte character set in 1993 including 2384 fixed characters whereas the BCI Unicode proposal suggests 886 characters that then can be combined.” Edutech Wiki.

Even Emojis have a Unicode ID, but the pictographic symbols most frequently used by those with complex communication needs do not have an international encoding standard. This means that if you search for different symbols amongst a collection of freely available and open licenced symbols sets you find several symbols have no relationship with the word you entered or the concept required.

symbols for up
Global Symbols used to show sample symbols when the word ‘up’ was entered in the search.

This lack of concept accuracy means that much work has to be done to enable useful automatic text to symbol support for web content. Initially there needs to be a process to support text simplification or perhaps text summarisation in some cases. Then keywords need to be represented by a particular symbol (from a symbol set recognised by the reader), that can be accurately related to the concept by their ISO or Unicode ID. Examples can be found in the WCAG Personalization task force Requirements for Personalization Semantics using the Blissymbolics IDs.

The presentation at the beginning of this blog will illustrate the work that has been achieved to date, but it is hoped that more can be written up in the coming months. The aim is to have improved image recognition to assist with the semantic relatedness. This automatic linking will then be used to map to Blissymbolics IDs. It is hoped that this will also enable multilingual mapping, where symbol sets already have label or gloss translations.

laptop coding

However, there still needs to be a process that ensures whenever symbol sets are updated the mapping can continue to be accurate as some symbol sets do not come with APIs! That will be another challenge.

COVID-19, AI and our Conferences

conference seating

Much has changed for everyone since our last blog. Swami Sivasubramanian, VP of Amazon Machine Learning, AWS has written an article about the way AI and machine learning have been helping to fight COVID-19 and we can see how varied the use of this technology has been. However, we remain in a world that is having to come to terms with many different ways of working and travelling to conferences has been off the agenda for the last few months.

We have continued to work on topics covered in our papers for ICCHP that will delivered remotely, as will the one we submitted for WebSci 2020 . ISAAC 2020 has been moved to 2021, but who knows if we will get to Mexico but hopefully at least we may have some results from the linking of concepts for several free and open augmentative and alternative communication symbol sets.

As the months pass much of our work will be seen on Global Symbols with examples of how we will be using the linked symbol sets.

We are also trying to support the WCAG personalization task force in their “Requirements for Personalization Semantics” to automatically link concepts to increase understanding of web content for those who use AAC or have literacy difficulties and/or cognitive impairments.

mapping symbol sets
The future for freely available mapped sample AAC symbol sets to illustrate multilingual linking of concepts from simplified web content.

AI and Inclusion projects related to Web Accessibility and AAC support.

Over the last few months we have been concentrating on projects related to automated web accessibility checks and the automatic  linking and categorisation of open licenced and freely available Augmentative and Alternative Communication symbol sets for those with complex communication needs.

As has been mentioned we presented these projects at a workshop in the Alan Turing Institute in November and work has been ongoing. It is hoped that the results will be shared by the end of March 2020.

Automating Web Accessibility Checks

Recent regulations and UK laws recognise the W3C Web Content Accessibility Guidelines (WCAG) as a method of ensuring compliance, but testing can be laborious and those checkers that automate the process need to be able to find where more errors are occurring.  This has led to the development of an accessibility checker that carries out well-known automated checks, but also includes image recognition to make it possible to see if the alternative text tags for images are appropriate. A second AI related check involves a new WCAG 2.1 Success Criteria 2.4.4 Link Purpose (In Context).  This is where “the purpose of each link can be determined from the link text alone or from the link text together with its programmatically determined link context, except where the purpose of the link would be ambiguous to users in general”.[1] 

A Natural Language Processing (NLP) model is used to check whether the text in the aria-label attribute within the target hyperlink object matches the content in the target URL. Based on the matching result, it is possible to determine whether the target web page or website fit the link purpose criteria. Despite previous research in this area, the task is proving challenging with two different experiments being worked on. One experiment has been designed to use some existing NLP models (e.g. GloVe), while another one is investigating the training of data with human input. The results will be published in an academic paper and at a conference.

AAC symbol classification to aid searches.

Global Symbols with a Cboard user

The team have also investigated issues for those supporting Augmentative and Alternative Communication (AAC) users who may have severe communication difficulties and make use of symbols and pictures on speech generating devices. A multilingual symbol repository for families, carers and professionals has been created to link different freely available symbol sets.  The symbol sets can be used to create communication charts for the AAC user but this takes time and finding appropriate cultural symbols is not always easy.  A system has been developed that automatically links and categorises symbols across symbol sets related to their parts of speech, topic and language using a combination of linked data, natural language processing and image recognition.  The latter is not always successful in isolation as symbols lack context and concepts are not necessarily concrete such as an image for ‘anxious’, so further work is required to enhance the system.  The Global Symbols AAC symbol repository will be making use of these features on their BoardBuilder for making symbol charts by the end of March 2020.

This project is exploring some existing Convolutional Neural Network (CNN, or ConvNet) models to help classify, categorise and integrate AAC symbols. Experiments have already been undertaken to produce a baseline by simply using the image matrix similarity. Due to the nature of AAC symbols, some of these similar symbols are representing different concepts, but some different symbols are representing the same concept across different symbols sets. The training data set has mapped symbol images labels and NLP models have been used to map the labels into the same concept across different symbols. This will help those supporting ACC users offer much wider symbol choices suitable for different cultures and languages. The Global Symbols API for searching open licence and freely available AAC symbols is already being used in the Cboard application for AAC users