Thursday, July 24, 2014

The Choices of a Simplification System

There are many choices to make when building an LS system. In my experience there are three big decisions to take: the target audience; the source documents; and the mode of presentation. Let's look at each of these in detail.

Target Audience

Firstly, you need a well defined group to aim your simplification at.  This group should have a clear style of language, documents written specifically for the group may be useful here.  They should all require a similar form of simplification, otherwise you will be writing several simplification systems.  The group shouldn't be too narrowly defined (e.g. Deaf children in the age range 8-11 with a below average reading age), as this will make it difficult to find test subjects.  It also shouldn't be too broadly defined, otherwise different simplification needs may be present.

Once you have a group to simplify documents for, you're ready to consider the next step.

Source Material

You must decide what type of text to simplify.  It's easy to assume that text is text and you can just build a general model, but in fact different genres have their own peculiarities and jargon.  Consider the difference between a set of news articles, Wikipedia entries and social media posts.  Each will be significantly different in composition than the last.  Of course, the text genre should be one which the target audience wants you to simplify!  That's why this step comes after selecting the target group.  It's also important at this point to check that nobody else has tried to simplify the type of documents that you're working with.

Mode of Presentation

There are, roughly speaking, three ways of presenting the simplifications to the end user.  Which one you choose depends upon factors such as the text itself, the requirements of the user and the reason behind the simplification. Each has advantages and disadvantages as outlined below:

Fully Automated

Substitutions are applied directly to the text where possible.  The user never sees the simplifications being made and so does not need to know that they are reading a simplified document.

  • User is presented with a simple document
  • Requires minimum work from author / user
  • Can be performed on the fly - e.g. whilst browsing the web / reading e-books / etc.
  • Errors in the simplification process cannot be recovered
  • Simplification may alter the meaning away from what the original author intended
  • Some words may not be simplified - leaving difficult terms to be seen by the user

Author Aid

The simplifications are presented to the author of the document, who chooses when and how to apply the simplifications.  In this case, the simplification acts similarly to a spell checker.

  • Author can make simplifications which retain their original intention
  • No chance of grammar mistakes - as the author has the final say of which words to use
  • Work must be simplified before being published
  • No guarantee that the author will apply the suggested simplifications

On Demand

The user has the option to look up simplifications if they find a word too difficult.  These simplifications are typically presented somewhere nearby on the screen.  For example, if a word is clicked on, the simplification may appear in a pop up box or in a dedicated area outside of the body of text.

  • User gets to see the original text
  • Helps language learning / recovery as the user can decide when they require simplification
  •  User may struggle if a text or portion of a text has a high density of difficult words
  • The user may be distracted by the simplifications, which divert their attention away from the text