Accuracy is always to be sought after. Regardless of your domain, the more accurate your algorithm, the better. In many domains, negative results can be tolerated. For example, if you search for the query 'jaguar in the jungle' you are likely to receive lots of results about big cats in their natural habitat, but you may also receive some results about fancy cars in the jungle too. This is acceptable and may even be helpful as the original query contained some ambiguity - maybe you really wanted to know about those fancy cars.
The same thing can occur during text simplification. Inaccurate identifications or replacements may lead to an incorrect result being present in the final text. Some of the critical points of failure are as follows:
- A complex word could be mislabeled as simple - meaning it is not considered for simplification.
- No replacements may be available for an identified complex word.
- A replacement which does not make sense in the context of the original word may be selected.
- A complex replacement may be incorrectly selected over a simpler alternative due to the difficulty of estimating lexical complexity
was changed by a rudimentary lexical simplification system to:A young couple with children will need nearly 12 years to get enough money for a deposit.
Not only has a synonym which is more complicated than the original word been chosen here, the synonym does not make any sense in the given context. Through making an error, the understandability of the text is reduced, and it would have been better to make no simplification at all.A young couple with children will need nearly 12 years to get enough money for a sediment.
To end this post, I will present some practical ways to mitigate this.
- Only simplify if you're sure. Thresholds for deciding whether to simplify should be set high to avoid errors.
- Use resources which are well suited to your task, preferably built from as large a corpus as possible.
- Investigate these errors in resultant text. If they are occurring, is there a specific reason?