Ground-breaking research project in applying language technology to forum content announces first results and opens first phase of alpha release of technology to special interest group

Berlin, Germany – December 21, 2012  Twelve months into its 36-month research program, the ACCEPT project is able to announce its first results in several areas. The project aims to investigate and develop new language technologies and user experiences for forums and users of community content. The specific focus of the project is to improve machine translation results for this challenging type of content. To achieve this goal, the main focus has been on improving the quality of the content to make it more translatable and improving the machine translation (MT) engines, particularly statistical machine translation (SMT) engines, to better handle sometimes unconventional input.

The first results are in four main development areas:

  1. An integrated pre-editing plug-in for forum users. Powered by Acrolinx language checking functionality, this new plug-in is designed to gently encourage people who are writing content to address the most critical quality issues, whilst not asking too much of their time. Getting this balance right is one of the major challenges of the project.
  2. Pre-editing strategies for SMT: Acrolinx, Symantec and the University of Geneva have worked intensively on investigating the typical issues that arise, evaluating which issues can be corrected fully automatically without user interaction, and learning corrections or rewording from training corpora.
  3. Innovative developments around the SMT engine MOSES. This work, carried out at the University of Edinburgh, has focused on evaluating which combination of data leads to the best results for user-generated content and using factored models to deal with unknown words.
  4. The Evaluation API provides a flexible framework to collect user feedback from online content repositories such as user forums. The questions and answers used to elicit user feedback are defined in the ACCEPT Portal itself.

Fred Hollowood from Symantec said, “User-generated content in and around the social web is a vital part of the ongoing customer relationship. Technologies which rapidly globalize this communication are must have, so we are delighted to contribute to this important research area.” “Machine translation is here to stay,” commented Lori Thicke, CEO of Lexcelera. “This project is important because it’s helping us build the technology to make MT even more useful to more global communities.”

Francis Tsang, Director of Globalization at Adobe Systems, said, “As a major MT user and with a very strong strategy to support and leverage user-generated content in our product forums, Adobe is very excited to monitor this research and evaluate it in our own environment. Like Symantec we have a very active and global user community and are convinced that technology can help us support them even better.”

Launching the ACCEPT Special Interest Group

Now that the first visible results are being delivered, the ACCEPT consortium is planning to get early feedback from potential users from outside the project. To this end, the project is establishing a Special Interest Group (SIG) and will give access to first members of the SIG in the first quarter of 2013. Additional members will be brought in through 2013. The project has had an excellent response from members of the two user communities targeted by the ACCEPT technology. High-tech tech companies mirroring the use case represented in the project by Symantec and large government and non-profit organizations to reflect the use case represented by Lexcelera and Translators without Borders.

Collaboration of leading partners in the field of language technology

ACCEPT joins Acrolinx, the leading provider of content optimization software based on natural language processing, together with the universities of Edinburgh and Geneva, both renowned for their research centers in the field of machine translation. Also participating are Symantec and Lexcelera, two highly experienced users of machine translation systems; Translators without Borders, founded by Lexcelera, and Symantec will add their extensive experience with online communities and internet forums.

Project information:

Coordinator: Prof. Pierrette Bouillon,