Neural Machine Translation powered by the Crowd

Neural Machine Translation powered by the Crowd

The Neural Machine Translation (NMT) train is rapidly picking up speed and NMT has long since arrived in Germany. This is due to the widespread use of the NMT flagship DeepL as well as the growing interest of small and large companies in integrating specialized customer-specific MT into the translation workflow. In my blog, I deal with everyday questions about the service provider landscape, training possibilities for domain-specific engines and the implementation of post-editing processes.

As our visit to the MT Summit 2019 in Dublin showed, the translator and his role in the MT workflow is finally more prominent in evaluation studies. After all, the potential of NMT can only be fully exploited if machine translation and post-editing are interlinked in a professional and technical manner. The product landscape reflects these developments and is increasingly adapting to integrate the post-editor better. In addition to post-editing integration, the sharing of NMT engines on provider-specific platforms also contributes to the accelerated implementation of NMT processes.

In both cases, the crowd concept is the driving idea that will also inspire other aspects of machine translation in the future.

And what are NMT developers cooking up?

A perennial topic among NMT-related questions is the quality that can be expected. Important influencing factors: quantity and quality of training material as well as training and evaluation possibilities of the NMT system used. The underlying technology is based on the status quo across all providers: The developers‘ community, which is responsible for the progress of NMT procedures, operates in the spirit of open source. The entire system landscape therefore benefits immediately from efficient new developments by the developer crowd. These are also continuously tested by NMT service providers and used if the results are successful.

For some time now, the transformer architecture has proven itself successful for NMT models. Due to parallel processing method and the consideration of extended lexical context in sentences, the architecture still forms the base of many NMT systems that combine efficient training with high quality. Sufficient and good training material is of course also a prerequisite here. However, there is still plenty of room for optimization: Even if the sentence-internal recognition of dependencies already works well, a context window that goes beyond the respective sentence is still required when interpreting terminological content during translation. The document-based NMT, which also uses context of previous sentences, is therefore also one of the NMT development priorities for the coming years. In addition, new research papers on the extension of the transformer architecture and other approaches are published regularly, which promise a continuous optimization of NMT quality.

New Product Features for NMT

The core functionalities of MT systems – training and translation – are increasingly being expanded by MT service providers with components that facilitate, for example, the reuse of engines and their integration into the translation process.

For example, the post-editing process at KantanMT has been outsourced to an independent cloud application called Kantan SkyNet, which enables internal project groups as well as freelance post-editors to perform post-editing from anywhere at any time. The system uses a payment model where the post-editor is paid per edited sentence after free registration at https://app.skynet.kantanmt.com. A crowd ranking procedure is used to identify good translations, which is reflected in higher rank and higher remuneration for the post-editor. Payment is user-controlled via PayPal.

With the Model Studio, SYSTRAN has introduced a way for users to offer their domain-specific trained engines on the SYSTRAN Marketplace. The engines created in this way can be used by the marketplace community for a fee or can be used as a base to build up new specialized engines. The idea of reuse not only affects the availability of specialized engines for different domains. According to SYSTRAN, it is also reflected in the ecological footprint, which is kept low by avoiding unnecessary new training by using existing engines.

We are confident that post-editing and reuse of existing MT engines in crowd driven platforms will become more common in the future.

You also want to use the benefits of Neural Machine Translation?

Since we deal with neural machine translation, quality expectations, the multitude of MT providers and smart process integration including post-editing every day, we are offering a free webinar in April entitled ‚NMT: Systems, Processes and Trends‘. The actual date will still be announced. Would you like to stay informed about current topics and our free webinars? Click here to subscribe to our newsletter (German).

And of course we are happy to support you with your specific MT projects anytime. Contact Christian Eisold!

Related Posts