The machine translation train in our pharmaceutical customer project has really gained momentum! After a scoping phase followed by the selection of the appropriate system for the customer’s use cases, everything now revolves around the integration and training of MT engines. Current and future MT engines are of interest to various customer stakeholders, so the topic of MT testing plays a major role even after system selection. Here we give you an insight into the MT processes supported by blc at the customer.
The scoping phase
The scoping phase is used to capture all core requirements for the future MT system. In our pharmaceutical project, we have compiled a comprehensive questionnaire in advance, which was the basis of an initial ‘Request for Information’ (RFI). After a series of voting rounds, qualitative testing of three favored MT systems together with a cost statement served as the final decision criterion. To do this, we first pre-processed the data of several language directions. The training of the engines was followed by a human evaluation of the engines. This was done via sentence-by-sentence quality assessment and post-editing of the test sets we created. Here, we are already gaining valuable insights into the customer-specific optimization potentials of the engines!
Acceptance of MT engines: Testing, tooling, testing …
After selecting the winning system, we provided the customer-specific training data to the MT providers. For the purpose of an objective comparison of the MT systems in the scoping phase, we did not share this data with the providers. A process for the provision and acceptance of representative test sets has been agreed to ensure the smooth running of future MT training by the manufacturer. This is the basis for a final acceptance of MT engines. Subsequently, the engines trained by the manufacturer themselves were subjected to an even more detailed evaluation. We trained the customer’s language experts to do this. Subsequently, we carried out an analysis of all error annotations and post-edits of the language experts and produced a comprehensive report. Such reports are an important source of feedback, both for project management and for engine optimization by the MT provider.
Old and new stations on the MT route: Quality tracking and stakeholder testing
Testing of an MT engine is not complete with the acceptance of the initial training. We support the establishment of MT quality tracking processes in the productive translation process. Feedback from translation service providers, recording post-editing efforts and determining the ever-increasing volume of translation are triggers for re-testing or re-training MT engines.
In addition, we currently accompany the coordination process with potential users and stakeholders. In order to check the suitability of the engines for specific use cases, we prepare test sets based on customer data and evaluate the evaluation data. If existing engines are unsuitable for a stakeholder (keyword: fit-for-purpose), we aggregate data for re-training or initiate the training of an engine variant.
Coordination with customer and MT provider – the oil in the MT transmission
Weekly meetings with our customer and the MT provider keep all decision makers informed about the status of MT integration during and after the GO-Live. One topic of the meetings is the ongoing optimization of the already trained MT engines. But we also discussed how to ensure the access, roles and rights of various user groups of the MT system and how to plan further engine training sessions. MT integration methods are also being coordinated: Configuration of the translation management system to use different MT profiles, API connection and provision of a translation portal for all employees. For the technical side of integration, IT is of course a permanent member of the voting rounds!
Prepare language experts and LSPs for the MT-based workflow
After the GoLive is before the GoLive
After the trained and approved MT engines have been put into production, further MT engines will be trained. Of course, we focus on the needs of the customer, the stakeholders and the existing data. Each MT engine is thoroughly tested on the basis of established evaluation processes. Testing of further integrations via API into the customer’s creation systems is also planned and tested.
To make sure our customer is prepared for the challenges of the complex MT process after the project has been completed, we carry out expert training courses in which we teach technical and organizational processes (e.g. automation in the evaluation process using scripts, terminology integration). In this way, special MT knowledge is gradually built up and our customer can further develop the introduced MT processes on his own request – but of course we are also happy to support them if required. 🙂
Since the first engines were introduced to the customer, a lot has happened. Over the course of 3 years, a total of 32 engines were trained with customer data, evaluated and made productive for the post-editing process and self-service. In addition, there are 23 more generic (non-customer-trained) engines of the same MT provider. With such a large number of engines, there are many challenges for which blc processes and methods have been established in advance. Of course, the top priority is to track the translation quality of the engines.
Here, blc supported the implementation of processes and methods to capture post-editing efforts in individual projects and at the cross-project language level, in order to quickly identify problems with individual engines. The problematic segments can be filtered and evaluated quickly. This way, potentials for optimization will quickly find their way to the weekly coordination appointments with the MT provider. Another source for assessing translation quality is the user feedback that comes from self-service. These are collected centrally and checked for optimization potential manually by the customer.
Specific glossaries and MT profiles are created for the different translation requirements of individual stakeholders, which can be used by defined user groups. In addition, important content of the translation memories is maintained as an additional translation source in self-service.
In order to make the MT processes even more efficient, the customer is currently introducing an MT interface that will fulfill several functions: On one hand, it regulates the MT connection of various internal stakeholders to the MT system. In addition, it will support quality evaluation in the future by automating the export of project reports from the translation management system in order to further accelerate the analysis of translation quality.