Portuguese-English Translation for online patent search

07 December 10

A core aspect of the European Commission's (EC) commitment to language diversification is the provision of multilingual access to intellectual property information, namely patents. This will afford inventors in Europe better access to technical information on patents in their native language and foster innovation and growth. Central to such a provision is the availability of high-quality search and translation technologies capable of dealing with the volume and language diversity of large collections of patent data. Machine Translation (MT) software must also be adapted to handle the specific language found in patent documents.

The European Patent Office (EPO) MT project team requested the PLuTO team to assist with a pilot project in MT for the English—Portuguese language pair. This collaboration has provided a number of distinct technical benefits for the project consortium including the receipt a large quantity of patent data, some of it of a very high quality. In building MT engines for the European Patent Office and developing a full-scale production-level web service through which the translation service was delivered, a number of specific project requirements were also satisfied ahead of time.

The preparation of hosting arrangements at Dublin City University for the MT web service aided in the development of a number of APIs and interfaces which now serve as core elements of the PLuTO integration framework. Furthermore, upon implementation of the translation service, we came through a thorough evaluation, provided by the EPO, from both a linguistic perspective in terms of the quality of the translations being produced, as well as from an engineering standpoint, in terms of the speed and robustness of the web service.

The IRF will bring its expertise in information retrieval and also provides access to its data on patent search use-cases and a large scale, multilingual patent repository.

The project consortium is composed of:

Project Coordinator and project contact: Paraic Sheridan, Dublin City University

European Commission Funding Scheme: PP Strategic Objective: Theme 5
ICT-PSP Objective Identifier: 5.1 Machine Translation for the Multilingual World

The service is available via espacenet.com.

top