Statistical Machine Translation

Short course

Lecturers

Philipp Koehn (University of Edinburgh)
and
Ashish Venugopal (University of Carnegie Mellon)

Date : November 12-16, 2007

Place: Lectures and discussions take place in J. Liivi 2 (building of the Faculty of Mathematics and Computer Science), University of Tartu, Estonia.

Schedule
Mo 10.15-12.00 lecture, Room 405
14.15-16.00 lecture, Room 405

Tu 10.15-12.00 lecture, Room 206
14.15-16.00 exercise, Room 205

Wd 10.15-12.00 lecture, Room 122
14.15-16.00 exercise, Room 205

Th 10.15-12.00 lecture, Room 403
14.15-16.00 exercise, Room 205

FR 10.15-12.00 lecture, Room 405
14.15-16.00 exercise, Room 205

Minimum of inscriptions: 5

Duration: 20 hours (5 days, 4 hours per day - 5 lectures, 5 exercises)

Goals

- Understand the problem of machine translation. Why is it hard and not solved yet?

- Understand recent statistical approaches to MT (word-based, phrase-based, syntactic)

- Capable to use common tools used in the SMT community

Summary of contents

The course introduces the problem of automatic machine translation with the focus on statistical methods. The major topics are machine translation evaluation, word-based models, phrase-based models and syntactic approaches.

The course presents the theory and methods behind current approaches, and the tutorials will offer hands-on experience with common tools, including the open source Moses machine translation system.

Assessment: pass/fail grade

Literature

- Koehn (2007): Statistical Machine Translation (note: the book may not be available at the time of the course, in that case, selected chapters will be provided).

ECTS credits: 3

Prerequisites

The course has no special requisites over and above what is required for admission to GSLT.

Contact

Mare Koit, mare.koit at ut.ee
Maarika Traat, maarika.traat at ut.ee

Information for the participants from abroad