GF Tutorial

icon

47

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

47

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

G F T utorial
M uhammad H umayoun
mhuma [at] univ-sav oie. fr
PhD student, Department of Mathematic s (LAMA)
Universite de s avoie
Ba sed on the course named "N atural Lan guage T echnology", talks and tutorials given by
Aarne R anta (aarn e [at] cs.chalmers. se) 1 N atural Language T echnology and state
of the art T echnologies
 6000-8000 living langua ges in th e worl d
 Trans lation Sy stems: either limited or of low quality
 Dialogue Sys tems: bo th li mited an d of low quali ty
 Teac hing: no t use d as mu ch as it cou ld be
 Web search: advanced for s ome languages but unknown
 Error m essages: ba d qua lity throu gh " canned tex t": you ha ve 1
message(s)
 Soft ware localization: lists o f can ned text sent ences
2 N atural vs programming languages
 G enerally, G rammar = S yntax + S emantics
 For a programming language, the grammar is part
of its specification and implementation
 N atural language is not defined by a grammar.
grammar of a N L is a research problem
 A part of a language technology application is often
to solve a part of this research problem!
3 T he O bjective of these T utorials
 Building some applications in three sub disciplines
of N LT using G F
 M orphology: theory of w ords a nd th eir fo rms
 Sy ntax: theory of text and sentence st ructure
 Sem antics: theory of meaning
 Understanding what is needed for high-quality
translation, dialogues etc and their specific
solutions in G F
4 W hat will we cover
 T utorial 1: ...
Voir icon arrow

Publié par

Langue

English

GF Tutorial
Muhammad Humayoun
mhuma [at] univ-savoie.fr
PhD student, Department of Mathematics (LAMA)
Universite de savoie
Based on the course named "Natural Language Technology", talks and tutorials given by Aarne Ranta (aarne [at] cs.chalmers.se)1
aNcenhlogo yna dtstural Language Tgolonhcesei tofe at Trt aheitalthy ugro"ch ennaet d:"txuoy Error emssgase :ab duq e 1h va
2
message(s)
6000-8000living languages in the world
Web search some languages but unknown: advanced for
Software localization: lists of canned text sentences
Translation Systems quality: either limited or of low
Dialogue Systems quality low: both limited and of
Teaching: not used as much as it could be
Natural vs programming languages
Generally,Grammar = Syntax + Semantics
For a programming language, the grammar is part
of its specification and implementation
Natural language is not defined by a grammar.
grammar of a NL is aresearch problem
A part of a language technology application is often
to solve a part of this research problem!
3
The Objective of these Tutorials
Building some applications in three sub disciplines of NLT using GF
 Morphology: theory of words and their forms
 Syntax: theory of text and sentence structure
 Semantics: theory of meaning Understanding what is needed for high-quality translation, dialogues etc and their specific solutions in GF
4
What will we cover
Tutorial 1: Morphology & Lexicon
Tutorial 2: Syntax and Translation systems
Tutorial 3: Syntax, Translation and Formal Proofs
Tutorial 4: Syntax and Semantics
5
What is GF?
Grammar formalism based on type theory Special-purpose functional programming language having a powerful type system Fundamental structure:
       grammar = abstract syntax + concrete syntaxes Abstract syntax= semantic conditions (correct   syntactic structures/trees of a language) Concrete syntax= mapping abstract syn
Concrete syntax= mapping abstract syntax into strings along-with the grammatical features for a language (and back, by reversibility)
6
Morphology
Part of speech or word class (Nouns, Verbs, Adjectives, Adverbs etc) GF follows word-and-paradigm model of morphology in which word forms are created by combining different morphs Inflection tables =Display all forms of a word.
Example: English regular nouns
Singular Nominative rat Genitive rat's
Plural rats rats '
7
Stems, endings, morphs, morphemes
A word form can often be analysed to parts:  (prefix), dress (stem), -ed (suffix) un-Undressed ---Carelessness --- care (stem), -less (suffix), -ness (suffix) All these significant parts are calledmorphs. Amorphemeis an abstraction over different morphs that have the same function. For instance, s and es are variants of the plural morpheme: boy + s, kiss + es Morphological analysis = analysis into morphemes (in the abstract sense of parameter description) boys --> boy +Nom +Pl babies --> baby +Gen +Pl ' Thusanalysis = lemma + tags
8
Parameters
The different form descriptions are grouped into types. Examples of such types and their values: number: singular, plural (Arabic also: dual) gender: masculine, feminine (French, Arabic, Urdu/Hindi) / masculine, feminine, neuter (German, Latin, English) case: nominative, genitive (Swedish) / nominative, accusative, dative, genitive (German) Heavily dependent on language! A word class is of morphologically defined by telling what type parameters its forms depend on. Parametric vs. inherent features To define  inherenta word class, we should also tell what features attach to it. Cf. class in Java: method: inflection for different combinations of parameters attributes: inherent features
9
Defining morphology of a language
Type system: define parameter types and word classes
Inflection engine: define all paradigms for all word classes
Lexicon: list all words with their word classes and paradigms.
The definitions can be made with stg like 100 + 1000 + 10000 lines of code, for a "medium hard" language likeFrench.
Englishneeds less types and paradigms, but more lexicon.
10
Uses of a morphology
Synthesis inflection generate table. word,: given a dictionary Analysis word class, and: given a word form, return lemma, form description (which can be ambiguous)
Implementing morphology General-purpose programming languages: Haskell, Caml, Java, C,... need to define the types and data structures of the type system, the inflection engine, and the lexicon. And also an analysis program! ex.Functional Morphology(a Haskell library for morphology development). Special-purpose morphology languages. The most well-known:XFST, based on regular expressions. GF, Further it extends to syntax seamlessly from morphology and semantics.11
Voir icon more
Alternate Text