Examples
Common features from tokens
In this example, we will crate a paradigm for the present and past tense forms of the English copula to be (tokens) and compute the common features for all different word forms (types).
Define a feature system with the meanings for the paradigm cells.
>>> import features
>>> context = '''
... |+1|-1|+2|-2|+3|-3|+sg|+pl|+pst|-pst|
... 1sg.pres| X| | | X| | X| X| | | X|
... 1pl.pres| X| | | X| | X| | X| | X|
... 2sg.pres| | X| X| | | X| X| | | X|
... 2pl.pres| | X| X| | | X| | X| | X|
... 3sg.pres| | X| | X| X| | X| | | X|
... 3pl.pres| | X| | X| X| | | X| | X|
... 1sg.past| X| | | X| | X| X| | X| |
... 1pl.past| X| | | X| | X| | X| X| |
... 2sg.past| | X| X| | | X| X| | X| |
... 2pl.past| | X| X| | | X| | X| X| |
... 3sg.past| | X| | X| X| | X| | X| |
... 3pl.past| | X| | X| X| | | X| X| |'''
>>> fs = features.make_features(context)
>>> cellmeanings = fs.atoms
Enter the word forms for each cell.
>>> cellforms = [
... 'am', 'are',
... 'are', 'are',
... 'is', 'are',
...
... 'was', 'were',
... 'were', 'were',
... 'was', 'were']
Create the paradigm as ordered mapping
(collections.OrderedDict
) from meaning to form.
>>> from collections import OrderedDict
>>> paradigm = OrderedDict(zip(cellmeanings, cellforms))
Pretty-print the meaning -> word form mapping.
>>> for meaning, form in paradigm.items():
... print(meaning.string_extent, form, sep=' | ')
1sg.pres | am
1pl.pres | are
2sg.pres | are
2pl.pres | are
3sg.pres | is
3pl.pres | are
1sg.past | was
1pl.past | were
2sg.past | were
2pl.past | were
3sg.past | was
3pl.past | were
Create a correspondence from each word form to the list of cell meanings where it occurs.
>>> occurrences = OrderedDict()
>>> for meaning in paradigm:
... form = paradigm[meaning]
... occurrences.setdefault(form, []).append(meaning)
Pretty-print the form -> occurrences mapping.
>>> for form in occurrences:
... meanings = occurrences[form]
... labels = ', '.join(m.string_extent for m in meanings)
... print(f'{form:>4} | {labels}')
am | 1sg.pres
are | 1pl.pres, 2sg.pres, 2pl.pres, 3pl.pres
is | 3sg.pres
was | 1sg.past, 3sg.past
were | 1pl.past, 2sg.past, 2pl.past, 3pl.past
Show the common features for all word forms, computed with the
join()
-method (generalization, least upper bound).
>>> for form in occurrences:
... meanings = occurrences[form]
... common = fs.join(meanings)
... print(f'{form:>4} | {common}')
am | [+1 +sg -pst]
are | [-pst]
is | [+3 +sg -pst]
was | [-2 +sg +pst]
were | [+pst]
Their necessary conditions (common features).