User Guide
Installation
features
is a pure-python package implementing feature set algebra
as commonly used in linguistics. It runs under both Python 3.8+ and is
available from PyPI. To install it using pip, run the following command:
$ pip install features
For a system-wide install, this typically requires administrator access. For an
isolated installation, you can run the same inside a venv
or a
virtualenv.
The pip-command will automatically download and install the (pure-python)
fileconfig and concepts packages (plus dependencies) from PyPI. The latter
provides the lower level Formal Concept Analysis (FCA) algorithms on which
features
is based.
Features is essentially a convenience wrapper around the FCA-functionality of
concepts
.
Feature systems
Features includes some predefined feature systems that you can try out
immediately and will be used as example in this documentation. See below on
how to define, persist, and load you own feature systems/definitions.
To load a feature system, pass its name to features.FeatureSystem
:
>>> import features
>>> fs = features.FeatureSystem('plural')
>>> fs
<FeatureSystem('plural') of 6 atoms 22 featuresets>
The built-in feature systems are defined in the config.ini
file in the
package directory (usually, this will be Lib/site-packages/concepts/
in your
Python directory). You can either directly define new systems within a Python
script or create your own INI-file(s) with definitions so that you can load
and reuse feature systems in different scripts.
The definition of a feature system is stored in its
context
object. It is basically a cross-table giving
the features (properties) for each thing to be described (object):
>>> print(fs.context)
<Context object mapping 6 objects to 10 properties [3011c283] at 0x...>
|+1|-1|+2|-2|+3|-3|+sg|+pl|-sg|-pl|
1s|X | | |X | |X |X | | |X |
1p|X | | |X | |X | |X |X | |
2s| |X |X | | |X |X | | |X |
2p| |X |X | | |X | |X |X | |
3s| |X | |X |X | |X | | |X |
3p| |X | |X |X | | |X |X | |
>>> fs.context.objects
('1s', '1p', '2s', '2p', '3s', '3p')
>>> fs.context.properties
('+1', '-1', '+2', '-2', '+3', '-3', '+sg', '+pl', '-sg', '-pl')
>>> fs.context.bools
[(True, False, False, True, False, True, True, False, False, True),
(True, False, False, True, False, True, False, True, True, False),
(False, True, True, False, False, True, True, False, False, True),
(False, True, True, False, False, True, False, True, True, False),
(False, True, False, True, True, False, True, False, False, True),
(False, True, False, True, True, False, False, True, True, False)]
In other words, it provides a mapping from objects to features and vice versa. Check the documentation of the concepts package for further information on its full functionality.
>>> fs.context.intension(['1s', '1p']) # common features?
('+1', '-2', '-3')
>>> fs.context.extension(['-3', '+sg']) # common objects?
('1s', '2s')
Feature sets
All feature system contain a contradicting feature set with all features that refers to no object:
>>> fs.infimum
FeatureSet('+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl')
>>> fs.infimum.concept.extent
()
As well as a maximally general tautological feature set with no features referring to all objects:
>>> fs.supremum
FeatureSet('')
>>> fs.supremum.concept.extent
('1s', '1p', '2s', '2p', '3s', '3p')
Use the feature system to iterate over all defined feature sets in shortlex extent order:
>>> for f in fs:
... print(f, f.concept.extent)
[+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl] ()
[+1 +sg] ('1s',)
[+1 +pl] ('1p',)
[+2 +sg] ('2s',)
[+2 +pl] ('2p',)
[+3 +sg] ('3s',)
[+3 +pl] ('3p',)
[+1] ('1s', '1p')
[-3 +sg] ('1s', '2s')
[-2 +sg] ('1s', '3s')
[-3 +pl] ('1p', '2p')
[-2 +pl] ('1p', '3p')
[+2] ('2s', '2p')
[-1 +sg] ('2s', '3s')
[-1 +pl] ('2p', '3p')
[+3] ('3s', '3p')
[+sg] ('1s', '2s', '3s')
[+pl] ('1p', '2p', '3p')
[-3] ('1s', '1p', '2s', '2p')
[-2] ('1s', '1p', '3s', '3p')
[-1] ('2s', '2p', '3s', '3p')
[] ('1s', '1p', '2s', '2p', '3s', '3p')
The string representations will show the smallest possible notation for each feature set by default (shortlex minimum). The full representation is also available (and an extent-based representation):
>>> fs('1sg').string
'+1 +sg'
>>> fs('1sg').string_maximal
'+1 -2 -3 +sg -pl'
>>> fs('1sg').string_extent
'1s'
To use the maximal representation for __str__()
, put
str_maximal = true
into the configuration file section (see
below).
Retrieval
You can call the feature system with an iterable of features to retrieve one of its feature sets:
>>> fs(['+1', '+sg'])
FeatureSet('+1 +sg')
Usually, it is more convenient to let the system extract the features from a string:
>>> fs('+1 +sg')
FeatureSet('+1 +sg')
Leading plusses can be omitted. Spaces are optional. Case, order, and duplication of features are ignored.
>>> fs('2 pl')
FeatureSet('+2 +pl')
>>> fs('SG3sg')
FeatureSet('+3 +sg')
Note that commas are not allowed inside the string.
Uniqueness
Feature sets are singletons. The constructor is also idempotent:
>>> fs('1sg') is fs('1sg')
True
>>> fs(fs('1sg')) is fs('1sg')
True
All different possible ways to notate a feature set map to the same instance:
>>> fs('+1 -2 -3 -sg +pl') is fs('1pl')
True
>>> fs('+sg') is fs('-pl')
True
Notations are equivalent, when they refer to the same set of objects (have the same extent).
Comparisons
Compatibility tests:
>>> fs('+1').incompatible_with(fs('+3'))
True
>>> fs('sg').complement_of(fs('pl'))
True
>>> fs('-1').subcontrary_with(fs('-2'))
True
>>> fs('+1').orthogonal_to(fs('+sg'))
True
Set inclusion (subsumption):
>>> fs('') < fs('-3') <= fs('-3') < fs('+1') < fs('1sg')
True
Operations
Intersection (join, generalization, closest feature set that subsumes the given ones):
>>> fs('1sg') % fs('2sg') # common features, or?
FeatureSet('-3 +sg')
Intersect an iterable of feature sets:
>>> fs.join([fs('+1'), fs('+2'), fs('1sg')])
FeatureSet('-3')
Union (meet, unification, closest feature set that implies the given ones):
>>> fs('-1') ^ fs('-2') # commbined features, and?
FeatureSet('+3')
Unify an iterable of feature sets:
>>> fs.meet([fs('+1'), fs('+sg'), fs('-3')])
FeatureSet('+1 +sg')
Relations
Immediately implied/subsumed neighbors.
>>> fs('+1').upper_neighbors
[FeatureSet('-3'), FeatureSet('-2')]
>>> fs('+1').lower_neighbors
[FeatureSet('+1 +sg'), FeatureSet('+1 +pl')]
Complete set of implied/subsumed neighbors.
>>> list(fs('+1').upset())
[FeatureSet('+1'), FeatureSet('-3'), FeatureSet('-2'), FeatureSet('')]
>>> list(fs('+1').downset())
[FeatureSet('+1'),
FeatureSet('+1 +sg'), FeatureSet('+1 +pl'),
FeatureSet('+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl')]
Definition
If you do not need to save your definition, you can directly create a system from an ASCII-art style table:
>>> fs = features.make_features('''
... |+male|-male|+adult|-adult|
... man | X | | X | |
... woman| | X | X | |
... boy | X | | | X |
... girl | | X | | X |
... ''', str_maximal=False)
>>> fs
<FeatureSystem object of 4 atoms 10 featuresets at 0x...>
>>> for f in fs:
... print(f, f.concept.extent)
[+male -male +adult -adult] ()
[+male +adult] ('man',)
[-male +adult] ('woman',)
[+male -adult] ('boy',)
[-male -adult] ('girl',)
[+adult] ('man', 'woman')
[+male] ('man', 'boy')
[-male] ('woman', 'girl')
[-adult] ('boy', 'girl')
[] ('man', 'woman', 'boy', 'girl')
Note that the strings representing the objects and features need to be disjoint and features cannot be in substring relation.
To load feature systems by name, create an INI-file with your configurations, for example:
# phonemes.ini - define distinctive features
[vowels]
description = Distinctive vowel place features
str_maximal = true
context =
|+high|-high|+low|-low|+back|-back|+round|-round|
i| X | | | X | | X | | X |
y| X | | | X | | X | X | |
?| X | | | X | X | | | X |
u| X | | | X | X | | X | |
e| | X | | X | | X | | X |
ø| | X | | X | | X | X | |
?| | X | | X | X | | | X |
o| | X | | X | X | | X | |
æ| | X | X | | | X | | X |
œ| | X | X | | | X | X | |
?| | X | X | | X | | | X |
?| | X | X | | X | | X | |
Add your config file, overriding existing sections with the same name:
>>> features.add_config('examples/phonemes.ini')
If the filename is relative, it is resolved relative to the file where the
add_config()
function was called. Check the documentation of the
fileconfig package for details.
Load your feature system:
>>> fs = features.FeatureSystem('vowels')
>>> fs
<FeatureSystem('vowels') of 12 atoms 55 featuresets>
Retrieve feature sets, extents and intents:
>>> print(fs('+high'))
[+high -low]
>>> print('high round = {}, {}'.format(*fs('high round').concept.extent))
high round = y, u
>>> print('i, e, o = {}'.format(*fs.lattice[('i', 'e', 'o')].intent))
i, e, o = -low
Logical relations between feature pairs (excluding orthogonal pairs):
>>> print(fs.context.relations())
+high complement -high
+low complement -low
+back complement -back
+round complement -round
+high incompatible +low
+high implication -low
+low implication -high
-high subcontrary -low