ePSD2 generally uses the same annotation as Oracc generally. This
is described from the perspective of people working projects on the
Oracc lemmatization page and on the
Sumerian annotation page.
To summarize, we use the following terminology and elements in annotating Sumerian:
- CF: Citation Form
- The headword used in the dictionary.
In general, ePSD2 headwords use the long forms of words, and
explicitly include the final -k in genitive compounds.
- GW: Guide Word
- A label for the word which is primarily
intended as a way of disambiguating homophones. Guide Words are not
necessarily a "basic" meaning for the word. Although in practice this
is often the case it is not a requirement.
- POS: Part Of Speech
- The reference part-of-speech for the
word: in some cases words are used both as nouns and as verbs and it
is not always obvious which to use as the reference
part-of-speech in which case we simply make a conventional choice. See EPOS.
- SENSE
- Senses are indicative of the range of meanings of
words. An ongoing objective for future work on ePSD2 is to improve
annotation of the corpora with regard to senses in order to provide
a more nuanced understanding of the ways words are used in
context.
- EPOS: Effective Part of Speech
- This is the
part-of-speech that goes with an individual sense.
- BASE
- Rather than use the term 'root' we use the term
'base' to indicate the portion of a word-form that writes the word
itself rather than any attached morphological markers. Two special
notations are used for Sumerian bases for situations where a single
grapheme combines morphology and base. When the first part of the
grapheme is morphological and the second part belongs to the base,
we separate them with the degree symbol, °, as in
b°e₂
for b+e. When the first part of the grapheme belongs to the base
and the second is morphological, we separate them using the centred
dot, ·, as in e₂-udu-k·a
, a writing of
eʾuduk[sheephouse].
- CONT: Continuation
- Continuation graphemes are annotated
explicitly because they often give information about th ending of a
word. They have the form
+-ga=g.a
meaning that the
base is followed by GA, writing the end of the base, g, and some
other item, a. There is some inconsistency in ePSD2 about when a
CONT is used and when a centred dot is used in the base: this will
be rectified in a forthcoming release.
- MORPH: Morphology
- The morphology string follows a simple
set of conventions for which preliminary documentation is available
on the morphology pages.