Searching ePSD2

Introduction: Special Characters | Separators

Glossary Search: Glossary Transliteration Search | Sign Search | Base and Citation Form Search | More on Citation Forms | English Search | Combining Glossary Searches | Compound Entries | Specialized Glossaries

Corpus Search: Corpus Transliteration Search | Sign Sequence Search | Metadata | Lemma Search | Combinations

Sign List Search

A Note About the Corpora

Introduction

ePSD2 searches may proceed through the ePSD2 Glossary [/epsd2/sux] through the ePSD2 Corpus [/epsd2/pager] or through the ePSD2 Sign List [/epsd2/signlist]. The following table summarizes the main usage of each, with more details and options discussed below.

Page Search For Example Search
Glossary [/epsd2/sux] Words in transliteration or translation en-nu-un
Corpus [/epsd2/pager] Collocations, Sign Sequences s:ne~ga2
Sign List [/epsd2/signlist] Words that use a particular sign nu

Go to the Glossary [/epsd2/sux] to find a word and its translation, its attestations, and some references to important literature. Glossary search will result in a list of lemmas (Sumerian words, Emesal words, and/or Proper Nouns); clicking on one result will open the relevant article [/epsd2/about/articles/index.html] (the lemma) in ePSD2. Corpus [/epsd2/pager] search is appropriate for more advanced, complex, or desperate searches. It allows for a search of the entire ePSD2 corpus that feeds the ePSD2 glossary. Corpus search will result in a list of lines or (in the case of metadata search) in a list of documents. Clicking on a result will open a page with the edition and metadata of the text in which the line appears. In the Sign List [/epsd2/pager] one may discover all words that include a particular sign, with the results sorted by the reading of that sign.

The search options described below are not fundamentally different from general ORACC searches, as discussed in the ORACC documentation. The user is encouraged to consult the pages on Glossary search [/doc/help/visitingoracc/glossaries/index.html] and Corpus search [/doc/search/searchingcorpora/index.html], which explain in detail the various elements of glossary pages and corpus pages and how to use them. In some respects, however, the structure of ePSD2 and its expected usage are different enough from regular ORACC projects to warrant a dedicated help page.

Special Characters

When searching use the following special characters in ASCII or in Unicode. Do not use lengthening marks over vowels (for instance in proper nouns).
Name ASCII Unicode
ALEF ' (single quotation mark) ʾ
Nasal G j ŋ
SHIN sz, SZ š, Š
Sign Index Numbers 0-9 ₀-₉

Searches in ASCII and Unicode are equivalent: searching for szej6 (ASCII) will give the same result as šeŋ₆ (Unicode); you may even use a mixed mode, such as šej6. Do not use accents, that is: search for e2 (not é) or e₃ (not è).

Separators

In search, signs may be separated by hyphen, space, tilde, or underscore, each indicating different search parameters:

Separator Example Result
- ka-ga ka followed by ga in the same word
SPACE ka ga ka and ga anywhere in the same line or entry
~ ka~ga ka followed by ga (ka-ga or ka ga)
_ ka_ga ka followed by ga in the next word

In the search box, determinatives are written on the line, as in amar-d-suen. However, one may also omit the determinative and simply write amar-suen to find instances of amar-suen as well as amar-{d}suen, bringing up results with or without the determinative.

Glossary Search

For a glossary search, go the url http://oracc.org/epsd2/sux [/epsd2/sux] and use the search box to find Sumerian words by transliteration, citation form, English translation, or a combination of those. In some cases it is advantageous to filter results by adding one of several prefixes, including s: (Sign), b: (Base), or c: (Citation Form). More on those prefixes below.

Examples of valid searches are:

Glossary Transliteration Search

In transliteration search you may give as many (or as few) signs as you wish. Searching e2-dub-ba-a will get you to the article edubbaʾa[scribal school] [/epsd2/o0026700], but e2-dub will find that entry as well. Searching for e2 dub (without the hyphen) is interpreted as "any word that includes e₂ and dub (in any order)." This will return edubbaʾa[scribal school] [/epsd2/o0026700], but also imešagduba[calculation tablet] [/epsd2/o0030960] and several other words. Try nij2 gu2 for an example that yields results with the search terms in different orders. Transliteration search may include morphological prefixes and suffixes, provided that that particular form is actually attested in the current corpus. A search for im-e-re-ša will return ere[go] [/epsd2/o0027146], but im-e-re-eš currently returns nothing. The search engine uses aliases [/epsd2/searching/aliases] to deal with different transliteration systems. Searching for sugal7 or sukkal will get you the same results, and so will du11 vs. dug4. Searching for du11 (or dug4) will not return words that use other values of the KA sign (such as zu₂, ka, inim, or kir₄).

Sign Search: The Prefix s:

The prefix s: turns your search into a sign search, matching any sign values of the search terms. For instance s:gar~ni~de2, will find nindaʾidea [/epsd2/o0036259]. The s: prefix (or any other prefix, see below) is only valid for the search term that follows it immediately. Thus the search s:gar de2 means: "the sign GAR in any reading, preceded or followed by de₂". Note that ORACC maximally differentiates between signs, so that IM does not match ni₂ and TUG₂ and NAM₂ are treated as two different signs, because these are differentiated during part of the third millennium.

Base and Citation Form search: The Prefixes b: and c:

Glossary search will try to find matches in Citation Form (nindaʾidea), Guide Word (bread), Sense (pastry), Form (ninda-i₃-de₂-a-aš), and Base (ninda-i₃-de₂-a). In some cases it may be advantageous to restrict your search to Citation Form or Base only by using the prefixes c: and b:, respectively. Do not insert a space between the prefix and the search term. Searching for a currently yields 5578 results - which is not very useful. Searching for c:a (Citation Form = "a") yields 60 results, including a[arm] [/epsd2/o0023086], a[water] [/epsd2/o0023102] and numerous expressions that include one of these words. In order to restrict your search to Bases only (and exclude all affixes), use the prefix b:. The search b:nu will find all words that have nu in the BASE (such as allanum [/epsd2/o0023941], oak or NU [/epsd2/o0036438], to spin, but not verbal forms with the negation prefix nu-. Note that this search may also be done effectively through the ePSD2 Sign List [/epsd2/signlist].

More on Citation Forms

The Citation Form in ePSD2 is independent of the way the word is written; it omits hyphens and sign index numbers, but it does include a final /k/ if the word is a genitive construction. Examples are šatam [/epsd2/o0038930] (usually written ša₃-tam), or ensik [/epsd2/o0027087] (usually written ensi₂). Two adjacent vowels are separated by an Alef, as in edubbaʾa [/epsd2/o0026700]. If the proper reading of one or more signs is unknown, the Citation Form is equivalent to the transliteration, but with dots instead of hyphens separating the signs (for instance ad.KID [/epsd2/o0023618], a weaver). One may search for such items by writing ad-KID, ad.KID, ad-kid or ad.kid.

English Search

Searching by English translation will find matches in Guide Word and Sense. Searching for weave, for instance, will currently yield tuku[beat] [/epsd2/o0040567], sag[beat] [/epsd2/o0037168], tu[beat] [/epsd2/o0040432], and kad[tie] [/epsd2/o0031459], because all these verbs are recognized as having a meaning of 'to weave'.

Combining Glossary Searches

The ePSD2 glossary search will return Sumerian words as well as words in Emesal and proper nouns found in Sumerian contexts. You will often find more than you were looking for and it may help to combine search methods. A search for bala will return dozens of hits. A search for bala spindle (combining transliteration and translation) leads directly to the article balak[spindle] [/epsd2/o0024881].

Compound Entries

Compound entries (such as compound verbs) may be found by simply entering the two (or more) parts of the compound; for instance saj rig (or, in transliteration, saj rig7 or saŋ rig₇), which will find the articles saŋ rig[bestow] [/epsd2/o0037276] and sarig[gift] [/epsd2/o0037683]. One may also go to one of the component words in the glossary and find links to the compounds. The entry saŋ[head] [/epsd2/o0037222] refers to saŋ ba[bestow] [http://oracc.org/epsd2/o0037240]; saŋ bala[shake] [http://oracc.org/epsd2/o0037242]; saŋ ŋeš rah[kill] [http://oracc.org/epsd2/o0037262]; and several other compounds, including, of course, saŋ rig[bestow] [/epsd2/o0037276].

Specialized Glossaries

ePSD2 provides a number of specialized glossaries, which all feed into the main epsd2/sux [/epsd2/sux] glossary discussed above. The Emesal Glossary [/epsd2/emesal/sux-x-emesal] collects all word instances that are marked as Emesal, either because they have Emesal roots (such as umun[lord] [/epsd2/emesal/o0049890]) or Emesal morphology (as in de₃-ma-ab-gub-bu-ne). The Proper Nouns [/epsd2/names/qpn] glossary collects references to Personal Names, Divine Names, Geographical Names, Month Names, Constellation Names, Field Names, etc. In addition, all ePSD2 subprojects and contributing ORACC projects have their own glossaries. If, for instance, you wish to restrict your search to Ur III administrative vocabulary, go to the Ur III glossary [/epsd2/admin/ur3/sux]. Alternatively, you may choose to go to the ePSD2 Admin Umbrella Glossary [/epsd2/admin/sux] to consult Sumerian administrative vocabulary of all periods. For an overview of the available corpora and their glossaries, see the ePSD2 Corpora [/epsd2/about/corpora/] page.

Corpus Search

For corpus search go to the URL http://oracc.org/epsd2/corpus [/epsd2/corpus], and use the search box to find attestations of words, sign sequences, metadata or combinations of those. This search gives you access to all the corpora (including literary, lexical, administrative, royal, liturgical, etc.) that are included in the ePSD2 dataset. Alternatively, you may focus on one of the subcorpora (for an overview of the available corpora, see the ePSD2 Corpora [/epsd2/about/corpora/] page). The primary search unit is the line; for metadata the primary search unit is a text.

Examples of valid searches include:

Corpus Transliteration Search

Enter any sequence of transliterated signs. The search will return a list of lines where that sequence is found. As in the Glossary Search, transliteration search uses aliasing [/epsd2/searching/aliases], so that sugal7 will find sugal₇, as well as sukkal, and du11 will also find dug₄.

Sign Sequence Search

With the prefix s: the search will not only match the literal transliteration, but also any value of that same sign. The search s:inim-ga will find du₁₁-ga, dug₄-ga, and ka-ga. A very powerful strategy is to combine the s: prefix with various separators, discussed above. The search s:inim~ga will find du₁₁-ga, dug₄-ga, and ka-ga, but also inim ga-(ab-dug₄), etc.

Metadata

Metadata (provenance, museum number, publication number, etc.) may be searched for directly in the search box. Try, for instance, to search for MVN 1 68 (no need for commas or leading zeros). This currently results in four hits, including MVN 01, 068 [/epsd2/P113101]. The search engine will search for MVN and 1 and 68 anywhere in the catalogue. It will, therefore, consider MVN 01, 098 [/epsd2/P113131] a match, because that text has the Museum number MW 068. To prevent this behaviour, use underscores instead of spaces (MVN_1_68). In some cases metadata and transliteration data may use the same terms (Nisaba may be a book series or a goddess) and it may be useful to restrict your search to catalog data only. To do so type !cat followed by a space and your search term(s). More detailed information about catalog searches may be found in the page Searching the Oracc corpora [/doc/search/searchingcorpora/index.html].

Lemma Search

One may use the corpus search box to search for lemmas (Sumerian words) irrespective of their spelling, by entering the Citation Form, Guide Word, or Sense. It is recommended, however, to use the Glossaries for such searches. In order to search for lemmas only use the !lem selector, followed by a space and your search terms.

Combinations

One may combine multiple searches with the operators and and or. The and operator is present by default and may be omitted. The searches s:izi~gu7 s:ka and s:izi~gu7 and s:ka are equivalent and will have the same results. This search pattern is a powerful tool for finding collocations; try, for instance, bizaza ugu de (or bizaza and ugu and de) to find the lost frog. Note that this is a Citation Form search; in transliteration search one gets to the same frog with bi₂-za-za u₂-gu de₂, or one may enter a mix of those as in bi₂-za-za ugu de, combining transliteration and Citation Form search.

In combining multiple searches it may be useful to indicate the search domain:

Code Domain
!lem Lemmatized transliteration (default search domain for epsd2/corpus)
!cat Metadata (catalogue data)

One may thus construct a search like !lem dubszen !cat Umma, to find references where the word dubšen [/epsd2//o0026190] appears in a text that has Umma in the catalog data. Note that this may be done through the glossary as well. The glossary page for dubšen [/epsd2//o0026190] has a link (n instances) that will open a new page listing all the instances of the lemma. The instances are ordered by period, place, and genre. Also note that the search term !cat Umma will match the word 'Umma' (or 'umma'; not case sensitive) anywhere in the catalog, whether in the listing for provenance, titles of books, comments, etc.

Other examples of complex searches are:

The user is encouraged to try various combinations and see what works.

Sign List Search

The ePSD2 Sign List [/epsd2/signlist] contains all of the signs and readings that occur in the glossaries, and provides links to every article in which the signs and values are used. Simply enter a known sign value in the search box. The return is organized by sign value and by the position of the sign in the word (independent, initial, medial, or final). This can be a very efficient search if you know what you are looking for.

A Note About the Corpora

ePSD2 is built on top of multiple corpora [/epsd2/about/corpora/] assembled by the ETCSL [http://etcsl.orinst.ox.ac.uk/] team, the CDLI [http://cdli.ucla.edu] team and numerous ORACC projects within ePSD2 or outside of ePSD2. As a result, many people in various different teams have contributed to the data set, using various sets of transliteration conventions and offering different solutions in interpreting expressions or assigning word boundaries (ninda i₃-de₂-a or ninda-i₃-de₂-a or nig₂-i₃-de₂-a). It is inevitable, therefore, that there are many types of inconsistencies within the ePSD2 data set and its lemmatization - by design or by accident. Rather than imposing strict measures of consistency on the data, the search engine is designed to enable users to find data, independent of such conventions.

 
Back to top ^^
 
CC BY-SA The Pennsylvania Sumerian Dictionary Project, 2017-
http://oracc.org/searching/