1. The aim of the project is to create the world’s largest lexical collection of words and word combinations for the Polish language of the 20th century, with attestation by citations. The key distinguishing principle will be that every excerpt is documented photographically, that is, in the same form in which it appeared in print.
  2. Lexical observation will cover the period 1901–2000, that is, it will include texts published during that time.
  3. Every excerpt will be precisely localized, that is, a record will be given of the title of the document from which it is taken.
  4. Every excerpt will be given a precise chronological characterization, that is, a record will be given of the date on which it appeared in the text.
  5. A new feature of the project compared with previous observations of 20th-century vocabulary will be that the work will result in the world’s largest collection of units whose existence was not previously known, that is, units which have not previously been searched for and recorded.

The result of the work will be a database documenting the usage of lexical units. This database will significantly expand present knowledge concerning the lexical resources of the Polish language. It is especially notable that as regards the first half of the 20th century, particularly the two interwar decades, these records are extremely scanty, being limited to the results of systematic excerption by two researchers: J. Wawrzyńczyk and P. Wierzchoń.

The new database will firstly be, in relative terms, extremely large (the largest database of this type in the world); secondly it will permit a variety of applications (morphological studies, studies of borrowings, phrasematics, stylometric analysis, etc.); thirdly it will enable viewing of the original context via photographic documentation; and fourthly it will contain tools enabling searching according to various criteria.