HOMEABOUTCORPUS

General Trends Across the Corpus

Neologisms By Time Period

Depicted below is a comparison of the amount of neologisms arranged in a chronological manner. For each play, these years are estimates. The chronology of Shakespeare's plays is obtained from E.K Chambers, who spent his life doing an extensive examination of Shakespeare's works.

Frequency Neologisms by Time Period 1591 34 1592 6 1593 20 1594 58 1595 20 1596 19 1597 35 1598 33 1599 33 1600 86 1601 38 1604 27 1605 44 1606 18 1607 24 1608 6 1609 29 1610 19 1611 12 1612 29

Neologisms By Part of Speech

This chart below examines the distribution of part of speech across all of Shakespeare's neologisms. The part of speech data was obtained from the Oxford English Dictionary. High-frequency words often appear across multiple plays, meaning they entered Shakespeare’s regular personal vocabulary, and were not just one-off innovations. Frequency Is Not Distributed Evenly
Across the entire corpus, Shakespeare’s neologisms fall into two sharply different behaviors:
• a small handful of sound-based words repeat dozens of times,
• while hundreds of semantic neologisms appear only once. This is the biggest structural imbalance in the dataset.

Frequency 177 Adjective 105 Noun 62 Verb 29 Adverb 11 Interjection 1 Adjective and noun 1 Adjective and adverb

The main pattern across all Shakespearean neologisms is a split between “sound-based” stage words (high frequency) and “semantic” invented words (low frequency). The sound-words repeat constantly because they function as musical cues, crowd noises, or comedic fillers. The semantic words barely appear more than once, but they’re the ones that actually expand English into new conceptual territory. This explains why the most common invented words are nonsense syllables, yet the part-of-speech breakdown is dominated by complex adjectives.

Frequency Is Not Distributed Evenly total tokens neologism type sound few words, high use semantic many words, low use

Shakespeare frequently reuses the prefix un- over and over (unbated, unwept, unsex, unearthly) and the same evaluative endings like -less, -ful, -ish, -able, etc. He’s not inventing brand-new roots, but just attaching familiar prefixes and suffixes to normal words. That’s why most of the neologisms end up being adjectives, as he is basically using modified or intensified versions of existing ideas, not new concepts.

bated un