concepts
Sense coarsening
WordNet splits many words into very fine senses that are hard to tell apart — even for expert lexicographers. Coarsening groups those narrow senses into broader, more application-oriented concepts, and SenseBench can score every run at the fine level or under one of two coarse inventories. This page explains what coarsening is, how it changes WSD results, and how each inventory — Glite and CSI — works.
What is sense coarsening?
A coarsening is a map from WordNet 3.0 sense keys to broader concepts. Several WordNet senses of a word can map to one concept, so a prediction is coarse-correct when its WordNet sense lands in the same concept as a gold sense — even if it is not the exact fine sense. Coarsening matters for two reasons:
- Many WordNet distinctions are redundant for applications. Telling a dictionary-construction or language-learning system that volume means "the amount of 3-D space" rather than "a relative amount" rarely matters; telling those apart from "the loudness of a sound" does.
- Fine senses are often not reliably separable. On the hard, human-reviewed lexEN items, expert annotators frequently disagree on the exact fine sense yet agree on the coarse concept — so fine-grained scoring penalises models for distinctions the experts themselves do not make consistently.
How coarsening affects WSD results
- Coarse accuracy is always at least the fine accuracy. Every exact (fine) hit is also a coarse hit, and coarsening can only turn some fine misses into hits, never the reverse. So switching the granularity selector up can only raise a run's score.
- The lift is substantial. On the full lexEN-v1 set, frontier models score around 95% fine-grained and around 98% under Glite coarse. On the hard reviewed subset the gap is much larger — roughly 66% fine versus 78–88% coarse depending on the inventory — which shows that most remaining "errors" are fine sense splits, not genuine disambiguation failures.
- Rankings are largely preserved. Coarsening lifts strong and weak systems together, so the order of models on the leaderboard changes little when you switch granularity.
- It is computed live, from raw predictions. Coarse scores are not baked into a run; SenseBench re-maps each model's predicted sense key through the chosen concept map at build time. Use the granularity selector on the leaderboard to switch between fine, Glite, and CSI; each run page reports all schemes together. See label schemes for how granularity combines with the gold-label selector.
Examples
The four words below each have five or six WordNet senses. In every case Glite groups them exactly the way the consensus of five professional dictionaries (Merriam-Webster, Collins, Cambridge, Oxford Learner's, Longman) does. Within a Glite group, a fine-grained miss still counts as a coarse hit; crossing a group boundary is wrong at both levels. The right-hand note shows where CSI — a domain-based inventory — groups the senses differently.
party (noun) — 5 senses
| WordNet sense | Glite group = dictionary consensus |
|---|---|
| an organization to gain political power | political party |
| a group of people gathered together for pleasure | a party (social) |
| an occasion for social interaction and entertainment | a party (social) |
| a band of people associated temporarily in an activity | search / working party |
| a person involved in legal proceedings | legal party |
Glite and the dictionaries merge the social group and the social occasion into one everyday "party" and keep political, search-party, and legal apart. CSI groups them the same way here.
volume (noun) — 6 senses
| WordNet sense | Glite group = dictionary consensus |
|---|---|
| the amount of 3-D space occupied by an object | quantity / amount |
| a relative amount | quantity / amount |
| a physical book (pages bound together) | a book |
| a publication that is one of a set | a book (in a set) |
| the property of being great in magnitude (bulk) | bulk |
| the magnitude of sound | loudness |
Glite and the dictionaries merge the two amount senses and the two book senses. CSI instead pairs "3-D space" with "bulk" (both size) and splits the books apart.
care (verb) — 5 senses
| WordNet sense | Glite group = dictionary consensus |
|---|---|
| feel concern or interest | care about |
| be concerned with | care about |
| provide care for | look after / be in charge of |
| be in charge of, act on, or dispose of | look after / be in charge of |
| prefer or wish to do something | care to (wish) |
Glite and the dictionaries pair "feel concern" with "be concerned with", and "provide care for" with "be in charge of". CSI instead lumps the "wish" sense in with the concern senses.
replacement (noun) — 6 senses
| WordNet sense | Glite group = dictionary consensus |
|---|---|
| the act of furnishing an equivalent | the act of replacing |
| an event in which one thing is substituted for another | the act of replacing |
| filling again by supplying what has been used up | the act of replacing |
| someone who takes the place of another (surrogate) | the replacing person / thing |
| a person or thing that can take the place of another | the replacing person / thing |
| a person who follows next in order (successor) | the replacing person / thing |
Glite and the dictionaries make one clean cut — the act of replacing vs. the person or thing that does the replacing. CSI scatters the six senses across five domains.
These are typical: across 100 polysemous words, Glite reproduces the professional-dictionary grouping more closely than any other public inventory. Coarse inventories differ mainly in how aggressively and along which lines they merge — which is why SenseBench offers more than one and reports them side by side.
Glite coarse-grained
Glite is a learner-dictionary sense inventory built by Glite. Its concepts are the entries of the Glite Dictionary — a dictionary designed for English learners, where senses are grouped at the granularity a reader actually needs. Because the groupings come from a real, externally published dictionary rather than an automatic rule, Glite's coarsening is principled: across 100 polysemous words it reproduces the sense grouping of major professional dictionaries more closely than any other public inventory.
-
How it works. A vendored map sends each WordNet 3.0 sense key to a Glite
concept id (for example
ct:ct7gsDQ2G22wWJNkuNk4fplan). Senses of a word that the Glite Dictionary treats as one meaning share a concept id; a prediction is coarse-correct when its sense key maps to the same Glite concept as a gold sense. -
Coverage. The public map covers about 95% of the WordNet sense keys lexEN
references. A sense key with no Glite concept is scored as its own singleton class
(
unmapped:<key>) and is never silently merged, so incomplete coverage can only lower coarse accuracy, never inflate it. - Role. Glite coarse is the default coarse scheme and pairs with the lexEN gold, whose review used the same Glite groupings. The map is vendored from the lexEN release.
Learn more at the Glite Dictionary and glite.ai.
CSI coarse-grained (Lacerra 2020)
CSI — the Coarse Sense Inventory of Lacerra, Bevilacqua, Pasini & Navigli (AAAI 2020) — is an independent, third-party coarsening that groups every WordNet synset into one of 45 broad semantic domains (for example biology, law & crime, art, architecture & archaeology). It is included so that SenseBench's coarse results can be audited against an inventory that the dataset's authors do not control.
-
How it works. The released
wn_synset2csimapping assigns each WordNet 3.0 synset to its CSI domain(s). SenseBench vendors a sense-key → concept map derived from it: a synset with several domains is reduced to one composite label (csi:<sorted domains>), a conservative single-class partition, so it plugs into the same scoring contract as Glite with the CSI map swapped in. -
Coverage. CSI maps about 79% of the referenced WordNet keys (uncovered keys
become
unmapped:singletons). Because its 45 domains are broader than Glite's dictionary concepts, CSI tends to split a word into more classes overall but groups cross-topic senses differently. - License. CSI is distributed under CC BY-NC-SA 4.0; the vendored subset keeps the same terms with attribution to the original authors.