Shortcut: WD:PP/L

Wikidata:Property proposal/Lexemes

From Wikidata
Jump to navigation Jump to search

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Computing Lexeme

See also

[edit]

This page is for the proposal of new properties.

Before proposing a property

  1. Search if the property already exists.
  2. Search if the property has already been proposed.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
  6. Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
  3. See property creation policy.

On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/06.

Wikibase lexeme

[edit]

Duden node ID

[edit]
   Under discussion
Descriptionnumeric identifier of an entry in the Duden
Representslexeme (Q111352)
Data typeExternal identifier
Domainlexeme
Allowed values[1-9][0-9]{2,}
Example 1zwischen (L302428)213628
Example 2Wikipedia (L221977)205799
Example 3Luftballon (L99)91371
Sourcehttps://www.duden.de/
Planned useadd to existing German lexemes (and perhaps replace P8376 on those lexemes?)
Number of IDs in sourcecurrent very loose upper-bound estimate: 399,000 (on seeing node '399770' for the word '2G-Regel' and on finding '569' to be the lowest node for '24-Sekunden-Regel')
Expected completenessalways incomplete (Q21873886)
Formatter URLhttps://www.duden.de/node/$1
See alsoDuden lexeme ID (P8376), Duden sense ID (P12641)

Motivation

[edit]

This property is intended as a substitute for (not as something to coexist with) P8376, mainly to avoid needing to deal with multiple values being present on a lexeme due to URL changes. It would provide a unique and hopefully little more stable identifier for entries in the most trusted German dictionary.

The value for this property can be found both below the heading, when clicking "Als Quelle verwenden" (the quotation mark) and noting the value between 'node' and 'revision' within the URL, and in multiple parts within the <head> element as part of an image URL (presumably used in link previews): https://www.duden.de/og-image/569.png is the image for '24-Sekunden-Regel' and https://cdn.duden.de/lexem_images/399770.png that for '2G-Regel'.

(Regarding value stability: when looking at the earliest captures of the formatter URL's prefix, the values prior to April 2019 all had values greater than 1000000 (except for '10' and '1220') and all pointed to non-entries. Those captured after April 2019, which appear uniformly to be within the range of IDs given above, still resolve to the same entries five years later—Panamaer (L931866) from 1 May 2019, Remix (L761972) from 30 April 2019, and Neger (L804760) from 26 April 2019.)

(I have begun an index of these values and will slowly add them to Mishramilan (মিশ্রমিলন) once this property is created. I would be open to performing the necessary substitutions of P8376 myself as well.) Mahir256 (talk) 13:53, 22 April 2024 (UTC)[reply]

Discussion

[edit]
  •  Support Hmm, do we want to plan to replace the other ID with this one? ArthurPSmith (talk) 19:19, 22 April 2024 (UTC)[reply]
  • If you want to add this in addition to the existing property, fine, but I would be opposed to replacing the existing property with this one:
It's not what people actually use for linking (global search for "duden.de/node/" finds just 122 results, versus more than the limit of 10,000 for "duden.de/rechtschreibung/"), so only having the node ID would make our data difficult to compare with what other places have.
The node IDs are not very easy to find or use. I had trouble finding it even when I knew it should be on the page somewhere, it can only be accessed by clicking certain things and following the link redirects to a page without the node ID in the URL, which makes it hard to tell whether a particular link is for a particular entry or not. Both of those points make it more likely that people will add the wrong values to the property.
A lexeme doesn't have a single current node ID: An entry typically has multiple pages (definition page, synonyms page, inflections page) and each page has its own node ID which has no obvious connection to the other pages for that entry, e.g. https://www.duden.de/rechtschreibung/Hund is 68862, https://www.duden.de/deklination/substantive/Hund is 385837 and https://www.duden.de/synonyme/Hund is 251193.
The node IDs appear to be less stable than the normal URLs: In my experience, the normal URLs are pretty stable. The changes I've seen all seem to come from them recently simplifying the ones with multiple disambiguation parts (now they only include one), but the previous values redirect to the new values. (Much like how a merge on Wikidata results in the previous ID redirecting to the new ID). I looked at one page where the URL changed and immediately found that https://web.archive.org/web/20110627114010/http://www.duden.de:80/rechtschreibung/Band_Gewebestreifen_Fessel has "http://www.duden.de/zitieren/10020253/1.5", https://web.archive.org/web/20151013073540/http://www.duden.de/rechtschreibung/Band_Gewebestreifen_Fessel has "http://www.duden.de/node/651609/permalink?destination=node/651609" and https://web.archive.org/web/20220517054827/https://www.duden.de/rechtschreibung/Band_Gewebestreifen_Fessel has "https://www.duden.de/node/18218/revision/496432". Both the "zitieren" link and previous "node" link are 404s, while the original URL from 2011 still works now.
If you want a single value for the current property, you should be using best ranked statements. If a lexeme has multiple best ranked statements, then either one now redirects and the other hasn't been set to preferred rank, or someone has conflated two entries.
- Nikki (talk) 02:46, 24 April 2024 (UTC)[reply]
@Mahir256:, could you please clarify the comments above by @Nikki:. Regards, ZI Jony (Talk) 18:30, 28 April 2024 (UTC)[reply]
 Oppose maybe instead, we should have a seperate Drupal node id property (analogous to MediaWiki page ID (P9675)). Duden has little incentive to keep node ids stable, they have on the other hand a strong incentive to keep the slug stable for SEO reasons. If it comes to a redesign, nobody is going to care about node ids. This happened before in the past: Archiv, das at some point had the node id 704401 which today is dead. the current node id for Archiv, das is 8452Shisma (talk) 09:38, 21 May 2024 (UTC)[reply]
also note: that all pages in this drupal have a node id, not just lexemes. For instance
  • 238998 → Terms and Conditions
  • 213867 → Orthographical rules concerning the apostrophe
that's not an argument agains having this property but it might throw off your estimation of Number of IDs in sourceShisma (talk) 11:04, 21 May 2024 (UTC)[reply]

‎Te Aka Māori Dictionary ID

[edit]
   Ready Create
Descriptionentry for a word in the online version of John C. Moorfield’s Te Aka Māori Dictionary
RepresentsTe Aka Māori Dictionary (Q116838232)
Data typeExternal identifier
Domainlexeme
Example 1wai (L8347) 9019
Example 2kai (L1327127) 1894
Example 3tūī (L1024707) 8660
Formatter URLhttps://maoridictionary.co.nz/word/$1
Applicable "stated in"-valueTe Aka Māori Dictionary (Q116838232)

Motivation

[edit]

Te Aka Māori Dictionary (Q116838232) is a comprehensive and well-referenced dictionary which can be linked to Māori lexemes. -عُثمان (talk) 18:54, 31 May 2024 (UTC)[reply]

Discussion

[edit]

‎He Pātaka Kupu ID

[edit]
   Ready Create
Descriptionentry for a word in the online version of the Māori dictionary He Pātaka Kupu
RepresentsHe Pātaka Kupu (Q126210616)
Data typeExternal identifier
Example 1kuau (L305637) 3038
Example 2wai (L8347) 11078
Example 3kai (L1327127) 1781
Formatter URLhttps://hepatakakupu.nz/word/$1
Applicable "stated in"-valueHe Pātaka Kupu (Q126210616)

Motivation

[edit]

He Pātaka Kupu (Q126210616) is a high quality monolingual dictionary which can be linked to Māori lexemes. -عُثمان (talk) 19:00, 31 May 2024 (UTC)[reply]

Discussion

[edit]

‎Oqaasersiorfik ID

[edit]
Descriptionentry for a Greenlandic lexeme in the online version of Oqaasersiorfik (2008)
RepresentsOqaasersiorfik (Q126282047)
Data typeExternal identifier
Example 1oqaluuseraa (L30582) 8889
Example 2bussi (L1327408) 2512
Example 3musliimi (L1320754) 7000
Number of IDs in source17327
Formatter URLhttps://ordbog.gl/2008-kal/#e$1
Applicable "stated in"-valueOqaasersiorfik (Q126282047)

Motivation

[edit]

A property for this monolingual dictionary would make a useful addition to Greenlandic lexemes alongside Oqaasileriffik online dictionary ID (P5912) -عُثمان (talk) 19:15, 31 May 2024 (UTC)[reply]

Discussion

[edit]

 Comment Needs a subject item and applicable stated in value. AdamSeattle (talk) 07:48, 1 June 2024 (UTC)[reply]

Added عُثمان (talk) 00:10, 2 June 2024 (UTC)[reply]

Greenlandic-English Dictionary ID

[edit]
   Ready Create
Descriptionentry for a Greenlandic lexeme in the online version of the University of Chicago & Oqaasileriffik Greenlandic-English Dictionary (2018)
RepresentsGreenlandic-English Dictionary (Q126282066)
Data typeExternal identifier
Example 1oqaluuseraa (L30582) 9445
Example 2bussi (L1327408) 2653
Example 3musliimi (L1320754) 7547
Number of IDs in source17702
Formatter URLhttps://ordbog.gl/2018-kal-eng/#e$1
Applicable "stated in"-valueGreenlandic-English Dictionary (Q126282066)

Motivation

[edit]

A property for this bilingual dictionary would make a useful addition to Greenlandic lexemes alongside Oqaasileriffik online dictionary ID (P5912) -عُثمان (talk) 19:20, 31 May 2024 (UTC)[reply]

Discussion

[edit]

MoFo ID

[edit]
   Under discussion
Descriptionentry for a Greenlandic morpheme in Stian Lybech’s online database of morphemic forms
RepresentsMoFo - A dictionary of Morphemic Forms (Q126364348)
Data typeExternal identifier
Example 1lu (L1328047) 10
Example 2aaveq (L1328076) 2
Example 3angivoq (L1327975) 119
Formatter URLhttps://mofo.oqa.dk/Morphemes/Details/$1
Applicable "stated in"-valueMoFo - A dictionary of Morphemic Forms (Q126364348)

Motivation

[edit]

MoFo - A dictionary of Morphemic Forms (Q126364348) is a small database of 358 morphemes in Greenlandic. These grammatical/functional elements mostly do not have their own entries in Greenlandic dictionaries, making this a useful resource for them. It also includes entries for a handful of items in the general lexicon which have defective or irregular inflectional paradigms. -عُثمان (talk) 11:23, 5 June 2024 (UTC)[reply]

Discussion

[edit]

DAKA Greenlandic-Danish Dictionary ID

[edit]
   Under discussion
Descriptionentry for a Greenlandic lexeme in the online DAKA dictionary
RepresentsDAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971)
Data typeExternal identifier
Example 1lu (L1328047) 7044
Example 2aaveq (L1328076) 369
Example 3angivoq (L1327975)} 1387
Formatter URLhttps://iserasuaat.gl/daka/daka?f=lo&l=0&p=$1
Applicable "stated in"-valueDAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971)

Motivation

[edit]

DAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971) is one of the more popular general purpose Greenlandic dictionaries, and includes some entries for some lexical items not covered by the other proposed properties. -عُثمان (talk) 11:52, 5 June 2024 (UTC)[reply]

Discussion

[edit]

DAKA Danish-Greenlandic Dictionary ID

[edit]
   Under discussion
Descriptionentry for a Danish lexeme in the online DAKA dictionary
RepresentsDAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971)
Data typeExternal identifier
Example 1ampere (L739651) 825
Example 2vores (L35289) 35047
Example 3underbukser (L235280)} 33530
Formatter URLhttps://iserasuaat.gl/daka/daka?f=lo&l=1&p=$1
Applicable "stated in"-valueDAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971)

Motivation

[edit]

DAKA Greenlandic-Danish & Danish-Greenlandic Dictionary (Q126367971) is one of the more popular general purpose Greenlandic dictionaries. This is the reverse dictionary corresponding to the proposal above. -عُثمان (talk) 12:05, 5 June 2024 (UTC)[reply]

Discussion

[edit]

Farhang-i forsī ba rusī ID

[edit]
   Under discussion
Descriptionentry for a Persian lexeme in the farhang.ru online reproduction of Yuri Rubinchik’s Persian–Russian dictionary
Representsno label (Q126368177)
Data typeExternal identifier
Example 1نارنجستان/норинҷистон (L1186957) 6ade71dac83141ea9f264a989c62b0d6
Example 2جنبش‌شناسی/ҷунбишшиносӣ (L1003235) 09a68ee65e0f4aa5a8f7c15b5684ff3a
Example 3کاتابولیسم/котобулисм (L1254011)} 5e0461b9ea134ebe82c5cf436dcaa0f6
Formatter URLhttps://farhang.ru/lexeme/$1.html
Applicable "stated in"-valueno label (Q126368177)

Motivation

[edit]

Q126368177 is a digital reproduction of Yuri Rubinchik’s Persian–Russian dictionary. It contains entries for some items which do not have entries in the other three Persian dictionary properties we currently have (such as the first example here, for example). -عُثمان (talk) 12:50, 5 June 2024 (UTC)[reply]

Discussion

[edit]

‎Tommaseo-Bellini Online ID

[edit]
   Under discussion
Descriptionidentifier for an entry in the reproduction of Tommaseo and Bellini's Italian dictionary hosted at tommaseobellini.it
Data typeExternal identifier
DomainItalian lexemes
Allowed values[1-9][0-9]+
Example 1abbandonare (L642012)441
Example 2abbarbagliare (L1203047)455
Example 3abbastanza (L1165403)490
Sourcehttps://www.tommaseobellini.it/#/items
Planned useadd to existing Italian lexemes and to Mishramilan (মিশ্রমিলন)
Number of IDs in source120,594
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://api.tommaseobellini.it/api/text?delta=0&html=true&id=$1
See alsoTreccani Vocabulary ID (P5844), Garzanti Linguistica ID (P6266), Il Nuovo De Mauro ID (P12420)
Applicable "stated in"-valueDictionary of the Italian Language by Nicolò Tommaseo and Bernardo Bellini (Q902438)

Motivation

[edit]

This property is a numeric identifier for Italian lexemes. (Like Sõnaveeb lexeme ID (P11138), this property's formatter URL links to a single entry, but without any CSS as that is expected to be provided by the main tommaseobellini.it page.) Mahir256 (talk) 20:47, 7 June 2024 (UTC)[reply]

Discussion

[edit]

‎Tesoro della Lingua Italiana delle Origini ID

[edit]
   Under discussion
Descriptionidentifier for an entry in the online reproduction of Pietro G. Beltrami's dictionary at tlio.ovi.cnr.it
Data typeExternal identifier
DomainItalian lexemes
Allowed values[0-9]{6}
Example 1abbandonare (L642012)000042
Example 2abbarbagliare (L1203047)000047
Example 3abbastanza (L1165403)000066
Sourcehttp://tlio.ovi.cnr.it/
Planned useadd to existing Italian lexemes and to Mishramilan (মিশ্রমিলন)
Number of IDs in source44,892
Expected completenesseventually complete (Q21873974)
Formatter URLhttp://tlio.ovi.cnr.it/voci/$1.htm
See alsoTreccani Vocabulary ID (P5844), Garzanti Linguistica ID (P6266), Il Nuovo De Mauro ID (P12420)
Applicable "stated in"-valueTesoro della Lingua Italiana delle Origini (Q111993966)

Motivation

[edit]

This property is a numeric identifier for Italian lexemes. Mahir256 (talk) 20:58, 7 June 2024 (UTC)[reply]

Discussion

[edit]

Wikibase form

[edit]

Wikibase sense

[edit]

‎SALDO sense ID

[edit]
   Under discussion
Descriptionidentifier for a sense in SALDO
RepresentsSALDO (Q82799152)
Data typeExternal identifier
Domainsense
Allowed values/A-Za-zäöå.-_0-9/
Example 1L32192-S1ordbok..1
Example 2L47698-S1pass..1
Example 3L39238-S1galen..1
Sourcehttps://github.com/dpriskorn/LexSALDO
Number of IDs in source130000
Expected completenesseventually complete (Q21873974)
Formatter URLhttps://spraakbanken.gu.se/ws/saldo-ws/lid/html/$1
Single-value constraintyes

Motivation

[edit]

This is an important identifier for senses in Swedish.

Discussion

[edit]
Ainali Belteshassar so9q Vätte Popperipopp Tulipasylvestris Esquilo Daniel Mietchen — with focus on topics related to research (Q42240) VisbyStar Haxpett QubeCube Marcus.linneberg Vitplister Spisen Sollentuna Myohmy671 Autom S4b1nuz E.656

Notified participants of WikiProject Sweden Regards, ZI Jony (Talk) 18:22, 30 May 2024 (UTC)[reply]

Other

[edit]