Wikidata talk:Disclosure of paid editing

From Wikidata
Jump to navigation Jump to search

Forms of Disclosure

[edit]

@ChristianKl, Pyfisch: I'd propose changing the second section as follows:

==Forms of disclosure==
Disclosure must happen at least on your user page and by categorization in [[:Category:Paid Contributor]] or one of its subcategories. This can be done by using a template like [[User:Dr.üsenfieber/PaidContributions]].

This would make it easier to find out if a certain user is a paid contributor. -- Dr.üsenfieber (talk) 12:52, 12 November 2020 (UTC)[reply]

Yes, posting on some dead Wikiproject is not a good form of disclosure. Make user page disclosure mandatory and mention other as also potentially useful Mateusz Konieczny (talk) 06:20, 24 November 2020 (UTC)[reply]

I quite like Dr.üsenfieber's idea of having a standard template that every paid contributor has to have on their user page. This allows for standardizing the language employed and might help avoid bullsh*t corporate phrasing. NMaia (talk) 02:02, 16 December 2020 (UTC)[reply]

Conflict of interest

[edit]

Should this include a statement regarding conflict of interest? There was one proposed in the "code of conduct" draft circulated some time ago - "One should not create items about oneself or people or organizations close to one, and only add statements that have a reliable reference when editing such items". ArthurPSmith (talk) 13:46, 17 December 2020 (UTC)[reply]

  • I don't think it should include such a statement. Restricting a Wikimedian in Residence from creating items about people in his organization inhibits valuable behavior. Policy that focuses on disclosure is better then policy that focuses on not engaging in specific actions. We already have enough rules to shutdown self-interested item creation and have no need for more. ChristianKl18:06, 17 December 2020 (UTC)[reply]
I agree with Arthur. In the case of items created by the users themselves, there is often a problem with the original research. In the case of such a rule, we are not only talking not only about Wikimedians in Residence, but also about advertising/SEO agencies paid to create items about their clients.--Jklamo (talk) 13:58, 19 December 2020 (UTC)[reply]

Always welcomed

[edit]

I think that sentence "While paid editing by organizations that benefit from contributing to Wikidata is welcome," is not appropriate, as in some cases paid editing is not welcomed (mass spam, mass advertising). --Jklamo (talk) 13:43, 19 December 2020 (UTC)[reply]

@Jklamo: I tweaked the wording. What do you think? NMaia (talk) 01:00, 25 December 2020 (UTC)[reply]

Assessment of the impact of this policy to date?

[edit]

@Charles Matthews: mentioned recently some negative experiences with undisclosed paid editing, which were not satisfactorily resolved. Has there been any community survey or other analysis of edits to estimate the scale and impact of the problem, and the effectiveness of this policy? Sj (talk) 17:33, 30 November 2022 (UTC)[reply]

@Sj: So for example User:VanderBot is a bot without flag, and having a non-specific remit. The user page does provide information about the institution and operator behind it. I come across items about members of Vanderbilt staff, without explicit support for notability such as an external ID. Is this paid editing? Is it within the bot policy? Charles Matthews (talk) 09:01, 6 December 2022 (UTC)[reply]
When I opened the VanderBot account, I was thinking that I would be writing an autonomous bot to assist with uploading publications by Vanderbilt researchers. The first task I took on was to create Wikidata items for all Vanderbilt researchers who were likely to have publications. Nearly of them do have publications, although they may not yet be linked to publication items that haven't yet been created. Does that make them not notable? It became apparent that the script I was using would probably never be autonomous -- instead it developed into a general tool to assist users to upload data similar to QuickStatements or OpenRefine. For convenience, I've continued to use the VanderBot account when I use the script to make bulk uploads (about many types of things including many artworks and artists) but otherwise the situation isn't any different than any of the dozens of librarians I know of who use upload tools to make edits about resources that are in some way related to the institutions that they work for. If that's a conflict of interest, we have a problem because if we prohibited that, much of the high-quality editing that's going on in Wikidata would disappear. Steve Baskauf (Baskaufs) (talk) 13:10, 6 December 2022 (UTC)[reply]
  • It is IMO clearly disclosed on your user page that you (and your bot) are paid editors. “Undisclosed paid editing” shouldn’t really be a problem here.
  • The bot policy itself is problematic. It has been pretty much copied from Wikipedia in 2012 (and slightly adapted since). However, automated editing via tools came much later and has never been properly distinguished from proper "bot editing"; unlike at Wikipedia, it is also the case that automation is more like the norm than an exception here at Wikidata. It is thus not clear what the exact purpose of the bot policy is, to be honest. There are advantages of separating "automated editing" (whatever this means specifically) from the main account, but many users just use tools and sometimes even bot frameworks with their regular account as well. —MisterSynergy (talk) 13:24, 6 December 2022 (UTC)[reply]
  • @Baskaufs: Steve, I wasn't particularly choosing VanderBot as a target, but as an illustration of how things stand. The bot page does link to you as operator, as required by Wikidata:Alternate accounts. You don't touch on the point of whether the bot is yours, or the institution's: I would say that it is yours, and that meta:Role account applies here (this is one of the wider concerns I have about Wikidata). I wouldn't say that the paid editing is "clearly disclosed" on User:VanderBot but it is a fair deduction from what is said that you edit with it in a professional capacity, rather than as a volunteer.
Policies here evolve, and I see that Wikidata:Notability, point #2 about description using "serious and publicly available references", probably applies to the biographical items I have seen from the bot. My immediate concern is with author disambiguation for scientific papers. Where person A and person B have a same or similar name, and there is an issue whether one is an author of a given paper P, one does have to fall back on such references: both to decide whether a link here should run from the item about P to the item about A, or the item about B, or neither, or whether in fact A and B are the same person and the items about them should be merged. In fact merging is a big deal in this area: ORCIDs can be sparse and/or duplicated, there are items based on Semantic Scholar that require guesswork (I merged five of those into one yesterday), and so on. The most basic discrimination is from the topical area of papers: when an item about a scientist is not yet linked to by items about a few articles, one starts with no idea whether it's a geneticist or a sports scientist.
Bot policy is at Wikidata:Bots#Bot requirements and could be said to err on the side of concision. I would argue that after ten years Wikidata should be rather clearer about bot editing, and the fact that over half of edits are still automated makes that more important. All dispute resolution is still "bare hands", and it would be nice to think that scales, but as far as I'm concerned it doesn't. Charles Matthews (talk) 17:53, 6 December 2022 (UTC)[reply]
@Charles Matthews For the record, all Vanderbilt researchers who had identifiable ORCIDs at the time of the upload by the VanderBot script had their ORCIDs added to their item metadata. There was also a careful human-mediated effort to avoid creating duplicates. With the addition of university and departmental affiliations, plus references for all non-trivial claims, I'm not sure what more could have been done to make the researchers identifiable and able to be disambiguated as authors. That's one reason why I gave up on ever trying to create an autonomous bot script. This is a far cry from the "stupid" bots who create items described as "researcher" with no metadata that's useful for disambiguation and no references. Steve Baskauf (Baskaufs) (talk) 18:31, 6 December 2022 (UTC)[reply]

Yesterday I carried out a case study for Cody Covington (Q79275894), an early-career scientist, one of the 3.5K+ non-ORCID VanderBot creations. It was interesting. On creation in 2020 the item was nothing much - it was 2022 when an item about Covington's PhD was linked to it, meaning his full name was available. He still doesn't have an ORCID, as far as I can see. I was able to add three external IDs which helped with publications. With the author_strings script by Magnus Manske I added author statements for Covington, on many but not all of his papers as represented in Wikidata. In that way at least the item has become a node in good standing.

The script is basically about search in local areas of the bipartite graph of articles and authors; it is not the first tool in this area, but is able to open up that graph as a kind of gigantic patchwork quilt of clusters of co-authors. It makes use of "triangulation" in the graph, so is not so transparent at first glance.

While working on Covington, I came across two merges involving VanderBot items, with Lawrence J. Marnett (Q86531334) and Philip J. Kingsley (Q90270887), both created without ORCIDs. In both cases LargeDatasetBot created items for the ORCIDs, a matter of a few weeks later. Perhaps LargeDatasetBot was reading from ORCIDs as they were minted. These duplications are only coming to light as the big graph is being looked at in detail.

This has been an interesting discussion. Both compliance and care should be required of bot operators (I run one myself, but its output is passed to QuickStatements). On the initial point in this thread, I don't see why {{Bot}} shouldn't be modified with a Y/N field to indicate paid editing or not. Charles Matthews (talk) 11:29, 7 December 2022 (UTC)[reply]