Talk:Technology Committee/Project requests/WikiRate - rating Wikimedia: Difference between revisions

From Wikimedia UK
Jump to navigation Jump to search
(→‎WikiTrust: new section)
Line 52: Line 52:


WikiTrust is looking for an adopter, have you considered hosting/supporting it? --[[User:Nemo bis|Nemo bis]] ([[User talk:Nemo bis|talk]]) 20:11, 16 April 2014 (BST)
WikiTrust is looking for an adopter, have you considered hosting/supporting it? --[[User:Nemo bis|Nemo bis]] ([[User talk:Nemo bis|talk]]) 20:11, 16 April 2014 (BST)
== User-based quality measures ==
Thanks for this work Michael! I think one of the most challenging aspects of quality is tying contributions to specific users, i.e. how to tie various programs or events to specific user contributions. Wikimetrics measures contributions, but is unable to measure whether contributions are of high "quality". During one of the discussions at the Wikimedia Conference, which will soon be posted on-wiki, there was a discussion about various methods for measuring quality. One of the themes brought up that might be interesting to pursue is to break up the measure of "quality" into different ideas, such as "popularity" or "appreciation"; but more discussion is definitely needed around that and how to measure. One of the main challenges about quality as well is that they vary significantly across various language projects. The measure of citations can vary across articles because other cultures have different customs around the idea of citations.
Another factor to consider, for example, is what is the benefit of the quality of an already pretty decent article versus the addition of new content to a stub. But this brings up the point that it could be risky to get too nit picky at the outset of measuring quality and that perhaps focusing on very general measures (such as number of headings, page views, etc.) might be more helpful and generalizable to all the projects. Thank you for starting the discussion around this topic! Looking forward to hearing your thoughts. I will also send you the link to the discussion around program outcomes when its posted. Regards - [[User:EGalvez (WMF)|EGalvez (WMF)]] ([[User talk:EGalvez (WMF)|talk]]) 21:59, 16 April 2014 (BST)

Revision as of 21:59, 16 April 2014

SMART

I think this is now being used so much as a jargon word that the meaning was lost a while back. Metrics are not SMART, could this please be corrected as this just promulgates further misunderstanding? -- (talk) 20:11, 12 April 2014 (BST)

'SMART metrics' -> 'SMART targets'. Thanks for pointing that out. --MichaelMaggs (talk) 20:57, 12 April 2014 (BST)

Thoughts on task breakdown

Thanks Michael this looks like a great start. I'm just having a bit of a brain dump here. There are some challenges on which input, or/and experimentation will be required. These include:

  • Defining the classes/outcomes
  • Defining the relevant input feature set (and their detection) which will be used as predictors
  • Segmentation - both with regard to the level at which outcomes are assigned (e.g. edit, article) and with respect to

Some of this matters more for machine learning techniques, while we might also use some simple measures (e.g. looking at T1-T2 difference on article specific Wiki ToDo list) which depend on features as proxies for outcomes. On that Re: outcomes, one simple thing for machine learning particularly as a first step might just be: 1) was the edit positive, 2) negative, 3) neutral with respect to improving the article. We could then provide a breakdown of such edits. . We should also be cautious that our feature selection doesn't exclude some widely missed but important features (e.g. alt-text). The rubric below might be a good way to 'present back' and assess improvements (with some aggregation method probably for overall improvement).

Assessment area Scoring methods Score
Comprehensiveness Score based on how fully the article covers significant aspects of the topic. 1-10
Sourcing Score based on adequacy of inline citations and quality of sources relative to what is available. 0-6
Neutrality Score based on adherence to the Neutral Point of View policy. Scores decline rapidly with any problems with neutrality. 0-3
Readability Score based on how readable and well-written the article is. 0-3
Formatting Score based on quality of the article's layout and basic adherence to the Wikipedia Manual of Style 0-2
Illustrations Score based on how adequately the article is illustrated, within the constraints of acceptable copyright status. 0-2
Total 1-26

from https://en.wikipedia.org/wiki/Wikipedia:WikiProject_United_States_Public_Policy/Assessment#Quantitative_Article_Quality_Assessment_Metric

It's also the case that the salient features may vary with a combination of temporal and editor-interaction factors. Early stage articles benefit greatly from addition of different features to later stage ones (e.g. amongst others http://scholar.google.co.uk/scholar?cluster=2386972856451571180&hl=en&as_sdt=0,5).

There's also an interesting point re: namespace contributions on talk and article pages, presumably in the first instance we're looking only at article contributions.

It is also worth noting that whatever we do, we shoud where possible consider implications for non-English Wikipedias, in particular the ways in which references are used are (I believe) different in different Wikipedias. This may well also be true of different projects.

Finally, we should also note if we did anything sucessfully, a number of benefits might also be gained including automation of quality (or semi-automation) within projects, etc., and potential for new editor engagement experiments e.g. sending editors to articles which we think might be easily improved (a more sophisticated 'wiki to do' tool). Sjgknight (talk) 10:34, 13 April 2014 (BST)

Just crossed my mind it's also worth noting the potential benefits for e.g. education being able to take the contributions of a particular editor, and look for particular features (e.g. use of 'cite' templates) across the contributions. That would have benefits outside of Wikipedia (on Mediawiki) where analytics on writing style and content could be conducted. Sjgknight (talk) 10:45, 13 April 2014 (BST)
: Regarding assessment areas, take a look at Table 3 of Stvilia, Besiki, et al. "Information quality work organization in Wikipedia." Journal of the American society for information science and technology 59.6 (2008): 983-1001.. It gives a good list of criteria. For example, "Accessibility" has "caused by (1) Language barrier (2) Poor organization (3) Policy restrictions imposed by copyrights, Wikipedia internal policies, and automation scripts" and suggests actions such as "Reorganize, duplicate, remove, translate, split, join, rearrange"." See the whole list, I think it might be helpful for your chart, Simon. Jodi.a.schneider (talk) 10:48, 13 April 2014 (BST)

Related wikimania talks

WikiTrust

WikiTrust is looking for an adopter, have you considered hosting/supporting it? --Nemo bis (talk) 20:11, 16 April 2014 (BST)

User-based quality measures

Thanks for this work Michael! I think one of the most challenging aspects of quality is tying contributions to specific users, i.e. how to tie various programs or events to specific user contributions. Wikimetrics measures contributions, but is unable to measure whether contributions are of high "quality". During one of the discussions at the Wikimedia Conference, which will soon be posted on-wiki, there was a discussion about various methods for measuring quality. One of the themes brought up that might be interesting to pursue is to break up the measure of "quality" into different ideas, such as "popularity" or "appreciation"; but more discussion is definitely needed around that and how to measure. One of the main challenges about quality as well is that they vary significantly across various language projects. The measure of citations can vary across articles because other cultures have different customs around the idea of citations.


Another factor to consider, for example, is what is the benefit of the quality of an already pretty decent article versus the addition of new content to a stub. But this brings up the point that it could be risky to get too nit picky at the outset of measuring quality and that perhaps focusing on very general measures (such as number of headings, page views, etc.) might be more helpful and generalizable to all the projects. Thank you for starting the discussion around this topic! Looking forward to hearing your thoughts. I will also send you the link to the discussion around program outcomes when its posted. Regards - EGalvez (WMF) (talk) 21:59, 16 April 2014 (BST)