Nature of Tasks in Citizen Science

September 30th, 2014 | Posted by citizenscience in article | crowdscience | modularization | protocol | task - (0 comments)

Franzoni and Sauermann (2014), in their article titled Crowd science: The organization of scientific research in open collaborative projects, suggest a classification of crowd science projects according to task complexity and structure, which also provides an explanation of why and how projects will perform (regardless of being successful or not).

They define task complexity as the relationship between different individual sub-tasks. Less task complexity (usually preferred) is attained by minimizing individual sub-tasks. Therefore, a large and complex problem can be modularized, by being divided into many smaller modules, to address smaller problems, with a strategy or architecture specifying how modules fit together. Modularization is taken by the authors to allow for greater division of labor. Then, Franzoni and Sauermann, refer to task structure to denote how well defined the structure of sub-tasks is. Task complexity and task structure are useful for examining what amateurs are asked to do. Several “citizen science” projects, such as Galaxy Zoo, for example, ask for contributions that only require skills that are common to the general human population. For example, in Galaxy Zoo, when classifying galaxies, citizen scientists should be able to work independently on their sub-tasks, without the need to consider what other project participants contribute. This modularization – or granularity of tasks, as Benkler and Nissenbaum (2006) called it – allows people with different levels of motivation to work together by contributing small or large grained modules, consistent with their level of interest in the project and their motivation. Furthermore, modularization is compatible with loosely coupled work (Olson & Olson, 2000), which has fewer dependencies, is more routine, and tasks and procedures are clear. As a result, less amount and frequency of communication are needed to complete the task.

According to Franzoni and Sauermann, crowd science projects could benefit from modularization, by differentiating task complexity and structure aiming at citizens with different skills and expertise at different stages in a project. Different crowd science projects display more or less clearly formulated task complexities and structures and can be classified accordingly.

It should be noted that not only the organization of crowd science projects, often involving a number of independent participants in multiple locations, demands for independent and well-structured tasks, but also the emphasis on controlled and prescribed protocols and validation and accuracy of data. As Bonney et al. (2009) put it:

Citizen science data are gathered through protocols that specify when, where, and how data should be collected. Protocols must define a formal design or action plan for data collection that will allow observations made by multiple participants in many locations to be combined for analysis.

The need for accurate and validate data require convergent tasks  (Nickerson, 2014) to be assigned to citizen scientists, meaning that scientists look for a single output from contributors.  Classification of stars or annotation according to standard labels from experts are examples of convergent task. Since  in most citizen science projects reported in the literature we have examined, citizen scientists are only expected to perform tasks according to prescribed protocols, but not to design those tasks, which remains scientists’ responsibility,  it is worth reflecting on Nickerson’s thought-provokings words (which refer to Taylor’s advocated division between design and performance of tasks):

Distressingly, current crowd work seems to be at the early stages of recapitulating factory employment practices (p. 40).



Benkler, Y., & Nissenbaum, H. (2006). Commons-based peer production and virtue. Journal of Political Philosophy, 14(4), 394-419.

Bonney R (Bonney, Rick); Cooper CB (Cooper, Caren B.); Dickinson J (Dickinson, Janis); Kelling S (Kelling, Steve); Phillips T (Phillips, Tina); Rosenberg KV (Rosenberg, Kenneth V.); Shirk J (Shirk, Jennifer), 2009, Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy, BIOSCIENCE 59 (11): 977-984.

Franzoni, C., & Sauermann, H. (2014). Crowd science: The organization of scientific research in open collaborative projects. Research Policy, 43, 1–20.

Nickerson, J. V. (2013). Crowd work and collective learning. In A. Littlejohn & A. Margaryan (eds.), Technology-Enhanced Professional Learning (pp. 39-47). Routledge.

Olson, G. M., & Olson, J. S. (2000). Distance matters. Human-Computer Interaction, 15, 139–179.



Where is Citizen Science published?

September 18th, 2014 | Posted by citizenscience in analysis | data | scientometrics - (2 comments)

Today, the terms “citizen science” and “crowd science” are quite the buzzwords. Searching the internet for these terms will render millions of hits. However, what happens when you search the Web of Science for the actual scientific publications? Is citizen and crowd science really publishing properly peer-reviewed articles, and if so, where can these articles be found?

We used the following search string in the Thomson-Reuters Web of Science Core Collection (requires subscription) to cover as much citizen/crowd science as possible:

TS="crowd science" OR TS="citizen science" OR TS="crowdsourcing" OR TS="crowd sourcing"

The search produced 1462 articles, which we then imported to a clever application called HistCite, which allows you to sort and search the output data. HistCite makes possible, for example, to sort the publications according to journals (click on the image below to access the data interactively):



The journal with the most citizen/crowd science publications is the open-access multidisciplinary Plos One. Skipping the conference proceedings for now, the next journal in line is Frontiers in Ecology and the Environment followed by the computer science journal IEEE Internet Computing. The rest of the list seems to concern biology, conservation, ecology and one medicine journal. As a very preliminary observation, there appears to be a tendency for the mainstream of citizen/crowd science to appear within the disciplines of “life”, “computers” and “medicine”.

However, there are other things that you can do with scientometric data. Using the application VosViewer, it is possible to visualize the journal according to a principle called bibliographic coupling of sources.  This means, that articles that cite similar sources (other articles, books, etc.) are regarded as being “close” to each other. This can also be performed on a journal level. What we see below is, thus, journals that have clustered together closely because they cite similar references.


(click to enlarge)

Here we find one cluster on the right hand side (red) that focuses on ecology/zoology/conservation and another blue cluster on the left side, which is dominated by computer science journals. In the middle, the single journal Plos one forms a center of gravity. This seems to verify, at least visually, the top-10 list that we produced above.

If we zoom in on the red cluster (if your computer supports Java, you can do this interactively), we see the following:


(click to enlarge)impact2

Here, at least from a preliminary point of view, there seems to be an interesting line of research. What you can do then is to return to HistCite and look for the most cited authors, the individual articles that are most cited globally, or the publications that are the most cited within the dataset (n=1462).

Of course, this is not a complete picture. Scientific publications always lag behind “science in action”. If there is a strong trend right now, it will not show in the publication data until years later.