Guidelines for Academic Requesters

From WeAreDynamo Wiki
Revision as of 23:15, 12 August 2014 by Excited iguana (Talk | contribs)

Jump to: navigation, search

Doing Ethical Research with Amazon Mechanical Turk Workers and Communities - v0.2

"Treat your workers with respect and dignity. Workers are not numbers and statistics. Workers are not lab rats. Workers are people and should be treated with respect." - turker 'T', a Turkopticon moderator

Goal: Guidelines Turkers can share with researchers or the IRB

Lots of academic research happens through AMT or about Turkers, but ethics boards (IRBs) who review and approve research protocols often don't know how workers want to be treated. Let's collectively author guidelines that will educate researchers and let Turkers hold them accountable to a higher standard.

Suggestions? Thoughts?

Please feel free to make edits directly! If you'd like to get feedback or just throw an idea out there, drop a post at the discussion thread for this page.

TODOs

  • How do people feel about researchers logging, scraping forum data, lurking on forums to write papers unannounced? Draft a guideline.

This is a strawman draft. If you take strong issue with the content of these guidelines, please propose an alternative within the wiki or add to the discussion in the forum.

What counts as research?

Whether your research requires a human subjects or IRB clearance depends on your institution. We hope that human subjects boards find this document clarifying about details of ethical engagement with Amazon Mechanical Turk communities.

Even if you don't need an IRB, these guidelines outline the most basic Turker expectations of what constitutes ethical research engagement. The complexities of long-term research programs require ethics beyond IRB requirements. That means that if you are a machine learning researcher gathering training data for your newfangled convolutional neural network, or you are doing a controlled experiment on a HIT, or you are doing fieldwork with Turkers, these guidelines are meant for you.

Guidelines

Basics of how to be a good requester

A more extensive version of this section can be found here: Basics of how to be a good requester

There are many basics to being a good requester and getting good results. Several sources for additional opinions on how requesters can effectively use MTurk, including specifics of HIT creation, are linked in the section Other guidelines as resources.

Clearly identify yourself.

This ideally should include: the full name/s of the researcher/s responsible for the HIT's project; the university/organization/s they're affiliated with and its state/country; their department name, lab, project group, etc; and any direct contact information you're willing to provide. The more places that more of this information is clearly provided, the better; requester display name, HIT description, HIT content text visible in preview, and survey consent/intro page (in order of increasing amount of information that would be appropriate there).

Learn why requesters providing this information up-front benefits both workers and requesters.

Workers generally are more willing to take a chance on a requester they're not familiar with (particularly one who hasn't yet been reviewed by any workers on Turkopticon) if they know it is an academic requester, because it is a sign of legitimacy, and because the university 'chain of command' and IRB oversight are one of the few means of recourse workers have if something goes wrong on MTurk. Amazon takes a very hands-off approach to issues workers may have with unfair requesters.

Turkers who want to know (for the above reasons) can often figure out much of this information for an academic requester who doesn't provide it, but this takes time and effort that could be better spent on other things if the requester would provide it.

For example, when a large batch of HITs was posted by a new requester with no Turkopticon reviews and whose only visible identification was their first-name-only requester display name, some turkers hesitated, trying to decide if it was too risky to do more than a few. When a turker was able to identify the requester's full name and affiliation with a major university, the turkers felt more confident to do a larger quantity of those HITs.

And as another example, due to a lack of obvious indications of its identity/legitimacy, an academic research project trying to improve spam-filtering caused concerns for some turkers that it may have been posted by spammers trying to use MTurk to improve their own spam to bypass filters; until they became aware of the academic nature of the HIT, concerned workers avoided doing the HIT, and posted negative reviews and discussion comments.

Always use a consent/intro page or paragraphs.

It seems many universities currently exempt online surveys from many or all IRB requirements, or at least exempt online studies from certain departments which don't cover sensitive topics. Even if your university doesn't force you to, it's always a good idea to use a consent/intro page at the beginning of a survey, and/or paragraphs in the HIT content text for non-survey tasks.

Learn what to cover in your consent form.

  • clearly identify yourself
  • clearly state the pay to be expected (and make sure this statement of pay matches what the HIT is currently posted for; some consent pages accidentally state a higher pay than the HIT's actual current MTurk pay), and how soon approval can be expected;
  • clearly state any possible bonuses and/or follow-up studies you may qualify for, and how soon their issuance can be expected;
  • state the number of minutes you expect it to take a worker to complete the study; state any reasons for which you plan to automatically reject submissions;
  • and state a title for the study and however much description of it you reasonably can without compromising it.
  • provide an email address to contact the IRB since Turkers live in many places and may not be able to afford non-local phone calls

Provide reasonable time estimates and limits.

Clearly state up-front generous estimate of how long the task will take for someone not already familiar with it to do it carefully and well. Err on the side of overestimation, to avoid disappointment/frustration. Workers calculate estimated earnings based on time estimates. Optimistic estimates can mean that workers will rush through towards the end so they can maintain the effective pay rate they need to maintain. Other workers will return the survey in protest, losing out on all the compensation. Displaying an accurate progress bar as workers move through the survey helps them know when they're nearing the end.

Set the 'Time Allotted' limit for your HITs to an amount of time much longer than the expected amount of time needed to complete the survey or task. Workers like to have leeway in case your time expectation was underestimated, and to have time available if needed to deal with interruptions that occasionally come up, like ISP/browser malfunctions, restroom breaks, phone calls, visitors or family members needing attention, and such.

Approve work as soon as possible.

Some requesters try to compare MTurk approval times to the time between paychecks at a traditional job, to say that they think workers shouldn't complain about waiting for payment. But with a traditional job, a worker knows they'll get paid for the time they've reported working, even if they don't get the pay until days/weeks after they did that work, and they'll know for sure how many days/weeks that wait should be. Even if the employer fires the worker in the meantime, the employer is still legally obligated to pay what the worker earned. With MTurk approval times, a worker is actually waiting to see if they'll get paid at all for the work they've done, and if so, will it be at the end of the auto-approve time (which the worker may not know) or at some point sooner.

Set your auto approval time as short as reasonably possible for the time you'll need for any checking of the work; 7 days should generally be more than sufficient, and it's better if it's less than that. Many requesters approve work in less than 3 days, some in less than 24 hours. Read more

Don't violate workers' trust and the MTurk Terms of Service.

Don't require workers to provide personally identifying information to complete your HITs; common problems include asking for email addresses (requesters can use MTurk to send messages to workers without having the workers' email addresses), exact birthdates (year alone, or month and year, should be sufficient), or real names.

Don't require workers to register on sites that require this kind of personal information to complete your HITs. If a requester has a project that requires workers to register on a special site the requester set up just for the HIT, let workers use their MTurk worker ID# or a username of their choice as the unique login identifier, instead of unnecessarily expecting an email address be provided for this purpose.

Many workers also object to HITs that require the use of Facebook accounts, which are intended to be quite personally identifiable.

There has been at least one incident where a requester carelessly exposed hundreds of turkers' email addresses that the requester had collected.

Don't require workers to download software programs or apps to complete your HITs. This can be a major security risk for workers, particularly if the program comes from an unofficial source set up just for the HIT. It became known in 2014 that an academic researcher had performed a study on MTurk intended to see how low of pay levels would still convince workers to download and install a program that pretends to be malware, so many workers who are aware of this study are now even more hesitant to go along with download-requiring HITs even from seemingly legitimate requesters.

If you don't follow the Terms of Service, particularly in the aforementioned ways that pose potential threats directly to workers, some workers will give your requester account negative Turkopticon reviews with flags for ToS violations, and report your HITs to Amazon. Read more

Other points to consider

  • Communicate with workers promptly and politely: Check the email account associated with your MTurk requester account frequently. Respond to messages from workers as quickly as possible, preferably in less than 24 hours. Read more
  • Be clear about bonuses: If a bonus is offered state as clearly as possible what the potential amount will be (or range of possible amounts and expected mean) and how to earn it, and how soon workers should expect it to be paid. Pay in as timely a manner as possible. In your bonus clearly state the title of the HIT and it's date, and briefly re-explain how the bonus was earned/calculated. Read more
  • Avoid duplicates/retakes in fair ways: Please don't block workers through MTurk just to prevent retakes! Being blocked by requesters can put a worker's MTurk account at risk of being suspended (banned from all future work on MTurk). Blocking should generally only be a last resort against an occasional worker who submits such terrible work that they're clearly not trying. You can simply configure MTurk to only allow each worker to accept the HIT once. But if your survey will be posted more than once read on to learn how to restrict retakes. Read more
  • Compensate for qualifier/screener surveys: If you are using qualifier surveys, compensate all those who correctly complete the survey.Read more
  • Avoid completion code malfunctions: Make sure your survey will actually provide the promised completion code to workers who complete it and it is correctly saved in your database.Read more
  • Avoid other causes of unfair rejections: Rejections leave workers with a mark counting against them on their 'permanent record' at MTurk that may take them below a qualification threshold necessary for certain other HITs. Before deciding a rejection is justified, be sure you've considered these factors: Read more

Ethical engagement with worker forums

Turkopticon is a site specifically for workers to review requesters, and is augmented by browser extensions and userscripts that display requesters' average ratings within the MTurk interface.

General forums for MTurk workers include, in alphabetical order:
CloudMeBaby
mTurk Boards
mTurk Forum
mTurk Grind
mTurk Wiki Forum
Reddit's /r/mturk and /r/HITsWorthTurkingFor
Turker Nation

Some workers have strong feelings about which forums they prefer. Dynamo is taking a neutral position and not recommending for or against any specific forum over another, but encourages each requester to consider the options and choose one or more forums where they can communicate with workers about their HITs, responding to questions, suggestions, and complaints.

Experiments might sow confusion and mistrust among participants in Mechanical Turk forums.

Why this matters (history): For example, one academic experiment simulated requesters with varying ratings in Turkopticon to measure the effects of ratings on worker behavior and outcomes. Turkers found some of the requesters and smelled something fishy but did not know if it was a scam, academic research, vandalism, or something else; through what amounted to at least 50 hours of sleuthing over two days, Turkers across reddit and turkopticon-discuss hypothesized that this was a research project. The researcher wanted to make positivist knowledge claims about ratings, workers, and the economics of Turking, but neither he nor the IRB understood that:

  • simulating fabricated requesters and reviews would break the fragile trust that makes Turkopticon ratings meaningful to workers
  • that worker harm includes not only unpaid wages in AMT, but also the time they spent anxiously trying to track down these mysterious apparitions

Principles:

  • Protect not only workers, but protect the quality of interaction in the worker forums that enable workers to make a living and a life through Mechanical Turk.
  • Do not assume that any experimental practice involving forums is legitimate unless expressly forbidden. Worker forum members cannot anticipate all the interventions researchers might imagine.
  • Discuss research protocols involving worker forums with workers from those forums themselves (best) or with forum administrators to get guidance on the risks, needs, and vulnerabilities of those forums.

Fair payment

Crowdsourcing workers are a labor force. While we cannot speak for all crowd workers, many depend on income from crowdsourcing as a supplementary or primary income. Crowdsourcing workers are legally considered contractors and therefore are not protected by any minimum wage laws. When requesters pay a fair wage and treat workers like people, both sides receive positive results.

Many workers consider $0.10 a minute to the minimum to be considered ethical, though many studies pay more and there are excellent arguments to pay more (see below). Tasks paying less than $0.10 a minute are likely to tap into a highly vulnerable work pool and constitutes coercion.

Posting to Turk is not just like collecting a survey

Posting an academic survey on Amazon Mechanical Turk is different than traditional forms of survey collection. Workers presume that they will be paid a fair wage and do not respect requesters who offer an extremely low rate, though what that means for different crowdworkers (India vs. US) may be different. Unfortunately some requesters may interpret this to mean they are allowed to pay extremely low rates and consider this to be "the norm of the market". This is not acceptable to the majority of the established worker community. These requesters survive because of the constant influx of new workers who have not established themselves in the workplace and the large population of international workers who view these lower payments as acceptable.

What is ethical pay for Turkers in studies?

Consider the estimated duration and difficulty of your task when deciding about payment. On many turker forums a rate of $0.10 per minute is considered the bare minimum that most workers will work for. Although this is only $6 per hour and far below minimum wage standards in Western countries, it is the current guideline presented to requesters across a number of forums. Many more experienced and knowledgeable Turkers, by contrast, refuse to work for less than $0.15-$0.20 per minute or more.

To date, many academic requesters' published papers, and many of the few university IRB websites that have any specific guidelines about MTurk, have stated that they paid or recommend paying rates equivalent to $2-$3 or less per hour ($0.03-$0.05 or less per minute) on MTurk, because it's slightly better than or similar to reported average rates (perpetuating the status quo), and sometimes mention reasoning such as that they think the workers all either don't really need or care about the money or are in low-income countries where this seems like a lot of money. There is a large US university that has routinely posted survey HITs for less than $0.01 per expected minute. There has been a lot of debate in turker forums about what a fair, or even acceptable, rate of pay is or should be. So far, the most commonly mentioned rate among workers as a suggested minimum has been $6/hr ($0.10/min), largely not because this is what most workers really prefer and are satisfied with, but as an easy-to-remember figure and a relatively realistic target in light of the preponderance to date of both academic and non-academic HITs paying much less so far. There isn't one answer appropriate for all situations, but here are some points to consider when trying to decide what a fair and ethical rate of pay on your HITs would be for US-based workers:

  • Honest US-based turkers will generally be paying taxes on their MTurk earnings as self-employment.
  • There is a lot of unpaid overhead time involved in turking, including: looking for the next suitable HITs to do, taking uncompensated qualification tests to hopefully qualify for certain HITs, checking reviews for unfamiliar requesters to decide whether to work on their HITs, writing reviews of requesters, communicating with other workers on forums, dealing with some of Amazon's security measures such as periodic Captchas and forced logouts that can interrupt workflow, dealing with occasional malfunctions of the worker's ISP/browser/computer, communicating with requesters (or in about half the cases, apparently futilely sending messages into a void) about questions/problems/suggestions, keeping track of the work they've done and the payments and bonuses they have or haven't received so far, checking their records of work they've done to see if it's safe to take a survey that threatens rejections if you take it more than once, and more. All break time (even going to the restroom) is also uncompensated.
  • The more specialized knowledge/skills/characteristics, and/or the more stringent the qualifications (such as higher number of HITs approved, higher approval rate, scores on requesters' custom quals, and/or Masters) that your HITs will expect or require, the higher the pay rate for it generally should be if you want to continue to be fair (a fair minimum pay rate logically would only be considered as fair for HITs with minimum requirements, just like more-qualified/experienced workers in the traditional workforce generally expect to receive higher pay than less-qualified/experienced workers).
  • Although self-employment work is not legally obligated to comply with minimum wage laws, they are commonly used as a benchmark in evaluating what pay would be fair and ethical.

Learn why a fair minimum wage might be as high as $21.75

  • As of Aug 2014, the current national minimum wage has been $7.25/hr (~$0.12/min) since July 2009, due to the final of three gradual tiers of increases that were passed in May 2007.
  • As of Aug 2014, there is currently a movement trying to raise the national minimum wage to $10.10/hr (~$0.17/min), but a bill that would've done that by late 2016 is stalled in Congress due to the political situation. The President was still able to set $10.10/hr as a minimum wage for employees of companies contracting on federal government projects, which will take effect for contracts that are new or renegotiated after Jan 1, 2015. - Wage and Hour Defense Blog
  • An increasing number of states/territories, and even some cities, have stepped in to raise their own minimum wages higher than the national one. The current highest state/territory minimum wages as of Aug 2014 are $9.00/hr ($0.15/min) in California (increasing to $10.00/hr (~$0.17/min) on Jan 1, 2016), $9.32/hr (~$0.16/min) in Washington state, and $9.50/hr (~$0.16/min) in DC effective July 1, 2014 (increasing to $10.50 (~$0.18/min) on July 1, 2015, to $11.50 (~$0.19/min) on July 1, 2016, and annual inflation-indexed increases thereafter). - National Conference of State Legislatures
  • If increases in the national minimum wage had kept pace with basic inflation of consumer prices since 1968, it should be $10.86 (~$0.18/min) as of 2013. - National Employment Law Project
  • If increases in the national minimum wage had kept pace with nationwide productivity growth in all industries since the 1940s, it should be $16.54 (~$0.28/min) as of 2012. - Center for Economic and Policy Research
  • If only considering 'non-farm' productivity growth (i.e. excluding agricultural workers from the calculation), the minimum wage should be $21.75 (~$0.36/min) as of 2012. - Center for Economic and Policy Research
  • "If minimum-wage workers received only half of the productivity gains over the period, the federal minimum would be $15.34 [~$0.26/min]. Even if the minimum wage only grew at one-fourth the rate of productivity, in 2012 it would be set at $12.25 [~$0.20/min]." - Center for Economic and Policy Research
  • The per-subject costs for other non-MTurk ways researchers can recruit survey participants reportedly tend to be much higher than you would be paying MTurk workers even at much more fair and ethical pay rates than is currently prevalent. Researchers who say they 'don't have the funding' to pay better rates on MTurk should consider that the alternatives are often to pay even more for a participant pool that may be less diverse and in some cases less attentive than MTurk workers. Even in the case of unfunded student projects, please try to consider that the total difference between fair and unfair pay will usually be less than you might think, particularly compared to the other costs you've committed to in pursuing your education; even just the textbooks.
  • The availability of non-US workers on MTurk has apparently been gradually decreasing since Amazon stopped accepting registrations of new international worker accounts in late 2012. And with the exception of India (the only country besides the US that has ever been able to receive direct monetary payment), there were never a large percentage of workers from any other particular country. So even if you don't specifically require workers to be in the US to accept your HITs, a large and re-growing proportion of them will be in the US, unless you specifically exclude US workers from your HITs.

Learning more about the demographics and other statistics of the turker workforce can help requesters make more informed decisions about how to structure and compensate their HITs. Many turkers are indeed casual participants performing a small number of HITs, but studies indicate the vast majority, perhaps 80%, of the HITs completed on MTurk are performed by turkers who are in the top 10% or so of productivity among active turkers, each completing hundreds or thousands of HITs per week; and many of those put forth that much effort because the money is very important to them to make ends meet, whether they have other significant sources of income or not.

Several dozen academic papers and blog posts, covering much of the above information and other related topics (many demographics, as well as work consistency, work distribution, etc), are listed with links and quotes of the relevant portions, at 'Demographics of Mechanical Turk' by turker 'clickhapper' at mTurk Grind.

Why not pay a small, token amount?

Many Turkers won't work for less than $0.10 / minute. That means if you pay $.02 / minute, you are getting workers who are too desperate to boycott. This constitutes coercion. If you really cannot pay the minimum, then it is better to pay nothing because at least then you get true volunteers.

What about my research sample?

No matter what guideline is used, it has always been up to the individual worker to decide how much their time is worth, but when large groups of workers are excluded from research because of poor payment, the results of the research cannot be considered a valid sampling of a population.

What if I want Turkers from different countries?

Since Mturk is a worldwide website, what may be an acceptable wage in Asia is not acceptable in many North American and European countries. If a requester would like to use the entire range of worldwide users they should pay the same wage in India as they do in Indiana. If a requester would like to use only workers from emerging economies, it would be acceptable to break from the Western payment norms and price work according to fair wages within those countries.

What if my task takes longer than I thought, so the wages sink below what is fair?

If an amount of pay you expected to be a fair rate turns out not to be because you accidentally underestimated how long your survey would take for reasonably-efficient workers to complete, consider adjusting for this situation by sending bonuses to the workers to make up the difference (could base the bonus amount on the mean or median time to complete, in case a few workers are unusually inefficient). In July 2014, a requester did this unexpectedly for workers who took one of their surveys, basing their target pay rate on Washington state's $9.32/hr minimum wage.

User interface and title of your HIT

Let's use a common format for describing academic HITs. It signals good intentions to the Turkers, it helps spread word to other academic requesters, and it builds positive credibility for the larger community.

Please title your HIT in this format:

[HIT title], [Estimated completion time], [hourly rate=reward/time]

For increased visibility, consider adding "Dynamo" or "Dynamo-compliant" to the end of the HIT title.

So that workers can find these HITs, include "dynamo" in the keyword set for the HIT.

Within the HIT text itself, somewhere please include "This HIT follows the Dynamo guideline for academic requesters." Including a link to the IRB approval is appreciated, so that we know who to contact.

Who should you contact?

If you're not sure if something is ethical research to engage with Turkers on, who should the academic contact? TODO

Here are some resources on ethics of research online:

http://plato.stanford.edu/entries/ethics-internet-research/

http://aoir.org/reports/ethics2.pdf

http://irb.uconn.edu/internet_research.html

Other guidelines as resources

Epigraph

"Turking is work, even if it is for science, and academic researchers shouldn't assume that people are happy to do it for fun. They should pay and respect people's time." - Dr. Lilly Irani (of Turkopticon), Department of Communication, University of California at San Diego

"What we need to do is teach requesters about the human side of Mturk. Mturk encourages anybody that uses Mturk to think of us as little computing units, not as people." - Project2501 (a Turker)

"Dehumanization is the result of an unjust order that engenders violence in the oppressors, which in turn dehumanize the oppressed. Because it is a distortion of being more fully human, sooner or later being less human leads the oppressed to struggle against those who made them so. In order for this struggle to have meaning, the oppressed must not, in seeking to regain their humanity (which is a way to create it) become in turn oppressors of the oppressors, but rather restorers of the humanity of both. This, then, is the great humanistic and historical task of the oppressed: liberate themselves and their oppressors as well." - Freire's Pedagogy of the Oppressed

Signatories