Value of Data Labor

  • Licenses should protect data labor, even for derivative works like LLMs (protect_data_labor, 2)
  • Scale tips to the amount they stole from commons (“stolen” data labor) rather than model labor (scale_labor_tipped, 2)
  • Any model is better than no model (different from the first better_than_nothing, which refers to open weight better than closed) (better_than_nothing_two, 1)
  • There is more to building a model than just its training data (model_labor_matters, 2)
  • Compensation for data labor may be an intractable problem (data_compensation_intractable, 1)
  • Creators don’t get paid by the model builder because they get paid by the platform (get_paid_platform, 1)
  • One should get off platforms that collect data (in this case Reddit) if they truly disagree with their ToS (get_off_platform, 1)