JB Rubinovitz Rubinoblog

AI researchers propose ‘bias bounties’ to put ethics principles into practice

AI researchers propose ‘bias bounties’ to put ethics principles into practice

I’m featured in a writeup about our cross-institutional paper on trust mechanisms in AI.

Bias Bounty Programs as a Method of Combatting Bias in AI

This policy comes about as a response to continuous deployment of biased Artificial Intelligence systems into production, only to quickly be found biased with the only consequences being unfavorable news coverage. Bias Bounty Programs could provide scalable oversight to harmful discrimination by AI.

This policy assumes that this is unfavorable for both parties:

Those affected by bias - they will rarely receive enough news coverage of the bias to maybe get an apology and maybe a fix

Those deploying biased systems - usually a homogenous small group deploying a biased system, with no clear guidelines and imperative to debias it (in fact they could be penalized for taking the time to do so), but will maybe get a retroactive slap on the wrist if the media picks up on their bias.

Proposal

A similar problem exists in information security and one solution gaining traction are “bug bounty programs”. Bug bounty programs seek to allow security researchers and laymen to submit their exploits directly to the affected parties in exchange for compensation.

The market rate for security bounties for the average company on HackerOne range from $100-$1000. Bigger companies can pay more. In 2017, Facebook has disclosed paying $880,000 in bug bounties, with a minimum of $500 a bounty. Google pays from $100 to $31,337 for exploits and Google paid $3,000,000 in security bounties in 2016.

It seems reasonable to suggest at least big companies with large market caps who already have bounty reporting infrastructure, attempt to reward and collaborate with those who find bias in their software, rather than have them take it to the press in frustration and with no compensation for their efforts.

Determining what bias is

It is assumed here that the company will determine what bias is in accordance with “their company’s values” as they want to market to perceive them.

Potential Problems

Spam

Possibly the most cited issue with security bounties is the amount of false reports that come in, and the amount of people it takes to triage them. However, this does not seem like too much to ask from companies who professionally make content prioritization software.

Reluctance to Hire Triage Staff

These companies are controversial already for not hiring staff to even interact with paying customers, so this could be a hard sell. However, press pressure has led to hiring of more moderators at both Youtube and Facebook recently.

Adoption

Option A. Voluntary Enrollment

Companies decide this is a great idea (or better than eventual government intervention) and budget and implement them themselves.

Option B. Regulation

Bounty programs can be mandated by the government, most easily in any software government themselves use.

UX Practices

Where should the bias bounty program live?

  1. In the application? (e.g. under “help”)
  2. On a company run separate webpage?
  3. An independent bias bounty marketplace where companies can work together to share biased models?

I think a combination of one and two are the most likely, with one and two being mobile and web versions of submission forms, respectively.

Conclusion

This is a first attempt at solving a hard problem. Feel free to send jb@rubinovitz dot com feedback. I would love to figure out a way to hasten the iterations on debiasing in production AI models while compensating those affected by them who have to expend labor reporting them.

Acknowledgements: Thank you to Omar Bohsali for sharing his expertise in information security bounties.

NYT: Some Things About Tech Were Good in 2017. No, Really.

NYT: Some Things About Tech Were Good in 2017. No, Really.

I have not spent the time it deserves to do a writeup on Bail Bloc, but working on it as a co-creator last year was one of the best things I have ever done and I will definitely keep holding my work to this standard. Thanks NYT for mentioning us in “Some Things About Tech Were Good in 2017.” That’s what I at least was going for :)

NIPS 2017

Here are notes from two talks I particularly enjoyed

Fast.ai vs Deeplearning.ai: which deep learning courses should you take?

As deep learning has become more popular, two courses stood out to me as having really useful teaching styles and reputable staff behind them: Fast.ai and Deeplearning.ai. As someone who has a theoretical background in deep learning and picked up Tensorflow on an as needed for a project basis, I was really interested in learning how to build complex deep learning architectures from scratch, and I’m always found I can sharpen my skills by hearing different experts describe known concepts.

Along with my goals of sharpening skills, I have been working alongside folks at the Recurse Center who are learning deep learning from scratch, and am using some of their feedback for evaluating the courses from that basis as well.

So, I delved into both courses and will share here what I found.

Theory

Theory in Fast.ai

The way Fast.ai handles theory is through visual examples and take home readings. I do not think this would be sufficient for a strong grasp of theory unless you can form a great study group to go over the readings and concepts. Both courses do have forums though, which could help with this.

Theory in Deep Learning.ai

The theory teaching in deeplearning.ai is really strong, with optional videos to watch that are of use if you have a calculus background to go into detail.

Winner: Deeplearning.ai . It’s hard to beat Dr. Ng at explaining theory.

Applications

Applications in Fast.ai

I think it helps a lot that Jeremy Howard, a fast.ai professor, started a deep learning startup after seeing early on that modifying an out of the box ImageNet model could provide better results on classifying some medical imagery than top physicians. The hacker mentality is strong here as shown by show by having you able to submit to a Kaggle contest for what was at the release of the course a placement in the top 50% of the leaderboard, after lesson 1, and I’ve already reused several code snippets I’ve developed while going through the Fast.ai course, in projects.

Applications in Deeplearning.ai

While I could see myself using some code snippets from Deeplearning.ai (provided I downloaded them), I have not been able to easily translate the Deeplearning.ai code I’ve written into projects I’m working on. Deeplearning.ai also spends a lot more time teaching theory, especially early, and has yet to release their computer vision and sequence model components, so this will probably change soon.

Winner: Fast.ai since I am using code I’ve written there in projects already, whereas Deeplearning.ai code is mainly about learning the theory until their more application driven courses are released.

Portability

Portability of Fast.ai

Assuming you have access to a machine with GPU, this code is super portable, as you will be developing it all locally or on your own cloud box, and you will be developing modular solutions for real deep learning problems that you will continue to build on through out the course, within jupyter notebooks.

Portability of Deeplearning.ai

Deeplearning.ai really pales in comparison to Fast.ai here, for several reasons:

  1. They wiped my homework notebook clean without prior notice when I stopped paying the monthly fee.
  2. Since all your work is in their cloud, you need to explicitly download each Jupyter notebook of your homework before it is wiped.
  3. I don’t find the code that applicable/extendable to the projects I’ve been doing in industry/academia.

Winner: Fast.ai hands down is here with reusable, extendable code provided you have a machine to run it on to begin with.

Accessibility

Accessibility of Fast.ai

I would say Fast.ai is super accessible as far as teaching style and language choice (Keras and now Pytorch which they will use in the future, are a lot more accessible than Tensorflow), so as long as you can afford to rent an AWS GPU box and figure out how to run their environment installation script on it or have a deep learning rig already (I do), it is a very accessible introduction.

Accessibility of Deeplearning.ai

One way Deeplearning.ai makes itself super accessible is by hosting cloud Jupyter notebooks for you to do all your work in. This made the class pretty frictionless to start on, and is why I started it first.

By pretty frictionless, I do imply there is still friction, which is true. One of the big turnoffs of mine towards this class is that they ghost auditing it. By ghost I mean that when I and several other of my peers first tried to take this class, they did not show auditing as an option, but when we came back to the site through a search engine, the auditing option appeared on the site. I believe this will inhibit many people who want to take the class, but can’t afford the $50 a month cost, from taking it.

Also, at this point I am not convinced learning Tensorflow is the best way to learn deep learning, the syntax is highly nuanced and takes time to grasp that one could be spending learning more about the concepts.

Winner: Fast.ai if you can access a GPU machine, Deeplearning.ai if you cannot.

_Conclusion_I ultimately think this is a trick question, even though Fast.ai did win on the personal evaluation scale I chose to evaluate the courses and I’ve spoken to several people who regret not starting with Fast.ai. I think some combination of both courses, fast.ai for applications and deeplearning.ai for theory, is the optimal use case.