Weinberger, Conitzer, Freedman, McIlraith

AAAI 2018 Student Paper Award Given for paper Co-authored by Conitzer and Freedman

February 3, 2018

(Pictured, from left: Kilian Weinberger (AAAI 2018 program co-chair), Vince Conitzer, Rachel Freedman, Sheila McIlraith (AAAI 2018 program co-chair)

Two patients need a kidney, but only one donor is available: who receives the organ? The question may seem grim, yet it is one that medical professionals face often. In recent years, several countries have begun using algorithms developed by AI experts to match patients and donors. That led Rachel Freedman, who graduated from Duke in 2017, to ask whether patient characteristics — such as age or overall health — could be integrated into an algorithm to incorporate human values into the donor-matching decision-making and to improve the process.

Freedman’s work, supported by several faculty members at Duke and a professor at the University of Maryland, recently won the Outstanding Student Paper Honorable Mention award at the 2018 Association for the Advancement of Artificial Intelligence (AAAI) conference.

“I'm deeply, deeply honored to win this award and thankful to all of my advisors and co-authors, who put so much hard work into this project, into making it work,” Freedman says. “The coauthors span computer science, psychology, economics, social science and philosophy, so I think that the fact that this work was chosen bodes well for the future of artificial intelligence research, which is itself inherently interdisciplinary.”

CS professor Vincent Conitzer, a coauthor of the paper, notes that the award is usually given to a graduate student so the fact that Freedman won it for work she did as an undergraduate is extra special.

Freedman says interest in incorporating human values into automated decision-making stemmed from her fascination with a subfield of AI called Artificial Intelligence safety. AI safety work seeks to ensure that AI is developed in a safe and controllable and transparent way and that it makes decisions aligned with our values, even when those decisions are independent and unexpected. Kidney exchanges are a real-life instance of this situation, where currently in the real world, algorithms are being used to make decisions that have a great ethical importance, deciding which patients are allocated a kidney, a decision that can impact who lives and who dies, she says. The starting point for the project was an algorithm that University of Maryland computer scientist John Dickerson, a coauthor of the new study, helped to develop for the United Network for Organ Sharing’s nationwide kidney exchange; the algorithm can be used on a whole pool of patients to form optimal exchanges.

The new work builds on that, Conitzer says, explaining that the team quantified human values and incorporated them into a new algorithm. First, the team had to decide what values to include, but, to avoid bias, the researchers didn’t want to use values they decided on themselves. So they turned to a crowdsourcing marketplace on Amazon called Mechanical Turk, asking respondents which characteristics would be good to use for prioritizing organ-donation patients. Race was an obvious feature respondents did not think as appropriate to use, the team did home in on three agreed-upon attributes: age, drinking habits, and history with cancer. Freedman and colleagues then asked another set of M-Turkers to choose who should receive a kidney among hypothetical pairs of patients —for example, one who was old but in good health compared with a younger person who drank frequently. The respondents favored the younger, more frequent drinker.

The team then used the data to build an algorithm that would decide, when given a pool of patients, which subset would receive a kidney based on the weights of the characteristics chosen by the M-Turkers’ answers. The algorithm prioritized younger, occasional drinkers who hadn’t had cancer, while older, sicker patients would often remain unmatched. (Panels of experts currently make similar decisions). Still, after running simulations on how the algorithm would work, “it had a really big effect,” Freedman says. “Patients that were under-demanded, which is to say their blood types, or other factors, made them really difficult to match, are drastically impacted by our adapted algorithm.”

The algorithm is not quite ready for implementation on a nationwide scale. Making the patient profiles more detailed and including more stakeholders — doctors and others involved in the exchange — in weighting characteristics is necessary to improve the algorithm, Freedman says. There’s more work to come.