
UI representing the conversation flow while utilising explanability features for a chatbot to be used in clinical settings
AI Interpretability in Chat and Data workflows
Enabling AI interpretability for chat based technologies for clinical and data science pipelines. (2021)
STATUS
The explanability modules developed during the time are being used to provide added explanations in an app called DocTalk.
DURATION
- 3 months
- May, 2021 - July, 2021
TEAM (x1)
- Sumedh Supe, UX and ML Research Engineer
CONTRIBUTIONS: UX and ML Engineer
Enabled critical technical practice for both in clinical practice as well as ML pipelines by creating explanability tools.
From a human-centered standpoint I had to find out places where explanations will have the most impact, and what sort of explanations would benefit the users.
As the ML Engineer, I had to work with the data science pipelines and create the functional software that would enable the creation of explanations, powered by the LIME algorithm.
FINAL OUTCOME

Explanability modules created for clinical chatbots(left) and for data science pipelines(right) utilising LLMs
The explanations allow for quick lookout to how the black box models perform, allowing for a 40% increase in asking repeated questions. Enabling critical technical practice both for the clinal practice as well as for the ML pipeline.
1. Explanability Module for DocTalk(RASA): An explanability module for DocTalk was created that helps people analyse the text and explain the output. It was created as a library for RASA in Python.
2. Explanability Module for Sentiment Analysis Pipeline (Python): A generalized LIME explanability module was created to explain data science pipeline. This one was tested for explaining how sentiment analysis was used to create clusters.
PROBLEM

Most AI models are blackboxes and provide no clear way to understand how they work
.jpg&w=3840&q=75)
Explanability models like LIME can help explain certain instances of the model and provide more transparency
Most AI models today generate results in a way that cannot be explained. These models are blackboxes and it is very difficult to know how they operate.
To enable transparency and in a way critical technical practice, understanding how a result was produced by an AI model becomes utmost important.
By understanding what specific parts of a particular instance resulted to the output we can allow for more transparency as to how the result was produced without disclosing the entire working of the AI model.
THE USERS

The users, Residents(the first user) undergoing training on the left and data scientists(the second user) on the right
AI and ML models are making their way into so many apps. Creating a transparent way to boost confidence in these blackboxes is the key to have them being used at critical junctures like healthcare and data science.
PROCESS AND OUTCOMES
After going through the requirements of the practitioners and the RASA chatbot that they were using through interviews and collected data, I decided to use explanation algorithm like LIME to help analyze the results of the pipeline and understand the decisions made by the ML model.
LIME was chosen because of how easy it made explanations. It converted any response into a linear model and explained it using weights. We did not want to explain the entire model and could do with just explaining individual responses.
Instead of designing use cases for every scenario, a probabilistic design approach was taken and the abilities of the explainability module were studied. Knowing that only the weights would be the only output, an API enpoint in RASA was created to output these whenver the user requested.
Unnecessary human interactions as a result of confusion were reduced by 80%.
In the case of the Reflexive ML Pipeline, I analyzed how Explainable Al can help in understanding black box models that are used for the processing of textual data and how explanations can help the user inculcate reflexive practices. I created a LIME explanations module for the Reflexive ML Pipeline that extracts words that have a high-impact on the classification output. This module is aimed at enabling reflection and understanding of the black box models that are used for the processing of textual data.
The Reflexive ML pipeline enhanced the ability of the data scientist to practice critical technical practice on the data by 40%.
ADDITIONAL INFORMATION
DocTalk
DocTalk ("DocTalk: Dialog meets Chatbot") is a joint research project of Charité: Universitätsmedizin Berlin, Fern Universität in Hagen and Freie Universität Berlin funded by the Federal Ministry of Education and Research (BMBF). The overarching goal of the research project is to analyze and improve digital communication and learning paths in clinical environments in order to meet the increased requirements due to interdisciplinary collaboration and intertwining professional process flows, both technically and didactically. To achieve this goal, a proactive communication platform is being implemented at Charité, including a conversational agent (chatbot) that is supposed to support reflective learning processes of residents in the clinical environment. The research group Human-Centered Computing designs and implements the DocTalk chatbot in collaboration with the project partners and the residents as end users of the proactive communication platform.
Reflexive ML
The project "Reflexive ML" is an interdisciplinary research collaboration in the context of the Cluster of Excellence: Matters of Activity (ExC:MoA funded by DFG between Michael Tebbe (Computer Scientist at HCC Research Group and PhD candidate at EXC:MOA), Dr. Simon David Hirsbrunner (Postdoc in Science and Technology Studies at HCC Research Group) and Prof. Dr. Claudia Muller-Birn (Head of HCC Research Group and Principal Investigator at ExC:MoA). The goal of the project is to study how methods from Natural Language Processing (NLP) can be applied in such a way that they support hermeneutic practices of interpretivist scholars by increasing their accessibility through reconceptualization. To this end, a ML-Pipeline has been implemented that can be applied to large written natural language datasets (e.g. YouTube comments). The NLP-pipeline recontextualizes the data by semantically associating it with large datasets in a pre-trained model and thus making similarities visible that have not yet been considered in the analysis (e.g. by grouping together references to conspiracy narratives). Additional features in the Reflexive ML pipeline provide opportunities to inspect and interpret the data in its new context and thus allow a different view by providing material for reflection on the phenomena the data represents.
Human-Centered Computing, Freie Universitat Berlin
The project was a part of the Human-Centred Computing Group at Freie Universität Berlin. The group's mission is to embrace a critical practice in the design of socially responsible technologies. The group's current focus is on machine learning technologies related to privacy, reflection and interpretability, with a focus on interactive and conversational user interfaces. The group is headed by Prof. Claudia Müller-Birn.
REFLECTIONS
The utilisation of open source technologies like RASA enabled me to build an entire explanability pipeline on the their framework. This now enables any model's outputs to be understood using LIME.
I realized that even though we had set out to get explanations instantly the lack of compute made it take even longer to generate outputs, but with better computers and algorithms, I think the would soon be a non issue.
This was my first step into HCI research and I learnt a lot about having useful discussions to get to the desired result. Moreover this was all before ChatGPT, so I feel ChatGPT based explanations can make it even better.
This project would have not been possible without the opportunity from Prof. Claudia Muller-Birn and my mentors, Michael Tebbe and Diane Linke. Thanks to the other members of the HCC lab too for being accomodating throughout my stay.