Question: Please tell us how the use of AI-based tools has improved oncology literature searches?
The early stages of AI-based tools were initially explored in 3 main areas: literature searches, data extraction, and data synthesis or insight generation. The field of oncology has experienced rapid growth, with approximately 82,000 clinical trials published between 1963 and 2010, and an additional 42,000 in the following decade. In 2019 alone, an average of 190 articles focused on oncology clinical trials were published each month. In a study conducted from 1999 to 2020, literature and clinical trial searches were conducted across various platforms including PubMed, Dimensions, Embase, and Google. The research aimed to explore the application of AI in facilitating literature searches, developing clinical guidelines, and extracting clinical trial data in the field of uro-oncology.
Q: Tell us about your recent paper in BJUI on how AI publication searches performed better than traditional ones?
To demonstrate the potential of AI in improving therapeutic decision-making and personalizing treatment regimens, a search was conducted on the Dimensions platform for keywords related to 'prostate cancer' which resulted in the identification of 76 publications, of which 48 were included in the analysis. By utilizing natural language processing and machine learning algorithms, AI enabled a quicker, more personalized, efficient, and targeted search compared to traditional methods. (See “Application of artificial intelligence to overcome clinical information overload in urological cancer,” from the journal BJUI by Drs. Cora Sternberg, Andrea Sboner and colleagues).
Q: How have AI-based tools been useful in helping clinicians working on diseases like prostate cancer to navigate the medical literature?
Developing optimal therapeutic sequencing strategies in prostate cancer (PC) presents a significant challenge, which could be addressed with the help of AI-driven tools for analyzing medical literature. NSIDE PC was developed by customizing PubMed Bidirectional Encoder Representations from Transformers. Publications were ranked and aggregated for relevance using data visualization and analytics. Publications returned by INSIDE PC and PubMed were given normalized discounted cumulative gain (nDCG) scores by PC experts reflecting ranking and relevance.
A model was developed and published that could help physicians in understand the best sequencing of PC therapies. Blinded PC experts compared the performance of INSIDE PC and PubMed using test questions related to therapy sequencing in mCRPC, revealing that INSIDE PC performed competitively. This study highlights the potential of AI-based tools like INSIDE PC to assist clinicians in navigating therapeutic sequencing decisions in PC. The success of the INSIDE initiative underscores the effectiveness of systems designed for semantic Q&A analysis and suggests that similar frameworks could be applied to other medical questions and tumor types. While the development of INSIDE PC represents a step towards standardizing the validation of AI-based literature extraction tools, further research and refinement are necessary in this area. This work underscores the importance of greater clarity and a common language in reporting therapeutic sequencing strategies in PC literature.
Q: Can you tell us about the “Artificial INtelligence to Support Informed DEcision-making (INSIDE) for Improved Literature Analysis in Oncology,” study?
There is currently no agreement on the performance standards for evaluating generative AI systems in generating medical responses. This study in European Urology Focus, in which I’m a co-author, aimed to assess the capabilities of ChatGPT in addressing medical inquiries related to prostate cancer. An online survey was conducted worldwide from April to June 2023, involving over 700 medical oncologists and urologists specialized in treating prostate cancer patients. The participants were not informed that the survey was designed to evaluate AI technology. In the first part of the study, responses to 9 questions were independently crafted by medical writers, medical websites, and ChatGPT-4.0, which generated responses based on publicly available information. Physicians were randomly presented with responses without knowing if they were generated by AI or written by medical writers. Evaluation criteria and overall preferences were recorded. The second part of the study evaluated AI-generated responses to five complex questions with nuanced answers found in medical literature, using a 5-point Likert scale to assess the responses. Statistical significance was determined at P < .05.
Q: What were the advantages of the AI-generated responses?
In the first part of the study, 602 respondents consistently favored the clarity of AI-generated responses over those curated by medical writers in 7 out of 9 questions (P < .05). Despite this preference for AI-generated responses when blinded to the source, respondents still considered medical websites to be a more credible source (52%–67%) compared to ChatGPT (14%). In the second part of the study, 98 respondents also rated medical websites as more credible than ChatGPT, but highly rated AI-generated responses for all evaluation criteria, even for questions requiring nuanced answers from medical literature.
# # #