AI
MAI
Medical AI crosses a threshold that is no longer just theoretical. OpenAI now claims that its ChatGPT model outperforms human doctors in certain clinical tasks, with supporting figures. With ChatGPT for Clinicians, the company tackles directly the heart of the healthcare system, between documentation, research, and decision support. Behind this announcement, a promise of productivity… but also questions about reliability and the methodology used. This advancement marks a turning point in the accelerated integration of artificial intelligence in medicine.
OpenAI unveiled a specialized version of its chatbot, called “ChatGPT for Clinicians”, intended for doctors, nurse practitioners, medical assistants, and pharmacists. The tool aims to handle time-consuming tasks such as documentation, medical research, or care consultations.
According to the company, its GPT-5.4 model scored 59.0 on the internal benchmark HealthBench Professional, versus 43.7 for human doctors, and this “even with unlimited time and internet access”. OpenAI claims that its system “has outperformed human doctors in certain clinical tasks”, positioning its AI as an already competitive player in the medical field.
These elements detail the architecture and validation of the system, which is intended to be directly operational in a demanding medical environment.
The announcement comes in a context of rapid transformation of the medical sector under the effect of AI. According to certain on-chain data, OpenAI highlights that 72 % of doctors now use artificial intelligence tools, compared to 48 % a year earlier, while the use of ChatGPT in this field has “more than doubled over the past year.” The company emphasizes the role of its tool as a solution to the sector’s structural challenges, notably administrative overload and staff shortage. It also specifies that conversations will not be used to train its models and that devices compliant with HIPAA requirements are offered to protect sensitive data.
A key point tempers these performances: the HealthBench Professional benchmark was designed by OpenAI itself. This detail, acknowledged in the announcement, introduces an area of uncertainty regarding the objectivity of the results. The company adopts a cautious stance by reminding that its tool is “designed to accompany clinical tasks” and not to replace medical judgment. This distinction remains central as AI establishes itself in critical fields where human responsibility remains essential.
In the longer term, this advancement could redefine the organization of health systems. AI could become an optimization lever while raising issues of trust, regulation, and independent validation of performances. Between promises of efficiency and the need for safeguards, OpenAI’s entry into medicine opens a phase where technology already influences clinical practice standards.