AI can identify intimate partner violence years before people disclose it, but is that safe?

2 days ago 4

AI can identify intimate partner violence years before people disclose it, but is that safe?

Follow ZDNET: Add us as a preferred source on Google.

ZDNET's key takeaways

A new AI identifies abuse victims from medical history.
It's 80% accurate and flags victims up to 5 years before disclosure.
Past initiatives point to data safety concerns that need addressing.

More than one in three women in the United States will experience intimate partner violence (IPV) at some point in their lives, according to the Centers for Disease Control and Prevention. Many will arrive at hospitals or clinics with injuries, chronic pain, anxiety, and depression. Yet even as they receive care for their immediate symptoms, it's often years before these patients can come forward about what they're going through.

Researchers have repeatedly warned that women don't feel safe asking for help from their healthcare providers for reasons including fear of the abuser, financial dependence, immigration status, and stigma. The US Preventive Services Task Force recommends routine IPV screening for all women of childbearing age. Yet the CDC estimates that current tools, which rely on self-reporting, capture only a fraction of affected patients.

"IPV often remains invisible within healthcare systems despite repeated patient interactions over many years," said Dr. Bharti Khurana, founding director of the Trauma Imaging Research and Innovation Center at Harvard Medical School and an emergency radiologist at Brigham and Women's Hospital.

Working with a team of researchers at Brigham and Women's Hospital, MIT, and Harvard Medical School, Khurana published a study in March 2026 proposing a way to recognize IPV faster. It's an AI model that scans for patients at elevated risk of partner violence using data already stored in their health records. But before the tool can be deployed in hospitals for clinical use, its creators will have to address some remaining concerns around safety and privacy.

Closing a vital information gap

The data needed to identify risk already exists, according to researchers. But questions persist as to whether AI can read it reliably and safely enough to be useful. This is the problem that the team hopes to solve, using a system called AIRS.

AIRS stands for Automated IPV Risk Support. The system draws on two distinct types of data, combining them to achieve higher accuracy than earlier work in the field.

Also: What you give up when you put on a smartwatch or ring

The first is structured medical record data, the kind organized in rows and columns. This includes diagnoses, prescribed medications, radiology test locations and timing, emergency department visit frequency, vital signs, and a zip-code-level social deprivation score as a proxy for socioeconomic stress.

The second is unstructured clinical notes, the free-text records that radiologists, social workers, emergency physicians, and other clinicians write during and after patient encounters. The team processed these using Clinical-Longformer, a clinical language model trained on medical text, which converts each note into a numerical form the model can analyze.

Each data stream feeds into its own separate classifier. The two are then combined using a framework called HAIM (Holistic AI in Medicine), which fuses outputs at the prediction stage while keeping the data streams independent. So if a hospital's clinical notes are incomplete or if enough structured data is missing, the other stream still contributes. Record keeping varies considerably across clinical institutions, so this design addresses that crucial flaw.

On the primary test cohort, the fusion model reached an AUC of 0.88. AUC, or area under the receiver operating characteristic curve, is a standard measure of how well a model distinguishes between cases and non-cases. A score of 1.0 is perfect; 0.5 is no better than chance. Across all three validation cohorts, including patients from a second hospital in the network and patients who never sought help from any domestic violence program but received a confirmed diagnosis for IPV, the model held an AUC above 0.8.

Also: Wearables produce huge amounts of health data - and doctors are struggling to keep up

In fact, the fusion model flagged 80.6% of IPV cases before the patient self-reported, with an average lead time of 3.68 years before disclosure. Some patients were identified through records that were older than five years before they self-disclosed.

IPV tech over the years

This isn't the first time that someone has tried using AI to identify victims of intimate partner violence. AI-based tools for assessing violence against women have been deployed or tested in several countries for more than two decades.

Spain launched an algorithmic risk assessment tool called VioGén as early as 2007, developed by its own Interior Ministry. VioGén was used by police and, in some cases, judges issuing protective orders. However, at least 247 women have been killed by their partners since 2007 after being assessed by VioGén, according to reports from multiple news organizations. A review of 98 of those homicides found that 55 of the women had been scored as negligible or low risk. Reports also showed that many police officers followed the algorithm's assessment without applying any independent judgment.

In the UK, most police forces use a questionnaire-based tool called DASH (Domestic Abuse, Stalking, and Honour-Based Violence) to triage victim risk. A 2022 study by researchers at the University of Manchester and the University of Seville found that DASH was not identifying the most vulnerable victims. Moreover, machine learning models trained on police data significantly outperformed it. However, the paper also noted that many ethical concerns remained unaddressed for such a tool to be deployed for practical law enforcement use.

Meanwhile, clinical researchers have been working on the problem from a different angle. A 2009 study in The BMJ showed that longitudinal hospital records could predict future domestic abuse diagnoses. More recently, researchers published an NLP algorithm that screened more than one million ER visits for IPV-related language in clinical notes, achieving 99.5% precision in identifying IPV visits. In Australia, researchers at the University of New South Wales applied deep learning to nearly 500,000 police narratives from the NSW Police Force to predict repeat instances of family violence.

Given the challenging implications, however, most of these efforts have not expanded beyond pilot projects or localized deployments. In the meantime, self-reporting remains the standard for both legal and medical intervention.

"AIRS represents a more proactive approach by helping healthcare providers identify patients who may otherwise remain unrecognized," explained Khurana, though it is by no means absolute.

Khurana described AIRS as a silent clinical support tool because it surfaces a risk score to the clinician, not the patient. "AIRS is not designed to diagnose IPV," she clarified, "a positive flag should never be interpreted as confirmation of abuse." The model is in active pilot testing across several clinical settings at Mass General Brigham.

Alongside AIRS, the team developed structured "Caring Conversations" guidance and patient-facing "Empower Guides" to help clinicians respond in a trauma-informed way when the tool raises a flag. A positive score is intended to prompt a supportive conversation, not an automatic intervention.

Ethical fault lines

While AIRS addresses a lot of prior concerns, it may not meet the standards for real-world clinical deployment just yet. Alexia Maddox, senior lecturer at La Trobe University and co-chair of an IEEE Industry Connections Activity on AI for addressing family violence, said there are several concerns that need attention first.

Most importantly, Maddox noted that the paper defines IPV as violence or aggression by current or former partners. While this might be accurate in a strictly clinical setting, Maddox explained that the field has largely moved past this framing.

Australia's primary risk assessment framework on family violence, MARAM, now incorporates coercive control as a risk factor along with physical violence. This refers to the sustained pattern of emotional, financial, and technology-facilitated abuse through which a perpetrator dominates a partner's life. For example, tracking a partner's phone calls and geolocation data without their consent, accessing their bank accounts without permission, and more.

Also: The biggest risks lurking inside your at-home DNA and health tests

The distinction has direct consequences for what the model can detect. AIRS reads physical violence signals like fracture patterns, emergency presentations, and chronic pain. But, "coercive control, financial abuse, and technology-facilitated abuse leave almost no trace in a radiology report or an emergency department note," Maddox said. "These prevalent forms of IPV as currently understood are largely invisible to this model."

Therefore, patients experiencing coercive control without physical injury are unlikely to appear in data drawn from hospital records and domestic violence program enrollment.

Then there's the consent question, which Maddox said goes deeper than the study's IRB approval. The model was built on six years of medical records. But when deployed, it will generate risk scores for patients who may not be informed that they are being assessed. This is a significant challenge that the study already acknowledges, citing research showing that intrusion and loss of autonomy during disclosure hinder help-seeking.

For certain populations, these risks are higher. Women on temporary visas face abusers who use visa cancellation as a tool of control. A database linking women to IPV risk scores, tied to their full medical and socioeconomic records, could carry serious implications with the current political climate in the US, Maddox said.

Maddox also invoked VioGén as evidence of the distance between a model's performance metrics and its real-world outcomes.

"The history of transformative technology suggests that the people closest to a technical breakthrough are rarely best positioned to see its consequences in the world. That is precisely why governance standards need to exist before deployment, not emerge from the wreckage after," she explained to ZDNET.

At the same time, Maddox was careful to distinguish these concerns from a criticism of the research itself. "The limitations I've described are not really the fault of the authors, who are working squarely within their research lane and domain expertise," she said. "They have done what they set out to do, and done it well."

The more pressing question, she noted, is about the gap between what a study demonstrates and what kind of real-world applications it's eventually used to justify.

Looking to the future

Khurana's team is already working on expanding the model's scope to address some of these gaps. AIRS was trained on data from female patients, but queer, trans, and male patients are also at risk of IPV.

"IPV affects individuals across all gender identities, expanding this work to better support transgender and non-binary patients is an important future direction," Khurana said.

The team has also been studying injury patterns and healthcare use among transgender patients and working to characterize how partner violence presents in male survivors, a population that Khurana described as significantly understudied in IPV research literature.

In a parallel project with NIH funding, the team is also working to identify long-term and slow-growing health risks associated with IPV, including gastrointestinal disorders, neurological conditions, mental health trajectories, and substance use disorders.

Meanwhile, the IEEE initiative that Maddox co-chairs is focused on the governance standards that would need to be in place before tools like AIRS could be responsibly deployed at scale. It brings together participants who are researchers, technologists, community organizations, practitioners, and regulators to develop the systems and processes needed to deploy AI safely for these at-risk populations.

Maddox said she believes that AI should be part of the response to family violence, but whether it will be of actual service to victims of abuse depends on how it is designed, deployed, governed, and regulated by the authorities responsible. At the same time, she admitted that the model she would describe as meeting that bar does not fully exist yet, which is why building these standards will be important to future research.

To put it in practical terms, Maddox said that AI systems built to address family violence need to be "sensitive to coercive control in its full range of expressions, across cultural contexts, with community knowledge built in from the start," and that they should be "governed by frameworks that center the safety, autonomy, and dignity of the women it is designed to serve." She also reiterated that community knowledge is not simply an ethical add-on to technology like this; it's what separates a system that works from one that doesn't.

Read Entire Article