A team of researchers and scientists from universities across the globe have come together to develop an algorithm aimed at tackling human trafficking. The algorithm is called InfoShield, and it has the potential to scan hundreds of thousands of online advertisements for escorts to identify victims of human trafficking.
Currently being studied for its effectiveness, it could prove an important tool when it comes to addressing what has become an increasingly disturbing practice impacting young girls and women the world over.
“Given a million escort advertisements, how can we spot nearduplicates? Such micro-clusters of ads are usually signals of human trafficking. How can we summarize them, visually, to convince law enforcement to act? Can we build a general tool that works for different languages? Spotting micro-clusters of nearduplicate documents is useful in multiple, additional settings, including spam-bot detection in Twitter ads, plagiarism, and more,” the study lays out.
“While INFOSHIELD is general, our main motivation is near duplicate detection and summarization in escort advertisements. Human trafficking (HT) is a dangerous societal problem which is difficult to tackle. It is estimated that there are 24.9 million people trapped in forced labor, 55% of which are women and girls accounting for 99% of victims in the commercial sex industry,” it adds.
The algorithm looks at large swathes of online ads and identifies clusters of the same ad as a potential signal that there may be human trafficking involved with said escort. This as the team’s research has found that one person is usually controlling the accounts of four to six victims at a time.
“By looking for small clusters of ads that contain similar phrasing, rather than analyzing standalone ads, we’re finding the groups of ads that are most likely to be organized activity, which is a strong signal of HT (Human Trafficking),” the study adds.
“Our algorithm can put the millions of advertisements together and highlight the common parts. If they have a lot of things in common, it’s not guaranteed, but it’s highly likely that it is something suspicious,” explains co-author of the study, Christos Faloutsos.
The potential for the algorithm is substantial, with it capable of combing through four million documents in roughly eight hours on a standard notebook. This is work that would take a team of investigators weeks and months to complete with a far lower level of accuracy, which is why it could a vital tool for law enforcement moving forward.
“Our experiments on real data show that INFOSHIELD correctly identifies Twitter bots with an F1 score over 90% and detects human-trafficking ads with 84% precision,” the study highlights.
It remains to be seen whether InfoShield will make its way into the field, but given its early success, it definitely should.
To read the report on the algorithm, its design and methodologies, head here.