Detecting Hate Speech in the Balkans



Religious forms of radicalization as well as right-wing violent extremism pose a challenge to the security of different countries in the Balkans. While there have been numerous studies on the offline drives towards radicalization, the online space is still poorly understood. Measures to regulate extremist content so far have been haphazard and have failed to address the wide range of radicalism online.

Our client seeked to understand the role of the internet in facilitating radicalization and violent extremism, with the goal of establishing how online radicalization is placed in comparison to other known methods of radicalization in the country.

The ultimate goal was to use this improved understanding to drive policies and governance measures to address the increasing issue of religious and right-wing radicalization in the country.


Existing evidence showed that extremist content in our country of study was being disseminated through a wide range of platforms, mainly Facebook and YouTube, as well as different traditional media outlets.

Using our media monitoring engine MEDIA, we set up a series of monitors to look into posts that included terms our clients and partners identified as extremist language in public social media platforms like Twitter, Reddit, YouTube comments, etc. We also looked into the comments sections of media outlets specifics to the country of study: our Data Engineering team developed scripts to extract those comments within the limitations of each platform.

With the data gathered, and with help from thematic experts, our Data Science team built an algorithm that detected whether a specific comment was hate speech in the languages studied.