Home AI How Eyeo Worked With Students To Innovate On AI

How Eyeo Worked With Students To Innovate On AI

SHARE:
ad blocker

Eyeo is putting the “learning” in machine learning.

Last year, it partnered with a university student initiative in Munich to have students find new approaches to using AI in its online ad filtering.

As the parent company of AdBlock and Adblock Plus, two of the most downloaded ad-blocking services on the market, eyeo has invested a lot of time and money into developing its own deep-learning methods for detecting and filtering ads.

But eyeo had yet to explore how generative modeling could be used to curate and maintain ad filtering lists, which is typically a tedious, error-prone, largely manual process that is difficult to scale.

Eyeo’s extensions rely heavily on filter lists, such as EasyList, that are maintained by a community of mostly unpaid volunteers and contain tens of thousands of rules for blocking and hiding certain network requests and the HTML code that’s responsible for ad rendering.

The ability to automatically update these lists with accuracy would be a major benefit, said Dr. Humera Noor Minhas, director of engineering at eyeo.

Automation station

Last year, TUM.ai, a student initiative within the Technical University of Munich, put out a call for proposals for its Moonshot competition, a hackathon-style project for students interested in pursuing a career in AI.

In response, eyeo submitted a proposal to have TUM.ai students tackle the challenge of developing an AI model that automatically classifies URLs and generates filter rules based on website content.

Eyeo’s proposal stood out due to “its real-world impact,” according to TUM.ai student advisor Thomas Wölkhart.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

It also gave students the opportunity to acquire hands-on technical expertise and get direct feedback from eyeo engineers, he said.

See it to believe it

The six-week challenge ran from March to May 2023 and demonstrated how using generative AI could make ad filtering less onerous.

For instance, one student used OpenAI’s APIs to tweak the GPT-3 Ada model, producing an effective model that created automated webpage-based filter rules.

The student’s successful model opened eyeo’s eyes to generative AI’s real-world applications.

What students lack in experience, they often make up for in curiosity and fresh perspective.

But the challenge’s biggest boon for eyeo wasn’t actually the model itself, according to Minhas. It was the data set eyeo created for students to work with during the challenge, which contained more than 1 million ads.

To generate the data set for the challenge, eyeo first looked at open-source lists, like Alexa and Similarweb, to find the most-visited domains in different regions, according to Minhas. Then it gathered information from the HTML of those sites, including the headings, organic content and, crucially for eyeo’s purposes, the ads on the page.

Challenge participants worked with two data sets to develop their AI models: a training set and a test set. The training set included 100,000 pairs of webpages and corresponding filter rules, while the test set held 36,000 webpages.

By dividing data into two sets, students could check how well their models generalized data they hadn’t seen before, Wölkhart said, and the test set allowed eyeo to judge how closely the student-produced models matched the filters from the training set.

But the process also simulated how companies test model performance before rolling out a model to users – an important step, Wölkhart said, since “one wrong filter rule could break a whole website for thousands of users.”

Eyes ahead

Following the Moonshot project, eyeo went on to use the data set it generated for the challenge to train another model involving URL parameters.

Eyeo tested using URL parameters to detect if a certain portion of a page has ads or not, Minhas said.

URL parameters are query strings that append additional information to a basic web address to pass to a server, track ad campaigns and customize user experiences on a website.

This new URL-based classifier model achieved precision – a machine learning performance metric that measures a model’s accuracy – comparable to the solutions eyeo already has in production.

The company is working on a proof of concept and has identified use cases for the new model, such as automatically detecting buggy ad filters. “It has generalization ability that allows us to detect ads in unknown domains and provide a better user experience in ad filtering,” Minhas said.

Next up, eyeo has been researching how to detect ads served in AI chatbot experiences and filter them out if a user doesn’t want to see them anymore.

“The data set was a huge step for us to take our research forward,” Minhas said.

Must Read

Comic: Welcome Aboard

Google’s Ad Network Biz Dips, But Search Brings Home The Bacon

By next year, Google will have three separate business lines – Search, YouTube and Cloud – with an annual run rate to generate at least $100 billion, CEO Sundar Pichai told investors.

Comic: The Last Third-Party Cookie

Cookie-Related Quips To Get You Through Google’s THIRD Third-Party Cookie Delay

If you’re looking for a think piece about what Google’s most recent third-party cookie deprecation delay means for the online ad industry – this isn’t it. 😅

Comic: InstaTikSnapTokTube

The IAB Predicts Social Video Will Overtake CTV This Year

The IAB projects digital video ad spend will rise to $63 billion in 2024, representing a 16% increase from last year. Of the three video ad categories the report breaks out (social and online video and CTV), the clear winner is social video.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Pictograph of graph, mug of beer

Inside AB InBev’s Strategy For Tapping Into First-Party Data

Pour one out for third-party data. These days, AB InBev’s digital marketing strategy is built squarely on first-party data.

4A’s Measurement Committee Says New Currencies Aren’t Ready For Prime Time – Yet

The 4A’s measurement committee, a working group for marketers and media buyers to discuss their opinions and concerns about video ad measurement, has some thoughts on the status of alternative TV currencies.

How Chinese Sellers Are Quietly Reshaping US Consumer Habits

American consumers are buying more and more online products directly from Chinese manufacturers. It’s an important change, though many online shoppers are unaware.