Git Product home page Git Product logo

adblock-rust's Introduction

Ad Block engine in Rust

docs.rs crates.io npmjs.com Build Status

Native Rust module for Adblock Plus syntax (e.g. EasyList, EasyPrivacy) filter parsing and matching.

It uses a tokenisation approach for quickly reducing the potentially matching rule search space against a URL.

The algorithm is inspired by, and closely follows the algorithm of uBlock Origin and Cliqz.

Somewhat graphical explanation of the algorithm:

Ad Block Algorithm

Demo

Demo use in Rust:

use adblock::engine::Engine;
use adblock::lists::{FilterFormat, FilterSet};

fn main() {
    let rules = vec![
        String::from("-advertisement-icon."),
        String::from("-advertisement-management/"),
        String::from("-advertisement."),
        String::from("-advertisement/script."),
    ];

    let mut filter_set = FilterSet::new(true);
    filter_set.add_filters(&rules, FilterFormat::Standard);

    let blocker = Engine::from_filter_set(&filter_set, true);
    let blocker_result = blocker.check_network_urls("http://example.com/-advertisement-icon.", "http://example.com/helloworld", "image");

    println!("Blocker result: {:?}", blocker_result);
}

Node.js module demo

Note the Node.js module has overheads inherent to boundary crossing between JS and native code.

const AdBlockClient = require('adblock-rs');
let el_rules = fs.readFileSync('./data/easylist.to/easylist/easylist.txt', { encoding: 'utf-8' }).split('\n');
let ubo_unbreak_rules = fs.readFileSync('./data/uBlockOrigin/unbreak.txt', { encoding: 'utf-8' }).split('\n');
let rules = el_rules.concat(ubo_unbreak_rules);
let resources = AdBlockClient.uBlockResources('uBlockOrigin/src/web_accessible_resources', 'uBlockOrigin/src/js/redirect-engine.js', 'uBlockOrigin/assets/resources/scriptlets.js');

const filterSet = new AdBlockClient.FilterSet(true);
filterSet.addFilters(rules);
const client = new AdBlockClient.Engine(filterSet, true);
client.updateResources(resources);

const serializedArrayBuffer = client.serialize(); // Serialize the engine to an ArrayBuffer

console.log(`Engine size: ${(serializedArrayBuffer.byteLength / 1024 / 1024).toFixed(2)} MB`);

console.log("Matching:", client.check("http://example.com/-advertisement-icon.", "http://example.com/helloworld", "image"))
// Match with full debuging info
console.log("Matching:", client.check("http://example.com/-advertisement-icon.", "http://example.com/helloworld", "image", true))
// No, but still with debugging info
console.log("Matching:", client.check("https://github.githubassets.com/assets/frameworks-64831a3d.js", "https://github.com/AndriusA", "script", true))
// Example that inlcludes a redirect response
console.log("Matching:", client.check("https://bbci.co.uk/test/analytics.js", "https://bbc.co.uk", "script", true))

Optional features

CSS validation during rule parsing

When parsing cosmetic filter rules, it's possible to include a built-in implementation of CSS validation (through the selectors and cssparser crates) by enabling the css-validation feature. This will cause adblock-rust to reject cosmetic filter rules with invalid CSS syntax.

Content blocking format translation

Enabling the content-blocking feature gives adblock-rust support for conversion of standard ABP-style rules into Apple's content-blocking format, which can be exported for use on iOS and macOS platforms.

External domain resolution

By default, adblock-rust ships with a built-in domain resolution implementation (through the addr crate) that will generally suffice for standalone use-cases. For more advanced use-cases, disabling the embedded-domain-resolver feature will allow adblock-rust to use an external domain resolution implementation instead. This is extremely useful to reduce binary bloat and improve consistency when embedding adblock-rust within a browser.

Parsing resources from uBlock Origin's formats

adblock-rust uses uBlock Origin-compatible resources for scriptlet injection and redirect rules. The resource-assembler feature allows adblock-rust to parse these resources directly from the file formats used by the uBlock Origin repository.

adblock-rust's People

Contributors

andriusa avatar antonok-edm avatar arnidagur avatar bbondy avatar bsclifton avatar dependabot[bot] avatar hawkeye116477 avatar jumde avatar mihaiplesa avatar pes10k avatar ryanbr avatar snyderp avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.