Git Product home page Git Product logo

newhouse / url-tracking-stripper Goto Github PK

View Code? Open in Web Editor NEW
189.0 189.0 25.0 919 KB

An open-source Chrome Extension that will remove the tracking parameters from URLs to keep them short and cleaner for sharing, bookmarking, etc. It will also skip any known redirects and take you straight to the target URL instead of passing you through an intermediate URL.

License: MIT License

JavaScript 92.29% HTML 7.58% Makefile 0.13%

url-tracking-stripper's People

Contributors

dependabot[bot] avatar jarettmillard avatar jnozsc avatar menzow avatar mrwacky42 avatar newhouse avatar nikcorg avatar nikolay avatar rayou avatar wumpus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

url-tracking-stripper's Issues

Allow copying URL with stripped redirects

Hi @newhouse, thanks for a great extension!

On FB, I often want to copy a link and paste it somewhere, but don't want the ugly FB tracking redirect there. Could you please, either:

  1. add something like Copy clean link to a link context menu,
  2. or strip known redirects from any copied links as suggested in #63.

Another option would be to process all links on page, maybe in a special mode only, like in https://github.com/nokeya/direct-links-out, but option 2 above seems like a simpler and faster choice imo.

More tokens to strip

Hi. I'm a search engine guy, and I'm very interested in a well-tested list of strippable CGI args to reduce the work my crawler has to do. I tried to algorithmicly build a list by taking the top 1000 websites from an old Alexa list, plus a few hosts I care about, and took a sample of their URLs crawled by CommonCrawl, and then counting which cgi args appeared in many of the hosts.

The biggest was &utm_source appearing on 474 of the 1,000 hosts. I dropped everything fewer than 5 hosts. So, in theory, this is somewhat of a representative sample of the most popular ones... although CommonCrawl isn't totally representative of the web, of course.

Here is a list with examples of the ones that aren't currently in your configuration:

# more utm_ -- I think people use utm_ as a prefix for their own purposes and/or Google doesn't document all of them

# https://www.mozilla.org/en-US/firefox/new/?f=30&ref=producthunt&utm_expid=71153379-28.SNKFJ4VqRziIW1TLqjhpAw.1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

utm_expid (15 hosts)
utm_referrer (12 hosts)

# https://www.etsy.com/?utm_source=google&utm_medium=cpc&utm_term=etsy&utm_campaign=search_fr_fr-fr-src-pure-brand-exact-st_exact_etsy&gclid=EAIaIQobChMIk6Duvp6\
n1QIVjantCh1f-whGEAAYASAAEgLsx_D_BwE&gclsrc=aw.ds

gclsrc 22 hosts

# https://www.google.fr/chrome/browser/features.html?brand=CHBD&gclid=CN6B2tjusdECFVAQ0wodfmcISw&dclid=CM6vjtnusdECFcSjUQodyg4B2Q

dclid 21 hosts {similar to gclid?}

normally cookies

# Adobe ColdFusion
# https://techcrunch.com/?CFID=8494701&CFTOKEN=56974155

&CFID= 25 hosts, 70 total instances
&CFTOKEN= 25 hosts, 70 total instances

# PHP
# http://instagram.com/p/BUPpEcIDFjT/?PHPSESSID=dbj4v5fl2c6sd8f8986aprqpf3

&PHPSESSID= 5 hosts, 89 total instances

and here are the popular ones that you don't have at all:

# Web Trends

# http://www.nature.com/collections/dtfkmdgglg?WT.mc_id=SFB_NA_1017_FattyLiverGraphic
# https://www.microsoft.com/en-us/store/b/accessories?tid=vpOCJmmq&cid=5250&pcrid=3050714533&pkw=makerbot%20replicator%202%20desktop%203d%20printer&pmt=e&WT.srch=1&WT.mc_id=pointitsem_Microsoft+US_bing_5+-+Accessories&WT.term=make
# https://www.chase.com/ccp/index.jsp?pg_name=ccpmapp/shared/assets/page/repayment_examples&WT.ac=st_ctr_student&jp_aid=st_ctr_student&WT.mc_id=st_ctr_student_repayment&jp_mep=st_ctr_student_repayment&WT.pn_sku=repayment_plans&memberid=studentcenter
# https://www.intuit.com/company/press-room/press-releases/2013/QuickenPullsBacktheCoversonLoveandMoney/?WT.qs_osrc=TST-164886110

&WT.mc_id= 24 hosts, 2530 total instances
&WT.srch= 14 hosts, 422 total instances
&WT.ac= 8 hosts, 4094 total instances
&WT.qs_osrc= 5 hosts, 20 total instances
&WT.pn_sku

# Oracle Eloqua

# http://www.cray.com/company/policies-and-practices/privacy-policy?elqTrackId=2e97d2d4f56e41eb9498379bab9753db&elqaid=584&elqat=2
# http://www.blackboard.com/Platforms/Collaborate/Resources/Webinars-and-Demos.aspx?elq=a318adfc3e7e40de83e0883a1d6760ba&elqCampaignId=329

&elqTrackId= 12 hosts, 191 total instances
&elqaid= 12 hosts, 189 total instances
&elqat= 12 hosts, 189 total instances
&elqCampaignId= 7 hosts, 138 total instances
&elq= 7 hosts, 111 total instances

# comScore Digital Analytix:

# http://www.dailymail.co.uk/sport/rugbyunion/article-5082539/France-23-28-New-Zealand-Blacks-French.html?ITO=1490&ns_mchannel=rss&ns_campaign=1490
# http://www.hotstar.com/tv/cineplay/13080?ns_mchannel=Article&ns_source=Scroll&ns_campaign=Cineplay&ns_linkname=CineplayShowPage&ns_fee=0

&ns_campaign= 6 hosts, 97 total instances
&ns_mchannel= 5 hosts, 92 total instances
&ns_source=
&ns_linkname=
&ns_fee=

# suspicious but probably too generic

# https://www.cray.com/?leadsource=website&srcdes=seagate&campaign=7010b0000018kLW
&campaign= 15 hosts, 9072 total instances

# https://wordpress.com/create/?utm_source=bing&utm_campaign=WordPress-Generic-Exact-US-GP&utm_medium=cpc&keyword=wordpress&creative=9925335912&campaignid=12806\
5278&adgroupid=3099786316&matchtype=e&device=c&network=o
&campaignid= 6 hosts, 74 total instances

installed ext SOLELY to remove this:

[At the moment, of course.]
Is it likely to get "?src=[*]" removed if the queries are tracking things?
Example:
http://remedydaily.com/2016/05/23/osteoporosis-risk-factors-symptoms-and-treatment/?src=bottomxpromo&ro=3&et=syn&eid=53157&pid=52170&syn=bcs&t=mxp
only requires
http://remedydaily.com/2016/05/23/osteoporosis-risk-factors-symptoms-and-treatment
to load, but the rest is eating up my bookmark space and my time.
It isn't useful for me to get targeted content, since I open EVERY link in a new tab.

Add support for ad.doubleclick.net track link

Example URLs:

- http://ad.doubleclick.net/clk;274204538;98873843;y?http://www.food.com/recipe/cuban-pork-adobo-salad-501729
- http://ad.doubleclick.net/clk;272664759;101583304;i?http://www.porkbeinspired.com/RecipeDetail/2770/Cuban_Pork_Adobo_Salad.aspx

As you can see, doubleclick track link doesn't put target URL in any parameters, targetParam method can't fit this case.

I'd like to propose to use regex for parsing target URLs also keep the current "param" parse method:

Add targetRegex:

  {
    name: 'Doubleclick',
    targetRegex: /\?(.*)$/,
//  targetParam: "url", 
    patterns: [
      `${SCHEMA}ad.doubleclick.net${PATH}?`
    ],
    types: ['main_frame']
  }

OR

Add parseMethod to make it explicit:

  {
    name: 'Doubleclick',
    parseMethod: "regex", // Accepts: "param" | "regex" 
    target: /\?(.*)$/, // Accept types: string | regex
    patterns: [
      `${SCHEMA}ad.doubleclick.net${PATH}?`
    ],
    types: ['main_frame']
  }

OR

Use target: (one property to rule them all)

  {
    name: 'Doubleclick',
    target: /\?(.*)$/, // Accept types: string | regex
    patterns: [
      `${SCHEMA}ad.doubleclick.net${PATH}?`
    ],
    types: ['main_frame']
  }

However, this implementation requires a major revamp, so would like to hear some feedback.

Rewrite urls when copied clipboard

Sites like facebook and google won't let you copy external urls. Instead they put a verbose tracking url in your clipboard. Normally I go through the hassle of opening the tracking url and copying the destination from my address bar before sharing it.

To simplify this process I extended your plugin to listen for any copy events and rewrite urls in the clipboard data. If this isn't considered out of scope I could create a PR for it.

Can you remove affiliate links cloacking?

Hi

WordPress has addon for referal links. Result of its work is hidden affiliate links

I found service which can decrypt / decode URLs
https://prozavr.ru/tools/rasshifrovka-korotkih-ssilok.php

So, if we will put link https://www.kobzarev.com/r/smmbox in service above we will get next:
https://www.kobzarev.com/r/smmbox/ https://smmbox.com/c/349271 https://smmbox.com/?utm_source=partner&utm_campaign=partner&utm_medium=partner

But page is only https://smmbox.com/

Can u add this funclionality 2 exclude this stupid refferal links?

What is your browser?

  • Chrome

What is your operating system?

  • Windows

Description (please include examples/screenshots where applicable)

image

More tokens to strip.


_hsmi
_hsenc
ref
fref

fb_ref
fb_action_ids
fb_action_types
fb_ref
fb_source
hc_location
action_object_map
action_type_map
action_ref_map

Regards :octocat:

Should strip Redfin `riftinfo` parameter

Lovely extension, thanks!

After installing I'm still seeing this extremely long and ugly riftinfo parameter in urls on Redfin.com

Not sure what it is, possibly a token for some kind of Occulus Rift VR home tour?

Is this a parameter you could add to the list?

  • WITHOUT EXTENSION

https://www.redfin.com/WA/Coupeville/301-Kinney-St-NE-98239/home/16696480?utm_source=myredfin&utm_medium=email&utm_campaign=recommendations_update&riftinfo=ZXY9ZW1haWwmbD0yOTI3NzcwMSZwPWxpc3RpbmdfdXBkYXRlc19yZWNvbW1lbmRhdGlvbnMmYT1jbGljayZzPXJlY29tbWVuZGF0aW9ucyZ0PWFkZHJlc3MmZW1haWxfaWQ9MjkyNzc3MDFfMTU2NDE5MjEzMV82JnVwZGF0ZV90eXBlPTEmbGlscl9zY29yZT0wLjQ1MjEmbGlzdGluZ19pZD0xMTA4MTUyMDQmcHJvcGVydHlfaWQ9MTY2OTY0ODAmcG9zaXRpb25fbnVtYmVyPTg=

  • WITH CURRENT EXTENSION:

https://www.redfin.com/WA/Coupeville/301-Kinney-St-NE-98239/home/16696480?riftinfo=ZXY9ZW1haWwmbD0yOTI3NzcwMSZwPWxpc3RpbmdfdXBkYXRlc19yZWNvbW1lbmRhdGlvbnMmYT1jbGljayZzPXJlY29tbWVuZGF0aW9ucyZ0PWFkZHJlc3MmZW1haWxfaWQ9MjkyNzc3MDFfMTU2NDE5MjEzMV82JnVwZGF0ZV90eXBlPTEmbGlscl9zY29yZT0wLjQ1MjEmbGlzdGluZ19pZD0xMTA4MTUyMDQmcHJvcGVydHlfaWQ9MTY2OTY0ODAmcG9zaXRpb25fbnVtYmVyPTg=

  • PROPOSED

https://www.redfin.com/WA/Coupeville/301-Kinney-St-NE-98239/home/16696480

Move configuration out of source into a data file

Right now your configuration list of cgi args is expressed as source. This certainly gets the job done, but, it might be a lot prettier to move the list into a data file. And then it would be easier to share the list among projects, like my crawler in Python (hidden agenda alert :-))

For a format, it could be YAML, or this is a simpler way:

# name
# example
prefix name
prefix name

so

# Adobe ColdFusion
# https://techcrunch.com/?CFID=8494701&CFTOKEN=56974155
CF ID
CF TOKEN

In this example 'CF' is what your code calls the ROOT.

This format also makes diffs more useful, in that each line has the full meaning. So if I this line in a diff

+utm_ expid

I know at a glance the full name of the CGI arg that will be matched.

iOS Mobile app

I know an issue isn’t necessarily the best place to ask this question, but I’m wondering if you’ve seen any apps for iOS that perform this functionality, say, by seeing what’s on the clipboard and, if it’s a url, giving you the chance to copy the tracking-removed url to the clipboard instead.

I find that the majority of times I remove tracking identifiers by hand, it’s when I’m messaging a shared link to a friend, which usually happens on my phone. I am not an app developer, but I have considered learning Swift just to be able to write an app like this. If someone else has already beaten me to the punch, I’d be glad for it.

In your research trying to address the issue, have you seen anything like that?

Request to add "dclid" parameter

Furthermore i've noticed that GA seems to be able to track my screen resolution upon clicking, or hovering "Next" on the second (not the first) page of google search results (&biw= &bih= &ei=)

Add Trackers

Strip googleadservices links?

Whenever I search for something on google I find some useful links in the ad-section.
And if you hover over a link it looks ok, but if you copy it you get a googleadservices link that will redirect to the one it showed in the statusbar.

You think you could fix the link? :)

Keep up the great work :)

Ove B-)

Remove affiliate/tracking links using patterns?

Links with redirects sometimes not indicating they were skipped

Additional redirs

//youtu.be/foo => https://www.youtube.com/watch?v=foo

Also y2u.be works like youtu.be but is not run by google (!) That can't be good

https://redd.it/7tczf9 => https://www.reddit.com/tb/7tczf9

https://app.instapage.com/route/9475232/?url=www.nat.ai/careers => http://www.nat.ai/careers

Breaks support.steampowered.com Steam Support article links

With this extension enabled, links to specific support.steampowered.com Steam Support articles, e.g. https://support.steampowered.com/kb_article.php?ref=8625-wrah-9030 get redirected to the main Steam Support page at https://help.steampowered.com/en/ instead of to the expected article page.

This will not allow you to view the specific articles, probably because the ?ref=8625-wrah-9030 part gets removed. There should either be an exception for this or a way to allow users to configure exceptions for specific URLs.

add edge support

Please add edge support, as it is not fully working with it
A new extension from microsoft store should be great or an option to switch between "edge.tabs" instead of "chrome.tabs" and maybe some other lines code from current extension

go.trafficrouter.io, a popular affiliate redirect used by Skimlinks for Target.com, Lenovo.com, etc. broken by this extension

This is what happens with the extension if you click on, say, a Target.com url https://www.androidpolice.com/2019/11/28/target-unveils-black-friday-discounts-best-deals-start-november-27th/:

The link you clicked on requires cookies to be enabled in your browser. We apologize for any inconvenience.

image

Without the extension, the redirect works fine, and future redirects work fine too (so test in Incognito after you click through without the extension).

Example of where you end up with this dead end: http://go.trafficrouter.io/?res=nc&original=http%3A%2F%2Fgoto.target.com%2Fc%2F10078%2F81938%2F2092%3FsubId1%3D85009X1537243X7fb27be2aa63f0ce7ba48456a8d440d2%26sharedId%3Dandroidpolice.com%26u%3Dhttps%253A%252F%252Fwww.target.com%252Fp%252Famazon-echo-dot-3rd-generation%252F-%252FA-54148143%26level%3D4&dst=&brid=&dstsig=

What is it about the extension that breaks such links and can it be fixed?

Thanks.

mousing over link shows different URL to where clicking it goes

With the extension enabled, when mousing over a link, it shows the unstripped URL, yet clicking the link will navigate to the stripped version. I'm guessing the mouseover URL is unaffected because of the implementation only intercepting at click-time, not during page load / render:

The extension doesn't do any DOM parsing, but rather uses available APIs (webRequest.onBeforeRequest for navigation intercepts and contextMenu.create.onclick for right-clicks) to know about what the URL is.

I don't know how easy it would be to achieve, but I think it would definitely be less misleading if the mouseover URL was the exact URL which gets visited when the link is clicked. It's slightly scary to realise that a malware extension could hijack any innocent click and take the user somewhere they really don't want to go. So in the interests of keeping this benign extension as honest and transparent as possible, IMHO it would be worth fixing the mouseover, if that is technically feasible.

data-saferedirecturl defeats URL stripping

I see that Gmail are doing something really evil with hyperlinks within emails:

<a href="https://link.to.other.site" target="_blank" data-saferedirecturl="https://www.google.com/url?q=link.to.other.site&amp;source=gmail&amp;ust=1534942837200000&amp;usg=AFQjCNH3Pzb1Mq6zL847zzh6iqd4g1B3IA">here</a>

This is also described here:

I really don't want google tracking which links I click on. I'm sure Google would claim that they're doing the user a favour by hiding referrer data from the target web server, but only because they're stealing that data for themselves!

So I think it would be great if this project stripped these data-saferedirecturl links.

[Feature request] spm parameter

The spm parameter is used on websites operated by Alibaba and its related companies. Here are some documentation (in Chinese):

I'll summary some in English:

  • spm stands for super position model, it's basically the same as the utm_ parameters used by Google Analytics.
  • For example the URL https://cn.aliyun.com/about?spm=5176.8142029.631162.about.702b614aLj4URY, it contains the spm parameter, and its value is 5176.8142029.631162.about.702b614aLj4URY.
    • spm parameter is formatted like this: spm=spmA.spmB.spmC.spmD.spmE
    • spmA identifies a website;
    • spmB identifies a page on that website;
    • spmC identifies a block on that page;
    • spmD further identifies a specific position in that block;
    • spmE is a random string generated with current time, which could distinguish visits of the same link from each other by time;

You could visit website of Aliyun or Aliexpress to see the spm parameter in action. I really like this extension for its simplicity and effectiveness to block utm and other tracking parameters, and I hope you could consider adding spm to that list too.

I forked this repository and found the right file to modify, but unfortunately I have not yet figured out how to build it. Maybe you could add some instructions about building the extension to README, if you have time?

Thank you again for this amazing extension.

YouTube tracking removal not working?

Hi, I have noticed this extension is not working on YouTube right now, I am not sure if its because of something specific to my setup, I am running Opera with the extension that allows you to run Chrome extensions would that be a problem?

An important thing I should note, when I manually right click the url and select "Copy and Clean link" from the context menu it works fine (as it should) and the tracking gets removed.

So to sum up, I don't think automatic tracking removal is working on YouTube.

Example url: https://www.youtube.com/redirect?event=video_description&redir_token=pfsH9gZ730FoUkqi7XHqWNAi1zl8MTU0NTQwMzE4MEAxNTQ1MzE2Nzgw&q=https%3A%2F%2Fcbdfx.com%2Fproducts%2Fcbd-capsules-750mg&v=cB9SfzS56ms

Stripped url: https://cbdfx.com/products/cbd-capsules-750mg

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.