Git Product home page Git Product logo

domainatrex's Introduction

Domainatrex

Domainatrex is a TLD parsing library for Elixir, using the Public Suffix list

hex.pm version hex.pm downloads License

Installation

Add the following to your mix.exs

defp deps do
  [
    {:domainatrex, "~> 3.0.4"},
  ]

Usage

Domainatrex should be able to handle all valid hostnames, it uses the Public Suffix List and is heavily inspired by the fantastic Domainatrix library for Ruby

iex> Domainatrex.parse("someone.com")
{:ok, %{domain: "someone", subdomain: "", tld: "com"}}

iex> Domainatrex.parse("blog.someone.id.au")
{:ok, %{domain: "someone", subdomain: "blog", tld: "id.au"}}

Configuration

For maximum performance, Domainatrex reads the list of all known top-level domains at compile time. Likewise, by default, the package will attempt to fetch the latest list of TLDs from the web before falling back to a local (potentially out of date) copy. You can configure this behavior in your config.exs as follows:

  • :fetch_latest: A Boolean flag to determine whether Domainatrex should try to fetch the latest list of public suffixes at compile time; default is true
  • :public_suffix_list_url: A charlist URL to the latest public suffix file that Domainatrex will try to fetch at compile time; default is 'https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat'
  • :fallback_local_copy: The path to the local suffix file that Domainatrex will use if it wasn't able to fetch a fresh file from the URL, or if fetching updated files was disabled; default is the "lib/public_suffix_list.dat" file included in the package.

Here's a complete example of how you might customize this behavior in your config.exs:

config :domainatrex,
  # Explicitly allow compile-time HTTP request to fetch the latest list of TLDs (default)
  fetch_latest: true,
  # Download the public suffix list from the official source (not necessarily tested with Domainatrex!)
  public_suffix_list_url: 'https://publicsuffix.org/list/public_suffix_list.dat',
  fallback_local_copy: "priv/my_app_custom_suffix_list.dat"

domainatrex's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

domainatrex's Issues

3.0.1 on Hex is broken (removes the `lib/` directory)

Hi there! First off, thanks for Domainatrex—it's made our lives much easier at our Elixir-powered startup. ☺️

I downloaded the 3.0.1 release, but was surprised to get a bunch of errors of the form:

** (UndefinedFunctionError) function Domainatrex.parse/1 is undefined (module Domainatrex is not available)

Digging deeper, the diff on Hex.pm from 3.0.0 to 3.0.1 shows the lib/ directory was removed entirely, presumably in a packaging mistake.

Domainatrex crashes on amazon aws links

I expect domainatrex to be able to parse amazon s3 links properly. So Domainatrex.parse("s3.amazonaws.com") should succeed. Instead it throws the following error:

 ** (MatchError) no match of right hand side value: []
 stacktrace:
   (domainatrex) lib/domainatrex.ex:55: Domainatrex.format_response/2
   (domainatrex) lib/domainatrex.ex:34: Domainatrex.match/1
   test/domainatrex_test.exs:19: (test)

Failing test case:

  test "s3" do
    domain = "s3.amazonaws.com"
    assert Domainatrex.parse(domain) == {:ok, %{domain: "amazon", subdomain: "s3", tld: "com"}}
  end

I think Domainatrex can probably ignore all private domains in the public suffix list. Although there may be a better solution.

Autoload .dat

Is it possible to automate loading of .dat file from the internet? Just like Timex is doing it with TZ database.

Some generated clauses can never match

Compiling the library results in ~40 warnings like:

warning: this clause cannot match because a previous clause at line 109 always matches
  lib/domainatrex.ex:88

warning: this clause cannot match because a previous clause at line 109 always matches
  lib/domainatrex.ex:88

warning: this clause cannot match because a previous clause at line 109 always matches
  lib/domainatrex.ex:88

I'm not yet sure how to determine which entries in the PSL cause these or whether it's a problem.

Provide a way to disable fetching a fresh public suffix list on compile

Hi, the ability to fetch a public suffix list on compilation does sound useful, but the compile time of Domainatrex is already quite long (over 10s on heroku) and I'd rather not emit any HTTP requests while compiling. So is it possible to create a way disable fetching the public suffix list on compile?

Fails on a domain suffix of length 6 at dominatrex.ex:73

The case statement at

case length(suffix) do
covers domain suffixes up to length 5, but apparently now the list fetched at https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat contains a domain suffix of length 6 (.design), and this causes dominatrex to fail to compile:

==> domainatrex
Compiling 1 file (.ex)

== Compilation error in file lib/domainatrex.ex ==
** (CaseClauseError) no case clause matching: 6
    lib/domainatrex.ex:73: anonymous fn/1 in :elixir_compiler_1.__MODULE__/1
    (elixir 1.15.5) lib/enum.ex:984: Enum."-each/2-lists^foreach/1-0-"/2
    lib/domainatrex.ex:71: (module)

Domainatrex.parse raises an unhandled exception when passed a TLD

Similar error to #3 I get the following error on "is.it":

iex(23)> Domainatrex.parse("is.it")
** (MatchError) no match of right hand side value: []
    (domainatrex) lib/domainatrex.ex:65: Domainatrex.format_response/2
    (domainatrex) lib/domainatrex.ex:45: Domainatrex.match/1

It appears that if you pass Domainatrex a TLD directly then you get that MatchError

:wave: Question about TLDs

Hi 👋, just wondering why parsing "foo.netlify.app" returns "netlify.app" TLD and "foo" domain. That's not entirely accurate, is it?

iex(2)> Domainatrex.parse("foo.netlify.app")
{:ok, %{domain: "foo", subdomain: "", tld: "netlify.app"}}

Build error `no case clause matching: 7`

Library version v3.0.3.. This appears to be similar to #24.

#32 64.89 == Compilation error in file lib/domainatrex.ex ==
#32 64.89 ** (CaseClauseError) no case clause matching: 7
#32 64.89     lib/domainatrex.ex:73: anonymous fn/1 in :elixir_compiler_3.__MODULE__/1
#32 64.89     (elixir 1.15.7) lib/enum.ex:984: Enum."-each/2-lists^foreach/1-0-"/2
#32 64.89     lib/domainatrex.ex:71: (module)

Certificate problem when retrieving the PSL during compilation

18:45:22.741 [warning] Description: 'Server authenticity is not verified since certificate path validation is not enabled'
     Reason: 'The option {verify, verify_peer} and one of the options \'cacertfile\' or \'cacerts\' are required to enable this.'

Questionable terminology

Example from the documentation:

iex> Domainatrex.parse("blog.someone.id.au")
{:ok, %{domain: "someone", subdomain: "blog", tld: "id.au"}}

The field called tld should have a better name. In the above example, it is not even a TLD. Call it suffix or registration_domain .

(The field domain is also not ideal. May be registered_domain?)

Warnings when compiling

warning: Application.get_env/3 is discouraged in the module body, use Application.compile_env/3 instead
  lib/domainatrex.ex:7: Domainatrex
[several occurrences]

warning: this clause cannot match because a previous clause at line 109 always matches
  lib/domainatrex.ex:88
[several occurrences]

warning: :ssl.start/0 defined in application :ssl is used by the current application but the current application does not depend on :ssl. To fix this, you must do one of:

  1. If :ssl is part of Erlang/Elixir, you must include it under :extra_applications inside "def application" in your mix.exs

  2. If :ssl is a dependency, make sure it is listed under "def deps" in your mix.exs

  3. In case you don't want to add a requirement to :ssl, you may optionally skip this warning by adding [xref: [exclude: [:ssl]]] to your "def project" in mix.exs

  lib/domainatrex.ex:16: Domainatrex

% elixir --version
Erlang/OTP 25 [erts-13.1.5] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit]

Elixir 1.14.3 (compiled with Erlang/OTP 25)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.