Git Product home page Git Product logo

Comments (40)

bmartinn avatar bmartinn commented on May 26, 2024 1

It seems that for some reason the us-east-1 AMI wan't public. Please try again now.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

See matching thread here: #3
Feel free to vote with reaction 👍

TL;DL
TRAINS-server is for internal use only, the main goal is full transparency, so everyone in the company can login (which is mostly for read access).

If you are using the AMI distribution, I highly recommend configuring the security-group of the instance to limit the access to the server for internal company IP group...

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

in the AMI deployment , where exactly is the trains-init script , could not find it , not even in the docker instances (webserver/api)

The trains-init script is just an easier way to configure credentials for the SDK (i.e. the TRAINS python package). The TRAINS-server does not need any credentials configured.

Once you login to the web-app (with any username, as currently there is not limitation)
You can generate your key/secret credentials pair (which the TRAINS package is using, in a similar way to AWS credentials etc) in the profile page (which is currently called 'admin' but will be changed in the next release to 'profile') at http://localhost:8080/admin

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

thanks for the explanations.
regarding limiting IP addresses - this is a problem , as we use sagemaker to train models.
and i cannot tell which IP address sagemaker will use. (in order to report the metrics you collect)

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - i have put the instance in AWS behind a LB , so it could be available to our Sagemaker server
but i get 405 method not allowed on POST - i believe this is because of the headers below sent to my browser
if i go directly to the instance i get these headers which suggest that the local IP address is used as allowed origin
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: http://10.121.1.213:8080
when working without ELB everything works - meaning this cannot be put behind ELB ?
thanks
Shlomi

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

The demoapp.trainsai.io is an almost exact replica of the pre-installed trains-server AMI and it works fine behind an ELB.

@shlomiken Can you please share the trains-server logs? (default location /opt/trains/logs)

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

trains.tar.gz
@bmartinn attached
i believe this is CORS issue - as when i request directly from the instance i see OPTIONS is being sent to server.
i might be missing something in configuration that say hey this is my domain name for this server.
BTW - on your server i see this response which looks better that mine.
Access-Control-Allow-Origin: https://demoapp.trainsai.io

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

The webserver's CORS configuration uses origin "*", which means any request origin is allowed, and the response headers will simply echo the origin - this is the reason you see Access-Control-Allow-Origin: https://demoapp.trainsai.io in our demo server's response, and Access-Control-Allow-Origin: http://10.121.1.213:8080 in yours (when you do succeed reaching the server).

Just to make sure, the responses we're talking about are those received by your browser when trying to access the webserver in port 8080, right?

This issue might be related to a HTTPS redirect issue, can you please describe how you configured the ELB?

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

Hi @bmartinn - the ELB is configured with both 8008 (API) 80 listeners. that forward to the instance on port 8080 or 8008.
this is an Application LB .
Is there special rule i need to add ? can you share you configuration ?
thanks
Shlomi

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

I've just set up an AWS EC2 instance using the trains-server AMI to try and replicate your use-case. It works as expected, and my browser can successfully access the webserver with the URL http://<my-elb>.us-east-1.elb.amazonaws.com:8080.

I've used the following ELB configuration:

  • ELB Type: HTTP/HTTPS load balancer.
  • Listeners:
    • HTTP:8008 listener, forwarded to a target group:
      • Target type: Instance
      • Protocol: HTTP
      • Port 8008
    • HTTP:8080 listener, forwarded to a target group:
      • Target type: Instance
      • Protocol: HTTP
      • Port 8080

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

the difference is i use port 80 to mask port 8080 cause our sec groups do not allow 8080 traffic ( just like your demo server)
can you please share the configuration of the demo server ?
maybe its conf files without tokens?

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

You're correct, I've reproduced the issue right now and the reason is the different port used for accessing the webserver. The webapp assumes the webserver uses port 8080. Since in your case the webserver uses port 80, the webapp fails to access the apiserver.

I'll add this issue as a configuration feature request.

The demo server uses subdomains (demoapp, demoapi) instead of port numbers to redirect traffic to the webserver and the apiserver (the webapp can detect these names and adjust accordingly).

Since in your case, sagemaker only requires access to port 8008 (which is the port used by the trains package), you can:

  • Use ELB with a single 8008 listener and configure sagemaker/trains to use the external ELB address with port 8008
  • Use the EC2 instance private IP port 8080 in your browser, like you currently do

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - how can i use subdomains as well ?

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

You can configure ELB to direct port 80 in subdomain app to port 8080 and port 80 in subdomain api to port 8008. Please make sure to use HTTP and not HTTPS as changing this will require reconfiguration of the webserver (which is part of the pre-built AMI).

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn sure , but how the UI knows to which api server to address.
my problem is the the UI is also not functioning cause of this issue - we have no problem with sage maker calls.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

The UI can detect the app subdomain and use the api subdomain to access the apiserver.

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - i have created 2 separated domains in AWS.
image
i have tried that (although sounds weird) - but the api requests still try to get to trainsapp

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Use app and api, as in app.trains.kenshoo-lab.comand api.trains.kenshoo-lab.com.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Apologies, it seems the code related to custom subdomains redirection is not part of the docker build. I'll include it in the next release (soon). For now, you can use the ports approach discussed earlier.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Moved issue to TRAINS-server for future reference.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken did you manage to get the TRAINS-server working with your SageMaker pipeline ?

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - still no , i have proceed with this , was waiting for you fix that will allow us to have the UI working with CORS / subdomains .

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken please check if the docker below solves the problem :)

sudo docker run -d --restart="always" --name="trains-webserver" --network="host" -v /opt/trains/logs:/var/log/trains allegroai/trains:0.9.0-12 webserver

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - i don't run the docker myself , i remind you i have used the AMI deployment.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken sure, TRAINS-server 0.9.0-12 AMI IDs by regions:

us-east-1 ami-0f6518bfa76238051
eu-north-1 - ami-05a6e3ee78b647f97
ap-south-1 - ami-0d3edec2c2bc0155a
eu-west-3 - ami-0a1c4262f942e656d
eu-west-2 - ami-0ea8ca5fce67531c5
eu-west-1 - ami-067c828528a5203ee
ap-northeast-2 - ami-0588366d62f372702
ap-northeast-1 - ami-0395d3e7aef0f1c39
sa-east-1 - ami-08017fcb4f3a3a3f9
ca-central-1 - ami-0399f2bf57c9c4aa3
ap-southeast-1 - ami-0a2b780c64670beb9
ap-southeast-2 - ami-0a627401b844c3960
eu-central-1 - ami-0e6f580ffbd6d2fd1
us-east-2 - ami-0245c5960be5a67b5
us-west-1 - ami-04320deaf0fab4e29
us-west-2 - ami-0f632602fb4f8e9bd

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn , is there a reason us-east-1 is not on the list ?

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - looks like the image on us-east-1 is an old version ? 0.0.39
image
while those you sent are 0.9

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Oops :)
us-east-1 ami-0f6518bfa76238051

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - i cannot find it in us-east-1

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - still get a problem in UI , API requests are going to trainsapp.kenshoo-lab.com instead of trainsapi.kenshoo-lab.com
image

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken please note the Web-App will replace the "app" prefix with "api" prefix.
In order to get the UI working, please try to set the subdomains as following:

app.trains.kenshoo-lab.com
api.trains.kenshoo-lab.com

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn done that - unfortunately still not working, now its 401 unauthorized. maybe the 3 parts domain name ?

image

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

We've investigated this issue and found out the culprit is probably the authentication cookie's domain configuration. We'll release a new version very soon to allow configuring these kind of properties from outside the docker containers. I'll update when the new version is out.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Hi,
There's a new version out (0.10.0) that supports external configuration.
See the updated documentation's Configuration section for configuring the server, and the Upgrade section for upgrading the server to the latest version.

In order to configure the authentication cookie domain to your domain, you'll need to place a webserver.conf file in the mapped config directory with the following configuration:

auth {
    cookies {
        domain: "trains.kenshoo-lab.com"
    }
}

Please note:

  • You'll need to add this configuration file before running the new docker containers
  • Since you're using two sub-domains (app.trains...), you might need to use kenshoo-lab.com as the cookie's domain

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

@bmartinn - Hi i have just deployed the new version (again i'm using AMI not docker directly)
i have placed webserver.conf in the web server container . as you specified above and still get 405 method not allowed.
i think you are missing something
The cookie domain should not be configured - it should be taken from the request url (most server in the world works like this)
cause now for example i cannot access the instance directly at port 8080 - it just redirects back to the login page.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken a new version is out (0.10.1), and we've replaced the webserver with NGINX, which should simplify things.

The configuration you should need with this new version, assuming you are using HTTPS and your primary domain is trains.kenshoo-lab.com:

  • An /opt/trains/config/apiserver.conf file, containing:
auth {
  cookies {
    httponly: true
    secure: true
    domain: ".trains.kenshoo-lab.com"
    max_age: 99999999999
  }
}
  • The following load balancer configuration:
    • Listeners:
      • (optional) HTTP listener, that redirects all traffic to HTTPS.
      • HTTPS listener for app. forwarded to 'AppTargetGroup'
      • HTTPS listener for api. forwarded to 'ApiTargetGroup'
      • HTTPS listener for files. forwarded to 'FilesTargetGroup'
    • Target groups:
      • AppTargetGroup: HTTP based target group, port 8080
      • ApiTargetGroup: HTTP based target group, port 8008
      • FilesTargetGroup: HTTP based target group, port 8081
    • Security and routing:
      • Load balancer: Make sure the load balancers are able to receive traffic from the relevant IP addresses (Security groups and Subnets definitions).
      • Instances: Make sure the load balancers are able to access the instances, using the relevant ports (Security groups definitions).

Make sure you either use our AMIs or run the docker containers with the updated docker run commands (see the updated Launching Docker Containers).

from clearml-server.

shlomiken avatar shlomiken commented on May 26, 2024

thanks @bmartinn
can you be more specific about the version - i can see in AWS version allegroai-trains-server-0.10.1-101
should i use it ?
by having a config file file i guess you want me to create it in the AWS instance and then restart the service / instance ?
i just realized something does not make sense in the config you sent.
we cannot have ports 8080 , 8008, 8081 open to the public (company security groups) - this is why we went with ELB from the first place . so why do i need 3 listeners on the ELB ?
can you help me with this constraint configuration ?

thanks
Shlomi

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

You should use the allegroai-trains-server-0.10.1-1 AMIs (AMI id ami-0f2e1f2d006287666 for us-east-1).
As far as configuration goes, you're correct - after creating the instance based on the AMI, you should ssh into the machine, add the apiserver.conf, than docker stop, docker rm and docker run the trains-apiserver container according to the instructions in the Upgrade section.

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

@shlomiken was the problem solved?
We added a per service configuration to the TRAINS python package (starting v0.10.3)
You can find configuration example here
Is it working for you?

from clearml-server.

bmartinn avatar bmartinn commented on May 26, 2024

Closing, due to lack of activity.

from clearml-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.