Git Product home page Git Product logo

cloud-berg's Issues

Berg berg devbox example fails due to log in authentication

I have ensured that gcloud auth login is completed before the run.

Message:

{
"error": {
"errors": [
{
"domain": "global",
"reason": "required",
"message": "Login Required",
"locationType": "header",
"location": "Authorization",
"debugInfo": "com.google.api.server.core.Fault: ImmutableErrorDefinition{base=LOGIN_REQUIRED, category=USER_ERROR, cause=com.google.api.server.core.Fault: LOGIN_REQUIRED Login Required, debugInfo=null, domain=global, extendedHelp=null, httpHeaders={WWW-Authenticate=[Bearer realm="https://accounts.google.com/"]}, httpStatus=unauthorized, internalReason=Reason{arguments={}, cause=null, code=null, createdByBackend=false, debugMessage=null, errorProtoCode=null, errorProtoDomain=null, filteredMessage=null, location=null, message=null, unnamedArguments=[]}, location=headers.Authorization, message=Login Required, reason=required, rpcCode=401} Login Required\n\tat com.google.api.server.auth.AuthenticatorInterceptor.addChallengeHeader(AuthenticatorInterceptor.java:264)\n\tat com.google.api.server.auth.AuthenticatorInterceptor.processErrorResponse(AuthenticatorInterceptor.java:231)\n\tat com.google.api.server.auth.GaiaMintInterceptor.processErrorResponse(GaiaMintInterceptor.java:764)\n\tat com.google.api.server.core.intercept.AroundInterceptorWrapper.processErrorResponse(AroundInterceptorWrapper.java:28)\n\tat com.google.api.server.stats.StatsBootstrap$InterceptorStatsRecorder.processErrorResponse(StatsBootstrap.java:312)\n\tat com.google.api.server.core.intercept.Interceptions$AroundInterception.handleErrorResponse(Interceptions.java:202)\n\tat com.google.api.server.core.intercept.Interceptions$AroundInterception.invoke(Interceptions.java:151)\n\tat com.google.api.server.core.protocol.http.rest.RestServlet.service(RestServlet.java:123)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:717)\n\tat com.google.api.server.core.protocol.http.ApiServlet.service(ApiServlet.java:51)\n\tat com.google.gse.FilteredServlet$ChainEnd.doFilter(FilteredServlet.java:212)\n\tat com.google.api.server.core.EventIdFilter.doFilter(EventIdFilter.java:49)\n\tat com.google.gse.FilteredServlet$Chain.doFilter(FilteredServlet.java:189)\n\tat com.google.loadbalancer.gslb.backend.ubb.UBBFilter.doFilter(UBBFilter.java:72)\n\tat com.google.gse.FilteredServlet$Chain.doFilter(FilteredServlet.java:189)\n\tat com.google.servlet.testing.ResponseInjectionFilter.doFilter(ResponseInjectionFilter.java:133)\n\tat com.google.gse.FilteredServlet$Chain.doFilter(FilteredServlet.java:189)\n\tat com.google.gse.FilteredServlet.service(FilteredServlet.java:158)\n\tat com.google.gse.internal.HttpConnectionImpl.runServletFromWithinSpan(HttpConnectionImpl.java:933)\n\tat com.google.gse.internal.HttpConnectionImpl.access$000(HttpConnectionImpl.java:74)\n\tat com.google.gse.internal.HttpConnectionImpl$1.runServletFromWithinSpan(HttpConnectionImpl.java:825)\n\tat com.google.gse.GSETraceHelper$TraceableServletRunnable$2.run(GSETraceHelper.java:468)\n\tat com.google.tracing.LocalTraceSpanRunnable.runInContext(LocalTraceSpanRunnable.java:55)\n\tat com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:460)\n\tat com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:321)\n\tat com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:311)\n\tat com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:457)\n\tat com.google.tracing.LocalTraceSpanBuilder.internalContinueSpan(LocalTraceSpanBuilder.java:643)\n\tat com.google.gse.GSETraceHelper$TraceableServletRunnable.continueGfeTrace(GSETraceHelper.java:417)\n\tat com.google.gse.GSETraceHelper$TraceableServletRunnable.runWithTracingEnabled(GSETraceHelper.java:372)\n\tat com.google.gse.GSETraceHelper$TraceableServletRunnable.run(GSETraceHelper.java:338)\n\tat com.google.gse.internal.HttpConnectionImpl.runServlet(HttpConnectionImpl.java:827)\n\tat com.google.gse.internal.HttpConnectionImpl.run(HttpConnectionImpl.java:781)\n\tat com.google.gse.internal.DispatchQueueImpl$WorkerThread.run(DispatchQueueImpl.java:403)\nCaused by: com.google.api.server.core.Fault: LOGIN_REQUIRED Login Required\n\tat com.google.api.server.auth.NoAuthAuthenticationProcessor.process(NoAuthAuthenticationProcessor.java:20)\n\tat com.google.api.server.auth.GaiaMintApiAuthenticator.authenticate(GaiaMintApiAuthenticator.java:284)\n\tat com.google.api.server.auth.GaiaMintInterceptor.doAuthenticateSingleRequest(GaiaMintInterceptor.java:876)\n\tat com.google.api.server.auth.GaiaMintInterceptor.doAuthenticate(GaiaMintInterceptor.java:687)\n\tat com.google.api.server.auth.AuthenticatorInterceptor.authenticate(AuthenticatorInterceptor.java:361)\n\tat com.google.api.server.auth.GaiaMintInterceptor.authenticate(GaiaMintInterceptor.java:659)\n\tat com.google.api.server.auth.AuthenticatorInterceptor.processRequest(AuthenticatorInterceptor.java:191)\n\tat com.google.api.server.auth.GaiaMintInterceptor.processRequest(GaiaMintInterceptor.java:517)\n\tat com.google.api.server.core.intercept.AroundInterceptorWrapper.processRequest(AroundInterceptorWrapper.java:20)\n\tat com.google.api.server.stats.StatsBootstrap$InterceptorStatsRecorder.processRequest(StatsBootstrap.java:278)\n\tat com.google.api.server.core.intercept.Interceptions$AroundInterception.processRequest(Interceptions.java:159)\n\tat com.google.api.server.core.intercept.Interceptions$AroundInterception.invoke(Interceptions.java:135)\n\t... 27 more\n"
}
],
"code": 401,
"message": "Login Required"
}
}

Add basic integration tests

Not super clear to me what the best interface would be here.

Perhaps we could mock out check_call and just start with a high level test that ensures that we call gcloud with reasonable arguments

[unlikely to implement] Serialization helper

Follow up to #4

If serialization of the args becomes annoying (for example, if researchers frequently try to serialize strings have characters that bash mis-interprets), we could let the user set flags within their executable function also. I think that we likely don't need to do this though.

# train.py
def main():
  ...

if __name__ == '__main__':
  FLAGS = argparser.parse()
  berg.setup_flags(FLAGS)
  main()

My current thinking is that this is more trouble than it is worth, and that serializing through CLI flags yields code that is easier to understand and more portable than this proposal.

Add berg to pip

It would be nice to be able to run

pip install berg

This could also be the default way for users to create new cloud images. They just set up the cloud box how they want it and then run

pip install berg

Add programmatic interface

People often want to start a job without piping args through a CLI / scripting in bash.

Seems like we could get this from a very simple API on top of the existing system:

# berg_launcher.py
import berg
import numpy as np

for lr in np.linspace(0.0, 1.0, 10):
  berg.run("train.py", flags={
      'lr': lr
    },
    num_gpus=4
)

These argument could then be serialized into CLI flags and fed into the training script.

This would result in ten instances being spun up that each run a command like the following

train.py --lr 0.1

We also could potentially add a flags helper as in #8, but I think that serializing through CLI flags is likely to be simpler

Add TPU support

It would be nice to support starting up a TPU accelerator, connecting to it, and shutting it down after the job finishes

Consider running as configurable user

We currently run as root on the box because I didn't want to have to think about permissions.
Some downsides of this:

  1. Some programs don't work under root because they see it as unsafe (e.g. linux-brew)
  2. Some programs give warnings when run under root (e.g. jupyter)
  3. Users who don't know about sudo su to have trouble sshing into a box and running things.

Perhaps we could have people run as arbitrary users and default to a berg user?

We could also just continue to run as root. Not sure if this is worth changing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.