Git Product home page Git Product logo

Comments (12)

asg0451 avatar asg0451 commented on June 5, 2024 1

ugh that's definitely the issue. thanks for the help lol working fine now πŸ˜…

from khoj.

sabaimran avatar sabaimran commented on June 5, 2024

Yay for getting past the previous error!

@asg0451 , is the shutdown happening before you even connect, or only when you try pinging the server? If you're able to ssh into the host machine, are you able to ping it over localhost?

Do you mind also sharing your machine specs?

I haven't seen this error before, let me search around a bit.

from khoj.

sabaimran avatar sabaimran commented on June 5, 2024

Can you also ensure that the machine has access to the internet? By the error, it seems to be a DNS error, but I'm not sure what exactly it could be.

Also try using the env variable KHOJ_DEBUG=True to possibly get more verbose errors.

from khoj.

asg0451 avatar asg0451 commented on June 5, 2024

Thanks @sabaimran

The shutdown happens as soon as i connect to the machine, either via port forwarding to localhost and connecting as localhost:... or by connecting to its public-facing site (specified in khoj_domain env var above)

This is all on k8s so the pod definitely has access to dns, the internet, and it all should work lol. Other pods work just fine.

K8s on x86_64 (linux)

With KHOJ_DEBUG=True, the output is basically the same:

[15:58:18.693772] INFO     πŸš’ Initializing Khoj v1.9.0               main.py:108                                                                                                                  
[15:58:18.696562] INFO     πŸ“¦ Initializing DB:                       main.py:109                                                                                                                  
                           Operations to perform:                                                                                                                                                 
                             Apply all migrations: admin, auth,                                                                                                                                   
                           contenttypes, database, sessions                                                                                                                                       
                           Running migrations:                                                                                                                                                    
                             No migrations to apply.                                                                                                                                              
[15:58:18.697729] DEBUG    🌍 Initializing Web Client:               main.py:110                                                                                                                  
                           180 static files copied to                                                                                                                                             
                           '/app/src/khoj/static'.                                                                                                                                                
[15:58:18.701942] INFO     🌘 Starting Khoj                          main.py:122                                                                                                                  
[15:58:18.914279] INFO     🚨 Khoj is not configured.           configure.py:197                                                                                                                  
                           Initializing it with a default                                                                                                                                         
                           config.                                                                                                                                                                
/usr/local/lib/python3.10/dist-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This sho
  return self.fget.__get__(instance, owner)()                                                                                                                                                     
[15:58:34.780223] INFO     No default conversation config found, __init__.py:447                                                                                                                  
                           skipping default agent creation                                                                                                                                        
[15:58:34.788484] INFO     πŸ“¬ No-op...                          configure.py:248                                                                                                                  
[15:58:34.789352] INFO     πŸŒ– Khoj is ready to use                   main.py:159                                                                                                                  
[15:58:34.796329] INFO     Started server process [1]               server.py:75                                                                                                                  
[15:58:34.797219] INFO     Waiting for application startup.             on.py:45                                                                                                                  
[15:58:34.798039] INFO     Application startup complete.                on.py:59                                                                                                                  
[15:58:34.799050] ERROR    [Errno -2] Name or service not known    server.py:156                                                                                                                  
[15:58:34.799859] INFO     Waiting for application shutdown.            on.py:64                                                                                                                  
[15:58:34.800547] INFO     Application shutdown complete.               on.py:75                

(you can see it start up at :18, then i opened it via localhost in my browser at :34 and it dies)

one other thing is that while it initially doesnt crash until i connect to it, afterwards it does seem to crashloop on its own. possibly due to other traffic to it from elsewhere, but i'm not sure. logs are the same

from khoj.

asg0451 avatar asg0451 commented on June 5, 2024

actually, it does seem like maybe the connecting thing was a coincidence. it's definitely just crashlooping on its own :/

from khoj.

asg0451 avatar asg0451 commented on June 5, 2024

One additional clue is that for the first crash only i saw in the logs: Use offline chat model? (y/n): and some other stuff. so maybe the fact that it was started in a non-interactive way (no stdin) created an invalid config for all future starts?

from khoj.

sabaimran avatar sabaimran commented on June 5, 2024

Thanks for the debugging info!

For reference, this is what it looks like when it successfully starts up. Just did this on a clean install:

khoj-test-server-1    | [05:37:53.191273] INFO     πŸŒ– Khoj is ready to use                   main.py:159
khoj-test-server-1    | [05:37:53.197530] INFO     Started server process [1]               server.py:75
khoj-test-server-1    | [05:37:53.198166] INFO     Waiting for application startup.             on.py:45
khoj-test-server-1    | [05:37:53.198730] INFO     Application startup complete.                on.py:59
khoj-test-server-1    | [05:37:53.199286] INFO     Uvicorn running on http://0.0.0.0:42110 server.py:206

Interestingly, it's never getting to a state where the uvicorn server even starts running on your machine. Given the logs, it's getting right up to that last step before it dies.

Non interactive shouldn't be an issue. As per the instructions, you should have setup KHOJ_ADMIN_EMAIL and KHOJ_ADMIN_PASSWORD in your environment for admin credentials to work.

Would you mind sending me your full docker-compose.yml (removing any personal identifying information or secrets)? You can send it to [email protected]. I'll check it for any possible formatting issues and test it locally. If I'm not able to reproduce it, it may be something particular to the k8s environment.

If you can share the specs (RAM, CPU, GPU, etc) of the machine, this can also rule out any issues related to machine constraints.

from khoj.

sabaimran avatar sabaimran commented on June 5, 2024

More specifically, the error is being thrown here in the uvicorn library:

                server = await loop.create_server(
                    create_protocol,
                    host=config.host,
                    port=config.port,
                    ssl=config.ssl,
                    backlog=config.backlog,
                )

@asg0451 , I have a repro! Are you providing the CLI args in your docker-compose.yml in this format? It must be like this, it can't be in the list sort of format you shared in the first message.

    command: --host="0.0.0.0" --port=42110 -v --anonymous-mode

from khoj.

asg0451 avatar asg0451 commented on June 5, 2024

Thanks @sabaimran . Here is the k8s manifest i'm using:

apiVersion: v1
kind: Namespace
metadata:
  name: khoj

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
  namespace: khoj
spec:
  selector:
    matchLabels:
      app: database
  serviceName: database
  replicas: 1
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
        - name: database
          image: pgvector/pgvector:pg16
          env:
            - name: POSTGRES_USER
              value: postgres
            - name: POSTGRES_PASSWORD
              value: postgres
            - name: POSTGRES_DB
              value: postgres
          ports:
            - containerPort: 5432
              name: psql
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data/
          resources:
            requests:
              memory: "128Mi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "1"
          readinessProbe:
            exec:
              command:
                - pg_isready
                - -U
                - postgres
            initialDelaySeconds: 10
            periodSeconds: 30
            timeoutSeconds: 10
            successThreshold: 1
            failureThreshold: 3
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi

# headless svc
---
apiVersion: v1
kind: Service
metadata:
  name: database
  namespace: khoj
spec:
  clusterIP: None
  selector:
    app: database
  ports:
    - port: 5432
      targetPort: 5432

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: khoj
  namespace: khoj
spec:
  selector:
    matchLabels:
      app: khoj
  serviceName: khoj
  replicas: 1
  template:
    metadata:
      labels:
        app: khoj
    spec:
      containers:
        - name: khoj
          image: ghcr.io/khoj-ai/khoj:latest
          ports:
            - containerPort: 42110
              name: web
          volumeMounts:
            - name: config
              mountPath: /root/.khoj
            - name: models
              mountPath: /root/.cache/torch/sentence_transformers
          env:
            - name: POSTGRES_DB
              value: postgres
            - name: POSTGRES_USER
              value: postgres
            - name: POSTGRES_PASSWORD
              value: postgres
            - name: POSTGRES_HOST
              value: database-0.database.khoj.svc.cluster.local
            - name: POSTGRES_PORT
              value: "5432"
            - name: KHOJ_DJANGO_SECRET_KEY
              value: secret
            - name: KHOJ_ADMIN_EMAIL
              value: [email protected]
            - name: KHOJ_ADMIN_PASSWORD
              value: password
            - name: KHOJ_DOMAIN
              value: "khoj.beagle-chickadee.ts.net"
            - name: KHOJ_DEBUG
              value: "True"
          resources:
            requests:
              memory: "1Gi"
              cpu: "100m"
            limits:
              memory: "2Gi"
              cpu: "1"
          args:
            - --host="0.0.0.0"
            - --port=42110
            - -v
            - --anonymous-mode
  volumeClaimTemplates:
    - metadata:
        name: models
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 30Gi
        storageClassName: local-path
    - metadata:
        name: config
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 2Gi

---
apiVersion: v1
kind: Service
metadata:
  name: khoj
  namespace: khoj
spec:
  selector:
    app: khoj
  ports:
    - port: 42110
      targetPort: 42110

you should be able to reproduce the issue by starting a kind or minikube local cluster and then applying the above manifest with kubectl --context kind-kind -f <file.yaml>, then wait for pods to start (image pull takes a bit) with kubectl --context kind-kind -n khoj get pod/khoj-0 , then watch logs with kubectl --context kind-kind -n khoj logs -f pods/khoj-0

from khoj.

sabaimran avatar sabaimran commented on June 5, 2024

Ahh, okay @asg0451 , I have very limited experience with k8s, so take this with a grain of salt, but can you update your syntax to this format and try again? The quotes might be messing it up. I'm certain that the error is in the way the args are being passed, if you want to experiment with this.

args:
    - --host=0.0.0.0
    - --port=42110
    - -v
    - --anonymous-mode

from khoj.

asg0451 avatar asg0451 commented on June 5, 2024

One last little thing -- i was unable to open django settings via my custom domain. after being presented with a login screen, i get this error.
Screenshot 2024-04-09 at 1 57 14β€―PM

workaround: port-forward from localhost

from khoj.

debanjum avatar debanjum commented on June 5, 2024

One last little thing -- i was unable to open django settings via my custom domain. after being presented with a login screen, i get this error. Screenshot 2024-04-09 at 1 57 14β€―PM

workaround: port-forward from localhost

Yeah, a bunch of other folks have also complained about this. We need to check why, how to resolve this for custom domains. Just haven't gotten around to it yet. Glad the port-forward from localhost provided a workaround.

I'll close this issue for now given your original issue of using k8s to self-host Khoj got resolved. But feel free to reopen it if I missed anything

from khoj.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.