Git Product home page Git Product logo

khuedoan / homelab Goto Github PK

View Code? Open in Web Editor NEW
7.8K 80.0 685.0 1.4 MB

Fully automated homelab from empty disk to running services with a single command.

Home Page: https://homelab.khuedoan.com

License: GNU General Public License v3.0

Makefile 4.96% Dockerfile 0.24% Jinja 11.02% Shell 5.14% HCL 13.55% Python 32.98% Go 25.81% Nix 3.49% Jsonnet 2.80%
kubernetes pxe homelab ansible docker helm k3s argocd terraform devops

homelab's Introduction

Khue's Homelab

FeaturesGet StartedDocumentation

tag document license stars

This project utilizes Infrastructure as Code and GitOps to automate provisioning, operating, and updating self-hosted services in my homelab. It can be used as a highly customizable framework to build your own homelab.

What is a homelab?

Homelab is a laboratory at home where you can self-host, experiment with new technologies, practice for certifications, and so on. For more information, please see the r/homelab introduction and the Home Operations Discord community (formerly known as k8s-at-home).

Overview

Project status: ALPHA

This project is still in the experimental stage, and I don't use anything critical on it. Expect breaking changes that may require a complete redeployment. A proper upgrade path is planned for the stable release. More information can be found in the roadmap below.

Hardware

Hardware

  • 4 × NEC SFF PC-MK26ECZDR (Japanese version of the ThinkCentre M700):
    • CPU: Intel Core i5-6600T @ 2.70GHz
    • RAM: 16GB
    • SSD: 128GB
  • TP-Link TL-SG108 switch:
    • Ports: 8
    • Speed: 1000Mbps

Features

  • Common applications: Gitea, Jellyfin, Paperless...
  • Automated bare metal provisioning with PXE boot
  • Automated Kubernetes installation and management
  • Installing and managing applications using GitOps
  • Automatic rolling upgrade for OS and Kubernetes
  • Automatically update apps (with approval)
  • Modular architecture, easy to add or remove features/components
  • Automated certificate management
  • Automatically update DNS records for exposed services
  • VPN (Tailscale or Wireguard)
  • Expose services to the internet securely with Cloudflare Tunnel
  • CI/CD platform
  • Private container registry
  • Distributed storage
  • Support multiple environments (dev, prod)
  • Monitoring and alerting
  • Automated offsite backups 🚧
  • Single sign-on
  • Infrastructure testing

Some demo videos and screenshots are shown here. They can't capture all the project's features, but they are sufficient to get a concept of it.

Demo
Deploy with a single command (after updating the configuration files)
PXE boot
Observe network traffic with Hubble, built on top of Cilium and eBPF
Homepage powered by... Homepage
Monitoring dashboard powered by Grafana
Git server powered by Gitea
Matrix chat server
Continuous integration with Woodpecker CI
Continuous deployment with ArgoCD
ntfy displaying received alerts
Self-hosted AI powered by Ollama (experimental, not very fast because I don't have a GPU)

Tech stack

Logo Name Description
Ansible Automate bare metal provisioning and configuration
ArgoCD GitOps tool built to deploy applications to Kubernetes
cert-manager Cloud native certificate management
Cilium eBPF-based Networking, Observability and Security (CNI, LB, Network Policy, etc.)
Cloudflare DNS and Tunnel
Docker Ephemeral PXE server and convenient tools container
ExternalDNS Synchronizes exposed Kubernetes Services and Ingresses with DNS providers
Fedora Server Base OS for Kubernetes nodes
Gitea Self-hosted Git service
Grafana Observability platform
Helm The package manager for Kubernetes
K3s Lightweight distribution of Kubernetes
Kanidm Modern and simple identity management platform
Kubernetes Container-orchestration system, the backbone of this project
Loki Log aggregation system
NGINX Kubernetes Ingress Controller
ntfy Notification service to send notifications to your phone or desktop
Prometheus Systems monitoring and alerting toolkit
Renovate Automatically update dependencies
Rook Ceph Cloud-Native Storage for Kubernetes
Tailscale VPN without port forwarding
Wireguard Fast, modern, secure VPN tunnel
Woodpecker CI Simple yet powerful CI/CD engine with great extensibility
Zot Registry Private container registry

Get Started

Roadmap

See roadmap and open issues for a list of proposed features and known issues.

Contributing

Any contributions you make are greatly appreciated.

Please see contributing guide for more information.

License

Copyright © 2020 - 2024 Khue Doan

Distributed under the GPLv3 License. See license page or LICENSE.md file for more information.

Acknowledgements

References:

Here is a list of the contributors who have helped to improve this project. Big shout-out to them!

If you feel you're missing from this list, please feel free to add yourself in a PR.

Stargazers over time

Stargazers over time

homelab's People

Contributors

akwan avatar axon-kdoan avatar bluehatbrit avatar bourne-id avatar clashthebunny avatar crimrose avatar dotdiego avatar karpfediem avatar khuedoan avatar linhng98 avatar locmai avatar matthewjohn avatar raedkit avatar renovate-bot avatar retx0 avatar tangowithfoxtrot avatar trangmaiq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

homelab's Issues

Unable to access resources behind cloudflared.

Describe the bug

A clear and concise description of what the bug is.

  • I have read the document

To reproduce

Steps to reproduce the behavior:

  1. Follow the instructions on the repo to deploy the lab and run the external layer.

Expected behavior

Access the lab via my domain.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

I can get to the the hajimari page and see resources found on the apps folder. Argocd & git are not working. I get a cloudflared error( 1033)

configure.py error "TypeError: 'type' object is not subscriptable"

python3 ./configure.py

Text editor (vim):
Enter your domain (khuedoan.com):
Enter seed repo (https://github.com/khuedoan/homelab):
Enter time zone (Asia/Ho_Chi_Minh):
Enter your Terraform Workspace, skip if you don't want to use external resources yet (khuedoan):
Traceback (most recent call last):
File "./configure.py", line 27, in
def find_and_replace(pattern: str, replacement: str, paths: list[str]) -> None:
TypeError: 'type' object is not subscriptable

Rocky 8.6 kickstart stalls with root password issue

Describe the bug

Deploying with the Rocky 8.6 issues stalls complaining of Root account is disable.

  • I have read the document

To reproduce

Steps to reproduce the behavior:

  1. Run make tools, make configure, supply info for your environment
  2. Run make
  3. During kickstart installation stalls with "Root account is disabled"

Expected behavior

Kickstart should continue normally, without user intervention.

Screenshots

image

Additional context

You can move past this error by selecting option 8, providing a password and then selecting b to begin installation.. Installation will complete but will also require user to press enter once installation is complete. Supplying a root password in the kickstart template below, appears to allow the installation to complete as expected.

/homelab/blob/master/metal/roles/pxe_server/templates/kickstart.ks.j2

using the following line:

rootpw --iscrypted <hashed password>

where hashed password is created using something like:

python -c 'import crypt,getpass;pw=getpass.getpass();print(crypt.crypt(pw) if (pw==getpass.getpass("Confirm: ")) else exit())'

how do you manage Data in case of cluster rebuild

HI I'am new in kubernetes, I ask myself some question on restauration/secret management if you could help me

  • in case of rebuild cluster how to restore data on database or reuse persistant volume?
  • if I uninstall and install helm release this will generate new secret not according with database current password how do you manage this?
  • I can maybe create some custom secret resource to avoid this but what is best method to create this one?

Seafile is non-functional. Updating to newest release resolves issue.

After setting up your project on my own system I found that Seafile was throwing an HTTP error when I navigated to the URL. I believe it was 500, but I did not record it.

Pod logs showed that the Seafile pod was not able to authenticate to the mariadb instance.

Updating the helm chart version, and adding SEAFILE_SERVER_HOSTNAME to the values.yaml resolved this.

The changes I made are in this PR to my fork of your repository.

Rewrite configure script in pure Python

Right now it's just a thin Python wrapper for a bunch of shell scripts, which is very ugly. New version will need:

  • Pure Python
  • Input validation (see #29 (comment) for a possible case)
  • Preferably idempotent

Additional features:

  • Timezone

"Bring your own cluster" mode

Hey there!

I've got a Kubernetes cluster I've deployed myself, and am not hugely keen on PXE booting. It would be super awesome if there was a way to use this project while skipping past the PXE boot and provisioning stuff.

In fact, there basically is already a way! Copy your kube configs into place, then...

make -C bootstrap instead of make, and bam... you've got a khuedoan homelab, kind of.

Maybe you could have a target like "make existing-cluster" or something which officially supports this deployment method?

Paramiko Error Reading SSH Protocol Banner

Nodes don't seem to connect. I am probably doing something wrong as I am still learning Ansible.

Wondering does this mean the PXE boot to rocky has completed?

In the file ./metal/group_vars/all.yml do I need to generate the ed25519 ssh key?

image

Thanks for any info!

Permission errors with tools container image on ubuntu host

Hi, huge fan of this project! I've been studying it for a few evenings to understand how I might adapt this for my home lab which is currently 2 Lenovo ThinkCentre M700's. My workstation OS is Ubuntu and I've been having some issues with the tools container, it seems while it'll boot into the container as the same user, etc, the container then has no permissions to a bunch of files around my system.

Some of the issues I've run into so far include:

  • Issues accessing ~/.ssh/config and keys
  • "Destination is not writable" when ansible does the get_url on the rocky linux ISO to the metal/roles/pxe_server/files/data/iso directory.

I've managed to get through some of these so far by converting the Dockerfile over to a Ubuntu base image and getting all the dependencies setup but the ansible issue above is still causing problems. I'm just curious, are you running on Arch like the container images is? I'm wondering if there are some differences between Ubuntu and Arch which might be causing the problem and will give me somewhere to look.

If so, it might also be worth flagging this in the docs / readme that "YMMV if you're not on arch". I'm going to continue trying to get things working, but wanted to check my assumption of your workstation OS first. If that is the problem, I might look at trying it in an Arch VM to dig a bit deeper.

Latest archlinux image breaks 'make tools'

Describe the bug

Running make tools as of May 31, 2022 will bring archlinux:latest tagged with 11bc1c5d6e6e (in Docker Hub).

  • I have read the document

To reproduce

Steps to reproduce the behavior:

  1. Run make tools which will bring :latest tagged with 11bc1c5d6e6e from Docker Hub.
  2. Error will be:
$ make tools
make -C tools
make[1]: Entering directory '[edited]/github/homelab/tools'
Sending build context to Docker daemon  4.608kB
Step 1/5 : FROM archlinux
 ---> c656f9acae22
Step 2/5 : RUN pacman --sync --refresh --noconfirm     reflector     && reflector     --save /etc/pacman.d/mirrorlist     --protocol https     --latest 20     --sort rate
 ---> Running in d6d6dbb88b5e
:: Synchronizing package databases...
 core downloading...
 extra downloading...
 community downloading...
resolving dependencies...
looking for conflicting packages...

Packages (4) gdbm-1.23-1  libnsl-2.0.0-2  python-3.10.4-2  reflector-2021.11-3

Total Download Size:   12.05 MiB
Total Installed Size:  55.23 MiB

:: Proceed with installation? [Y/n] 
:: Retrieving packages...
 python-3.10.4-2-x86_64 downloading...
 gdbm-1.23-1-x86_64 downloading...
 reflector-2021.11-3-any downloading...
 libnsl-2.0.0-2-x86_64 downloading...
checking keyring...
downloading required keys...
:: Import PGP key 59E43E106B247368, "Leonidas Spyropoulos <[email protected]>"? [Y/n] 
checking package integrity...
loading package files...
checking for file conflicts...
:: Processing package changes...
installing gdbm...
installing libnsl...
installing python...
Optional dependencies for python
    python-setuptools
    python-pip
    sqlite [installed]
    mpdecimal: for decimal
    xz: for lzma [installed]
    tk: for tkinter
installing reflector...
Optional dependencies for reflector
    rsync: rate rsync mirrors
:: Running post-transaction hooks...
(1/2) Reloading system manager configuration...
  Skipped: Current root is not booted.
(2/2) Arming ConditionNeedsUpdate...
/usr/bin/python3: /usr/lib/libc.so.6: version `GLIBC_2.34' not found (required by /usr/bin/python3)
/usr/bin/python3: /usr/lib/libm.so.6: version `GLIBC_2.35' not found (required by /usr/lib/libpython3.10.so.1.0)
/usr/bin/python3: /usr/lib/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/libpython3.10.so.1.0)
The command '/bin/sh -c pacman --sync --refresh --noconfirm     reflector     && reflector     --save /etc/pacman.d/mirrorlist     --protocol https     --latest 20     --sort rate' returned a non-zero code: 1
make[1]: *** [Makefile:8: build] Error 1
make[1]: Leaving directory '[edited]/github/homelab/tools'
make: *** [Makefile:29: tools] Error 2

Expected behavior

A clean build. I tried with tag base-20220515.0.56491 (the base tag from 1 previous to :latest) and the build worked fine.

Additional context

Running on Arch Linux 5.18.1-arch1-1 and Docker 20.10.16 .

Building the tools containers to the test VM for the dev environment

When I run:

make tools to build the image based on Arch distribution I get this error and the image bulding stops, trying to troubleshooting,without results.

Step 2/4 : RUN pacman --sync --refresh --noconfirm reflector && reflector --save /etc/pacman.d/mirrorlist --protocol https --latest 20 --sort rate
---> Running in 4687b34d956a
error: failed to initialize alpm library:
(root: /, dbpath: /var/lib/pacman/)
could not find or read directory

Make configure does not start

Describe the bug

after running the make tools command, the mounted directory is empty and "make configure" gives the error below

Successfully tagged homelab-tools:latest
root@homelabdev:/home/xxx/dev/homelab# make configure
make: *** No rule to make target 'configure'.  Stop.
root@homelabdev:/home/xxx/dev/homelab# ls
root@homelabdev:/home/xxx/dev/homelab# cd ..
root@homelabdev:/home/xxx/dev# ls
homelab
  • I have read the document

To reproduce

Steps to reproduce the behavior:

  1. Run make tools
  2. wait for the container to build.
  3. run make configure
  4. See error

Expected behavior

The configure process should start

Additional context

Ubuntu 20.04.1 (vmware fusion on intel macOS Monterey)

dev0 is unreachable

Been trying to run my fork of this amazing project, but so far had no luck. After fixing a few bumps with docker, I'm finally stuck at running the dev inventory via make dev. I assume we run make dev after tools container becomes interactive (by running make tools). Then:

Which interface should be used to configure the metal/inventories/dev.yaml file? I tried every single interface I could find on my host machine after running ipconfig outside tools container, but they all failed with connection timeout or connection refused on port 22. Next tried to add the private ssh-key for ansible (~/.ssh/id_ed25519) inside of the tools container which did not help.

Error

ansible-playbook \
	--inventory inventories/dev.yml \
	cluster.yml

PLAY [Create Kubernetes cluster] **************************************************************************************************

TASK [Gathering Facts] ************************************************************************************************************
fatal: [dev0]: UNREACHABLE! => {
    "changed": false,
    "unreachable": true
}

MSG:

Failed to connect to the host via ssh: ssh: connect to host <an inet from ipconfig result> port 22: Connection timed out

PLAY RECAP ************************************************************************************************************************
dev0                       : ok=0    changed=0    unreachable=1    failed=0    skipped=0    rescued=0    ignored=0   

Question Regarding Node Booting and Installation Loop

Forgive as this seems like a generally novice question, but I was able to get a small display hooked up to one of my nodes to verify/review the pxe booting. To my surprise I had successfully configured my router and modified the repo. But I am not aware of the process of the OS installation.

My installation seems to run over and over again.

Installing boot loader
..
Performing post-installation setup tasks
.
Configuring installed system
................
Writing network configuration
.
Creating users
Configuring addons
Executing com_redhat_kdump addon
Executing org_fedora_oscap addon
..
Generating intramfs
...
Storing configuration files and kickstarts
.
Running post-installation scripts

Then it reboots, loads rocky from PXE, and begins the process again.

Does it install to the nvme every time it boots? Am I supposed to disable the PXE server and boot to the OS after installation? Perhaps one of my generated boot files isn't proper, where should I look. Thanks for all your help!

Unable to recognize "STDIN": no matches for kind "ServiceMonitor"...

Hey,

I've set up my VMs in proxmox, in the same way documented in #17.

I am, however, running into the following when running the make command:

service/argocd-redis unchanged
service/argocd-argocd-applicationset unchanged
deployment.apps/argocd-application-controller unchanged
deployment.apps/argocd-repo-server unchanged
deployment.apps/argocd-server unchanged
deployment.apps/argocd-dex-server unchanged
deployment.apps/argocd-redis unchanged
deployment.apps/argocd-argocd-applicationset unchanged
ingress.networking.k8s.io/argocd-server unchanged
unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
customresourcedefinition.apiextensions.k8s.io/applications.argoproj.io condition met
customresourcedefinition.apiextensions.k8s.io/applicationsets.argoproj.io condition met

This continues to run okay, but leaves me with only the following in the get-status script:

NAME        AGE
apps        104m
bootstrap   104m
platform    104m
system      104m
No resources found in argocd namespace.
NAMESPACE   NAME            CLASS   HOSTS                ADDRESS   PORTS     AGE
argocd      argocd-server   nginx   argocd.ashmcbri.de             80, 443   104m

The remainder don't get set up, assumedly because of these errors.

Bare-metal provisioning

Hey, this is just a proposal/question.

Have you considered using tinkerbell for bare-metal? I like the solution you have right now, but, I was wondering if there is a specific reason you didn't choose any of the existing solutions out there already.

Node Storage Size Question

I have just started to read through your docs and it's great so first of thanks for all the great work putting this together for everyone to use. I was wondering if the 128gb disk size requirement for the nodes is a true hard requirement or if the node could function with a smaller disk size? I don't have a specific target size but something smaller than 128gb would better fit my use case.

iso_url and checksum have changed because of Rocky 8.6

Describe the bug

The iso_url and iso_checksum values defined in https://github.com/khuedoan/homelab/blob/master/metal/roles/pxe_server/defaults/main.yml are no longer valid as Rocky 8.6 is now available. The link https://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-8.5-x86_64-minimal.iso results in 404.

To reproduce

Steps to reproduce the behavior:

  1. Fork repository
  2. Run make tools, then run make configure providing information specific to your environment, commit changes.
  3. Run make tools, then run make to begin deployment process.
  4. When attempt is made to fetch Rocky image, playbook errors on not found.

Expected behavior

Download of Rocky image would be successful and deploy would complete as expected.

Additional context

I see two potential solutions:

Use newer Rocky 8.6 iso_url and iso_checksum:

iso_url: "https://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-8.6-x86_64-minimal.iso"
iso_checksum: "sha256:a9ece0e810275e881abfd66bb0e59ac05d567a5ec0bc2f108b9a3e90bef5bf94"

or update iso_url and iso_checksum to point to the old Rocky 8.5 iso at it's new location:

iso_url: "https://download.rockylinux.org/pub/rocky/8.5/isos/x86_64/Rocky-8.5-x86_64-minimal.iso"
iso_checksum: "sha256:4eb2ae6b06876205f2209e4504110fe4115b37540c21ecfbbc0ebc11084cb779"

Inefficient disk usage by OS

I am not sure if this can be fixed during installation, as I am not very well versed in ansible/rocky/lvm.

I found that the root directory of the nodes is limited to 70GB in size in all nodes I deployed - two with 256GB drives and one with a 512GB drive - and the /home directory is given the majority of the disk space. As longhorn uses the root volume for PVCs this leaves the majority of the disk unusable by longhorn.

I was able to resolve this manually by running the following to recreate the volume for /home with a smaller size and granting the newly available disk space to the root volume.

#Install  xfsdump and backup /home
yum -y install xfsdump
xfsdump -f /tmp/home.dump /home

#Unmount /home and remove the volume
umount /home
lvremove /dev/mapper/<vg name>-home

#Create a new volume for home, limited to 5GB, and restore the backup to it
lvcreate -L 5G -n home <vg name>
mkfs.xfs /dev/mapper/<vg name>-home
mount -a
xfsrestore -f /tmp/home.dump /home

#Extend the root volume to use the newly available space
lvextend -l +100%FREE /dev/<vg name>/root
xfs_growfs /dev/mapper/<vg name>-root

Gitea SSH clone doesn't work with Cloudflare Tunnel using the same subdomain

The following command works in LAN but not from the internet:

git clone [email protected]:ops/homelab.git

  • Cloudflare Tunnel config:
    ingress:
    # It is safe to put a wildcard here
    # Please see https://homelab.khuedoan.com/reference/faq.html#is-it-safe-to-use-wildcard-in-cloudflare-tunnel-ingress-config
    - hostname: '*.khuedoan.com'
    service: https://ingress-nginx-controller.ingress-nginx
    originRequest:
    noTLSVerify: true
    - service: http_status:404
  • NGINX TCP config:
    tcp:
    22: gitea/gitea-ssh:22

Kickstart script curl request fails due to case-sensitive MAC address

Describe the bug

My PXE controller is a Linux (Arch Linux) box and the kickstart script download from it is failing early in the setup, e.g.

curl: (22) The requested URL returned error: 404 Not Found
Warning: anaconda: failed to fetch kickstart from http://192.168.x.x/init-config/84:xx:xx:xx:a0:dc.ks

NOTE: The 'x' and 'xx' values are for privacy (i.e. not actual values).

  • I have read the document

To reproduce

I'm suspecting this may happen when the controller is using a filesystem that is case-sensitive.

Expected behavior

That the .ks script would be downloaded from the controller without any intervention.

Additional context

As a workaround, I simply copied the generated .ks script to match the case used by curl, e.g.

$ cp ./metal/roles/pxe_server/files/data/init-config/84:xx:xx:xx:A8:DC.ks ./metal/roles/pxe_server/files/data/init-config/84:xx:xx:xx:a8:dc.ks

Localhost Not a Valid IP Address During "pxe_server" Task

Following the instructions in the documentation went smoothly. Updated all ansible hosts. Launched homelab-tools container on a debian machine (supports --host networking). But when running make -c metal I get the following error. I have tried adding "-c local" to the ansible-playbook boot.yml to no avail. Thanks for any insights.

image

Disable Vault dev mode

For production usage:

  • Persistent volume
  • Auto init
  • Auto unseal

Required for v0.1.0 beta release.

Raspberry Pi Support

Hello. Do you know if anyone has successfully used this project with raspberry pi's? If not, any thoughts on (at a high level) what kinds of changes might be needed to support them?

Thanks!

Proposal: Add option to use DNSMasq

Good day.

Issue statement

The use of dhcpd is great for air-gap solutions where a new DHCP is required. However for some home networks which does not have the VLAN capability or for users who would like to use common router DHCP services, the use of DHCPD will cause duplicate DHCP servers and will result in potential network disruption, or will limit the ability to auto-provision the Metal stage of this project.

Proposed Solution: DHCP Proxy

Use DHCP Proxy services to add PXE features such as Next Server into this project. This allows for users to use the existing DHCP servers which may be locked down or incapable of using Next Server/PXE settings on their network to be able to auto-provision hardware through PXE (with certain common configurations, like static IP allocation or reduction in DHCP request ranges on the DHCP server)

Proposed Application: DNSMasq

DNSMasq in Proxy mode interoperates with existing DHCP servers over IPv4 to add features such as next-server, TFTP, etc. where such hardware is either locked or unconfigurable for such services. This would be an opt-in change, configurable through the pxe_server defaults file.

Proposed Target Audience

Users who either do not want to create their own VLAN or lack the hardware to configure such services. Users who want to use common router services for DHCP and have router access to configure static IP and/or DHCP allocation ranges.

Additional Risks with Proposed Change

  • Additional Surface Area for Break-Out Attacks: Originally this project is locked to its own DHCP/VLAN, so any break-outs should be contained accordingly. Using common home networks increases the surface area of break-out attacks if the deployment is compromised.
    • Mitigation: Enrolment into this change is opt in only.

Proposed Next Steps

  1. Trial/Adopt/Halt - A discussion with all or a decision by the project maintainers to identify if this change should exist in this project or live on a fork.
  2. Documentation (This is in flight in any situation).

Automated offsite backup

  • Evaluate object storage providers (cost is a major factor, S3 compatible is a plus):
  • Evaluate backup tool:
    • Built in backup feature in Longhorn (doesn't support multiple destination)
    • K8up
    • Velero
  • Implement the above
  • Update docs

Vagrant Error | The following settings shouldn't exist: disk

I am trying to create the VMs with the command

VAGRANT_CWD=./metal vagrant up

but I get this error. Any ideas ? i can remove the reboot but disk? :/

There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* The following settings shouldn't exist: disk

shell provisioner:
* The following settings shouldn't exist: reboot

Can't bind to port 80 because it's already in use

I've just launched the script with my own configuration.

I didn't changed anything but i'm getting an error on the pxe-logs

^[[19dhttp_1  | nginx: [emerg] bind() to [::]:80 failed (98: Address in use)
^[[20dhttp_1  | 2022/02/13 21:07:24 [notice] 1#1: try again to bind() after 500ms
^[[21dhttp_1  | 2022/02/13 21:07:24 [emerg] 1#1: bind() to 0.0.0.0:80 failed (98: Address in use)

Yet i don't have any nginx running on and no port listening to 80 active other than the one set when running make.

Do you have any idea where can be the issue ?

Wakeonlan works well but then the pxe install doesn't work.

Running the make

This is what's showing on the ports when i'm running the make command

capture_httpbug

dhcpd.conf template does not render

I'm working through your netboot solution, and I have an interesting issue where dhcpd.conf.j2 does not render properly (no variables are substituted). All of the other templates work just fine, and the variables used in dhcpd.conf.j2 are available if I run a debug step:

# all variables are present with correct values
- debug:
    var: '{{item}}'
  with_items:
    - ansible_default_ipv4.address
    - ansible_default_ipv4.broadcast
    - ansible_default_ipv4.gateway
    - ansible_default_ipv4.netmask
    - ansible_default_ipv4.network

# no substitution occurs
- name: Render DHCP config
  template:
    src: dhcpd.conf.j2
    dest: "{{ role_path }}/files/dhcp/dhcpd.conf"
    mode: 0644

# substitution occurs correctly
- name: Render GRUB config
  template:
    src: grub.cfg.j2
    dest: "{{ role_path }}/files/data/boot/grub/grub.cfg"
    mode: 0644

Do you have any thoughts?

SSH tunnel as alternative to Argo

Hello, awesome project, now i know what thing i want to have in my life 😊

I tried to ask you this on Reddit, but therecs too many people and you probably didn't see 😅

Is it possible to run something like an email server, or minecraft, etc? I didn't manage to make it with argo tunnels

So i have a pruposal to add some ssh tunneling agent on the same layes as argo tunnels, so people that need to selfhost email or anything like that can have it
Also, just for users that don't like cloudflare, and free argo tunnel restrictions

(P.S. a side question, about jenkins, does this project have an alternative to jenkins? 😄 which one is it, and can i vuild aosp on it?)

Thank you, awesome project ✌️

Feature proposal: automating alert and event response using robusta

Hi, I'd like to suggest using robusta (https://github.com/robusta-dev/robusta) for automating the response to alerts and other events in your cluster. It's a runbook automation platform for Kubernetes with built-in automations for common errors and quality of life automations like marking application updates on your Grafana graphs.

Robusta can also be installed with a built-in Prometheus/Grafana/alertmanager stack (based on kube-prometheus-stack) if you want an all in one monitoring solution.

I'm the maintainer and happy to answer any questions. It's worth pointing out that we also have a SaaS platform which probably is not relevant for this, but it's disabled by default and everything will work without it. (If you are interested though, it's free.)

Python configure.py throwing if nothing to replace

There seems to be an error thrown by Python when running find_and_replace function:

subprocess.CalledProcessError: Command '['git', 'grep', '--files-with-matches', 'https://github.com/khuedoan/homelab', '--', 'bootstrap', 'platform']' returned non-zero exit status 1.

def find_and_replace(pattern: str, replacement: str, paths: list[str]) -> None:
    files_with_matches = subprocess.run(
        ["git", "grep", "--files-with-matches", pattern, "--"] + paths,
        capture_output=True,
        text=True,
        check=**True**
    ).stdout.splitlines()

The bug seems to happen when no matches are found. (I replayed the configure.py just to change the domain, github repo was already the same)
I changed the check value to False to make it work again.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.