Git Product home page Git Product logo

proposals's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

proposals's Issues

WIP: Tink CRD Refactor

Tink CRD Refactor

Overview

Tinkerbell's backend is rooted in 3 Custom Resource Definitions (CRDs): Hardware, Workflow and Template. The CRDs were developed as part of the KRM proposal that introduced a Kubernetes backend option to the Tinkerbell stack and mirrored the Postgres database schema (now deprecated and removed). As the CRDs were a reflection of the Postgres schema, they inherited the schemas flaws. This proposal attempts to remediate the flaws by refactoring the CRDs.

Context

When users interact with Tinkerbell the primary interface is kubectl and Tinkerbell's CRDs: Hardware, Workflow and Template. The CRDs are hard to understand because they contain duplicate fields, obsolete fields, unclear semantics and, consequently, unclear system behavior expectations.

Some specific issues with the CRDs are summarized as:

Hardware

  1. Network information is specified in both .Spec.Interfaces and .Spec.Metadata.Instance.Ips. Only .Spec.Interfaces is used.
  2. Disk information is specified in both .Spec.Disks and .Spec.Metadata.Instance.Storage. .Spec.Metadata.Instance.Storage is unused.
  3. Userdata can be specified in both .Spec.UserData and .Spec.Instance.Userdata. .Spec.Instance.Userdata is unused.
  4. .Spec.TinkVersion has no functional use.
  5. .Spec.Resources was intended for use in CAPT as part of its Hardware selection algorithm but is yet to be implemented.
  6. .Spec.Interfaces[].Netboot.{AllowPXE,AllowWorkflow} and .Spec.Metadata.Instance.AlwaysPxe are seemingly related but reside on different objects and how they impact eachother is unclear.
  7. .Spec.Metadata.Custom defines specific fields that are related to other parts of Hardware.
  8. .Status.State, .Spec.Metadata.Instance.State and .Spec.Metadata.State have unclear semantics. The .Spec state fields impact the machine provisioning process but nothing in the core Tinkerbell stack sets their values. .Status.State is unused.

Template

  1. Template defines a single field, .Spec.Data. The format of .Spec.Data is entirely ambiguous requiring the user to understand implementation detail. In summary, .Spec.Data is composed of a list of tasks that can run on different machines; a task is composed of a list of actions that perform a function such as streaming a raw image. The multi-machine capability has no known use-cases.
  2. The Template objects .Status field is unused.

Workflow

  1. The .Spec.HardwareMap historically defines a template value used to render a tasks WorkerAddr. The WorkerAddr should be the MAC of the machine that should run the task. This creates a hard to understand relationship between Workflow and Template that users must understand to successfully execute Workflows.
  2. The .Spec.GlobalTimeout is unused and its origin is unclear (status fields are typically populated by Kubernetes controllers to build understanding of the current object state).
  3. Actions leverage the WorkflowState type that is intended to describe the overall state of the Workflow.

Users resort to Q&A in the Tinkerbell Slack to determine what fields are required and how tweaking them impacts the system. The CRDs should be simple enough and sufficiently documented to aid users in understanding how they can manipulate the system.

Goals/Non-goals

Goals

  • To de-duplicate Tink custom resource definition fields and data structures.
  • To provide clear behavioral expectations when manipulating custom resources.
  • To remove obsolete fields and data structures.

Non-goals

  • To change the existing relationship between Tink and Rufio.
  • To introduce additional object status data typically found on the .Status field.

Proposal

Hardware

The Hardware object represents a machine that can run workflows and is equivilent to the Hardware that currently exists. However, unlike the existing Hardware, it does not define a .Status field. This is because Hardware is never reconciled and merely exists to hold data and logically represent a machine for the Tinkerbell stack.

The NetworkInterface object describes both DHCP and netboot configuration. IPv6 is unsupported in the existing and proposed data models.

The DisableNetboot and DisableDHCP field provides toggleability for application layers operating above the Tinkerbell stack such as CAPT. This renders the need for state related fields that impact netboot behavior in the existing system superfluous.

The OSIE is referenced from the NetworkInterface object to facilitate use-cases where booting from different interfaces attached to the same machine require different OSIEs.

The Instance data structure contains auxiliary data that is generally unused by the Tinkerbell stack. It is heavily stripped down relative to the existing data model which will necessitate a change in Hegel as it can no longer serve all endpoints it currently exposes. A separate proposal will address any need to expose arbitrary data from Hegel.

// Hardware is a logical representation of a machine that can execute Workflows.
type Hardware struct {
	HardwareSpec
}

type HardwareSpec struct {
	// NetworkInterfaces defines the desired DHCP and netboot configuration for a network interface.
	NetworkInterfaces NetworkInterfaces

	// StorageDevices is a list of storage devices that will be available in the OSIE.
	// Optional.
	StorageDevices []StorageDevice

	// Instance describes instance specific data.
	Instance Instance

	// BMCRef references a Rufio Machine object. It exists in the current API and will not be changed
	// with this proposal.
	BMCRef LocalObjectReference
}

// NetworkInterfaces maps a MAC address to a NetworkInterface.
type NetworkInterfaces map[MAC]NetworkInterface

// MAC is a Media Access Control address.
type MAC string

// NetworkInterface is the desired configuration for a particular network interface.
type NetworkInterface struct {
	// IP to configure during DHCP.
	IP string

	// Netmask to configure during DHCP.
	Netmask string

	// Gateway to configure during DHCP.
	Gateway string

	// Hostname to configure during DHCP.
	// Optional.
	Hostname string

	// VLAN ID to configure during DHCP.
	// Optional.
	VLANID int

	// Nameservers to configure during DHCP.
	Nameservers []string

	// Timeservers to configure during DHCP.
	// Optional.
	Timeservers []string

	// DisableNetboot disables netbooting for this interface.
	// Default false.
	DisableNetboot bool

	// DisableDHCP disables DHCP for this interface. Implies DisableNetboot.
	// Default false.
	DisableDHCP bool

	// OSIE references an OSIE object.
	OSIE LocalObjectReference
}

// StorageDevice describes a storage device path that will be present in the OSIE.
type StorageDevice string

// Instance describes instance specific data. Instance specific data is typically dependent on the
// permanent OS that a piece of hardware runs. This data is often served by an instance metadata
// service such as Tinkerbell's Hegel. The core Tinkerbell stack does not leverage this data.
type Instance struct {
	// Userdata is data with a structure understood by the producer and consumer of the data.
	Userdata string

	// Vendordata is data with a structure understood by the producer and consumer of the data.
	Vendordata string
}

OSIE

The OSIE object is a new CRD. It exists to ensure OSIE URLs can be re-used and easily updated across Hardware instances.

// OSIE describes and Operating System Initialization Environment. It is used by Tinkerbell
// to provision machines and should launch the Tink Worker component.
type OSIE struct {
	Spec OSIESpec
}

type OSIESpec struct {
	// KernelURL is a URL to a kernel image.
	KernelURL string

	// InitrdURL is a URL to an initrd image.
	InitrdURL string

	// KernelParams defines a set of key-value pairs that are passed to the Kernel on boot.
	// E.g. map{"console": "ttyS0,9600"} is passed as console=ttyS0,9600.
	// Optional.
	KernelParams map[string]string

	// PXEScriptURL is a URL to an iPXE script served during netboot in-place of default iPXE scripts.
	// Optional.
	IPXEScriptURL string
}

Template

The Template defines a list of actions to be executed on a single machine. This is a simplification on the existing Template that includes the concept of a tasks that can be run on different machines.

Fields that no longer feature on actions include:

  • OnTimeout
  • OnFailure
  • Pid

Every field within a Template will support template values that conform to Go's template language. See https://pkg.go.dev/text/template for further explanation.

Template's have no .Status property as they are not reconciled.

// Template defines a set of actions to be run on a target machine. The template is rendered
// prior to execution where it is exposed to Hardware and user defined data. All fields within
// TemplateSpec may contain template values. See https://pkg.go.dev/text/template for more details.
type Template struct {
	Spec TemplateSpec
}

type TemplateSpec struct {
	// Actions defines the set of actions to be run on a target machine. Actions are run sequentially
	// in the order they are specified. At least 1 action must be specified. Names of actions
	// must be unique within a Template.
	Actions []Action

	// Volumes to be mounted on all actions. If an action specifies the same volume it will take
	// precedence.
	// Optional.
	Volumes []Volume

	// Environment defines environment variables to be available in all actions. If an action
	// secifies the same environment variable it will take precedence.
	// Optional.
	Environment map[string]string
}

// Action defines an individual action to be run on a target machine.
type Action struct {
	// Name is a unique name for the action.
	Name string

	// Image is an OCI image.
	Image string

	// Command defines the command to use when launching the image.
	// Optional.
	Command string

	// Volumes defines the volumes to mount into the container.
	// Optional.
	Volumes []Volume

	// Environment defines environment variables used when launching the container.
	// Optional.
	Environment map[string]string
}

// Volume is a specification for mounting a volume in an action. Volumes take the form
// {VOLUME-NAME | HOST-DIR}:CONTAINER-DIR:OPTIONS. When specifying a VOLUME-NAME that does not exist
// it will be created for you.
//
// Examples
//
// Read-only bind mount
//   /etc/data:/data:ro
//
// Writable volume name bound to /data
//   shared_volume:/data
//
// See https://docs.docker.com/storage/volumes/ for additional details
type Volume string

Workflow

// Workflow describes a set of actions to be run on a specific Hardware. Workflows execute
// once and should be considered ephemeral.
type Workflow struct {
	Spec   WorkflowSpec
	Status WorkflowStatus
}

type WorkflowSpec struct {
	// HardwareRef is a reference to a Hardware resource this workflow will execute on.
	// If no namespace is specified the Workflow's namespace is assumed.
	HardwareRef NamespacedReference

	// TemplateRef is a reference to a Template resource used to render workflow actions.
	// If no namespace is specified the Workflow's namespace is assumed.
	TemplateRef NamespacedReference

	// TemplateData is arbitrary user defined data that is available during template rendering.
	// The complete data structure should be marshalable using packages such as the standard
	// library json package.
	// Optional.
	TemplateData map[string]any

	// Timeout defines the time the workflow has to complete after the first action is requested.
	// When set to 0, no timeout is applied.
	// Optional.
	Timeout int32
}

// NamespacedReference is a cross namespace reference for an object. The object being referenced
// is context dependent.
type NamespacedReference struct {
	Name      string
	Namespace string
}

type WorkflowStatus struct {
	// Actions is the list of rendered actions and their status.
	Actions RenderedActions

	// State describes the current state of the workflow.
	State State

	// Result describes the result of the workflow.
	Result Result
}

// RenderedActions is a map of action name to RenderedAction.
type RenderedActions map[string]ActionStatus

// ActionStatus describes status information about an action.
type ActionStatus struct {
	// Rendered is the rendered action.
	Rendered Action

	// StartedAt is the time the action was started.
	StartedAt *metav1.Time

	// State describes the current state of the action.
	State State

	// Result describes the result of the action.
	Result Result

	// Reason is a human readable string describing the result of the action.
	Reason string
}

// State describes the point in time state of a workflow or action.
type State string

const (
	StatePending  State = "Pending"
	StateRunning  State = "Running"
	StateComplete State = "Complete"
)

// Result describes the result of a workflow or action.
type Result string

const (
	ResultSuccess Result = "Success"
	ResultFailure Result = "Failure"
	ResultTimeout Result = "Timeout"
)
Data avilable during template rendering
Custom functions

Impact to services

Tink Server and Worker

  • Proto contracts change.

To do

  • Cross namespace references in Workflows.
  • What protections should we provide for Console settings?
  • Are we OK using IPXEScriptURL for custom iPXE scripts?
  • Do we need anything else on Instance?
  • How will we support IPv6?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.