Installing OKD on User-Provisioned Infrastructure: A Step-by-Step Guide

RajeshKumar1 · 2025-04-15T14:17:37+00:00

Installing OKD on User-Provisioned Infrastructure: A Step-by-Step GuideI. IntroductionA. Overview of OKDOKD, formerly known as OpenShift Origin, stands as the community distribution of Kubernetes, serving as the upstream foundation for Red Hat's OpenShift Container Platform (OCP).1 It embodies an open-source collection of software components and design patterns optimized for continuous application development and multi-tenant deployment at scale.1 As the community-driven counterpart to OCP, OKD typically incorporates newer features and updates earlier, acting as a proving ground for technologies before they potentially enter the enterprise product.3A key distinction lies in the underlying operating system and support model. OKD utilizes Fedora CoreOS (FCOS) as its immutable host operating system, particularly for control plane nodes.4 Unlike OCP, OKD does not come with commercial support from Red Hat or access to the full suite of Red Hat certified operators and middleware available through a subscription.3 Support relies on community channels like forums, mailing lists, and issue trackers.1B. User-Provisioned Infrastructure (UPI) InstallationOKD offers flexibility in its installation methods. This guide focuses on the User-Provisioned Infrastructure (UPI) approach, sometimes referred to as a "Full Control" installation or using platform: none or platform: baremetal in the configuration.9 In a UPI installation, the administrator assumes responsibility for preparing and managing all aspects of the underlying infrastructure. This includes provisioning physical servers or virtual machines, configuring networking (IP addressing, DNS, load balancing), setting up storage, and ensuring all prerequisites are met before initiating the OKD deployment.9This contrasts with the Installer-Provisioned Infrastructure (IPI) method, where the openshift-install program automates the creation and configuration of the necessary infrastructure resources on supported platforms like AWS, GCP, or vSphere.9 The UPI method provides maximum control and customization but requires a deeper understanding of both OKD's requirements and the target infrastructure environment.This guide details the steps for a UPI installation using the official openshift-install tool.2 It specifically excludes methods involving OpenShift Local (formerly CodeReady Containers or CRC), which provides a simplified, single-node OKD or OCP environment primarily for local development and testing on a workstation, often within a virtual machine managed by the crc tool.17 OpenShift Local is not intended for production or multi-node cluster deployments and has different setup procedures and limitations.18C. Overview of the Installation ProcessThe UPI installation process using openshift-install follows a structured sequence of steps, heavily relying on Fedora CoreOS (FCOS) and its Ignition provisioning system.6 The major phases include:
Prerequisites: Ensuring the installer machine, target nodes (hardware, OS), networking (IPs, DNS, LBs), and tooling meet OKD's requirements.
Tool Acquisition: Downloading the correct versions of openshift-install, oc, and FCOS images.
Configuration: Creating the primary install-config.yaml file defining the cluster.
Manifests & Ignition: Generating Kubernetes manifests and then the node-specific Ignition configuration files (bootstrap.ign, master.ign, worker.ign).
Infrastructure Provisioning & Booting: Setting up the physical/virtual nodes and booting them with their respective Ignition configurations.
Bootstrapping: Monitoring the temporary bootstrap node as it establishes the permanent control plane.
Finalization: Waiting for all cluster operators to become stable and operational.
Verification: Accessing the cluster via CLI and web console to confirm successful installation.
II. Step 1: Finding Official Documentation and Release InformationNavigating the official resources correctly is the first step towards a successful OKD installation. Using mismatched documentation or tools is a common source of errors.24A. Official Project WebsiteThe main project website is okd.io.1 This site provides a high-level overview of OKD, its goals, community engagement links (like Slack channels and working groups), and pointers to getting started resources.1B. Official Documentation SiteThe definitive source for technical documentation is docs.okd.io.27 It is crucial to select the documentation version that precisely matches the OKD version intended for installation. The documentation is versioned (e.g., 4.16, 4.17, latest), and using guides from a different version can lead to incompatibilities or outdated procedures. Key sections for this guide include "Installing", particularly subsections related to "Bare Metal", "User-Provisioned Infrastructure", or "Installing on Any Platform".10C. Official GitHub Repository and ReleasesThe primary code repository is okd-project/okd on GitHub.1 The crucial page for obtaining installation tools is the /releases sub-page: https://github.com/okd-project/okd/releases.2 This page hosts downloadable archives for the openshift-install binary and the oc (OpenShift CLI) client tools, specific to each tagged OKD release.2Recent releases, particularly those based on CentOS Stream CoreOS (SCOS), might originate from the okd-project/okd-scos repository 1, with artifacts often mirrored or copied to the main okd/releases page for broader visibility.36 Always ensure the downloaded tools correspond to the target OKD installation version.The documentation landscape for OKD is distributed across these three main resources: the project homepage (okd.io) for community and overview, the documentation site (docs.okd.io) for detailed technical procedures, and the GitHub repository (okd-project/okd) for code and release artifacts (install tools). Given the rapid development cycle and the differences between versions, consistently referencing the documentation and downloading tools that match the exact target OKD release version is paramount. Mixing versions (e.g., using the 4.16 installer with 4.17 documentation or FCOS image) is a frequent cause of installation failures, often manifesting in later stages with hard-to-diagnose errors.24 This guide assumes the user will select a target OKD version and consistently use the corresponding documentation and tools throughout.III. Step 2: Meeting Installation PrerequisitesBefore initiating the installation, a set of prerequisites must be met across the installer machine, the cluster nodes, networking, and tooling. Failure to meet these requirements often leads to installation failures that can be difficult to troubleshoot.39 The interdependent nature of these prerequisites means a failure in one area, such as DNS, can cascade and cause failures in subsequent steps, like node communication or API access. Therefore, thorough validation before starting the installation is highly recommended.A. Installer Machine
Operating System: A machine running a recent version of Linux (like Fedora) or macOS is required to execute the openshift-install binary.16
Software: Standard utilities like tar, curl/wget, and an SSH client are needed. Depending on the specific infrastructure and methods used (especially if performing IPI or using local virtualization for testing), packages like libvirt, qemu-kvm, jq, and ipmitool might be necessary on the machine running the installer.43
Resources: Sufficient disk space is needed to store the downloaded tools, configuration files, logs, and potentially cached FCOS images. While the tools themselves are relatively small (around 500MB cited 42), installation logs and generated assets require more space.
B. Cluster Node HardwareA standard high-availability (HA) OKD cluster requires a minimum set of hosts 10:
1 Bootstrap Machine (Temporary): Used only during the installation process to deploy the control plane. It can be removed afterward. Must run FCOS.
3 Control Plane Machines (Masters): Run the core Kubernetes and OKD control plane services (API server, etcd, controller manager, etc.). Must run FCOS.
2+ Compute Machines (Workers): Run user application workloads. Typically run FCOS in UPI installations, but Fedora is also technically possible (though managing Fedora nodes falls outside the scope of OKD's automated FCOS updates).49
Three-node clusters, where the control plane nodes also run user workloads (mastersSchedulable=true) and there are zero dedicated compute nodes, are also a supported configuration, often for resource-constrained environments.55The minimum resource requirements per node are critical for stability:
Node TypeOperating SystemMinimum vCPU/CoresMinimum RAM (GB)Minimum Disk (GB)Minimum IOPS (Recommended)ReferencesBootstrapFCOS416100 / 120¹30033Control PlaneFCOS416100 / 120¹300 (Fast disk essential)33ComputeFCOS / Fedora28100 / 120¹30033
¹ Note: Some documentation references 100GB 33, while others mention 120GB.16 Provisioning 120GB or more is safer.
CPU: 1 vCPU generally equates to 1 physical core if hyperthreading (SMT) is disabled.49 Specific CPU instruction set architectures (ISA) are required (e.g., x86-64-v2 for recent versions).49
Disk Performance: OKD, especially the etcd database running on control plane nodes, is highly sensitive to disk I/O performance. Fast storage (SSD/NVMe) with low latency (e.g., <10ms fsync for etcd) is strongly recommended.49 Insufficient disk performance can lead to cluster instability and timeouts.24 On cloud platforms where IOPS often scales with disk size, overprovisioning disk capacity might be necessary to achieve required performance.49
C. Network RequirementsNetworking is often the most complex prerequisite area for UPI installations.
IP Addressing:

Each node (bootstrap, control plane, compute) requires a persistent IP address.
This can be achieved via:

DHCP: A DHCP server configured with static reservations mapping each node's network interface MAC address to a specific IP address.10 The DHCP server should also provide DNS server information.42
Static IP: Manually configuring the IP address, subnet mask, gateway, and DNS servers during the FCOS installation process (e.g., via kernel arguments or Ignition).10

DNS Configuration: Correct DNS resolution is absolutely critical for cluster operation.10 Misconfiguration is a common failure point.64

Both forward (name-to-IP) and reverse (IP-to-name) lookups must work correctly for all cluster components.42
The following DNS records must be configured in the user's DNS zone before starting the installation:

Record TypeHostname PatternTargetResolvable ByReferencesA/AAAAapi.<cluster_name>.<base_domain>API Load Balancer VIPInternal & External10A/AAAAapi-int.<cluster_name>.<base_domain>API Load Balancer VIPInternal Only42A/AAAA*.apps.<cluster_name>.<base_domain>Application Ingress Load Balancer VIPInternal & External10A/AAAA<node_type>-<index>.<cluster_name>.<base_domain>Node IP AddressInternal Only10PTRNode IP AddressNode FQDNInternal Only42PTRAPI LB VIP Addressapi.<cluster_name>.<base_domain>Internal Only42PTRApps LB VIP Address(Typically not required)-

<cluster_name> and <base_domain> are defined in install-config.yaml.
api and api-int are used for Kubernetes API access (external and internal, respectively).
*.apps is a wildcard record for exposing applications via OKD Routes.
<node_type>-<index> refers to individual nodes like bootstrap-0, master-0, worker-0.
DNS validation using tools like dig should be performed before installation.[42]

Load Balancers: Two load balancers must be provisioned and configured.10 These can be hardware LBs, software LBs (like HAProxy), or cloud provider LBs configured manually.

Load Balancer NameFrontend Port(s)Backend Port(s)Backend Nodes (Initial Pool)Mode RequiredPersistenceHealth Check Endpoint/Port (TCP Check Often Suffices)ReferencesAPI6443, 226236443, 22623Bootstrap, Control Plane x3Layer 4 (TCP/SSL Passthrough)None / StatelessAPI: /readyz on 6443; Machine Config: TCP 2262310Application Ingress80 (HTTP), 443 (HTTPS)80, 443Compute (Worker) xNLayer 4 (TCP/SSL Passthrough)Session / Source IP¹/healthz/ready on 1936 (Ingress Controller health)10

Mode: Must operate at Layer 4 (TCP), not Layer 7 (HTTP/S). SSL termination should not happen at the LB (SSL Passthrough).[10, 33, 52, 68]
API LB Backend Pool: Initially includes the bootstrap node and all control plane nodes. The bootstrap node is removed after bootstrap completion.[10, 33, 52, 68]
Ingress LB Backend Pool: Includes all compute (worker) nodes where the Ingress Controller pods run (by default). In a 3-node cluster, this would be the control plane nodes.[68]
Persistence: The API LB should be stateless. The Ingress LB benefits from session persistence (e.g., source IP based) for better application performance.[10, 33, 52, 68]
Health Checks: Crucial for removing unhealthy nodes from the pool. Use the specified endpoints/ports or reliable TCP checks. API health check (/readyz) is particularly important for bootstrap.[10, 33, 52, 68]

Network Connectivity & Ports:

Nodes require internet access for pulling container images (from quay.io for OKD) and potentially telemetry, unless a disconnected installation with a mirror registry is performed.10
Nodes need access to internal services: DNS servers, NTP servers (time synchronization is critical) 50, load balancers, and potentially an HTTP/HTTPS server hosting Ignition files.54
Firewalls between nodes or between nodes and required services must allow necessary traffic. Refer to OKD documentation for specific port requirements (e.g., ports 6443, 22623 for API/Machine Config; 80, 443 for Ingress; internal SDN ports like 4789 UDP).51

D. Required ToolsEnsure the following tools are available on the installer machine:
openshift-install binary: The core installation program for the target OKD version 2-.36
oc (OpenShift CLI) binary: The command-line client for interacting with the cluster post-installation. Must match the cluster version for full compatibility 16-.72
Fedora CoreOS (FCOS) Images: Access to download the appropriate FCOS image (ISO, PXE, VMDK, QCOW2 etc.) for the target infrastructure and OKD version 21-.73
SSH Client & Key Pair: An SSH client and a generated key pair (e.g., id_rsa or id_ed25519) for secure access to cluster nodes if needed for troubleshooting.10 The public key will be embedded in the Ignition configs.
E. Infrastructure CapabilitiesThe chosen infrastructure must support:
Hardware: Provisioning of bare metal servers or VMs meeting the minimum CPU, RAM, and disk requirements.10
Booting & Provisioning: A method to boot the provisioned nodes using the chosen FCOS image (ISO, PXE, etc.) and reliably deliver the corresponding Ignition configuration file to each node upon its first boot 10-.76 This is a critical step detailed further in Step 6.
IV. Step 3: Downloading Installation ToolsWith prerequisites understood, the next step is to acquire the necessary command-line tools and the base operating system image. Using versions of these tools that precisely match the target OKD release is essential for avoiding compatibility issues.16A. Download openshift-installThe openshift-install binary orchestrates the creation of configuration files and, in IPI mode, the infrastructure itself. For UPI, it generates the manifests and Ignition files.
Navigate to OKD Releases: Go to the official OKD releases page on GitHub: https://github.com/okd-project/okd/releases.2
Select Version: Identify and click on the specific OKD release tag you intend to install (e.g., 4.17.0-okd-scos.0).
Download Archive: Under the "Assets" section for that release, find the openshift-install archive appropriate for your installer machine's operating system and architecture (e.g., openshift-install-linux-<version>.tar.gz or openshift-install-mac-<version>.tar.gz). Download this file.35 Example using curl:
BashOKD_VERSION="4.17.0-okd-scos.0" # Replace with your target version
OS_ARCH="linux" # Or "mac"
curl -L https://github.com/okd-project/okd/releases/download/${OKD_VERSION}/openshift-install-${OS_ARCH}-${OKD_VERSION}.tar.gz -o openshift-install.tar.gz

Extract Binary: Extract the archive:
Bashtar zxvf openshift-install.tar.gz

Place in PATH: Move the extracted openshift-install binary to a directory included in your system's PATH environment variable (e.g., /usr/local/bin or ~/bin) and ensure it's executable 2:
Bashsudo mv openshift-install /usr/local/bin/
sudo chmod +x /usr/local/bin/openshift-install

Verify: Check the installation: openshift-install version.
B. Download oc CLI ToolThe oc CLI is used for interacting with the OKD cluster after installation (managing resources, applications, users, etc.).
Use Same Release Page (Recommended): Navigate back to the same OKD GitHub release page used for openshift-install.36 Downloading oc from the same release bundle ensures version compatibility.16
Download Archive: Find the openshift-client archive for your OS and architecture (e.g., openshift-client-linux-<version>.tar.gz) under "Assets" and download it.36 Example:
Bash# Use the same OKD_VERSION and OS_ARCH as before
curl -L https://github.com/okd-project/okd/releases/download/${OKD_VERSION}/openshift-client-${OS_ARCH}-${OKD_VERSION}.tar.gz -o oc.tar.gz

(Alternative, less recommended due to potential version mismatch: Use the mirror site https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/ or a version-specific directory there.71)
Extract Binaries: Extract the archive. This typically includes both oc and kubectl 44:
Bashtar zxvf oc.tar.gz

Place in PATH: Move the extracted oc and kubectl binaries to a directory in your PATH and ensure they are executable 44:
Bashsudo mv oc kubectl /usr/local/bin/
sudo chmod +x /usr/local/bin/oc /usr/local/bin/kubectl

Verify: Check the installation: oc version.
C. Obtain Fedora CoreOS (FCOS) ImageFCOS is the immutable operating system required for bootstrap and control plane nodes.5
Identify Required Image: The most reliable way to get the FCOS image compatible with your openshift-install version is to use the installer itself 35:
Bash# Ensure openshift-install is in your PATH
openshift-install coreos print-stream-json

This command outputs JSON containing URLs for various FCOS artifacts (ISO, QCOW2, PXE kernel/initramfs/rootfs) specific to the installer's built-in release information. Parse this JSON (e.g., using jq) to find the URL for the image format you need. Example to get the bare metal ISO URL for x86_64:
BashISO_URL=$(openshift-install coreos print-stream-json | jq -r '.architectures.x86_64.artifacts.metal.formats.iso.disk.location')
echo $ISO_URL

Using the image specified by the installer minimizes compatibility risks associated with FCOS versions and Ignition specifications.6
Download Image: Use the URL obtained in the previous step to download the required FCOS image file. Example using the $ISO_URL variable from above:
Bashcurl -L $ISO_URL -o fcos-metal.iso

(Alternative: Download from the FCOS website https://fedoraproject.org/coreos/download/.70 Select the appropriate stream (usually 'stable' or the one matching OKD docs), architecture, and image format (Bare metal ISO, PXE files, QEMU QCOW2, VMware OVA, etc.). This approach carries a slightly higher risk of version mismatch if not carefully cross-referenced with OKD compatibility notes.)
Verify Image (Optional but Recommended): The FCOS download page provides checksums (SHA256) and GPG signatures. Verify the downloaded image integrity and authenticity using tools like sha256sum and gpg.82
V. Step 4: Creating the Installation Configuration (install-config.yaml)The install-config.yaml file serves as the primary input for the installation program, defining the fundamental parameters of the cluster. Getting this file correct is crucial, as errors here often lead to failures later in the process.55A. Generating the Initial File

Create Installation Directory: Create a new, empty directory for this specific installation attempt. Do not reuse directories from previous failed or successful installations, as some generated assets like bootstrap certificates have short lifespans.16
Bashmkdir ~/okd-install-upi
cd ~/okd-install-upi

Run create install-config: Execute the command within the new directory:
Bashopenshift-install create install-config --dir=.

16

Answer Prompts: The installer will prompt for basic information 16:

SSH Key: Select or provide the path to the public SSH key (e.g., ~/.ssh/id_ed25519.pub) to be used for accessing nodes.
Platform: Choose the target platform. For UPI on bare metal or generic VMs where you manage everything, select none. (Other options like aws, vsphere, gcp, baremetal trigger IPI logic or platform-specific UPI fields).
Base Domain: Enter the DNS base domain you configured (e.g., okd.example.com).
Cluster Name: Enter a name for the cluster (e.g., myupi). This will be prepended to the base domain for FQDNs.
Pull Secret: Paste the pull secret. For OKD UPI, use the dummy secret: {"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}.2

Note: Interactive mode might have issues on certain platforms or installer versions; manually creating the install-config.yaml based on the parameters below is always an option.89

B. Key Parameters for UPI (platform: none)After the initial generation, review and edit the created install-config.yaml file. For a typical UPI installation, it should resemble the following structure, ensuring parameters match your infrastructure:YAMLapiVersion: v1
baseDomain: okd.example.com # 1. Your configured base DNS domain
metadata:
name: myupi # 2. Your chosen cluster name
platform:
none: {} # 3. Indicates User-Provisioned Infrastructure
pullSecret: '{"auths":{"fake":{"auth":"aWQ6cGFzcwo="}}}' # 4. Dummy pull secret for OKD
sshKey: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAI...' # 5. Content of your public SSH key
controlPlane: # 6. Control Plane (Master) node definition
architecture: amd64
hyperthreading: Enabled # Optional
name: master
replicas: 3 # Must be 3 for HA
compute: # 7. Compute (Worker) node definition

architecture: amd64
hyperthreading: Enabled # Optional
name: worker
replicas: 2 # Minimum 2 for HA (or 0 for 3-node cluster)
networking: # 8. Cluster network configuration
clusterNetwork:
- cidr: 10.128.0.0/14 # Pod network CIDR
  hostPrefix: 23
  machineNetwork:
- cidr: 192.168.1.0/24 # CIDR matching your node infrastructure IPs
  networkType: OVNKubernetes # Or OpenShiftSDN
  serviceNetwork:
- 172.30.0.0/16 # Service network CIDR
  Optional Parameters below:
  
  fips: false # Enable FIPS mode
  
  publish: External # How to expose the cluster (External/Internal)
  
  proxy: # Define HTTP/HTTPS proxy if needed
  
  httpProxy: http://<proxy_user>:<proxy_pass>@<proxy_ip>:<proxy_port>
  
  httpsProxy: https://<proxy_user>:<proxy_pass>@<proxy_ip>:<proxy_port>
  
  noProxy: example.com,.cluster.local
  
  additionalTrustBundle: | # Add custom CA certs for proxy/registries
  
  -----BEGIN CERTIFICATE-----
  
  ...
  
  -----END CERTIFICATE-----
  Explanation of Key Parameters:
  baseDomain: Must match the DNS zone configured in prerequisites.16
  metadata.name: Defines the cluster name used in hostnames and resources.16
  platform.none: Specifies that the user is responsible for all infrastructure provisioning.57
  pullSecret: For OKD, the provided dummy secret is sufficient unless pulling from other private registries.2
  sshKey: The content of the public SSH key file (.pub) used for node access.16
  controlPlane.replicas: Must be 3 for a high-availability cluster.
  compute.replicas: Minimum of 2 for HA. Set to 0 for a 3-node cluster configuration where control plane nodes are schedulable.57
  networking:

clusterNetwork: CIDR for Pod IPs. Should not overlap with existing networks.
serviceNetwork: CIDR for Service IPs. Should not overlap.
machineNetwork: Must contain the IP range(s) from which the cluster nodes (bootstrap, control plane, compute) get their IPs.57
networkType: Typically OVNKubernetes (default in recent versions) or OpenShiftSDN.

C. Backup the ConfigurationCrucially, back up the customized install-config.yaml file to a safe location outside the installation directory. The openshift-install commands consume or modify this file during subsequent steps, and having a backup allows for easier restarts or recreation if needed.56VI. Step 5: Generating Kubernetes Manifests and Ignition ConfigurationsWith the install-config.yaml defining the cluster's core, the next steps involve transforming this high-level configuration into the detailed manifests and node-specific boot configurations needed by Kubernetes and FCOS.A. Generate Kubernetes ManifestsThe installer first converts the install-config.yaml into a collection of standard Kubernetes manifest files. These define various cluster resources and configurations.

Run create manifests: Execute the following command in the installation directory:
Bashopenshift-install create manifests --dir=.

91

Output: This creates a manifests/ subdirectory containing numerous YAML files (e.g., cluster-scheduler-02-config.yml, cvo-overrides.yaml) and an openshift/ subdirectory for injecting custom manifests.

Optional Customization Point: This stage offers an opportunity for advanced customization before the immutable Ignition configurations are generated.91 Users can modify the generated manifests in the manifests/ directory or add their own YAML files (like MachineConfig definitions) to the openshift/ directory. Common customizations include:

Making control plane nodes non-schedulable (by modifying manifests/cluster-scheduler-02-config.yml).58
Adding kernel arguments via a MachineConfig file placed in openshift/.92
Configuring custom NTP servers via a MachineConfig.92
Adjusting network operator configurations.67

Modifying these manifests requires a good understanding of OKD/Kubernetes internals, as incorrect changes can easily break the installation. It is generally recommended to avoid modifications unless necessary and well-understood.

B. Generate Ignition ConfigurationsIgnition is the first-boot provisioning tool used by Fedora CoreOS.6 It reads a JSON configuration file (.ign) and configures the node accordingly, setting up storage, filesystems, systemd units, users, network configurations, etc. The openshift-install tool generates these Ignition files based on the install-config.yaml and the (potentially customized) manifests.
Run create ignition-configs: Execute the following command in the installation directory:
Bashopenshift-install create ignition-configs --dir=.

10
Output: This command consumes the install-config.yaml and the contents of the manifests/ and openshift/ directories. It generates the following critical files directly in the installation directory:

bootstrap.ign: The Ignition configuration for the temporary bootstrap node.94 It contains instructions to set up a minimal Kubernetes control plane, etcd, and an HTTP server to provide resources (including Ignition files) to the permanent control plane nodes.
master.ign: The Ignition configuration for the permanent control plane nodes.94 It instructs these nodes how to format disks, configure networking, and join the cluster initiated by the bootstrap node.
worker.ign: The Ignition configuration for the compute (worker) nodes.94 Similar to master.ign, but configures nodes for running user workloads.
metadata.json: Contains metadata about the cluster being installed.
auth/kubeconfig: The kubeconfig file needed to access the cluster API as the administrator after installation.77 Secure this file.
auth/kubeadmin-password: Contains the randomly generated password for the initial kubeadmin user.16 Secure this file.

The generation of manifests serves as an intermediate step, allowing for specific customizations beyond the scope of install-config.yaml before the final, node-specific Ignition configurations are created. Once create ignition-configs is run, the boot configurations for the nodes are essentially fixed for the initial deployment.VII. Step 6: Provisioning Infrastructure and Booting NodesWith the Ignition configurations generated, the next phase involves preparing the actual infrastructure (servers or VMs) and booting them using the correct FCOS image and Ignition file. This stage is critical, as failure to deliver the correct Ignition configuration during the first boot will prevent nodes from joining the cluster correctly.A. Final Infrastructure Preparation (DNS/LB)Before booting any nodes, ensure the prerequisite DNS records and Load Balancers are fully configured, active, and resolvable/reachable within the network.10 The bootstrap and control plane nodes will immediately attempt to resolve and communicate with the API endpoints (api.<cluster>.<domain> and api-int.<cluster>.<domain>) via the configured load balancer upon booting.B. Provisioning NodesProvision the physical servers or virtual machines intended for the bootstrap, control plane, and compute roles. Ensure they meet the minimum hardware requirements (CPU, RAM, Disk) specified earlier. Configure network interfaces and note their MAC addresses if using DHCP reservations.C. Booting Nodes with Ignition ConfigsEach node must be booted using the appropriate FCOS image and provided with its corresponding Ignition file (bootstrap.ign for the bootstrap node, master.ign for control plane nodes, worker.ign for compute nodes). The method for delivering the Ignition config depends on the infrastructure:

Method 1: FCOS Installer ISO + HTTP(S) Server:

Boot the target node using the downloaded FCOS installer ISO.70
Host the bootstrap.ign, master.ign, and worker.ign files on an HTTP or HTTPS server accessible by the nodes during boot 70-.76
After the live ISO environment boots, run the coreos-installer command, specifying the URL to the node-specific Ignition file and the target installation disk.70
Bash# Example for a master node, installing to /dev/sda

Assumes master.ign is hosted at http://192.168.1.10:8080/master.ign

Obtain SHA512 hash: sha512sum master.ign

IGNITION_URL="http://192.168.1.10:8080/master.ign"
IGNITION_HASH="sha512-..." # Replace with actual hash
sudo coreos-installer install /dev/sda --ignition-url=${IGNITION_URL} --ignition-hash=${IGNITION_HASH}

If using plain HTTP, the --ignition-hash argument is crucial for verifying the integrity of the downloaded Ignition file.103
After the installer finishes writing FCOS to the disk and injecting the Ignition configuration reference, reboot the node. FCOS will then fetch and apply the Ignition config on its first boot from the disk.

Method 2: PXE/iPXE Boot + HTTP(S)/TFTP Server:

Configure your PXE boot environment (DHCP options, TFTP server) to serve the FCOS PXE kernel (vmlinuz) and initramfs (initrd.img) files 70-.107
Host the .ign files on an accessible HTTP, HTTPS, or TFTP server.70
In the PXE configuration (e.g., pxelinux.cfg or iPXE script), append kernel arguments to specify the Ignition config URL for the specific node type booting.103 Key arguments include ignition.firstboot=1, ignition.platform.id=metal (for bare metal), and ignition.config.url=<URL_to_specific.ign>.108
Code snippet# Example iPXE snippet for a worker node
kernel fedora-coreos-<version>-live-kernel-<arch> ignition.firstboot=1 ignition.platform.id=metal ignition.config.url=http://192.168.1.10:8080/worker.ign mitigations=auto,nosmt console=ttyS0,115200n8 console=tty0
initrd fedora-coreos-<version>-live-initramfs.<arch>.img fedora-coreos-<version>-live-rootfs.<arch>.img
boot

Method 3: Virtual Machine User Data / Metadata:

For virtualized environments (like vSphere, OpenStack, KVM/libvirt), the hypervisor often provides a mechanism to pass "user data" or "metadata" to the guest VM during creation.
Ignition can read configuration from these sources (e.g., vSphere guestinfo.ignition.config.data, OpenStack metadata service, QEMU fw_cfg).
The content of the appropriate .ign file can be base64 encoded and provided through this mechanism, often eliminating the need for an external HTTP server. Consult the specific hypervisor and FCOS documentation for details on platform IDs and user data formats.

D. Hosting Ignition Files (If Required)If using methods 1 or 2 where Ignition configs are fetched via URL:
Setup HTTP Server: A simple way is to use Python's built-in HTTP server on the installer machine (ensure Python 3 is installed).70 Navigate to the installation directory (~/okd-install-upi in our example) where the .ign files reside and run:
Bashpython3 -m http.server 8080

Alternatively, use a dedicated web server like Apache or Nginx.
Accessibility: Ensure the server's IP address and port (e.g., 192.168.1.10:8080) are reachable from the network segment where the OKD nodes will boot. Adjust firewall rules on the server machine if necessary.105
Security: Using plain HTTP transmits Ignition files unencrypted, potentially exposing sensitive data like SSH keys. Using HTTPS is more secure but requires managing TLS certificates and potentially trusting a custom CA within the FCOS boot process.110 For HTTP, always use --ignition-hash with coreos-installer or ensure the ignition.config.url includes verification if passed via kernel args.103
The reliable delivery of the correct Ignition file during the initial boot sequence is a frequent point of failure in UPI installations. Problems can arise from incorrect URLs, network connectivity issues, firewall blocks, inaccessible HTTP servers, or improperly formatted/encoded Ignition data passed via VM metadata. Careful planning and testing of the chosen delivery method are essential.VIII. Step 7: Monitoring the Bootstrap ProcessOnce the bootstrap node and the initial control plane nodes are provisioned and successfully boot with their respective Ignition configurations, the cluster bootstrapping phase begins. This critical phase establishes the permanent control plane.A. Running wait-for bootstrap-completeOn the installer machine, initiate the monitoring command:Bashopenshift-install wait-for bootstrap-complete --dir=<installation_directory> --log-level=info
16
Purpose: This command polls the cluster's API server (initially hosted on the bootstrap node, then transitioning to the control plane nodes via the API load balancer) and monitors the status of critical components like etcd.9 It waits until the permanent control plane, running on the designated master nodes, is successfully initialized and has formed an etcd quorum.
Monitoring: The command provides high-level status updates. For detailed progress or troubleshooting:

Use --log-level=debug for more verbose output from the installer.39
Monitor the installer log file: <installation_directory>/.openshift_install.log.39
SSH into the bootstrap node (using the SSH key provided during config) and monitor the bootstrap service logs: ssh core@<bootstrap_ip> journalctl -b -f -u bootkube.service.39 Look for etcd connection errors initially, which should resolve as control plane nodes come online.39
SSH into the control plane nodes and monitor kubelet and crio service logs: ssh core@<master_ip> journalctl -b -f -u kubelet.service and ... -u crio.service.39
If the command fails with timeouts, it doesn't always indicate a fatal error, especially on slower infrastructure. The command can sometimes be re-run.111 However, persistent timeouts often point to underlying issues with node provisioning, networking, DNS, load balancing, or Ignition configuration.40 Use openshift-install gather bootstrap to collect detailed logs for analysis.40

B. Decommissioning the Bootstrap NodeSuccessful completion of wait-for bootstrap-complete marks a major milestone: the permanent, high-availability control plane is operational on the master nodes, and the temporary bootstrap environment is no longer required.9
Shut Down: Power off and delete the bootstrap machine (VM or physical server).9
Update Load Balancer: Remove the bootstrap machine's IP address from the backend pool of the API Load Balancer (ports 6443 and 22623).10 The pool should now only contain the IP addresses of the three control plane nodes.
This transition from the temporary bootstrap control plane to the permanent HA control plane is arguably the most complex part of the installation. Failures occurring before this stage often relate back to fundamental infrastructure prerequisites (networking, DNS, LBs, Ignition delivery). Failures after this point are more commonly associated with cluster operator deployment or control plane component stability.IX. Step 8: Finalizing Installation and VerificationWith the bootstrap process complete and the bootstrap node removed, the final steps involve waiting for the cluster operators to stabilize and verifying cluster access and health.A. Running wait-for install-completeExecute the final monitoring command on the installer machine:Bashopenshift-install wait-for install-complete --dir=<installation_directory> --log-level=info
16
Purpose: This command monitors the Cluster Version Operator (CVO) and other essential cluster operators.9 It waits until these operators report themselves as Available, not Progressing, and not Degraded, indicating that the cluster's core services have successfully rolled out and stabilized.
Monitoring & Timeouts: Similar to the bootstrap wait, this command has a timeout (e.g., 30 minutes mentioned in 114). On slower systems, or if complex operator configurations are involved, it might time out even if the cluster is ultimately healthy.114 The command primarily checks operator status; nodes might still be coming to full readiness or some operators might still be performing final synchronization tasks even after the command reports success or times out.114 Use --log-level=debug for details.
B. Accessing the Cluster

Export Kubeconfig: The installer generated a kubeconfig file containing API server details and initial administrative credentials. Set the KUBECONFIG environment variable to point to this file:
Bashexport KUBECONFIG=<installation_directory>/auth/kubeconfig

(Replace <installation_directory> with the actual path, e.g., ~/okd-install-upi/auth/kubeconfig).16 This tells oc how to connect to your new cluster.

Login with kubeadmin:

Retrieve Password: Get the temporary administrative password from the generated file:
Bashcat <installation_directory>/auth/kubeadmin-password

16 (Alternatively, the password might be found in the .openshift_install.log file 97).
Login via CLI: Use the oc command to log in:
Bashoc login -u kubeadmin -p <password_from_file>

16 (The API server URL is typically embedded in the kubeconfig file).
kubeadmin User: This user is created automatically during installation with full cluster-admin privileges.98 It is intended for initial cluster access only. For security best practices, this user should be removed after configuring a proper identity provider (like HTPasswd, LDAP, GitHub, etc.) and granting cluster-admin privileges to one or more regular user accounts.98 To remove it:

Configure at least one identity provider (see OKD documentation on authentication).
Grant the cluster-admin role to a user from that provider: oc adm policy add-cluster-role-to-user cluster-admin <username>.
Log out as kubeadmin and log in as the new administrative user.
Delete the kubeadmin secret: oc delete secret kubeadmin -n kube-system.98 This action is irreversible without reinstalling the cluster.98

C. Verifying Cluster StatusEven after wait-for install-complete succeeds, perform manual checks to confirm cluster health:

Check Nodes: Verify all control plane and compute nodes are registered and Ready:
Bashoc get nodes

16 The output should list all expected nodes with STATUS as Ready.

Check Cluster Operators: Examine the status of core cluster operators:
Bashoc get clusteroperators

or shorthand:

oc get co

16 All operators should ideally show AVAILABLE=True, PROGRESSING=False, DEGRADED=False. PROGRESSING=True might occur briefly after installation but should resolve. DEGRADED=True indicates an issue with that operator requiring investigation.

Access Web Console:

Find URL: The console URL is usually printed by the installer 97 or can be found using oc:
Bashoc whoami --show-console

100 It typically follows the pattern https://console-openshift-console.apps.<cluster_name>.<base_domain>.
Login: Open the URL in a web browser. Accept any self-signed certificate warnings initially. Log in using the username kubeadmin and the password retrieved earlier.16

While wait-for install-complete signals the end of the automated installation phases monitored by the installer, operators might still be finalizing their setup.114 Therefore, relying on oc get nodes and oc get co provides a more definitive confirmation that the cluster components are truly ready and stable before deploying applications.X. ConclusionA. SummaryThis guide has detailed the step-by-step process for installing OKD, the community distribution of Kubernetes, on user-provisioned infrastructure (bare metal or virtual machines) using the official openshift-install tool. The process involves careful preparation of prerequisites (hardware, networking, DNS, load balancers), downloading the correct installation tools and FCOS images, generating configuration files (install-config.yaml, manifests, Ignition configs), provisioning and booting nodes with Ignition, monitoring the critical bootstrap phase, and finally verifying the cluster's operational status. Successful completion results in a functional, multi-node OKD cluster managed by the administrator.B. Next StepsWith the basic OKD cluster installed and verified, several crucial post-installation tasks should be addressed:
Configure Identity Provider & Remove kubeadmin: This is the most critical immediate security step. Configure integration with an external identity provider (LDAP, HTPasswd, OAuth, etc.) and grant administrative privileges to regular users. Subsequently, remove the temporary kubeadmin user as previously described.98
Configure Persistent Storage: By default, pods have ephemeral storage. For stateful applications, configure persistent storage solutions. This might involve setting up an NFS provisioner, deploying a storage platform like Ceph using the Rook operator, or configuring CSI drivers for underlying storage infrastructure.
Review Monitoring and Logging: Familiarize yourself with OKD's built-in monitoring stack (Prometheus, Grafana) and logging components (often based on Loki or Elasticsearch/Fluentd/Kibana). Configure alerts and log forwarding as needed.
Explore Cluster Operators: Investigate available operators via OperatorHub (if default sources were not disabled) to extend cluster functionality (e.g., service mesh, serverless, database operators).
Deploy Applications: Begin deploying containerized applications using oc commands, YAML manifests, or the OKD web console's developer perspective.29
Consult Documentation: Refer to the official OKD documentation (docs.okd.io) for detailed guidance on cluster administration, security hardening, network configuration, storage management, and application development workflows.27

Optional Parameters below:

fips: false # Enable FIPS mode

publish: External # How to expose the cluster (External/Internal)

proxy: # Define HTTP/HTTPS proxy if needed

httpProxy: http://<proxy_user>:<proxy_pass>@<proxy_ip>:<proxy_port>

httpsProxy: https://<proxy_user>:<proxy_pass>@<proxy_ip>:<proxy_port>

noProxy: example.com,.cluster.local

additionalTrustBundle: | # Add custom CA certs for proxy/registries

-----BEGIN CERTIFICATE-----

...

-----END CERTIFICATE-----

Assumes master.ign is hosted at http://192.168.1.10:8080/master.ign

Obtain SHA512 hash: sha512sum master.ign

or shorthand: