Comments (4)
@misdoro does this reproduce? Could you please share which interfaces are available after boot and which drivers are loaded?
For us it is 100% reproducible on an AMI image we are building internally, when started on t3
class aws instances.
the only non lo
network adapter is managed by ena
driver, and it gets recognized by the kernel a few seconds after cloud-init local
is started.
For the moment we've implemented a work-around to delay the cloud-init local after the network adapter is recognized,
but I'm wondering if cloud-init could have a more official way to handle network adapters that appear late during the boot process.
The work-around in question:
/etc/systemd/system/cloud-init-local.service.d/10-wait-for-net-device.conf
# cloud-init-local must wait for at least one network interface device to exist
# before attempting to download EC2 instance metadata.
#
# These systemd unit directives implement this policy along with
# /etc/udev/rules.d/10-ec2imds.rules
[Unit]
Requires=dev-ec2imds.device
After=dev-ec2imds.device
/etc/udev/rules.d/10-ec2imds.rules
# cloud-init-local must wait for at least one network interface device to exist
# before attempting to download EC2 instance metadata.
#
# These udev rules implement this policy along with
# /etc/systemd/system/cloud-init.local.service.d/10-wait-for-net-device.conf
ACTION!="remove", SUBSYSTEM=="net", KERNEL!="lo", DRIVERS=="ena|vif", TAG+="systemd", ENV{SYSTEMD_ALIAS}+="/dev/ec2imds"
from cloud-init.
@misdoro does this reproduce? Could you please share which interfaces are available after boot and which drivers are loaded?
from cloud-init.
For us it is 100% reproducible on an AMI image we are building internally, when started on t3 class aws instances.
the only non lo network adapter is managed by ena driver, and it gets recognized by the kernel a few seconds after cloud-init local is started.
Good to know, thank you. How would one reproduce this? How can you ensure that only an ena interface is available?
For the moment we've implemented a work-around to delay the cloud-init local after the network adapter is recognized,
but I'm wondering if cloud-init could have a more official way to handle network adapters that appear late during the boot process.
Cloud-init should handle this better. Can you please share more of the log? The whole cloud-init.log would be best, but if you feel the need to redact, if there is a line like the following in the log, it would be good to know what it says:
2024-05-14 14:50:46,817 - stages.py[DEBUG]: applying net config names for {'version': 1, 'config': [{'type': 'physical', 'name': 'enp5s0', 'subnets': [{'type': 'dhcp', 'control': 'auto'}]}]}
from cloud-init.
@misdoro Thanks again for reporting. If you can share any additional data about your image (more complete logs, reproducer), that would be extremely helpful.
For EC2, and probably other datasources as well, cloud-init-local.service
needs to wait until at least one interface is available prior to proceeding into ephemeral network setup.
Current state
Cloud-init already does something similar, but with a different intent and outcome. Cloud-init currently polls on configured interfaces when a network configuration is available and waits on those configured interfaces to exist. Once these are available, cloud-init manually does interface rename.
Problems
-
Interface rename shouldn't actually be required in many cases (netplan, systemd, and friends are capable of doing rename). This logic predates current network backends.
-
The Local service doesn't wait for physical devices to exist before attempting to bring up an ephemeral interface. This seems to work when kernel drivers are loaded by initramfs as a module or built into the kernel.
Proposed fix
-
short term: add a poll for a single interface[1][2]
-
long term: only do interface rename in renderers which require it (possibly eni, ifconfig, sysconfig?). Initially we should retain current functionality for untested renderers and potentially add an opt-out flag to allow testing the different network back ends for working rename support.
[1] not wanted for LXD, None, NoCloud, and any other datasources which do not require an interface to be available in Local stage
[2] udevadm settle
causes unnecessary waiting. Polling at some frequency would probably be more appropriate.
from cloud-init.
Related Issues (20)
- netplan apply failing in standard noble cloud image HOT 2
- [enhancement]: Implement ipv6-address-token / ip token set ::123 HOT 2
- DataSourceCloudstack: typo in 23.4.x HOT 5
- add an option to disable dhcpv6 on fallback HOT 11
- [enhancement]: rh_subscription module attribute org has the wrong schema HOT 2
- Fix ability to set DNS servers via OpenStack network_data.json
- Set L3 method of unused NetworkManager managed interfaces
- Warning in logs on EC2 in v6-only environment
- Seed not supported by DataSourceNoCloud when using a http-URL HOT 4
- [enhancement]: cloud-init does not provide a schema-compliant way to retain current `system_info` `package_mirrors` behaviour HOT 1
- [enhancement]: network interface discovery clean up
- cloud-init query produces no output with missing userdata files
- [enhancement]: mypy - validate jsonschema against type annotations
- [enhancement]: Schema validator run from `main_init` can produce confusing error messages HOT 7
- search domains missing in /etc/resolv.conf
- ubuntu/focal/24.2: Patch out ConditionEnvironment key in systemd units HOT 1
- [enhancement]: Distinguish between genuine schema errors and deprecated keys
- Ansible module doesn't find collections installed by the playbook
- [enhancement]: document cloud.cfg values
- openbsd: the mtu in hostname files should be one a separate line
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloud-init.