Walid Ghallab
8ef02095fe
Improve "Max total nodes in cluster reached" log.
2024-12-20 12:52:15 +00:00
Kuba Tużnik
4a89524f84
CA: enable the DRA feature gate whenever the DRA flag is passed
...
This is needed so that the scheduler code correctly includes and
executes the DRA plugin.
We could just use the feature gate instead of the DRA flag in CA
(the feature gates flag is already there, just not really used),
but I guess there could be use-cases for having DRA enabled in the
cluster but not in CA (e.g. DRA being tested in the cluster, CA only
operating on non-DRA nodes/pods).
2024-12-20 13:30:37 +01:00
Kuba Tużnik
99282c08cb
CA: automatically use BasicSnapshotStore when DRA is enabled
...
By default CA is built with DeltaSnapshotStore, which isn't integrated
with DRA yet.
2024-12-20 13:30:37 +01:00
Kuba Tużnik
a45e6b7003
CA: implement DRA integration tests for StaticAutoscaler
2024-12-20 13:30:36 +01:00
Kuba Tużnik
55388f1136
CA: plumb the DRA provider to SetClusterState callsites, grab and pass DRA snapshot
...
The new logic is flag-guarded, it should be a no-op if DRA is disabled.
2024-12-20 13:30:36 +01:00
Kuba Tużnik
c5cb8a077d
CA: add DRA object handling logic to PredicateSnapshot
...
All added logic is behind the DRA flag guard, this should be a no-op
if the flag is disabled.
2024-12-20 13:30:36 +01:00
Kuba Tużnik
714ab661ca
CA: implement calculating utilization for DRA resources
...
The logic is very basic and will likely need to be revised, but it's
something for initial testing. Utilization of a given Pool is calculated
as the number of allocated devices in the pool divided by the number of
all devices in the pool. For scale-down purposes, the max utilization
of all Node-local Pools is used.
The new logic is mostly behind the DRA flag guard, so this should be a no-op
if the flag is disabled. The only difference should be that FilterOutUnremovable
marks a Node as unremovable if calculating utilization fails. Not sure
why this wasn't the case before, but I think we need it for DRA - if CA sees an
incomplete picture of a resource pool, we probably don't want to scale
the Node down.
2024-12-20 13:30:36 +01:00
Kuba Tużnik
4e68a0c6ef
CA: sanitize and propagate DRA objects through NodeInfos in node_info utils
2024-12-20 13:30:36 +01:00
Kuba Tużnik
479d7ce3d6
CA: implement a Provider for dynamicresources.Snapshot
...
The Provider uses DRA object listers to create a Snapshot of the
DRA objects.
2024-12-20 13:30:36 +01:00
Kuba Tużnik
377639a8dc
CA: implement dynamicresources.Snapshot for storing and modifying the state of DRA objects
...
The Snapshot can hold all DRA objects in the cluster, and expose them
to the scheduler framework via the SharedDRAManager interface.
The state of the objects can be modified during autoscaling simulations
using the provided methods.
2024-12-20 13:30:10 +01:00
soer3n
833af67cbd
bump chart version
...
Signed-off-by: soer3n <srenhenning@googlemail.com>
2024-12-20 09:33:46 +01:00
Adrian Moisey
b4d8de06d9
Fix a few CVEs in the e2e tests
...
- https://pkg.go.dev/vuln/GO-2023-2113
- https://pkg.go.dev/vuln/GO-2024-3333
- https://pkg.go.dev/vuln/GO-2024-3321
Trivy discovered these. While not critical, since it's for e2e tests,
it's good to keep on top of these.
2024-12-19 17:52:37 +02:00
Adrian Moisey
7943e68bc8
Bump golang.org/x/net to v0.33.0
...
For this CVE: https://pkg.go.dev/vuln/GO-2024-3333
Whil it isn't used in the VPA, it's great to keep CVEs shipped with the
VPA down
2024-12-19 17:50:26 +02:00
Marco Voelz
2edd0261b5
Add comment to ErrNodeInvalidOwner
2024-12-19 16:19:37 +01:00
Marco Voelz
4c98e459f1
Rename Error to fit linting rules
2024-12-19 16:14:04 +01:00
Kuba Tużnik
66d0aeb3cb
CA: implement utils for interacting with ResourceClaims
...
These utils will be used by various parts of the DRA logic in the
following commits.
2024-12-19 15:55:49 +01:00
Shiqi Wang
11740d1398
remove contact information
2024-12-19 09:23:05 -05:00
Marco Voelz
f335e9db73
Handle NodeInvalidOwnerError instead of returning it
2024-12-19 14:31:17 +01:00
Marco Voelz
c191b6dcad
Cleanup: use ptr.To for reference
2024-12-19 13:57:20 +01:00
Marco Voelz
d919930546
Make FakeControllerFetcher also return error for Node kind
2024-12-19 13:56:47 +01:00
Marco Voelz
bd74115003
Move NilControllerFetcher
2024-12-19 13:55:41 +01:00
Kubernetes Prow Robot
7df4f842a7
Merge pull request #7617 from ialidzhikov/nit/vpa-generate-crd-script
...
VPA: Clean up unused `cd` invocation in the `generate-crd-yaml.sh` script
2024-12-19 11:26:09 +01:00
Kubernetes Prow Robot
b8664f8b45
Merge pull request #7618 from omerap12/flags-docs
...
VPA: add flags to docs
2024-12-19 11:18:09 +01:00
Kubernetes Prow Robot
da31dff7a6
Merge pull request #7614 from DataDog/update-azure-instance-types
...
update azure static sku list
2024-12-17 20:54:52 +01:00
Omer Aplatony
037b377355
Moved flags to introduction section
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-17 19:14:51 +02:00
Omer Aplatony
d6fcca20f0
VPA: add flags to docs
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-17 15:10:25 +02:00
Kubernetes Prow Robot
7b648361c3
Merge pull request #7613 from walidghallab/err
...
Refactor NewAutoscalerError function.
2024-12-17 13:48:53 +01:00
Kubernetes Prow Robot
e4898a9563
Merge pull request #7611 from macsko/dont_accept_pr_twice_when_check_capacity_batch_processing_enabled
...
Don't accept ProvisioningRequest twice when checkCapacityBatchProcessing enabled
2024-12-17 10:58:53 +01:00
Ismail Alidzhikov
9b7d949a0e
VPA: Clean up unused cd
invocation in the generate-crd-yaml.sh
script
2024-12-17 09:57:29 +02:00
Rahul Rangith
9d44562d0e
Add test for recomputing similar nodegroups
2024-12-16 16:14:42 -05:00
Rahul Rangith
6ab0eb94f7
update azure static sku list
2024-12-16 15:01:28 -05:00
Kubernetes Prow Robot
de5d64fbcd
Merge pull request #7605 from ialidzhikov/enh/kubebuilder-tag
...
VPA: Add the `api-approved.kubernetes.io` annotation via kubebuilder tag
2024-12-16 19:18:51 +01:00
Kubernetes Prow Robot
6bbc649542
Merge pull request #7599 from omerap12/automated-flags
...
VPA: Add automated flag generator for all components
2024-12-16 19:00:52 +01:00
Kubernetes Prow Robot
fd14076561
Merge pull request #7610 from davidspek/chore/code-crd-gen
...
chore(vpa): run codegen
2024-12-16 18:56:51 +01:00
Walid Ghallab
720f5946fd
Refactor NewAutoscalerError function.
...
We will have two functions instead of one:
1. One that doesn't do formatting, like klog.Error
2. One that accepts formating, like klog.Errorf
The main reason behind this is to avoid go vet errors and have clear
interfaces to catch accidental bugs and rely on go vet to catch those
accidental bugs (or go test in go 1.24, as those are treated as errors).
2024-12-16 17:46:40 +00:00
Kubernetes Prow Robot
148ffa345b
Merge pull request #7520 from hetznercloud/refactor-placement-groups
...
refactor(hetzner): refactored placement group code
2024-12-16 13:36:51 +01:00
Kubernetes Prow Robot
c2972a8000
Merge pull request #7606 from towca/jtuznik/node-info-fix
...
CA: fix a nil map write in NodeInfo.AddPod()
2024-12-16 12:48:51 +01:00
Kubernetes Prow Robot
ae22146f60
Merge pull request #7449 from thiha-min-thant/failed-scale-ups-metrics
...
🐛 (metrics) Initialize metrics for autoscaler errors, scale events, and pod evictions
2024-12-16 11:10:51 +01:00
Maciej Skoczeń
2426d7f836
Don't accept ProvisioningRequest twice when checkCapacityBatchProcessing enabled
2024-12-16 09:57:18 +00:00
Kubernetes Prow Robot
9388ee32d4
Merge pull request #7549 from abdelrahman882/asynchronous_taint_nodes
...
Taint nodes for deletion asynchronously
2024-12-16 10:54:52 +01:00
David van der Spek
93e5f872b0
chore: run codegen
...
Signed-off-by: David van der Spek <david.vanderspek@flyrlabs.com>
2024-12-16 10:19:39 +01:00
Omer Aplatony
ce75c91874
Add final message
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-16 11:16:40 +02:00
Omer Aplatony
06558a4406
sub shell
...
Co-authored-by: Adrian Moisey <adrian@changeover.za.net>
2024-12-16 11:13:29 +02:00
Omer Aplatony
e227e20005
Remove the temporary file after merging
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-15 18:27:33 +02:00
Omer Aplatony
1611b35b12
sub shell to avoid cd -
...
Co-authored-by: Adrian Moisey <adrian@changeover.za.net>
2024-12-15 18:25:56 +02:00
Omer Aplatony
b7ac55b48e
Add BASH_SOURCE
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-15 11:52:56 +02:00
Omer Aplatony
4743dfbd9c
fix misspelling
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-15 11:22:55 +02:00
Omer Aplatony
863d49f7f4
Address issues and add merge script
...
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-12-15 11:19:50 +02:00
Adrian Moisey
f544d94fe7
Add generated API docs and link to them
2024-12-15 08:21:07 +02:00
Adrian Moisey
024b8d2345
Generate API docs
2024-12-15 08:18:09 +02:00