Skip to content

Add EndpointSlice consumer helper functions #131376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mbergo
Copy link

@mbergo mbergo commented Apr 18, 2025

This commit adds helper functions for EndpointSlice consumers to make it easier to transition from Endpoints to EndpointSlices. The new package provides:

  1. EndpointSliceConsumer - Core component that tracks EndpointSlices and provides a unified view of endpoints for a service
  2. EndpointSliceInformer - Informer-like interface for EndpointSlices
  3. EndpointSliceLister - Lister-like interface for EndpointSlices

These helpers handle the complexity of merging multiple slices for the same service and deduplicating endpoints that might appear in multiple slices.

Benefits:

  • Easier migration from Endpoints to EndpointSlices with familiar interfaces
  • Simplified handling of multiple slices without manual merging and deduplication
  • Improved performance by leveraging the scalability of the EndpointSlice API
  • Consistent view of endpoints even as they move between slices

Fixes #124777

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


This commit adds helper functions for EndpointSlice consumers to make it easier to transition from Endpoints to EndpointSlices. The new package provides:

1. EndpointSliceConsumer - Core component that tracks EndpointSlices and provides a unified view of endpoints for a service
2. EndpointSliceInformer - Informer-like interface for EndpointSlices
3. EndpointSliceLister - Lister-like interface for EndpointSlices

These helpers handle the complexity of merging multiple slices for the same service and deduplicating endpoints that might appear in multiple slices.

Benefits:
- Easier migration from Endpoints to EndpointSlices with familiar interfaces
- Simplified handling of multiple slices without manual merging and deduplication
- Improved performance by leveraging the scalability of the EndpointSlice API
- Consistent view of endpoints even as they move between slices

Fixes kubernetes#124777

Signed-off-by: Mad Bergo <marcusbergo@gmail.com>
@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. labels Apr 18, 2025
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.33 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.33.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Fri Apr 18 13:34:58 UTC 2025.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Apr 18, 2025
@k8s-ci-robot
Copy link
Contributor

Keywords which can automatically close issues and at(@) or hashtag(#) mentions are not allowed in commit messages.

The list of commits with invalid commit messages:

  • 354c12e Add EndpointSlice consumer helper functions

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 18, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

Hi @mbergo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 18, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mbergo
Once this PR has been reviewed and has the lgtm label, please assign danwinship for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from aroradaman and tnqn April 18, 2025 19:34
@mbergo
Copy link
Author

mbergo commented Apr 19, 2025

/cc @kubernetes/sig-network-pr-reviews

@k8s-ci-robot
Copy link
Contributor

@mbergo: GitHub didn't allow me to request PR reviews from the following users: kubernetes/sig-network-pr-reviews.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @kubernetes/sig-network-pr-reviews

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

@mbergo: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-pr-reviews

In response to this:

/cc @kubernetes/sig-network-pr-reviews

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mbergo
Copy link
Author

mbergo commented Apr 19, 2025

maybe like this. cc: @kubernetes/sig-network-pr-reviews

@k8s-ci-robot
Copy link
Contributor

@mbergo: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-pr-reviews

In response to this:

maybe like this. cc: @kubernetes/sig-network-pr-reviews

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

// deduplicating endpoints from all EndpointSlices for the service.
func (l *endpointSliceNamespaceLister) GetEndpoints(serviceName string) ([]discovery.Endpoint, error) {
// Get all EndpointSlices for the service
_, err := l.Get(serviceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is AI-generated, right?

/hold

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first was manually generated but I used my own AI from the IBM days for a review which made 3 lines fixes. Can you tell what you saw to think that? Actually it did nothing that I would haven't done nowadays or when I was at Google working at this or Borg. But I got interested now, since it was trained only on my codebase and my style with my own constraints of 23 years doing this.

Nevertheless, nice catch @danwinship .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never the less, I have a 4h video for you that I spent on this, just did not get any suggestions for modifications. I did not think my AI review more than a lint or LSP wrote by me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, was worried that since you are a new contributor (ie, not a member of the kubernetes github org), and this has AI fingerprints, that maybe that meant you just let the AI write the whole thing and you didn't actually understand any of the code and wouldn't be able to usefully respond to code review. Glad that's not the case.

(The particular reason I pointed out this line is that it's a no-op; it gets the EndpointSlices but discards the result, and then a few lines below calls l.consumer.GetEndpoints() which then re-fetches the EndpointSlices again.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... ok, looking closer, this is pretty weird; it uses the underlying lister to get the slices, feeds them into the consumer via OnEndpointSliceAdd, and then asks the consumer to list the slices...

So this wouldn't actually work right, since nothing deletes slices from the consumer when they get deleted, so every call to GetEndpoints would end up returning both current and old endpoints.

We shouldn't actually be using an underlying discoverylisters.EndpointSliceNamespaceLister here; we should implement API directly via an underlying consumer/informer.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 21, 2025
@aojea
Copy link
Member

aojea commented Apr 21, 2025

Adding libraries to staging repos has demonstrated that is not the best place , since people expect these apis to be frozen and the expierence is that there is no control across the versions breaking compatbility, making people to go through a lot of pain to revendor the code ... I do not think staging repos are the best places for this kind of things unless you are willing to provide stable APIs on these helpers

@mbergo
Copy link
Author

mbergo commented Apr 22, 2025

Adding libraries to staging repos has demonstrated that is not the best place , since people expect these apis to be frozen and the expierence is that there is no control across the versions breaking compatbility, making people to go through a lot of pain to revendor the code ... I do not think staging repos are the best places for this kind of things unless you are willing to provide stable APIs on these helpers


I think there is a more in deep problem about what you said, could not all this be due the release rate you guys think is maintainable? Let's be a little a bit audacious and compare this codebase with the Kernel (linux), from also an early contributor I think Linux windows of merge patches are just agile. And this after well matured guardrails.

Without long debates, what am I saying. I came to help and take a look to see why k8s adoption is crashing from when I started with it. I am open to suggestions to improve the PR, but I only got criticism?

Any suggestions for a better approach, or at least a reason for why do you think people expect that from your project? cc: @aojea

@mbergo
Copy link
Author

mbergo commented Apr 22, 2025

Overall, you guys are showing me one thing, might be time for investing in a good K8s alternative, it is becoming hell to work with it with so subtle bugs due all those approaches you guys are taking that makes almost impossible to see if it is one error of the chaotic configuration architecture or a small bug for one of many sharing responsibilities components.

Release after release, no palpable, user friendly improvements.

@danwinship
Copy link
Contributor

re staging vs not staging, Antonio and I talked about this. I think for the initial version of this API, we should put it internal to k8s.io/kubernetes, and wait a few releases to make sure we're happy with the API before moving it to a staging repo where it would become harder to change the API if it turned out there were problems


// EndpointSliceConsumer provides a unified view of endpoints for services
// across multiple EndpointSlice objects.
type EndpointSliceConsumer struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any advantage in having this type be exposed; it should just be an implementation detail of the informer and lister.

(I'm not sure we really need the informer and lister as separate types either, and if you squashed them together, then there's no need for the Consumer as its own type at all.)

result := make([]*discovery.EndpointSlice, 0, len(slices))
for _, slice := range slices {
result = append(result, slice.DeepCopy())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standard informer/lister semantics are that they don't copy the data, and just document that you aren't allowed to modify the return values. That makes them a little bit tricky, but it's much better for memory usage...

// Sort slices by name for consistent results
sort.Slice(result, func(i, j int) bool {
return result[i].Name < result[j].Name
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, don't do this; if the caller wants them sorted, they can do that, but they may not care at all


// GetEndpoints returns all endpoints for a service, merging and deduplicating
// endpoints from all EndpointSlices for the service.
func (c *EndpointSliceConsumer) GetEndpoints(serviceNN types.NamespacedName) []discovery.Endpoint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will end working as an API, because it discards the Ports... (In most cases, all of the EndpointSlices have identical ports, but if you're upgrading pods and moving them from an old port to a new port, then there will be one slice with the pods using the old port number and one slice with the pods using the new port number, and they'd have different Ports and you need to know which Endpoints go with which Ports.)

(Also, again, this function necessarily has to do a bunch of memory allocations, whereas many EndpointSlice consumers would be able to just iterate over the return value from GetEndpointSlices and do what they need to do without needing equivalent allocations.)

err = fmt.Errorf("expected EndpointSlice name and namespace to be set: %v", endpointSlice)
}
return types.NamespacedName{Namespace: endpointSlice.Namespace, Name: serviceName}, endpointSlice.Name, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having a discovery.LabelServiceName isn't an "error", it just means the slice doesn't correspond to a Service (so we don't need to track it, but we don't need to log anything).

endpointSlice.Namespace == "" || endpointSlice.Name == "" is not possible for an object that came from the apiserver, so you don't need to worry about that.

)

// This example demonstrates how to use the EndpointSliceConsumer directly.
func Example_directUsage() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally mix example code in with implementation code. I think we don't generally provide examples at all...


// nodeName is the name of the node this consumer is running on.
// Used to determine if an endpoint is local.
nodeName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caring about local endpoints only is specific to kube-proxy and shouldn't be part of the generic API.

Comment on lines +177 to +181
// If we already have this endpoint, only replace it if the existing one
// is not local but the new one is
existingEp, exists := endpointMap[key]
isLocal := endpoint.NodeName != nil && *endpoint.NodeName == c.nodeName
existingIsLocal := exists && existingEp.NodeName != nil && *existingEp.NodeName == c.nodeName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we don't want any of this)

// NewEndpointSliceInformer creates a new EndpointSliceInformer.
func NewEndpointSliceInformer(
informerFactory informers.SharedInformerFactory,
nodeName string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, we don't want nodeName in the API.

Though, there may be a use case for having Namespace-specific informers/listers rather than watching all namespaces.

// deduplicating endpoints from all EndpointSlices for the service.
func (l *endpointSliceNamespaceLister) GetEndpoints(serviceName string) ([]discovery.Endpoint, error) {
// Get all EndpointSlices for the service
_, err := l.Get(serviceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... ok, looking closer, this is pretty weird; it uses the underlying lister to get the slices, feeds them into the consumer via OnEndpointSliceAdd, and then asks the consumer to list the slices...

So this wouldn't actually work right, since nothing deletes slices from the consumer when they get deleted, so every call to GetEndpoints would end up returning both current and old endpoints.

We shouldn't actually be using an underlying discoverylisters.EndpointSliceNamespaceLister here; we should implement API directly via an underlying consumer/informer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add EndpointSlice consumer helper functions
4 participants