Skip to content

[Feature] Add Support on Log Probability Value from Returned Response of Gemini Models. #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jacklanda opened this issue Mar 14, 2024 · 32 comments
Assignees
Labels
component:python sdk Issue/PR related to Python SDK type:feature request New feature request/enhancement

Comments

@jacklanda
Copy link

Description of the feature request:

How about providing new support for retrieving the log probability of each predicted token by Google models like Gemini?

Something just like the same function illustrated in this OpenAI post: User Logprobs.

What problem are you trying to solve with this feature?

Getting the returned log prob of each generated token does help me (and other users maybe) to confirm the confidence of the model's prediction. Further, this feature can help users compute the perplexity of generated sentences to better understand the textual continuation quality.

Any other information you'd like to share?

No response

@jacklanda jacklanda added component:python sdk Issue/PR related to Python SDK type:feature request New feature request/enhancement labels Mar 14, 2024
@jacklanda
Copy link
Author

Any thoughts about it?

@singhniraj08 singhniraj08 added the status:triaged Issue/PR triaged to the corresponding sub-team label Mar 19, 2024
@Brian-Ckwu
Copy link

I also think it would be extremely helpful if the API could provide the top-k log probabilities of each predicted token.

@krenova
Copy link

krenova commented Apr 9, 2024

Really need this as I found Gemini to be hallucinating quite abit on a RAG application.

Btw your title refers to claude and not gemini.
[Feature] Add Support on Log Probability Value from Returned Response of Claude Models

@jacklanda jacklanda changed the title [Feature] Add Support on Log Probability Value from Returned Response of Claude Models. [Feature] Add Support on Log Probability Value from Returned Response of Gemini Models. Apr 9, 2024
@jacklanda
Copy link
Author

Really need this as I found Gemini to be hallucinating quite abit on a RAG application.

Btw your title refers to claude and not gemini. [Feature] Add Support on Log Probability Value from Returned Response of Claude Models

Thanks for reminding 🤗

@simpleusername96
Copy link

Many hallucination detection approaches rely on log probability as a key feature. It's one of the most essential elements when building a serious product with a LLM.

@anwang427
Copy link

This would be extremely helpful!

@Said-Apollo
Copy link

As all others previously mentioned, it would be great to somehow get access to the logsprobs in a similar fashion as OpenAI does so with the models. Based on that we could then e.g. calculate the perplexity score and various other evaluation metrics.

@haugmarkus
Copy link

Yep, this would make it possible to use gemini in production

@waveworks-ai
Copy link

+1 - useful also for classification tasks.

@lavanyanemani96
Copy link

+1

@luna-b20
Copy link

+1 it is critical. Chatgpt and other major models are now all supporting this feature.
please help!

@michaelgfeldman
Copy link

+1 it is critical. Chatgpt and other major models are now all supporting this feature. please help!

Can you please specify what major models support this feature? Because I'm also in search for the alternatives. Now I don't see any other players supporting this except from OpenAI. No Anthropic, No Mistral...

@MarkDaoust
Copy link
Collaborator

b/361194489

@MFajcik
Copy link

MFajcik commented Aug 30, 2024

I also think it would be extremely helpful if the API could provide the top-k log probabilities of each predicted token.

Yes, this would allow evaluating Gemini with threshold-free evaluation metrics. That would be excellent.

@MarkDaoust
Copy link
Collaborator

MarkDaoust commented Aug 30, 2024

Google's Vertex-AI Just launched this, hopefully that means it's coming soon here, but I don't have a timeline.

@michaelgfeldman
Copy link

Google's Vertex-AI Just launched this, hopefully that means it's coming soon here, but I don't have a timeline.

can you share a link?

@haugmarkus
Copy link

Google's Vertex-AI Just launched this, hopefully that means it's coming soon here, but I don't have a timeline.

can you share a link?

They have added a field 'avgLogprobs' to the response documentation in https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#nodejs but I am unable to get a response with such field.

@MarkDaoust
Copy link
Collaborator

Yeah, this API is related, but separate from vertex. Hopefully this API will catch up soon.

@kauabh
Copy link

kauabh commented Oct 4, 2024

  • 1

@MarkDaoust
Copy link
Collaborator

This is fixed in the latest version:

code: #561
tutorial: https://github.com/google-gemini/cookbook/blob/main/quickstarts/New_in_002.ipynb

@github-actions github-actions bot removed the status:triaged Issue/PR triaged to the corresponding sub-team label Oct 4, 2024
@FraPochetti
Copy link

Just to make sure I get this right. The logprobs are averaged for the entire output, correct?
No logprob at token level?

@jacklanda
Copy link
Author

Just to make sure I get this right. The logprobs are averaged for the entire output, correct? No logprob at token level?

I recommend you open another issue to discuss this question =)

@FraPochetti
Copy link

Created a convo here actually :)

@miaojingang
Copy link

The feature is available in only gemini-2.0-flash-lite and gemini-1.5-flash per https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference.

Would it be possible to extend to 1.5-pro and 2.0 models?

@chancharikmitra
Copy link

chancharikmitra commented Feb 27, 2025

Thanks for the note @miaojingang! Unfortunately, my experience is a bit different.

TLDR: I did some digging. But here's very likely confimation that logprobs does not work for Gemini at the moment:

def test_log_probs(client):
  # ML DEV discovery doc supports response_logprobs but the backend
  # does not.
  # TODO: update replay test json files when ML Dev backend is updated.
  with pytest_helper.exception_if_mldev(client, errors.ClientError):
    client.models.generate_content(
        model='gemini-1.5-flash',
        contents='What is your name?',
        config={
            'logprobs': 2,
            'presence_penalty': 0.5,
            'frequency_penalty': 0.5,
            'response_logprobs': True,
        },
    )

from: https://github.com/googleapis/python-genai/blob/0601b76204af78b5a5bac05bc188a92464277a7d/google/genai/tests/models/test_generate_content.py#L386

Perhaps, it is in the works.


Long Answer

So there's some ambiguity in how this is being handled. To my knowledge, there was full support for logprobs just a few short months ago on Gemini's generativeai version of the api. But now, there is also a version called genai, which is different. If you actually print the attributes of

  1. generativeai version:
    the GenerationConfig class, you get the following:

['__annotations__', '__class__', '__dataclass_fields__', '__dataclass_params__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__match_args__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'candidate_count', 'frequency_penalty', 'max_output_tokens', 'presence_penalty', 'response_mime_type', 'response_schema', 'stop_sequences', 'temperature', 'top_k', 'top_p']

  1. genai version:
genConfig = types.GenerateContentConfig()
print(dir(genConfig))

yields:

['__abstractmethods__', '__annotations__', '__class__', '__class_getitem__', '__class_vars__', '__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__fields__', '__fields_set__', '__format__', '__ge__', '__get_pydantic_core_schema__', '__get_pydantic_json_schema__', '__getattr__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__pretty__', '__private_attributes__', '__pydantic_complete__', '__pydantic_computed_fields__', '__pydantic_core_schema__', '__pydantic_custom_init__', '__pydantic_decorators__', '__pydantic_extra__', '__pydantic_fields__', '__pydantic_fields_set__', '__pydantic_generic_metadata__', '__pydantic_init_subclass__', '__pydantic_parent_namespace__', '__pydantic_post_init__', '__pydantic_private__', '__pydantic_root_model__', '__pydantic_serializer__', '__pydantic_validator__', '__reduce__', '__reduce_ex__', '__replace__', '__repr__', '__repr_args__', '__repr_name__', '__repr_recursion__', '__repr_str__', '__rich_repr__', '__setattr__', '__setstate__', '__signature__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '__weakref__', '_abc_impl', '_calculate_keys', '_check_frozen', '_convert_literal_to_enum', '_copy_and_set_values', '_from_response', '_get_value', '_iter', 'audio_timestamp', 'automatic_function_calling', 'cached_content', 'candidate_count', 'construct', 'copy', 'dict', 'frequency_penalty', 'from_orm', 'http_options', 'json', 'labels', 'logprobs', 'max_output_tokens', 'media_resolution', 'model_computed_fields', 'model_config', 'model_construct', 'model_copy', 'model_dump', 'model_dump_json', 'model_extra', 'model_fields', 'model_fields_set', 'model_json_schema', 'model_parametrized_name', 'model_post_init', 'model_rebuild', 'model_validate', 'model_validate_json', 'model_validate_strings', 'parse_file', 'parse_obj', 'parse_raw', 'presence_penalty', 'response_logprobs', 'response_mime_type', 'response_modalities', 'response_schema', 'routing_config', 'safety_settings', 'schema', 'schema_json', 'seed', 'speech_config', 'stop_sequences', 'system_instruction', 'temperature', 'thinking_config', 'to_json_dict', 'tool_config', 'tools', 'top_k', 'top_p', 'update_forward_refs', 'validate']

The latter has response_logprobs! Promising, indeed. But alas, the following:

response = client.models.generate_content(
    model="gemini-2.0-flash-lite",
    contents="Tell me how the internet works, but pretend I'm a puppy who only understands squeaky toys.",
    config=types.GenerateContentConfig(
        temperature=0.4,
        top_p=0.95,
        top_k=20,
        candidate_count=1,
        seed=5,
        max_output_tokens=100,
        stop_sequences=["STOP!"],
        presence_penalty=0.0,
        frequency_penalty=0.0,
        response_logprobs=True
    )
)

print(response.text)

yields

ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Logprobs is not supported for the current model.', 'status': 'INVALID_ARGUMENT'}}

And the same for 1.5-flash as well. Not sure what else to try. @miaojingang, can you share how you are getting the log prob feature even from the models you listed? Versions and example code of what you are doing would be appreciated as it's challenging to keep track of which version of the Gemini api this applies to. I've tried a few versions of the apis but can't be certain that a different version won't work better.

Edit: I didn't spend too much time parsing it, but this commit seems very relevant. It looks like the devs tried to add logprobs back, but didn't quite get all the way? I'll have to look more deeply through it: googleapis/python-genai@5586f3d

@miaojingang
Copy link

Thanks @chancharikmitra for the detailed notes. I'm new to all those different implementation and find it confusing.

I saw the generativeai example using the parameter was removed in google-gemini/cookbook#309. I got an timeout error last time I tried it. A colleague got something working and I will ask.

@slverpla
Copy link

@miaojingang how did they make it work?

@wmpauli
Copy link

wmpauli commented Mar 18, 2025

I was able to get this to work with

GENERATION_CONFIG = {
    "max_output_tokens": 8192,
    "temperature": 0,
    "top_p": 0,
    "response_logprobs": True,
    "logprobs": 5,
}

, but only for gemini-1.5-flash.

according to the documentation, it currently only works for gemini-2.0-flash-lite-001 and gemini-1.5-flash:
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference

@tranhoangnguyen03
Copy link

It stopped working for gemini-2.0-flash-lite-001 and gemini-1.5-flash now

@chancharikmitra
Copy link

chancharikmitra commented Apr 29, 2025

Yes, agreed @tranhoangnguyen03. This seems to be a widespread source of confusion for those who care about this feature, have used it some point, and now are suddenly seeing it missing despite some remenants of it being on the Gemini documentation:

EDIT: The code below works if you use the Vertex AI API, NOT THE GOOGLE GEMINI API. My error was in confusing the two as I was sure their functionality would be the same.

This is a response I posted in another thread vercel/ai#5418 (comment).

Thanks for the reference. Gemini's documentation is a bit convoluted as there are 2 versions of the API genai (the new one) and generative-ai (the old one). I believe (not 100% sure) that support for the old one is slowly ramping down.

Now, I just tried all of the gemini models using the new API using a Google Gemini Key (I believe this is different from Vertex AI key? It is a bit unclear, but I definitely got my API key specifically from the Gemini web app, so I am guessing it is for Gemini only and not a Vertex key).

TLDR: response logprobs does not seem to be supported at all anymore despite these documentations claims. I think they are outdated. If someone is able to find something that works, I am eager to hear about it.

For reference, the code that does not yield logprobs for others to confirm taken from the gemini demo notebook: https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/getting-started/intro_gemini_2_5_pro.ipynb#scrollTo=37CH91ddY9kG

client = genai.Client(api_key=API_KEY)

MODEL_ID = "gemini-2.5-pro-preview-03-25" 

response = client.models.generate_content(
    model=MODEL_ID, contents="What's the largest planet in our solar system?", 
    config=GenerateContentConfig(
        response_logprobs=True,
        logprobs=1
    )
)
response

@victoraavila
Copy link

+1 here. I am also not getting the logprobs for the Preview models. I am using the google.genai module:

{'error': {'code': 400, 'message': 'Logprobs is not enabled for models/gemini-2.5-pro-preview-03-25', 'status': 'INVALID_ARGUMENT'}}

{'error': {'code': 400, 'message': 'Logprobs is not enabled for models/gemini-2.5-flash-preview-04-17', 'status': 'INVALID_ARGUMENT'}}

@Tahlor
Copy link

Tahlor commented May 7, 2025

It works on Vertex, 1 request per day, resets 12am Pacific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:python sdk Issue/PR related to Python SDK type:feature request New feature request/enhancement
Projects
None yet
Development

No branches or pull requests