Skip to content

Variability in Responses with top_k=1 Parameter in Gemini Pro Model #192

Closed as not planned
@storybite

Description

@storybite

Description of the bug:

Greetings,

While experimenting with the GenerationConfig parameters in the Gemini Pro model, I've noticed an unexpected variability in the outputs generated with the top_k=1 setting, which contrasts with the near-consistent responses observed with top_p=0.

Detailed Explanation:
Testing revealed that while top_p=0 leads to near-consistent outputs for the same input—which is expected due to its nature of narrowing down the generation to the most probable outcomes—the top_k=1 setting does not exhibit the same level of consistency. This observation is intriguing given that top_k=1 theoretically limits the response generation to the top k probable outcomes, which should similarly result in near-consistent outputs for identical requests.

Below is the Python code snippet illustrating the tests conducted and highlighting the difference in response consistency:

import google.generativeai as genai

model = genai.GenerativeModel(model_name='gemini-pro')
user_message = "Write a one-sentence poem"

# Testing for near-consistent response with top_p=0
print("\ntop_p=0:")
generation_config = genai.GenerationConfig(top_p=0)
for _ in range(3):    
    response = model.generate_content(user_message, generation_config=generation_config)
    print(f'{"_"*20}\n{response.text}')

# Testing for variability with top_k=1
print("\ntop_k=1:")
generation_config = genai.GenerationConfig(top_k=1)
for _ in range(3):
    response = model.generate_content(user_message, generation_config=generation_config)
    print(f'{"_"*20}\n{response.text}')    
output:

top_p=0:
____________________
In the vast expanse, a star whispers its tale.
____________________
In the vast expanse, a star whispers its tale.
____________________
In the vast expanse, a star whispers its tale.

top_k=1:
____________________
In a world of colors, a heart beats, a story unfolds.
____________________
In cosmic expanse, a flicker of light, a tale untold.
____________________
In twilight's embrace, dreams whisper of distant stars.

Actual vs expected behavior:

Expected Behavior:
It is anticipated that top_k=1 would result in near-consistent responses for the same input, similar to the behavior observed with top_p=0.

Actual Behavior:
The top_k=1 parameter exhibits significant variability in responses for identical inputs, contrary to the near-consistency expected.

Any other information you'd like to share?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    component:python sdkIssue/PR related to Python SDKstatus:awaiting user responseAwaiting a response from the authorstatus:staleIssue/PR will be closed automatically if there's no further activitytype:bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions