Decoding LLM Parameters, Part 2: Top-P (Nucleus Sampling)

The second article in this multi-part series explores how parameter tuning on Top-P impacts creativity, precision, and diversity in LLM content generation.

Pavan Vemuri

Prince Bose

Tharakarama Reddy Yernapalli Sreenivasulu

Oct. 09, 24 · Tutorial

Likes (4)

Comment

Save

4.2K Views

LLM Parameters

Like any machine learning model, large language models have various parameters that control the variance of the generated text output. We have started a multi-part series to explain the impact of these parameters in detail. We will conclude by striking the perfect balance in content generation using all of these parameters discussed in our multi-part series.

Welcome to the second part, where we discuss another well-known parameter, "Top-P."

Top-P (Nucleus Sampling)

If the goal is to control the diversity of the model output, then Top-P is the one for you. Lower Top-P forces the model to use the most probable words, whereas higher Top-P forces the model to use more diverse words, increasing creativity.

Let us look at Top-P in action with the following code and output.

    Python
   
 

   import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token

# Function to generate response with varying top_p
def generate_with_top_p(prompt, top_p):
    inputs = tokenizer(prompt, return_tensors='pt', padding=True)

    # Set the attention_mask and pad_token_id
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs['attention_mask'],
        do_sample=True,
        max_length=200, 
        top_p=top_p,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


prompt = "What are some effective ways to manage stress in daily life?"

# List of top-p values and their descriptions
top_p_values = {
    0.1: "Very conservative: Generates highly probable and safe responses.",
    0.3: "Conservative: Generates probable responses with less risk.",
    0.5: "Balanced: A mix of safe and creative responses.",
    0.7: "Creative: Generates more diverse and creative responses.",
    0.9: "Very creative: Allows for highly diverse and less probable responses."
}

# Test top_p variations
for top_p, description in top_p_values.items():
    print(f"\nTop-p {top_p} ({description}):\n")
    print(generate_with_top_p(prompt, top_p=top_p))

  

Output:

    PowerShell
   
   python test_top_p.py

Top-p 0.1 (Very conservative: Generates highly probable and safe responses.):

What are some effective ways to manage stress in daily life?

The following are some of the most common ways to manage stress in daily life.

1. Avoiding stress

The most common way to avoid stress is to avoid it.

The most common way to avoid stress is to avoid it.

2. Avoiding stress

The most common way to avoid stress is to avoid it.

The most common way to avoid stress is to avoid it.

3. Avoiding stress

The most common way to avoid stress is to avoid it.

The most common way to avoid stress is to avoid it.

4. Avoiding stress

The most common way to avoid stress is to avoid it.

The most common way to avoid stress is to avoid it.

5. Avoiding stress

The most common way to avoid stress is to avoid it.

The most common way to avoid stress

Top-p 0.3 (Conservative: Generates probable responses with less risk.):

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective ways to manage stress in daily life?

What are some effective

Top-p 0.5 (Balanced: A mix of safe and creative responses.):

What are some effective ways to manage stress in daily life?

1. Stay on top of your body's natural stress levels

When you're stressed, your body's natural stress levels are low.

If you're stressed, your body's natural stress levels are high.

If you're stressed, your body's natural stress levels are low.

2. Avoid excessive exercise

Exercise can make you feel better.

Exercise can make you feel better.

3. Get up early to avoid fatigue

Exercise can make you feel better.

Exercise can make you feel better.

4. Avoid the temptation to take the wrong thing

Exercise can make you feel better.

Exercise can make you feel better.

5. Avoid eating the wrong foods

Exercise can make you feel better.

Exercise can make you feel better.

6. Avoid the temptation to

Top-p 0.7 (Creative: Generates more diverse and creative responses.):

What are some effective ways to manage stress in daily life?

I am talking about a very specific situation. The person I am talking about has been stressed, but has not been doing much work for a long time. I want to tell you, because this person has had a lot of stress in his life, that it is not something you can just go back to. But what I'm trying to say is, that if you don't have a job, you have to go back to work every day, so you can spend more time with your family. So I've been doing that for a long time now. And so, that is a very common occurrence.

But what do you think is the best way to deal with the stress?

You know, it's not easy to deal with it. It is very difficult to deal with the stress that we experience. So, that is a very good way to deal with it. So, I think it's the

Top-p 0.9 (Very creative: Allows for highly diverse and less probable responses.):

What are some effective ways to manage stress in daily life?

There are many things that can be done by daily meditation and practice. As a general rule of thumb, meditation can help you stay mindful of your own needs, goals, feelings, desires, emotions, and the life and emotions around you.

The purpose of meditation is to feel a deep desire to practice more, to be more mindful, and to be more productive. It also serves to enhance your overall well-being.

1. Be active, be creative, be mindful, and be optimistic.

This is where the first step towards meditation comes from. If we're looking for inspiration, there's a whole section on being "active" and "creative."

While I'm not sure I know much about meditation, I know some of its practitioners and some that I never met. My mom used to tell me that she'd always find a way to make her feel more connected and involved.

Now let's understand the output.

Top-P 0.1 - Very Conservative: Since the model selects from the top 10% of the probable following word choices, there is a lot of repetition in the generated content. Hence, this response lacks diversity and is also uninformative most of the time.
Top-P 0.3 - Conservative: The model selects from the top 30% of the probable following word choices, so it is slightly less conservative than the previous Top-P Setting. As you can see from the output, this has not improved content generation, and the prompt was repeated throughout the completion. In this case, the prompt's repetition means the most probable continuation after the prompt for the model seems to be the prompt itself.
Top-P 0.5 - Balanced: This is where you see the model listing some numbered strategies for the first time. You still see some repetition in this setting as well. But the bottom line is that at this Top-P setting, the model starts to incorporate a broader range of words. The output is a mix of standard advice with some inconsistencies. This Top-P value allows for improved creativity but still struggles with depth of information.
Top-P 0.7 - Creative: In this case, the model can select from a broader range of words, and as you can see, the response is shifting towards a narrative style. The content is more creative as it now involves a scenario where a person is dealing with stress. The downside is the loss of focus, as the emphasis was not on managing stress but on the difficulties in coping with stress.
Top-P 0.9 - Very Creative: In this setting, the model has access to a wide range of vocabulary and ideas including less probable words and concepts. This setting enabled the model to use more expressive language. Again the downside of being very creative is that the model deviates from the prompt in the quest for producing rich and varied content.

The critical thing to note from the above exercise is how the content changes with the change in the Top-P setting. It also gives us an idea that this parameter is not the only one that needs to be handled for variation in content and its relevancy.

Now, let us look at Top-P's impact on a couple of use cases, just like the previous part of this series on "Creative Story Generation" and "Technical Explanation."

    Python
   
 

   import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token

# Function to generate response based on top_p
def generate_with_top_p(prompt, top_p, max_length=250):
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs.attention_mask,
        do_sample=True,
        max_length=max_length,
        top_p=top_p,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=2  # Prevents repetition of phrases
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

### USE CASE 1: CREATIVE STORY GENERATION ###

def creative_story_generation():
    prompt = ("In the mystical land of Eldoria, a young warrior found an ancient map "
              "that led to a hidden treasure guarded by dragons. He knew that courage and "
              "wisdom would be his allies on this perilous journey.")

    # Negative Impact: Low top_p for creative writing (less creative)
    print("\n=== Creative Story with Low top_p (0.2) - Negative Impact: ===")
    low_top_p_story = generate_with_top_p(prompt, top_p=0.2)
    print(low_top_p_story)

    # Perfect Impact: High top_p for creative writing (more creative)
    print("\n=== Creative Story with High top_p (0.95) - Perfect Impact: ===")
    high_top_p_story = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_story)

### USE CASE 2: TECHNICAL EXPLANATION ###

def technical_explanation():
    prompt = ("Explain step by step how the internet works, focusing on how computers "
              "use IP addresses and data packets to communicate with each other.")

    # Negative Impact: High top_p for technical writing (less precise)
    print("\n=== Technical Explanation with High top_p (0.95) - Negative Impact: ===")
    high_top_p_explanation = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_explanation)

    # Perfect Impact: Optimal top_p for technical writing (accurate)
    print("\n=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===")
    optimal_top_p_explanation = generate_with_top_p(prompt, top_p=0.5)
    print(optimal_top_p_explanation)

# Run both use cases
creative_story_generation()
technical_explanation()

  

Output:

    PowerShell
   
 

   python top_p_multiple.py

=== Creative Story with Low top_p (0.2) - Negative Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.

The Dragon King
...
 (The Book of the Dragon)
,
-
: The Dragon Lord is a legendary warrior who has been the focus of many legends. The dragon king is the most powerful of all the dragons in the world. In the magical land, he is known as the "Dragon King". He is also known to be the leader of a group of dragons called the Black Dragons. His name is derived from the dragon's name, "the dragon".
"The Black Dragon" is an important symbol of power and powerlessness. It is said that the black dragon is able to create a dragon that can defeat the strongest of his enemies. However, the true power of this dragon lies in his ability to manipulate the minds of others. This ability is called "The Dark Dragon". The Dark dragon has a powerful sense of self-preservation and is capable of manipulating others to his will. When he has control over others, his power is so great that he can destroy entire cities. As a result

=== Creative Story with High top_p (0.95) - Perfect Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.

Spirits are like gods. In this world, there are no gods without secrets. There are also no secrets about being a fighter or a thief. But every dragon has a special hidden skill, and he or she can use that skill to destroy and gain strength or hide something hidden in the secret. Many dragons are skilled at their martial arts, while most are unaware of the secrets of their true power. These dragons cannot only use these skills, but that will only allow them to escape the dragons' clutches. Because their training will be tested before they're even born, dragon fighting has never been so hard, even without training, so they should be able to break a dragon's body.

=== Technical Explanation with High top_p (0.95) - Negative Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other. If a person with the same identity as a user on the US government's private network uses the online address bar, then this data is sent to a server on a computer on your local network. Your IP address is a small byte in the string. The IP and network address are identical. Do you remember, you just want to do that instead of using IPs or numbers. In addition, remember that IP can be used to verify a particular IP for you and your computer. For instance, your name does not always match an address on our government network and you should have your public IP in this country. This does seem quite unusual and perhaps a bit bizarre.

There was a time in Silicon Valley when you could set your identity out. But in most of today's world, how do you set up your own address and how does one look for it? What about the public? The internet itself was different. It was just a set of rules around data flow that you were supposed to follow. Now, even in today the "internet in general" seems a little more complicated to define. Let's say

=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other.

"We've been trying to understand how it works and what it means for the future," says James. "It's not just about the IP address, it's about how people communicate. It's also about what's going on with the data. We want to see how this works. What is the Internet going to look like in the next 10 years?"
, the director of the Computer Science and Artificial Intelligence Laboratory at the University of Michigan, says that while there's still a lot of work to be done, "we've got to start to think about it."
  

Now let us break down and analyze the output for creative story generation and technical explanation based on the Top-P settings and how the output was impacted.

In order to effectively demonstrate the impact of Top-P we have incorporated better prompts to steer the output in a way that the impact is observed easily.

Creative Story Generation

Low Top-P (Negative Impact): As you can see with the lower Top-P, the model is restricted to the use of words or phrases and hence causes repetition and redundancy. The creativity is also limited in this case as the model tries not to introduce new ideas. But if you notice, the logical flow is still maintained, and the model stays on topic, which is typical of lower Top-P values.
High Top-P (Perfect Impact): In this case, the model introduces new concepts and adds a creative angle to the narration. Broader vocabulary is used, adding depth and richness to the text. However, due to increased creativity, logical flow has been curbed.

The contrast between the two narratives clearly shows the impact of Top-P, making it easy to understand how it affects creative writing.

Technical Explanation

High Top-P (Negative Impact): As you can see, high Top-P negatively impacts technical explanations by preventing a logical flow and deviating from the topic. The model also introduces irrelevant information which is not pertinent to the explanation.
Optimal Top-P ( Perfect Impact): The explanation is more coherent and close to the topic with optimal Top-P. The content aligns more with the prompt and balances accuracy and expression well. The reliability of the information is enhanced because the model is limited to more probable words.

Conclusion

With this experiment, we have successfully showcased the importance of the Top-P parameter in controlling the randomness and creativity of the generated text. We first looked at a single prompt and how the output varies with varying Top-P and then took a more use-case-based approach to how Top-P controls the output based on the use case.

However, from the previous and this part of the series, we have noticed that individually, each parameter does not do enough justice to the quality of content generation. That is why it is essential to look at the impact of all of these parameters, and we will be doing that as the final part of this series.

Machine learning Python (language) artificial intelligence

Opinions expressed by DZone contributors are their own.

Related

Trending