Is it possible to generate multiple sequences per token-batch sequence?

#7
by ArthurParkerhouse - opened
This comment has been hidden
ArthurParkerhouse changed discussion title from Is it possible to generate multiple sequences per token-batch sequence? to Is it possible to generate multiple sequences per token-batch sequence?

interesting proposition! IMO it depends on the decoding method. Beam search (AFAIK) will generate deterministically for a given model (aside: I have not explored what happens if you use set_seed() and try different ones, maybe they are different). If you generate text via sampling, it is possible to get different generated text(s) for a given input token batch. when I last tried sampling, it was pretty trash. Granted, that was with the LED-MODELSIZE-book-summary series of models.. so maybe things are different here. you can get details on how to generate via different methods here and in the transformers docs. I think the blog post details things but essentially you could pass num_generated_sequences=69 or whatever number of different summaries you find useful/have the compute for.

Another way to get unique/different generations for a given input would be with the "new" contrastive search (which came out several months ago). For text-generation purposes (read: not summarization) I have found it to be superior to sampling, I have not tried it for summarization, and it might do well. If you try that let me know!

p.s.: the csearch blog post has most of the code you would need to try different decoding methods.

Thanks so much!

I was trying to get the contrastive search to work on the batch tokenization script yesterday, but I'm not sure if it was working properly or not. For a while it seemed to only use the top_k setting and changing penalty alpha didn't change the output at all.

Then I switched around the order in which the arguments were listed, so instead of them being listed in this order (penalty_alpha=0.6, top_k=4) as per the csearch instructions - I instead listed top_k first and penalty_alpha second (such as: top_k=4, penalty_alpha=0.6), and it was only after listing the arguments in this order that changing the penalty_alpha setting (while keeping top_k the same) actually affected the output.

No clue why the order of those two specific arguments change how it works, though. I was using "torch.manual_seed(3407)" for repeatable examples.

I just discovered that Hugging face has a separate forum section as well which I just joined, and planning on joining the discord at some point, so hopefully soon I'll stop clogging up your discussion sections, lol.

No worries and sorry for taking forever to respond. Btw, I have taken a look at some different inference params and while a real analysis is still in progress, you can check out/ compare some different params with long-t5 in the /long-t5 subfolder in my summarization gauntlet

closing for now but feel free to comment and/or reopen if you have any issues or burning questions

pszemraj changed discussion status to closed

Sign up or log in to comment