Skip to main content

Why would the tokenizer for encoder-decoder model for machine translation use bos_token_id == eos_token_id? How does the model know when a sequence ends?

Sorry, this post was removed by Reddit’s filters.