--- license: mit --- # TinyLlama-NoPE-HeadScale8k ## Citation ``` @misc{wang2024length, title={Length Generalization of Causal Transformers without Position Encoding}, author={Jie Wang and Tao Ji and Yuanbin Wu and Hang Yan and Tao Gui and Qi Zhang and Xuanjing Huang and Xiaoling Wang}, year={2024}, eprint={2404.12224}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```