Phi-1.5-Tele Model Card

Model Summary

The language model Phi-1.5-Tele is a Transformer with 1.3 billion parameters, specialized in telecommunications. It is based on Microsoft phi-1.5 and was continutally pretrained on Tele-Data, a large-scale dataset of approximately 2.5 billion tokens of telecommunications material, including articles, standards, and general web content related to the telecommunications domain.

When assessed against telecommunications benchmarks such as Tele-Eval, Phi-1.5-Tele outperforms phi-1.5 by several percentage points. Additionally, Phi-1.5-Tele matches phi-1.5 across benchmarks related to common sense, language understanding, and logical reasoning. Thus, this adaptation was achieved with minimal compromise in performance on the original version.

Context Length

The model was trained on a context length of 2048 tokens.

Usage

Phi-1.5-Tele is a base model best suited for fine-tuning on applications related to telecommunications. Although it has not been specifically fine-tuned to follow instructions, it can be prompted to answer questions and follow instructions using the following format:

Write me a poem about telecommunications.

Answer: Our world is a network of digital streams, 
Connecting every voice and thought,
Through the wires and fibers that transmit,
Bringing us closer to the end of the road.

where the model generates the text after "Answer:".

Sample Code

Below we share some code snippets on how to get quickly started with running the model. First, make sure to pip install transformers, then copy the snippet corresponding to your hardware and adapt it to your usecase.

Running the model on a CPU

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Phi-1.5-Tele", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Phi-1.5-Tele")

prompt = "Write me a poem about telecommunications.\nAnswer:"
input_ids = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Running the model on a single / multi GPU

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Phi-1.5-Tele", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Phi-1.5-Tele")

prompt = "Write me a poem about telecommunications.\nAnswer:"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Citation

You can find the paper with all details about the model at https://arxiv.org/abs/2409.05314. Please cite it as follows:

@misc{maatouk2024telellmsseriesspecializedlarge,
      title={Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications}, 
      author={Ali Maatouk and Kenny Chirino Ampudia and Rex Ying and Leandros Tassiulas},
      year={2024},
      eprint={2409.05314},
      archivePrefix={arXiv},
      primaryClass={cs.IT},
      url={https://arxiv.org/abs/2409.05314}, 
}

AliMaatouk
/

Phi-1.5-Tele