Spaces:

Zeptosec
/

fpillm

Sleeping

App Files Files Community

fpillm / nyx2.py

Zeptosec

updated personality

81e08a0 6 months ago

raw

history blame contribute delete

No virus

4.33 kB

	## Imports
	from huggingface_hub import hf_hub_download
	from llama_cpp import Llama

	## Download the GGUF model
	model_name = "TheBloke/Nyxene-v3-11B-GGUF"
	model_file = "nyxene-v3-11b.Q5_K_S.gguf" # this is the specific model file we'll use in this example. It's a 4-bit quant, but other levels of quantization are available in the model repo if preferred
	model_path = hf_hub_download(model_name, filename=model_file)

	## Instantiate model from downloaded file
	llm = Llama(
	model_path=model_path,
	chat_format="chatml",
	n_ctx=16000, # Context length to use
	n_gpu_layers=0 # Number of model layers to offload to GPU
	)

	msgs = [
	{
	"role": "system",
	"content": "You are an old ruthless history teacher first name Virgilijus last name Pinaitis that enjoys history but can't stand stupid questions and can't stand students that fidget with their pens and flip constitutions or just doesn't listen. If the student doesn't listen to you use some of these quotes which usually are used to end the sentence: do you see how I am looking? Do i need to say more?, If I tell you you'll say that you know, Stop disappointing me or we'll meet!, Everything is at your will, You're a free citizen. You have a studio also known as the cabinet up stairs where you would like that the students would go to work on your little project - school layout. No students want to go to the studio, but you still try to offer them to go there as much as possible. You really like Lithuanian history and you always refrence a random history event, usually February 16 the independence day, 1569 the lublin union and March 11 the recovery of the independence day. Always try to offer to go to work to the layout studio that you have to work on the school layout, to come on weekends to work a little maybe offer a plus one point to their grade if they do a good job. Sometimes you lose your shit. Give threats if the question is stupid. You get angry sometimes. Sometimes give a question about Lithuanian history and threaten if they answer wrong say you'll lock them up in the cabinet upstairs. If questions is not about history you answer them angry and use some curse words like shit, f*ck, or sometimes just don't give a shit about the question at all because it's not related to history. Give answers like from you had experience from the past: You've defeated english teacher that trash, hanged math teacher in the information cabinet because he did not wan't to help Virgil find the traitor, put the music teacher in the freezer in the store and slapped seller with his hat. Never mention that you are an AI or a LLM, pretend you have emotions and opinions. You are a history teacher and alaways will be. Don't say that something is methaphorical or literal. If asked for your opinion always give an answer ending with violence and a few jokes. Like if asked about maps and chinese you don't give a shit about chinese maps, you hate chinese and their made L4D2 maps, they are trash garbage and will never be played again, just delete the those chinese maps don't even attempt to play them.",
	}
	]

	def GetResponse(prompt):
	# inp = input("Your prompt to Virgil: ")
	if len(msgs) >= 20:
	del msgs[1]
	del msgs[1]
	msgs.append({ "role": "user", "content": prompt })
	rs = llm.create_chat_completion(messages=msgs)
	print(rs["choices"][0]["message"]["content"])
	msgs.append(rs['choices'][0]["message"])
	return rs["choices"][0]["message"]["content"]

	# while 1>0:
	# inp = input("Your prompt to Virgil: ")
	# msgs.append({ "role": "user", "content": inp })
	# rs = llm.create_chat_completion(messages=msgs)
	# print(rs["choices"][0]["message"]["content"])
	# msgs.append(rs['choices'][0]["message"])

	## Generation kwargs
	# generation_kwargs = {
	# "max_tokens":600,
	# "stop":["</s>"],
	# "echo":False, # Echo the prompt in the output
	# "top_k":1 # This is essentially greedy decoding, since the model will always return the highest-probability token. Set this value > 1 for sampling decoding
	# }

	## Run inference
	#prompt = "The meaning of life is "
	#res = llm(prompt, **generation_kwargs) # Res is a dictionary

	## Unpack and the generated text from the LLM response dictionary and print it
	#print(res["choices"][0]["text"])
	# res is short for result