Files changed (1) hide show
  1. README.md +10 -22
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: apache-2.0
3
  pipeline_tag: text-generation
4
  tags:
5
  - chemistry
@@ -8,20 +8,12 @@ language:
8
  - zh
9
  ---
10
  # ChemLLM-7B-Chat: LLM for Chemistry and Molecule Science
11
-
12
- > [!IMPORTANT]
13
- > Better using New version of ChemLLM!
14
- > [AI4Chem/ChemLLM-7B-Chat-1.5-DPO](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO) or [AI4Chem/ChemLLM-7B-Chat-1.5-SFT](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-SFT)
15
-
16
-
17
  ChemLLM-7B-Chat, The First Open-source Large Language Model for Chemistry and Molecule Science, Build based on InternLM-2 with ❤
18
  [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-sm.svg)](https://huggingface.co/papers/2402.06852)
19
 
20
  <center><img src='https://cdn-uploads.huggingface.co/production/uploads/64bce15bafd1e46c5504ad38/wdFV6p3rTBCtskbeuVwNJ.png'></center>
21
 
22
  ## News
23
- - ChemLLM-1.5 released! Two versions are available [AI4Chem/ChemLLM-7B-Chat-1.5-DPO](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-DPO) or [AI4Chem/ChemLLM-7B-Chat-1.5-SFT](https://huggingface.co/AI4Chem/ChemLLM-7B-Chat-1.5-SFT).[2024-4-2]
24
- - ChemLLM-1.5 updated! Have a try on [Demo Site](https://chemllm.org/#/chat) or [API Reference](https://api.chemllm.org/docs).[2024-3-23]
25
  - ChemLLM has been featured by HuggingFace on [“Daily Papers” page](https://huggingface.co/papers/2402.06852).[2024-2-13]
26
  - ChemLLM arXiv preprint released.[ChemLLM: A Chemical Large Language Model](https://arxiv.org/abs/2402.06852)[2024-2-10]
27
  - News report from [Shanghai AI Lab](https://mp.weixin.qq.com/s/u-i7lQxJzrytipek4a87fw)[2024-1-26]
@@ -44,7 +36,7 @@ import torch
44
  model_name_or_id = "AI4Chem/ChemLLM-7B-Chat"
45
 
46
  model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="auto",trust_remote_code=True)
47
- tokenizer = AutoTokenizer.from_pretrained(model_name_or_id,trust_remote_code=True)
48
 
49
  prompt = "What is Molecule of Ibuprofen?"
50
 
@@ -75,21 +67,17 @@ You can format it into this InternLM2 Dialogue format like,
75
  ```
76
  def InternLM2_format(instruction,prompt,answer,history):
77
  prefix_template=[
78
- "<|im_start|>system\n",
79
- "{}",
80
- "<|im_end|>\n"
81
  ]
82
  prompt_template=[
83
- "<|im_start|>user\n",
84
- "{}",
85
- "<|im_end|>\n"
86
- "<|im_start|>assistant\n",
87
- "{}",
88
- "<|im_end|>\n"
89
  ]
90
- system = f'{prefix_template[0]}{prefix_template[1].format(instruction)}{prefix_template[2]}'
91
- history = "".join([f'{prompt_template[0]}{prompt_template[1].format(qa[0])}{prompt_template[2]}{prompt_template[3]}{prompt_template[4].format(qa[1])}{prompt_template[5]}' for qa in history])
92
- prompt = f'{prompt_template[0]}{prompt_template[1].format(prompt)}{prompt_template[2]}{prompt_template[3]}'
93
  return f"{system}{history}{prompt}"
94
  ```
95
  And there is a good example for system prompt,
 
1
  ---
2
+ license: other
3
  pipeline_tag: text-generation
4
  tags:
5
  - chemistry
 
8
  - zh
9
  ---
10
  # ChemLLM-7B-Chat: LLM for Chemistry and Molecule Science
 
 
 
 
 
 
11
  ChemLLM-7B-Chat, The First Open-source Large Language Model for Chemistry and Molecule Science, Build based on InternLM-2 with ❤
12
  [![Paper page](https://huggingface.co/datasets/huggingface/badges/resolve/main/paper-page-sm.svg)](https://huggingface.co/papers/2402.06852)
13
 
14
  <center><img src='https://cdn-uploads.huggingface.co/production/uploads/64bce15bafd1e46c5504ad38/wdFV6p3rTBCtskbeuVwNJ.png'></center>
15
 
16
  ## News
 
 
17
  - ChemLLM has been featured by HuggingFace on [“Daily Papers” page](https://huggingface.co/papers/2402.06852).[2024-2-13]
18
  - ChemLLM arXiv preprint released.[ChemLLM: A Chemical Large Language Model](https://arxiv.org/abs/2402.06852)[2024-2-10]
19
  - News report from [Shanghai AI Lab](https://mp.weixin.qq.com/s/u-i7lQxJzrytipek4a87fw)[2024-1-26]
 
36
  model_name_or_id = "AI4Chem/ChemLLM-7B-Chat"
37
 
38
  model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.float16, device_map="auto",trust_remote_code=True)
39
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_id,,trust_remote_code=True)
40
 
41
  prompt = "What is Molecule of Ibuprofen?"
42
 
 
67
  ```
68
  def InternLM2_format(instruction,prompt,answer,history):
69
  prefix_template=[
70
+ "<|system|>:",
71
+ "{}"
 
72
  ]
73
  prompt_template=[
74
+ "<|user|>:",
75
+ "{}\n",
76
+ "<|Bot|>:\n"
 
 
 
77
  ]
78
+ system = f'{prefix_template[0]}\n{prefix_template[-1].format(instruction)}\n'
79
+ history = "\n".join([f'{prompt_template[0]}\n{prompt_template[1].format(qa[0])}{prompt_template[-1]}{qa[1]}' for qa in history])
80
+ prompt = f'\n{prompt_template[0]}\n{prompt_template[1].format(prompt)}{prompt_template[-1]}'
81
  return f"{system}{history}{prompt}"
82
  ```
83
  And there is a good example for system prompt,