harshitv804 commited on
Commit
193485a
1 Parent(s): 616a0c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -6
README.md CHANGED
@@ -4,21 +4,25 @@ base_model:
4
  tags:
5
  - mergekit
6
  - merge
7
-
 
 
 
 
 
 
8
  ---
9
- # merge
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
 
13
  ## Merge Details
14
  ### Merge Method
15
 
16
- This model was merged using the SLERP merge method.
17
 
18
  ### Models Merged
19
 
20
  The following models were included in the merge:
21
- * [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)
22
 
23
  ### Configuration
24
 
@@ -44,3 +48,76 @@ parameters:
44
  dtype: bfloat16
45
 
46
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  tags:
5
  - mergekit
6
  - merge
7
+ - meta-math/MetaMath-Mistral-7B
8
+ - Mixture of Experts
9
+ license: apache-2.0
10
+ language:
11
+ - en
12
+ pipeline_tag: text-generation
13
+ library_name: transformers
14
  ---
15
+ This is MetaMath-Mistral-2x7B Mixture of Experts (MOE) model created using [mergekit](https://github.com/cg123/mergekit) for experimental and learning purpose of MOE.
 
 
16
 
17
  ## Merge Details
18
  ### Merge Method
19
 
20
+ This model was merged using the SLERP merge method using [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B) as the base model.
21
 
22
  ### Models Merged
23
 
24
  The following models were included in the merge:
25
+ * [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B) x 2
26
 
27
  ### Configuration
28
 
 
48
  dtype: bfloat16
49
 
50
  ```
51
+
52
+ ## Inference Code
53
+ ```python
54
+
55
+ ## install dependencies
56
+ ## !pip install -q -U git+https://github.com/huggingface/transformers.git
57
+ ## !pip install -q -U git+https://github.com/huggingface/accelerate.git
58
+ ## !pip install -q -U sentencepiece
59
+
60
+ ## load model
61
+ import torch
62
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
63
+
64
+ model_name = "harshitv804/MetaMath-Mistral-2x7B"
65
+
66
+ # load the model and tokenizer
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ model_name,
69
+ device_map="auto",
70
+ )
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained(
73
+ model_name,
74
+ trust_remote_code=True
75
+ )
76
+
77
+ tokenizer.pad_token = tokenizer.eos_token
78
+
79
+ ## inference
80
+
81
+ query = "Maximoff's monthly bill is $60 per month. His monthly bill increased by thirty percent when he started working at home. How much is his total monthly bill working from home?"
82
+ prompt =f"""
83
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.\n
84
+ ### Instruction:\n
85
+ {query}\n
86
+ ### Response: Let's think step by step.
87
+ """
88
+
89
+ # tokenize the input string
90
+ inputs = tokenizer(
91
+ prompt,
92
+ return_tensors="pt",
93
+ return_attention_mask=False
94
+ )
95
+
96
+ # generate text using the model
97
+ streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
98
+ outputs = model.generate(**inputs, max_length=2048, streamer=streamer)
99
+
100
+ # decode and print the output
101
+ text = tokenizer.batch_decode(outputs)[0]
102
+
103
+ ```
104
+
105
+ ## Citation
106
+
107
+ ```bibtex
108
+ @article{yu2023metamath,
109
+ title={MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models},
110
+ author={Yu, Longhui and Jiang, Weisen and Shi, Han and Yu, Jincheng and Liu, Zhengying and Zhang, Yu and Kwok, James T and Li, Zhenguo and Weller, Adrian and Liu, Weiyang},
111
+ journal={arXiv preprint arXiv:2309.12284},
112
+ year={2023}
113
+ }
114
+ ```
115
+
116
+ ```bibtex
117
+ @article{jiang2023mistral,
118
+ title={Mistral 7B},
119
+ author={Jiang, Albert Q and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and Casas, Diego de las and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and others},
120
+ journal={arXiv preprint arXiv:2310.06825},
121
+ year={2023}
122
+ }
123
+ ```