tyoyo commited on
Commit
5c01a58
1 Parent(s): ad90505

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -5
README.md CHANGED
@@ -1,25 +1,121 @@
1
  ---
2
  library_name: transformers
 
 
 
 
3
  tags:
4
  - llama-cpp
5
  ---
6
 
7
  # Llama-3-ELYZA-JP-8B-GGUF
8
 
 
 
 
 
 
 
 
 
 
9
  ## Use with llama.cpp
10
  Install llama.cpp through brew (works on Mac and Linux)
11
 
12
  ```bash
13
  brew install llama.cpp
14
  ```
15
- Invoke the llama.cpp server or the CLI.
16
 
17
- ### CLI:
18
  ```bash
19
- llama --hf-repo elyza/Llama-3-ELYZA-JP-8B-GGUF --hf-file Llama-3-ELYZA-JP-8B-q4_k_m.gguf -p "古代ギリシャを学ぶ上で知っておくべきポイントは?"
 
 
 
20
  ```
21
 
22
- ### Server:
 
23
  ```bash
24
- llama-server --hf-repo elyza/Llama-3-ELYZA-JP-8B-GGUF --hf-file Llama-3-ELYZA-JP-8B-q4_k_m.gguf -c 2048
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ```
 
1
  ---
2
  library_name: transformers
3
+ license: llama3
4
+ language:
5
+ - ja
6
+ - en
7
  tags:
8
  - llama-cpp
9
  ---
10
 
11
  # Llama-3-ELYZA-JP-8B-GGUF
12
 
13
+ ![Llama-3-ELYZA-JP-8B-image](./key_visual.png)
14
+
15
+ ## Model Description
16
+
17
+ **Llama-3-ELYZA-JP-8B** is a large language model trained by [ELYZA, Inc](https://elyza.ai/).
18
+ Based on [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), it has been enhanced for Japanese usage through additional pre-training and instruction tuning.
19
+
20
+ For more details, please refer to [our blog post](https://note.com/elyza/n/n360b6084fdbd).
21
+
22
  ## Use with llama.cpp
23
  Install llama.cpp through brew (works on Mac and Linux)
24
 
25
  ```bash
26
  brew install llama.cpp
27
  ```
28
+ Invoke the llama.cpp server.
29
 
 
30
  ```bash
31
+ $ llama-server \
32
+ --hf-repo elyza/Llama-3-ELYZA-JP-8B-GGUF \
33
+ --hf-file Llama-3-ELYZA-JP-8B-q4_k_m.gguf \
34
+ --port 8080
35
  ```
36
 
37
+ Call the API using curl.
38
+
39
  ```bash
40
+ $ curl http://localhost:8080/v1/chat/completions \
41
+ -H "Content-Type: application/json" \
42
+ -d '{
43
+ "messages": [
44
+ { "role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。特に指示が無い場合は、常に日本語で回答してください。" },
45
+ { "role": "user", "content": "古代ギリシャを学ぶ上で知っておくべきポイントは?" }
46
+ ],
47
+ "temperature": 0.6,
48
+ "max_tokens": -1,
49
+ "stream": false
50
+ }'
51
+ ```
52
+
53
+ Call the API using Python.
54
+
55
+ ```python
56
+ import openai
57
+
58
+ client = openai.OpenAI(
59
+ base_url="http://localhost:8080/v1",
60
+ api_key = "dummy_api_key"
61
+ )
62
+
63
+ completion = client.chat.completions.create(
64
+ model="dummy_model_name",
65
+ messages=[
66
+ {"role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。特に指示が無い場合は、常に日本語で回答してください。"},
67
+ {"role": "user", "content": "古代ギリシャを学ぶ上で知っておくべきポイントは?"}
68
+ ]
69
+ )
70
+ ```
71
+
72
+ ## Use with Desktop App
73
+
74
+ There are various desktop applications that can handle GGUF models, but here we will introduce how to use a model in a local environment without coding by using LM Studio.
75
+
76
+ - **Installation**: Download and install [LM Studio](https://lmstudio.ai/).
77
+ - **Downloading the Model**: Search for `elyza/Llama-3-ELYZA-JP-8B-GGUF` in the search bar on the home page 🏠, and download `Llama-3-ELYZA-JP-8B-q4_k_m.gguf`.
78
+ - **Start Chatting**: Click on 💬 in the sidebar, select `Llama-3-ELYZA-JP-8B-GGUF` from "Select a Model to load" in the header, and load the model. Now you can freely chat with the local LLM.
79
+ - **Setting Options**: You can set options from the sidebar on the right. Faster inference can be achieved by setting Quick GPU Offload Settings to Max in the GPU Settings.
80
+ - **For Developers, Starting the API Server**: Click `<->` in the left sidebar and move to the Local Server tab. Select the model and click Start Server to launch an OpenAI API-compatible API server.
81
+
82
+ ## Quantization Options
83
+
84
+ Currently, we only offer quantized models in the Q4_K_M format.
85
+
86
+ ## Developers
87
+
88
+ Listed in alphabetical order.
89
+
90
+ - [Masato Hirakawa](https://huggingface.co/m-hirakawa)
91
+ - [Shintaro Horie](https://huggingface.co/e-mon)
92
+ - [Tomoaki Nakamura](https://huggingface.co/tyoyo)
93
+ - [Daisuke Oba](https://huggingface.co/daisuk30ba)
94
+ - [Sam Passaglia](https://huggingface.co/passaglia)
95
+ - [Akira Sasaki](https://huggingface.co/akirasasaki)
96
+
97
+ ## License
98
+
99
+ [Meta Llama 3 Community License](https://llama.meta.com/llama3/license/)
100
+
101
+ ## How to Cite
102
+
103
+ ```tex
104
+ @misc{elyzallama2024,
105
+ title={elyza/Llama-3-ELYZA-JP-8B},
106
+ url={https://huggingface.co/elyza/Llama-3-ELYZA-JP-8B},
107
+ author={Masato Hirakawa and Shintaro Horie and Tomoaki Nakamura and Daisuke Oba and Sam Passaglia and Akira Sasaki},
108
+ year={2024},
109
+ }
110
+ ```
111
+
112
+ ## Citations
113
+
114
+ ```tex
115
+ @article{llama3modelcard,
116
+ title={Llama 3 Model Card},
117
+ author={AI@Meta},
118
+ year={2024},
119
+ url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
120
+ }
121
  ```