Steelskull commited on
Commit
88d3ae5
1 Parent(s): de9e8d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -49
README.md CHANGED
@@ -7,52 +7,113 @@ tags:
7
  - merge
8
 
9
  ---
10
- # merge
11
-
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
-
14
- ## Merge Details
15
- ### Merge Method
16
-
17
- This model was merged using the passthrough merge method.
18
-
19
- ### Models Merged
20
-
21
- The following models were included in the merge:
22
- * [unsloth/Mistral-Small-Instruct-2409](https://huggingface.co/unsloth/Mistral-Small-Instruct-2409)
23
-
24
- ### Configuration
25
-
26
- The following YAML configuration was used to produce this model:
27
-
28
- ```yaml
29
- dtype: bfloat16
30
- merge_method: passthrough
31
- slices:
32
- - sources:
33
- - layer_range: [0, 41]
34
- model: unsloth/Mistral-Small-Instruct-2409
35
- - sources:
36
- - layer_range: [19, 41]
37
- model: unsloth/Mistral-Small-Instruct-2409
38
- parameters:
39
- scale:
40
- - filter: o_proj
41
- value: 0.0
42
- - filter: down_proj
43
- value: 0.0
44
- - value: 1.0
45
- - sources:
46
- - layer_range: [19, 41]
47
- model: unsloth/Mistral-Small-Instruct-2409
48
- parameters:
49
- scale:
50
- - filter: o_proj
51
- value: 0.0
52
- - filter: down_proj
53
- value: 0.0
54
- - value: 1.0
55
- - sources:
56
- - layer_range: [41, 55]
57
- model: unsloth/Mistral-Small-Instruct-2409
58
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - merge
8
 
9
  ---
10
+ <!DOCTYPE html>
11
+ <html lang="en">
12
+ <head>
13
+ <meta charset="UTF-8">
14
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
15
+ <title>BA-Zephyria-39b Data Card</title>
16
+ <link href="https://fonts.googleapis.com/css2?family=Quicksand:wght@400;500;600&display=swap" rel="stylesheet">
17
+ <style>
18
+ body, html {
19
+ height: 100%;
20
+ margin: 0;
21
+ padding: 0;
22
+ font-family: 'Quicksand', sans-serif;
23
+ background: linear-gradient(135deg, #0a1128 0%, #1c2541 100%);
24
+ color: #e0e1dd;
25
+ font-size: 16px;
26
+ }
27
+ .container {
28
+ width: 100%;
29
+ height: 100%;
30
+ padding: 20px;
31
+ margin: 0;
32
+ background-color: rgba(255, 255, 255, 0.05);
33
+ border-radius: 12px;
34
+ box-shadow: 0 4px 10px rgba(0, 0, 0, 0.3);
35
+ backdrop-filter: blur(10px);
36
+ border: 1px solid rgba(255, 255, 255, 0.1);
37
+ }
38
+ .header h1 {
39
+ font-size: 28px;
40
+ color: #4cc9f0;
41
+ margin: 0 0 20px 0;
42
+ text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
43
+ }
44
+ .update-section h2 {
45
+ font-size: 24px;
46
+ color: #7209b7;
47
+ }
48
+ .update-section p {
49
+ font-size: 16px;
50
+ line-height: 1.6;
51
+ color: #e0e1dd;
52
+ }
53
+ .info img {
54
+ width: 100%;
55
+ border-radius: 10px;
56
+ margin-bottom: 15px;
57
+ }
58
+ a {
59
+ color: #4cc9f0;
60
+ text-decoration: none;
61
+ }
62
+ a:hover {
63
+ color: #f72585;
64
+ }
65
+ .button {
66
+ display: inline-block;
67
+ background-color: #3a0ca3;
68
+ color: #e0e1dd;
69
+ padding: 10px 20px;
70
+ border-radius: 5px;
71
+ cursor: pointer;
72
+ text-decoration: none;
73
+ }
74
+ .button:hover {
75
+ background-color: #7209b7;
76
+ }
77
+ pre {
78
+ background-color: #1c2541;
79
+ padding: 10px;
80
+ border-radius: 5px;
81
+ overflow-x: auto;
82
+ }
83
+ code {
84
+ font-family: 'Courier New', monospace;
85
+ color: #e0e1dd;
86
+ }
87
+ </style>
88
+ </head>
89
+ <body>
90
+ <div class="container">
91
+ <div class="header">
92
+ <h1>BA-Zephyria-39b [EXPERIMENTAL]</h1>
93
+ </div>
94
+ <div class="info">
95
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/6W3orrbf8A68l-3p_JxN1.png">
96
+ <h2>Model Information</h2>
97
+ <p><strong>Base Model:</strong> unsloth/Mistral-Small-Instruct-2409</p>
98
+ <p><strong>Strategy:</strong> Balanced Approach</p>
99
+ <p><strong>Total Layers:</strong> 55</p>
100
+ <p><strong>Duplication Start:</strong> Layer 19 (34.5% of model)</p>
101
+ <p><strong>Duplicated Layers:</strong> 23 (41.8% of model)</p>
102
+ <p><strong>Unique Final Layers:</strong> 14 (25.5% of model)</p>
103
+ <h2>Model Characteristics</h2>
104
+ <ul>
105
+ <li>Combines benefits of early and mid duplication strategies</li>
106
+ <li>Balanced between unique initial layers, duplicated middle layers, and unique final layers</li>
107
+ <li>Versatile approach suitable for a wide range of tasks</li>
108
+ <li>Provides substantial unique layers at the end for task-specific adaptations</li>
109
+ </ul>
110
+ <h2>Configuration Visualization</h2>
111
+ <pre><code>
112
+ [ Unique ][ Duplicated ][ Unique ]
113
+ 0 ----------- 18 19 ------------ 41 42 ---------- 54
114
+ 34.5% 41.8% 23.7%
115
+ </code></pre>
116
+ </div>
117
+ </div>
118
+ </body>
119
+ </html>