pszemraj commited on
Commit
f30bdc1
1 Parent(s): 9c32c23

Add BERTopic model

Browse files
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # BERTopic-summcomparer-gauntlet-v0p1-sentence-t5-xl-document_text
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("pszemraj/BERTopic-summcomparer-gauntlet-v0p1-sentence-t5-xl-document_text")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 16
34
+ * Number of training documents: 630
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | -1 | convolutional - images - networks - superpixels - overfitting | 12 | -1_convolutional_images_networks_superpixels |
42
+ | 0 | bruno - guy - pdf - screentalk - he | 26 | 0_bruno_guy_pdf_screentalk |
43
+ | 1 | elsa - arendelle - kristoff - frozen - anna | 94 | 1_elsa_arendelle_kristoff_frozen |
44
+ | 2 | gillis - script - room - ll - artie | 73 | 2_gillis_script_room_ll |
45
+ | 3 | interpretation - explanation - theory - structure - merge | 72 | 3_interpretation_explanation_theory_structure |
46
+ | 4 | topics - topic - documents - corpus - document | 63 | 4_topics_topic_documents_corpus |
47
+ | 5 | nemo - dory - chum - gill - fish | 56 | 5_nemo_dory_chum_gill |
48
+ | 6 | films - film - identity - trauma - zinnemann | 54 | 6_films_film_identity_trauma |
49
+ | 7 | computational - data - pathology - medical - informatics | 47 | 7_computational_data_pathology_medical |
50
+ | 8 | images - captions - representations - embeddings - image | 26 | 8_images_captions_representations_embeddings |
51
+ | 9 | zaroff - rainsford - hunt - hunting - general | 24 | 9_zaroff_rainsford_hunt_hunting |
52
+ | 10 | cogvideo - interpolation - videos - coglm - frames | 24 | 10_cogvideo_interpolation_videos_coglm |
53
+ | 11 | assignment - essays - questions - projects - students | 17 | 11_assignment_essays_questions_projects |
54
+ | 12 | things - ll - some - lol - explain | 16 | 12_things_ll_some_lol |
55
+ | 13 | videos - arxiv - visual - preprint - generative | 13 | 13_videos_arxiv_visual_preprint |
56
+ | 14 | spectrograms - musecoder - melspectrogram - vocoding - spectrogram | 13 | 14_spectrograms_musecoder_melspectrogram_vocoding |
57
+
58
+ </details>
59
+
60
+ ## Training hyperparameters
61
+
62
+ * calculate_probabilities: True
63
+ * language: None
64
+ * low_memory: False
65
+ * min_topic_size: 10
66
+ * n_gram_range: (1, 1)
67
+ * nr_topics: None
68
+ * seed_topic_list: None
69
+ * top_n_words: 10
70
+ * verbose: True
71
+
72
+ ## Framework versions
73
+
74
+ * Numpy: 1.22.4
75
+ * HDBSCAN: 0.8.29
76
+ * UMAP: 0.5.3
77
+ * Pandas: 1.5.3
78
+ * Scikit-Learn: 1.2.2
79
+ * Sentence-transformers: 2.2.2
80
+ * Transformers: 4.29.2
81
+ * Numba: 0.56.4
82
+ * Plotly: 5.13.1
83
+ * Python: 3.10.11
config.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": true,
3
+ "language": null,
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": null,
11
+ "seed_topic_list": null,
12
+ "top_n_words": 10,
13
+ "verbose": true,
14
+ "embedding_model": "sentence-transformers/sentence-t5-xl"
15
+ }
ctfidf.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9c9cec94da8d52bd72ea2ca76c52bf3bd1fd5edcfc20abf6fd84601d1425d16
3
+ size 528340
ctfidf_config.json ADDED
The diff for this file is too large to render. See raw diff
 
topic_embeddings.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5452fafd52d3715ea3325d8147988bd7ea8d9ad60ac7831f27a3422617a99ea6
3
+ size 49240
topics.json ADDED
@@ -0,0 +1,1429 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "topic_representations": {
3
+ "-1": [
4
+ [
5
+ "convolutional",
6
+ 0.812958836555481
7
+ ],
8
+ [
9
+ "images",
10
+ 0.789915144443512
11
+ ],
12
+ [
13
+ "networks",
14
+ 0.7750969529151917
15
+ ],
16
+ [
17
+ "superpixels",
18
+ 0.7669301629066467
19
+ ],
20
+ [
21
+ "overfitting",
22
+ 0.766011118888855
23
+ ],
24
+ [
25
+ "image",
26
+ 0.7607262134552002
27
+ ],
28
+ [
29
+ "segmentation",
30
+ 0.7580690383911133
31
+ ],
32
+ [
33
+ "neural",
34
+ 0.7571927905082703
35
+ ],
36
+ [
37
+ "input",
38
+ 0.7562020421028137
39
+ ],
40
+ [
41
+ "nn",
42
+ 0.7535852789878845
43
+ ]
44
+ ],
45
+ "0": [
46
+ [
47
+ "bruno",
48
+ 0.782376766204834
49
+ ],
50
+ [
51
+ "guy",
52
+ 0.7694101333618164
53
+ ],
54
+ [
55
+ "pdf",
56
+ 0.7633169889450073
57
+ ],
58
+ [
59
+ "screentalk",
60
+ 0.7479361295700073
61
+ ],
62
+ [
63
+ "he",
64
+ 0.7390514612197876
65
+ ],
66
+ [
67
+ "int",
68
+ 0.7379845380783081
69
+ ],
70
+ [
71
+ "his",
72
+ 0.7341611385345459
73
+ ],
74
+ [
75
+ "converted",
76
+ 0.7340353727340698
77
+ ],
78
+ [
79
+ "out",
80
+ 0.7325744032859802
81
+ ],
82
+ [
83
+ "voice",
84
+ 0.7307334542274475
85
+ ]
86
+ ],
87
+ "1": [
88
+ [
89
+ "elsa",
90
+ 0.8160275816917419
91
+ ],
92
+ [
93
+ "arendelle",
94
+ 0.7971220016479492
95
+ ],
96
+ [
97
+ "kristoff",
98
+ 0.7863443493843079
99
+ ],
100
+ [
101
+ "frozen",
102
+ 0.7767009139060974
103
+ ],
104
+ [
105
+ "anna",
106
+ 0.7665482759475708
107
+ ],
108
+ [
109
+ "olaf",
110
+ 0.7611944675445557
111
+ ],
112
+ [
113
+ "hans",
114
+ 0.7529194355010986
115
+ ],
116
+ [
117
+ "snow",
118
+ 0.7397163510322571
119
+ ],
120
+ [
121
+ "no",
122
+ 0.7334048748016357
123
+ ],
124
+ [
125
+ "sven",
126
+ 0.7317110300064087
127
+ ]
128
+ ],
129
+ "2": [
130
+ [
131
+ "gillis",
132
+ 0.780834436416626
133
+ ],
134
+ [
135
+ "script",
136
+ 0.7442724704742432
137
+ ],
138
+ [
139
+ "room",
140
+ 0.7441475987434387
141
+ ],
142
+ [
143
+ "ll",
144
+ 0.7397430539131165
145
+ ],
146
+ [
147
+ "artie",
148
+ 0.7382926940917969
149
+ ],
150
+ [
151
+ "norma",
152
+ 0.7371121048927307
153
+ ],
154
+ [
155
+ "house",
156
+ 0.7355591058731079
157
+ ],
158
+ [
159
+ "some",
160
+ 0.7345719933509827
161
+ ],
162
+ [
163
+ "no",
164
+ 0.7334718108177185
165
+ ],
166
+ [
167
+ "out",
168
+ 0.7313063144683838
169
+ ]
170
+ ],
171
+ "3": [
172
+ [
173
+ "interpretation",
174
+ 0.7614122629165649
175
+ ],
176
+ [
177
+ "explanation",
178
+ 0.7482936382293701
179
+ ],
180
+ [
181
+ "theory",
182
+ 0.7403494119644165
183
+ ],
184
+ [
185
+ "structure",
186
+ 0.7351824045181274
187
+ ],
188
+ [
189
+ "merge",
190
+ 0.7320129871368408
191
+ ],
192
+ [
193
+ "conditions",
194
+ 0.7313622236251831
195
+ ],
196
+ [
197
+ "simplest",
198
+ 0.7254719734191895
199
+ ],
200
+ [
201
+ "interesting",
202
+ 0.7216659784317017
203
+ ],
204
+ [
205
+ "system",
206
+ 0.7192807197570801
207
+ ],
208
+ [
209
+ "something",
210
+ 0.7187553644180298
211
+ ]
212
+ ],
213
+ "4": [
214
+ [
215
+ "topics",
216
+ 0.7710573673248291
217
+ ],
218
+ [
219
+ "topic",
220
+ 0.764340877532959
221
+ ],
222
+ [
223
+ "documents",
224
+ 0.7622215747833252
225
+ ],
226
+ [
227
+ "corpus",
228
+ 0.7556679844856262
229
+ ],
230
+ [
231
+ "document",
232
+ 0.7548444271087646
233
+ ],
234
+ [
235
+ "data",
236
+ 0.7541389465332031
237
+ ],
238
+ [
239
+ "words",
240
+ 0.7497113943099976
241
+ ],
242
+ [
243
+ "frequency",
244
+ 0.7443191409111023
245
+ ],
246
+ [
247
+ "vocabulary",
248
+ 0.7432523369789124
249
+ ],
250
+ [
251
+ "example",
252
+ 0.7350988388061523
253
+ ]
254
+ ],
255
+ "5": [
256
+ [
257
+ "nemo",
258
+ 0.8560476303100586
259
+ ],
260
+ [
261
+ "dory",
262
+ 0.8030993938446045
263
+ ],
264
+ [
265
+ "chum",
266
+ 0.7655107975006104
267
+ ],
268
+ [
269
+ "gill",
270
+ 0.7619814872741699
271
+ ],
272
+ [
273
+ "fish",
274
+ 0.7474814653396606
275
+ ],
276
+ [
277
+ "sharkbait",
278
+ 0.7334390878677368
279
+ ],
280
+ [
281
+ "swim",
282
+ 0.732296347618103
283
+ ],
284
+ [
285
+ "uh",
286
+ 0.7292873859405518
287
+ ],
288
+ [
289
+ "coral",
290
+ 0.728416919708252
291
+ ],
292
+ [
293
+ "no",
294
+ 0.7282388210296631
295
+ ]
296
+ ],
297
+ "6": [
298
+ [
299
+ "films",
300
+ 0.7558428645133972
301
+ ],
302
+ [
303
+ "film",
304
+ 0.7491594552993774
305
+ ],
306
+ [
307
+ "identity",
308
+ 0.7483522891998291
309
+ ],
310
+ [
311
+ "trauma",
312
+ 0.7400946617126465
313
+ ],
314
+ [
315
+ "zinnemann",
316
+ 0.7322919368743896
317
+ ],
318
+ [
319
+ "identities",
320
+ 0.7272083163261414
321
+ ],
322
+ [
323
+ "traces",
324
+ 0.7219871282577515
325
+ ],
326
+ [
327
+ "urban",
328
+ 0.721187174320221
329
+ ],
330
+ [
331
+ "between",
332
+ 0.7194730639457703
333
+ ],
334
+ [
335
+ "materiality",
336
+ 0.7193468809127808
337
+ ]
338
+ ],
339
+ "7": [
340
+ [
341
+ "computational",
342
+ 0.7724958658218384
343
+ ],
344
+ [
345
+ "data",
346
+ 0.7706875205039978
347
+ ],
348
+ [
349
+ "pathology",
350
+ 0.7677849531173706
351
+ ],
352
+ [
353
+ "medical",
354
+ 0.7668463587760925
355
+ ],
356
+ [
357
+ "informatics",
358
+ 0.7635326385498047
359
+ ],
360
+ [
361
+ "classification",
362
+ 0.7610385417938232
363
+ ],
364
+ [
365
+ "medical_",
366
+ 0.7591860890388489
367
+ ],
368
+ [
369
+ "data_",
370
+ 0.7576411962509155
371
+ ],
372
+ [
373
+ "icu",
374
+ 0.7570323944091797
375
+ ],
376
+ [
377
+ "images",
378
+ 0.7569260597229004
379
+ ]
380
+ ],
381
+ "8": [
382
+ [
383
+ "images",
384
+ 0.8044949173927307
385
+ ],
386
+ [
387
+ "captions",
388
+ 0.783771812915802
389
+ ],
390
+ [
391
+ "representations",
392
+ 0.7822390794754028
393
+ ],
394
+ [
395
+ "embeddings",
396
+ 0.7761370539665222
397
+ ],
398
+ [
399
+ "image",
400
+ 0.7710026502609253
401
+ ],
402
+ [
403
+ "embedding",
404
+ 0.766433835029602
405
+ ],
406
+ [
407
+ "conditioning",
408
+ 0.7635176777839661
409
+ ],
410
+ [
411
+ "encoder",
412
+ 0.7606396675109863
413
+ ],
414
+ [
415
+ "decoder",
416
+ 0.7513459920883179
417
+ ],
418
+ [
419
+ "classifier",
420
+ 0.7493197917938232
421
+ ]
422
+ ],
423
+ "9": [
424
+ [
425
+ "zaroff",
426
+ 0.7830722332000732
427
+ ],
428
+ [
429
+ "rainsford",
430
+ 0.7752155065536499
431
+ ],
432
+ [
433
+ "hunt",
434
+ 0.7669715285301208
435
+ ],
436
+ [
437
+ "hunting",
438
+ 0.758228063583374
439
+ ],
440
+ [
441
+ "general",
442
+ 0.7424396872520447
443
+ ],
444
+ [
445
+ "he",
446
+ 0.7389982342720032
447
+ ],
448
+ [
449
+ "ll",
450
+ 0.7381408214569092
451
+ ],
452
+ [
453
+ "hunter",
454
+ 0.7372696995735168
455
+ ],
456
+ [
457
+ "had",
458
+ 0.733473539352417
459
+ ],
460
+ [
461
+ "man",
462
+ 0.7315329909324646
463
+ ]
464
+ ],
465
+ "10": [
466
+ [
467
+ "cogvideo",
468
+ 0.8041282892227173
469
+ ],
470
+ [
471
+ "interpolation",
472
+ 0.7749236226081848
473
+ ],
474
+ [
475
+ "videos",
476
+ 0.7728286981582642
477
+ ],
478
+ [
479
+ "coglm",
480
+ 0.7700319290161133
481
+ ],
482
+ [
483
+ "frames",
484
+ 0.7670143246650696
485
+ ],
486
+ [
487
+ "iterations",
488
+ 0.7624410390853882
489
+ ],
490
+ [
491
+ "sequential",
492
+ 0.7571900486946106
493
+ ],
494
+ [
495
+ "cog",
496
+ 0.7526829242706299
497
+ ],
498
+ [
499
+ "pretraining",
500
+ 0.7518821954727173
501
+ ],
502
+ [
503
+ "model",
504
+ 0.7496122121810913
505
+ ]
506
+ ],
507
+ "11": [
508
+ [
509
+ "assignment",
510
+ 0.7819938659667969
511
+ ],
512
+ [
513
+ "essays",
514
+ 0.7542336583137512
515
+ ],
516
+ [
517
+ "questions",
518
+ 0.7518027424812317
519
+ ],
520
+ [
521
+ "projects",
522
+ 0.7412513494491577
523
+ ],
524
+ [
525
+ "students",
526
+ 0.7385664582252502
527
+ ],
528
+ [
529
+ "learning",
530
+ 0.7382364273071289
531
+ ],
532
+ [
533
+ "readings",
534
+ 0.7367334365844727
535
+ ],
536
+ [
537
+ "homework",
538
+ 0.7351245880126953
539
+ ],
540
+ [
541
+ "session",
542
+ 0.7344810962677002
543
+ ],
544
+ [
545
+ "required",
546
+ 0.729199230670929
547
+ ]
548
+ ],
549
+ "12": [
550
+ [
551
+ "things",
552
+ 0.7505715489387512
553
+ ],
554
+ [
555
+ "ll",
556
+ 0.7484922409057617
557
+ ],
558
+ [
559
+ "some",
560
+ 0.7481634616851807
561
+ ],
562
+ [
563
+ "lol",
564
+ 0.7392996549606323
565
+ ],
566
+ [
567
+ "explain",
568
+ 0.732305645942688
569
+ ],
570
+ [
571
+ "why",
572
+ 0.7317633628845215
573
+ ],
574
+ [
575
+ "can",
576
+ 0.7299840450286865
577
+ ],
578
+ [
579
+ "think",
580
+ 0.7281968593597412
581
+ ],
582
+ [
583
+ "thoughts",
584
+ 0.727929949760437
585
+ ],
586
+ [
587
+ "am",
588
+ 0.7273967862129211
589
+ ]
590
+ ],
591
+ "13": [
592
+ [
593
+ "videos",
594
+ 0.7792222499847412
595
+ ],
596
+ [
597
+ "arxiv",
598
+ 0.7715873718261719
599
+ ],
600
+ [
601
+ "visual",
602
+ 0.7565194964408875
603
+ ],
604
+ [
605
+ "preprint",
606
+ 0.7493596076965332
607
+ ],
608
+ [
609
+ "generative",
610
+ 0.7435239553451538
611
+ ],
612
+ [
613
+ "models",
614
+ 0.7431691884994507
615
+ ],
616
+ [
617
+ "ieeeicvf",
618
+ 0.7431351542472839
619
+ ],
620
+ [
621
+ "ieee",
622
+ 0.7425673007965088
623
+ ],
624
+ [
625
+ "generating",
626
+ 0.7353911995887756
627
+ ],
628
+ [
629
+ "video",
630
+ 0.7344731092453003
631
+ ]
632
+ ],
633
+ "14": [
634
+ [
635
+ "spectrograms",
636
+ 0.8015516996383667
637
+ ],
638
+ [
639
+ "musecoder",
640
+ 0.7912242412567139
641
+ ],
642
+ [
643
+ "melspectrogram",
644
+ 0.7880717515945435
645
+ ],
646
+ [
647
+ "vocoding",
648
+ 0.7843880653381348
649
+ ],
650
+ [
651
+ "spectrogram",
652
+ 0.7839890122413635
653
+ ],
654
+ [
655
+ "waveforms",
656
+ 0.7807684540748596
657
+ ],
658
+ [
659
+ "enhancement",
660
+ 0.7761921286582947
661
+ ],
662
+ [
663
+ "recordings",
664
+ 0.7706590294837952
665
+ ],
666
+ [
667
+ "waveform",
668
+ 0.7621763348579407
669
+ ],
670
+ [
671
+ "diffwave",
672
+ 0.7587536573410034
673
+ ]
674
+ ]
675
+ },
676
+ "topics": [
677
+ 7,
678
+ -1,
679
+ -1,
680
+ -1,
681
+ 8,
682
+ -1,
683
+ -1,
684
+ -1,
685
+ -1,
686
+ -1,
687
+ -1,
688
+ 7,
689
+ 7,
690
+ 7,
691
+ 0,
692
+ 0,
693
+ 0,
694
+ 0,
695
+ 0,
696
+ 0,
697
+ 0,
698
+ 0,
699
+ 0,
700
+ 0,
701
+ 0,
702
+ 0,
703
+ 0,
704
+ 0,
705
+ 0,
706
+ 0,
707
+ 0,
708
+ 0,
709
+ 0,
710
+ 0,
711
+ 0,
712
+ 0,
713
+ 0,
714
+ 0,
715
+ 0,
716
+ 0,
717
+ 0,
718
+ 0,
719
+ 0,
720
+ 0,
721
+ 0,
722
+ 0,
723
+ 0,
724
+ 0,
725
+ 0,
726
+ 0,
727
+ 0,
728
+ 0,
729
+ 0,
730
+ 0,
731
+ 0,
732
+ 0,
733
+ 0,
734
+ 0,
735
+ 0,
736
+ 0,
737
+ 0,
738
+ 0,
739
+ 0,
740
+ 0,
741
+ 0,
742
+ 0,
743
+ 0,
744
+ 0,
745
+ 0,
746
+ 0,
747
+ 0,
748
+ 0,
749
+ 0,
750
+ 0,
751
+ 0,
752
+ 0,
753
+ 0,
754
+ 0,
755
+ 0,
756
+ 0,
757
+ 0,
758
+ 0,
759
+ 0,
760
+ 0,
761
+ 0,
762
+ 0,
763
+ 0,
764
+ 0,
765
+ 0,
766
+ 0,
767
+ 0,
768
+ 0,
769
+ 0,
770
+ 0,
771
+ 0,
772
+ 0,
773
+ 0,
774
+ 0,
775
+ 0,
776
+ 0,
777
+ 0,
778
+ 0,
779
+ 0,
780
+ 7,
781
+ 7,
782
+ 7,
783
+ 7,
784
+ -1,
785
+ 7,
786
+ 7,
787
+ 7,
788
+ 7,
789
+ 7,
790
+ 7,
791
+ 8,
792
+ 7,
793
+ 7,
794
+ 7,
795
+ 7,
796
+ 7,
797
+ 7,
798
+ 9,
799
+ 9,
800
+ 9,
801
+ 9,
802
+ 9,
803
+ 9,
804
+ 9,
805
+ 9,
806
+ 9,
807
+ 9,
808
+ 9,
809
+ 9,
810
+ 9,
811
+ 9,
812
+ 9,
813
+ 9,
814
+ 9,
815
+ 9,
816
+ 9,
817
+ 9,
818
+ 9,
819
+ 9,
820
+ 9,
821
+ 9,
822
+ 5,
823
+ 5,
824
+ 5,
825
+ 5,
826
+ 5,
827
+ 5,
828
+ 5,
829
+ 5,
830
+ 5,
831
+ 5,
832
+ 5,
833
+ 5,
834
+ 5,
835
+ 5,
836
+ 5,
837
+ 5,
838
+ 5,
839
+ 5,
840
+ 5,
841
+ 5,
842
+ 5,
843
+ 5,
844
+ 5,
845
+ 5,
846
+ 5,
847
+ 5,
848
+ 5,
849
+ 5,
850
+ 5,
851
+ 5,
852
+ 5,
853
+ 5,
854
+ 5,
855
+ 5,
856
+ 5,
857
+ 5,
858
+ 5,
859
+ 5,
860
+ 5,
861
+ 5,
862
+ 5,
863
+ 5,
864
+ 5,
865
+ 5,
866
+ 5,
867
+ 5,
868
+ 5,
869
+ 5,
870
+ 5,
871
+ 5,
872
+ 5,
873
+ 5,
874
+ 5,
875
+ 5,
876
+ 0,
877
+ 11,
878
+ 4,
879
+ 4,
880
+ 11,
881
+ 11,
882
+ 11,
883
+ 11,
884
+ 11,
885
+ 11,
886
+ 11,
887
+ -1,
888
+ -1,
889
+ 4,
890
+ -1,
891
+ 11,
892
+ 4,
893
+ 11,
894
+ 4,
895
+ 4,
896
+ 4,
897
+ 4,
898
+ 4,
899
+ 4,
900
+ 11,
901
+ 12,
902
+ 12,
903
+ 12,
904
+ 12,
905
+ 12,
906
+ 12,
907
+ 12,
908
+ 12,
909
+ 12,
910
+ 12,
911
+ 12,
912
+ 12,
913
+ 12,
914
+ 6,
915
+ 6,
916
+ 6,
917
+ 6,
918
+ 6,
919
+ 6,
920
+ 6,
921
+ 6,
922
+ 6,
923
+ 6,
924
+ 6,
925
+ 6,
926
+ 6,
927
+ 6,
928
+ 6,
929
+ 6,
930
+ 6,
931
+ 6,
932
+ 6,
933
+ 6,
934
+ 6,
935
+ 6,
936
+ 6,
937
+ 6,
938
+ 6,
939
+ 6,
940
+ 6,
941
+ 6,
942
+ 6,
943
+ 6,
944
+ 6,
945
+ 6,
946
+ 6,
947
+ 6,
948
+ 0,
949
+ 6,
950
+ 6,
951
+ 6,
952
+ 6,
953
+ 6,
954
+ 6,
955
+ 6,
956
+ 6,
957
+ 6,
958
+ 6,
959
+ 6,
960
+ 6,
961
+ 10,
962
+ 13,
963
+ 10,
964
+ -1,
965
+ 10,
966
+ 10,
967
+ 10,
968
+ 10,
969
+ 10,
970
+ 10,
971
+ 10,
972
+ 10,
973
+ 10,
974
+ 10,
975
+ 10,
976
+ 10,
977
+ 13,
978
+ 13,
979
+ 13,
980
+ 13,
981
+ 10,
982
+ 10,
983
+ 10,
984
+ -1,
985
+ 11,
986
+ 11,
987
+ 11,
988
+ 4,
989
+ 4,
990
+ 4,
991
+ 4,
992
+ 4,
993
+ 4,
994
+ 4,
995
+ -1,
996
+ 4,
997
+ 4,
998
+ 4,
999
+ 4,
1000
+ 4,
1001
+ 4,
1002
+ 4,
1003
+ 4,
1004
+ 4,
1005
+ 4,
1006
+ 4,
1007
+ 4,
1008
+ 4,
1009
+ 4,
1010
+ 4,
1011
+ -1,
1012
+ 11,
1013
+ 6,
1014
+ 0,
1015
+ 2,
1016
+ 2,
1017
+ 2,
1018
+ 2,
1019
+ 2,
1020
+ 2,
1021
+ 2,
1022
+ 2,
1023
+ 2,
1024
+ 2,
1025
+ 2,
1026
+ 2,
1027
+ 2,
1028
+ 2,
1029
+ 2,
1030
+ 2,
1031
+ 2,
1032
+ 2,
1033
+ 2,
1034
+ 2,
1035
+ 2,
1036
+ 2,
1037
+ 2,
1038
+ 2,
1039
+ 2,
1040
+ 2,
1041
+ 2,
1042
+ 2,
1043
+ 2,
1044
+ 2,
1045
+ 2,
1046
+ 2,
1047
+ 2,
1048
+ 2,
1049
+ 2,
1050
+ 2,
1051
+ 2,
1052
+ 2,
1053
+ 2,
1054
+ 2,
1055
+ 2,
1056
+ 2,
1057
+ 2,
1058
+ 2,
1059
+ 2,
1060
+ 2,
1061
+ 2,
1062
+ 2,
1063
+ 2,
1064
+ 2,
1065
+ 2,
1066
+ 2,
1067
+ 2,
1068
+ 2,
1069
+ 2,
1070
+ 2,
1071
+ 2,
1072
+ 2,
1073
+ 2,
1074
+ 2,
1075
+ 2,
1076
+ 2,
1077
+ 2,
1078
+ 2,
1079
+ 2,
1080
+ 2,
1081
+ 2,
1082
+ 2,
1083
+ 2,
1084
+ 2,
1085
+ 0,
1086
+ 2,
1087
+ 2,
1088
+ 8,
1089
+ 8,
1090
+ 8,
1091
+ 8,
1092
+ 8,
1093
+ 8,
1094
+ 8,
1095
+ 8,
1096
+ 8,
1097
+ 8,
1098
+ 8,
1099
+ 8,
1100
+ 8,
1101
+ 8,
1102
+ 8,
1103
+ 8,
1104
+ 8,
1105
+ 8,
1106
+ 8,
1107
+ 13,
1108
+ 13,
1109
+ -1,
1110
+ 13,
1111
+ 13,
1112
+ 13,
1113
+ 13,
1114
+ 13,
1115
+ -1,
1116
+ 8,
1117
+ 8,
1118
+ 8,
1119
+ 0,
1120
+ 14,
1121
+ 14,
1122
+ 14,
1123
+ 14,
1124
+ 14,
1125
+ 14,
1126
+ 14,
1127
+ 14,
1128
+ 14,
1129
+ 14,
1130
+ 14,
1131
+ 13,
1132
+ -1,
1133
+ -1,
1134
+ 14,
1135
+ 3,
1136
+ 3,
1137
+ 3,
1138
+ 3,
1139
+ 3,
1140
+ 3,
1141
+ 3,
1142
+ 3,
1143
+ 3,
1144
+ 3,
1145
+ 3,
1146
+ 3,
1147
+ 3,
1148
+ 3,
1149
+ 3,
1150
+ 3,
1151
+ 3,
1152
+ 3,
1153
+ 3,
1154
+ 3,
1155
+ 3,
1156
+ 3,
1157
+ 3,
1158
+ 3,
1159
+ 3,
1160
+ 3,
1161
+ 3,
1162
+ 3,
1163
+ 3,
1164
+ 3,
1165
+ 3,
1166
+ 3,
1167
+ 3,
1168
+ 3,
1169
+ 3,
1170
+ 3,
1171
+ 3,
1172
+ 3,
1173
+ 3,
1174
+ 3,
1175
+ 3,
1176
+ 3,
1177
+ 3,
1178
+ 3,
1179
+ 3,
1180
+ 3,
1181
+ 3,
1182
+ 3,
1183
+ 3,
1184
+ 3,
1185
+ 3,
1186
+ 3,
1187
+ 3,
1188
+ 3,
1189
+ 3,
1190
+ 3,
1191
+ 3,
1192
+ 3,
1193
+ 3,
1194
+ 3,
1195
+ 3,
1196
+ 3,
1197
+ 3,
1198
+ 11,
1199
+ 4,
1200
+ 4,
1201
+ 4,
1202
+ 4,
1203
+ 4,
1204
+ 4,
1205
+ 4,
1206
+ 4,
1207
+ 4,
1208
+ 4,
1209
+ 4,
1210
+ 4,
1211
+ 4,
1212
+ 4,
1213
+ 4,
1214
+ 4,
1215
+ 7,
1216
+ 4,
1217
+ 4,
1218
+ 4,
1219
+ 4,
1220
+ 4,
1221
+ 4,
1222
+ 4,
1223
+ -1,
1224
+ 7,
1225
+ 7,
1226
+ 7,
1227
+ 4,
1228
+ -1,
1229
+ -1,
1230
+ -1,
1231
+ -1,
1232
+ 7,
1233
+ 7,
1234
+ 1,
1235
+ 1,
1236
+ 1,
1237
+ 1,
1238
+ 1,
1239
+ 1,
1240
+ 1,
1241
+ 1,
1242
+ 1,
1243
+ 1,
1244
+ 1,
1245
+ 1,
1246
+ 1,
1247
+ 1,
1248
+ 1,
1249
+ 1,
1250
+ 1,
1251
+ 1,
1252
+ 1,
1253
+ 1,
1254
+ 1,
1255
+ 1,
1256
+ 1,
1257
+ 1,
1258
+ 1,
1259
+ 1,
1260
+ 1,
1261
+ 1,
1262
+ 1,
1263
+ 1,
1264
+ 1,
1265
+ 1,
1266
+ 1,
1267
+ 1,
1268
+ 1,
1269
+ 1,
1270
+ 1,
1271
+ 1,
1272
+ 1,
1273
+ 1,
1274
+ 1,
1275
+ 1,
1276
+ 1,
1277
+ 1,
1278
+ 1,
1279
+ 1,
1280
+ 1,
1281
+ 1,
1282
+ 1,
1283
+ 1,
1284
+ 1,
1285
+ 1,
1286
+ 1,
1287
+ 1,
1288
+ 1,
1289
+ 1,
1290
+ 1,
1291
+ 1,
1292
+ 1,
1293
+ 1,
1294
+ 1,
1295
+ 1,
1296
+ 1,
1297
+ 1,
1298
+ 1,
1299
+ 1,
1300
+ 1,
1301
+ 1,
1302
+ 1,
1303
+ 1,
1304
+ 1,
1305
+ 1,
1306
+ 1
1307
+ ],
1308
+ "topic_sizes": {
1309
+ "7": 26,
1310
+ "-1": 26,
1311
+ "8": 24,
1312
+ "0": 94,
1313
+ "9": 24,
1314
+ "5": 54,
1315
+ "11": 16,
1316
+ "4": 56,
1317
+ "12": 13,
1318
+ "6": 47,
1319
+ "10": 17,
1320
+ "13": 13,
1321
+ "2": 72,
1322
+ "14": 12,
1323
+ "3": 63,
1324
+ "1": 73
1325
+ },
1326
+ "topic_mapper": [
1327
+ [
1328
+ -1,
1329
+ -1,
1330
+ -1
1331
+ ],
1332
+ [
1333
+ 0,
1334
+ 0,
1335
+ 1
1336
+ ],
1337
+ [
1338
+ 1,
1339
+ 1,
1340
+ 5
1341
+ ],
1342
+ [
1343
+ 2,
1344
+ 2,
1345
+ 6
1346
+ ],
1347
+ [
1348
+ 3,
1349
+ 3,
1350
+ 9
1351
+ ],
1352
+ [
1353
+ 4,
1354
+ 4,
1355
+ 0
1356
+ ],
1357
+ [
1358
+ 5,
1359
+ 5,
1360
+ 2
1361
+ ],
1362
+ [
1363
+ 6,
1364
+ 6,
1365
+ 12
1366
+ ],
1367
+ [
1368
+ 7,
1369
+ 7,
1370
+ 3
1371
+ ],
1372
+ [
1373
+ 8,
1374
+ 8,
1375
+ 11
1376
+ ],
1377
+ [
1378
+ 9,
1379
+ 9,
1380
+ 4
1381
+ ],
1382
+ [
1383
+ 10,
1384
+ 10,
1385
+ 7
1386
+ ],
1387
+ [
1388
+ 11,
1389
+ 11,
1390
+ 14
1391
+ ],
1392
+ [
1393
+ 12,
1394
+ 12,
1395
+ 8
1396
+ ],
1397
+ [
1398
+ 13,
1399
+ 13,
1400
+ 13
1401
+ ],
1402
+ [
1403
+ 14,
1404
+ 14,
1405
+ 10
1406
+ ]
1407
+ ],
1408
+ "topic_labels": {
1409
+ "-1": "-1_convolutional_images_networks_superpixels",
1410
+ "0": "0_bruno_guy_pdf_screentalk",
1411
+ "1": "1_elsa_arendelle_kristoff_frozen",
1412
+ "2": "2_gillis_script_room_ll",
1413
+ "3": "3_interpretation_explanation_theory_structure",
1414
+ "4": "4_topics_topic_documents_corpus",
1415
+ "5": "5_nemo_dory_chum_gill",
1416
+ "6": "6_films_film_identity_trauma",
1417
+ "7": "7_computational_data_pathology_medical",
1418
+ "8": "8_images_captions_representations_embeddings",
1419
+ "9": "9_zaroff_rainsford_hunt_hunting",
1420
+ "10": "10_cogvideo_interpolation_videos_coglm",
1421
+ "11": "11_assignment_essays_questions_projects",
1422
+ "12": "12_things_ll_some_lol",
1423
+ "13": "13_videos_arxiv_visual_preprint",
1424
+ "14": "14_spectrograms_musecoder_melspectrogram_vocoding"
1425
+ },
1426
+ "custom_labels": null,
1427
+ "_outliers": 1,
1428
+ "topic_aspects": {}
1429
+ }