vladargunov commited on
Commit
44d4894
1 Parent(s): 7c1b950

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,625 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: sentence-transformers
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:16158
11
+ - loss:CosineSimilarityLoss
12
+ base_model: sentence-transformers/all-MiniLM-L6-v2
13
+ datasets:
14
+ - bigbio/pubhealth
15
+ widget:
16
+ - source_sentence: 'The fruit (soursop, guyabano), leaves, and bark of the graviola
17
+ tree (A. muricata), have long been utilized as a folk remedy in parts of Africa
18
+ and South America for myriad conditions. Claims of their potential to “cure” cancer,
19
+ similarly, have long been a fixture in certain regions of the Internet — fringe
20
+ health websites and supplement hucksters, primarily. In their most exaggerated
21
+ form, such claims take the form of a widespread conspiracy alleging a pharmaceutical
22
+ coverup to squash evidence of viable, powerful, and universal cure for cancer
23
+ in favor of financial gain. The dubious Health Sciences Institute, (promoter of
24
+ a previously debunked claim that Hillary Clinton has worked to hide a cancer cure
25
+ dubbed “sour honey”) described the plant’s potential this way: Since the 1970s,
26
+ the bark, leaves, roots, fruit, and fruit seeds of the Amazonian Graviola tree
27
+ have been studied in numerous laboratory tests and have shown remarkable results
28
+ with this deadly disease. Several years ago, a major pharmaceutical company began
29
+ extensive independent research on it. They learned that certain extracts of the
30
+ tree actually seek out, attack, and destroy cancer cells. […] After more than
31
+ seven years of work behind closed doors, researchers at this company realized
32
+ they couldn’t duplicate the tree’s natural properties with a patentable substance.
33
+ So they shut down the entire project. It basically came down to this—if they couldn’t
34
+ make huge profits, they would keep the news of this possible cure a well-guarded
35
+ secret. But one researcher couldn’t bear that, and decided to risk his job with
36
+ the hope of saving lives. Indeed, there has been research on many parts of, and
37
+ chemicals within, the graviola tree with regard to their ability to kill cancerous
38
+ cells. In terms of a possible mechanism, most ideas revolve around unique chemicals
39
+ contained within the fruit — annonaceous acetogenins — that may present a novel
40
+ pathway to kill cancer cells. These chemicals are found only in the family of
41
+ plants Graviola belongs to (Annonaceae) and some research indicates they may have
42
+ the ability to inhibit chemicals that aid cellular respiration, which can cause
43
+ a “programmed death” of cancer cells. Perhaps most notably, this mechanism has
44
+ been explored using extracts from graviola material against human lung, colorectal,
45
+ and liver cancer cell lines. Such studies have found that extracts were indeed
46
+ able to cause programmed cell death as hypothesized. Other studies have shown
47
+ limited potential in reducing the proliferation of cancer cells in some animals
48
+ and cell lines as well. It is worth mentioning, however, that many chemicals that
49
+ show anti-cancer properties in laboratory settings do not translate to viable
50
+ cures or treatments for cancer. Investigations on laboratory animals, too, have
51
+ shown limited but somewhat positive results with regard to the plant’s anticancer
52
+ potential. Studies on rats and mice, respectively, have shown some anti-tumor
53
+ potential with prostate cancer and breast cancer, and studies on rats have, as
54
+ well, shown potential preventive effects for colon cancer. Outside of singular
55
+ case reports from people alleging benefits from the plant, no large scale clinical
56
+ human studies have been published on its efficacy as a legitimate treatment for
57
+ cancer (at least one clinical trial has been registered, however). As such, the view
58
+ of the UK based Cancer Research, and other Cancer groups, is as follows: There
59
+ have not been any studies [of Graviola] in humans. So we don’t know whether it
60
+ can work as a cancer treatment or not. Many sites on the internet advertise and
61
+ promote graviola capsules as a cancer cure but none of them are supported by any
62
+ reputable scientific cancer organisations. Both the United States Food and Drug
63
+ administration as well as the United States Federal Trade Commission have issued
64
+ warnings to groups selling graviola extract with claims of its cancer-curing potential.
65
+ In 2008, in a press release describing a “sweep” of graviola supplement sellers,
66
+ the FTC described their products as “bogus“. Outside of overblown claims, there
67
+ are also legitimate concerns about the safety of these products. Numerous studies
68
+ have suggested that the potentially active chemicals within the graviola tree
69
+ may be neurotoxic. Epidemiological studies of cultures that regularly use the
70
+ plant in traditional medicine have shown associations between the plant’s consumption
71
+ and Parkinson’s disease: Epidemiological studies, however, linked the consumption
72
+ of Annonaceae to a high prevalence of atypical parkinsonism, in Guadeloupe, in
73
+ parts of the Afro-Caribbean and Indian population in London and New Caledonia.
74
+ In several patients who desisted in their consumption of Annonaceae fruits, the
75
+ progression of atypical parkinsonism ceased […]. Chemical investigations of active
76
+ components within the plant reveal strong evidence of its neurotoxicity, as well:
77
+ The fruit pulp extract of A. muricata revealed the strongest neurotoxic effect,
78
+ with 67% cell death at a concentration of 1 µg/mL. A high reduction in cell viability
79
+ coupled with pronounced cell death was found at 0.1 µg/mL for an Annonaceous seed
80
+ extract. These results demonstrate that the intake of dietary supplements containing
81
+ plant material from Annonaceae may be hazardous to health in terms of neurotoxicity.'
82
+ sentences:
83
+ - U.S. President Donald Trump issued a pardon for the leader of the armed group
84
+ that held migrants at gunpoint in New Mexico.
85
+ - Thanks to the immigrants who illegally cross the U.S. Mexican border, and the
86
+ Democrats who refuse to stop them, the Measles virus has been declared a public
87
+ health emergency in 2019.
88
+ - '"""The animated film """"Incredibles 2"""" contains scenes that prompted an epilepsy
89
+ warning at movie theaters."""'
90
+ - source_sentence: '"""In a regular feature called """"How the Left Destroys the Nation,""""
91
+ a website founded by the leader of a far-right group posted this headline about
92
+ one state’s coronavirus response: """"Michigan Governor Bans Gardening, Sale Of
93
+ Fruit and Vegetable Seeds, Gardening Supplies Prohibited."""" The attack on Gov.
94
+ Gretchen Whitmer, a Democrat who has been touted as a potential running mate for
95
+ presumptive Democratic presidential nominee Joe Biden, was flagged as part of
96
+ Facebook’s efforts to combat news and misinformation on its News Feed. (Read
97
+ more about our partnership with Facebook.) That’s because it’s wrong. Whitmer
98
+ has issued orders directing people to stay home and limiting some commercial activity,
99
+ but this claim goes too far. The headline appears on the Geller Report, a website
100
+ by Pamela Geller. She is an activist who co-founded Stop Islamization of America,
101
+ also known as the American Freedom Defense Initiative. Below the headline is an
102
+ article that originally appeared in The Daily Caller, a conservative-leaning publication,
103
+ that reports on an executive order issued by Whitmer in response to the COVID-19
104
+ outbreak. The article does not say that the order bans gardening, but that it
105
+ does restrict the sale of gardening supplies. In reality, executive order 2020-42,
106
+ which went into effect April 9, 2020, requires larger stores to block off certain
107
+ areas of their sales floors as a way of limiting the number of people in those
108
+ stores. The order does not ban gardening or the sale of any product, including,
109
+ as we mentioned in a previous fact-check, American flags. The numbers of coronavirus
110
+ cases in Michigan have surged in recent weeks. As of April 14, the Wolverine State
111
+ ranked fourth — behind New York, New Jersey and Massachusetts, according to the
112
+ New York Times. Nearly half of Michigan’s cases are in Wayne County, which includes
113
+ Detroit, according to Johns Hopkins University. Both the state and the county
114
+ have a COVID-19 fatality rate of 6%. It’s in that climate that Whitmer issued
115
+ this order, subtitled the """"Temporary requirement to suspend activities that
116
+ are not necessary to sustain or protect life,"""" which extended and added to
117
+ a stay-at-home order issued March 23. Tiffany Brown, a spokeswoman for the governor,
118
+ told PolitiFact that Whitmer’s order does not ban Michiganders from buying any
119
+ item. The order says that stores larger than 50,000 square feet must close areas
120
+ — """"by cordoning them off, placing signs in aisles, posting prominent signs,
121
+ removing goods from shelves, or other appropriate means — that are dedicated to
122
+ the following classes of goods: Carpet or flooring, furniture, garden centers
123
+ and plant nurseries, and paint."""" Referring to that restriction at a news conference
124
+ announcing the order, Whitmer said: """"If you’re not buying food or medicine
125
+ or other essential items, you should not be going to the store."""" As to gardening,
126
+ a frequently asked questions document released by the governor’s office states:
127
+ """"The order does not prohibit homeowners from tending to their own yards as
128
+ they see fit."""" Grocery stores, of course, remain open. And neither the order
129
+ nor the FAQs mention any restriction on the sale of fruit or seeds. A headline
130
+ shared on social media inaccurately describes an order that Whitmer issued in
131
+ response to the coronavirus. The order does not prohibit gardening or the sale
132
+ of any particular product in Michigan. Stores in Michigan larger than 50,000 square
133
+ feet must close areas for garden centers and plant nurseries, as well as those
134
+ that sell carpet or flooring, furniture and paint."""'
135
+ sentences:
136
+ - Bushfires rage out of control across southeast Australia.
137
+ - Iran records 4,585 coronavirus deaths as restrictions eased.
138
+ - '"""The Republican budget plan """"says that 10 years from now, if you’re a 65-year-old
139
+ who’s eligible for Medicare, you should have to pay nearly $6,400 more than you
140
+ would today."""'
141
+ - source_sentence: 'An old hoax about Charles Manson being paroled that was started
142
+ by a known fake news website in June 2014 resurfaced in June 2017. The rumor stems
143
+ from a 2014 report that appeared at Empire News under the headline, “Charles Manson
144
+ Granted Parole,” that reports Manson had been granted parole due to prison overcrowding:
145
+ The ruling, issued by three judges overseeing the state’s efforts to ease the overcrowding,
146
+ gives California until February 2016 to achieve their goals. But, the judges said,
147
+ the state has to make elderly inmates and those with serious illnesses eligible
148
+ for parole immediately. Manson, who was denied parole in April of 2012 and wasn’t
149
+ scheduled for another parole hearing until 2027, was re-evaluated due to his age
150
+ and health and the Parole Board recommended his parole. The site’s disclaimer,
151
+ however, states that it’s content is “intended for entertainment purposes only,”
152
+ meaning that its reporting should not be taken as fact. It’s not clear why Charles
153
+ Manson parole rumors resurfaced in June 2017. Manson was denied parole by the
154
+ California Department of Corrections in 2012 and his next parole hearing was scheduled
155
+ for 2027, when Manson would be 92 years old. In January 2017, however, Manson
156
+ was transferred to a hospital for treatment of gastrointestinal bleeding, and
157
+ Manson’s condition was described as “serious” by family members. He had been transferred
158
+ back to prison by the time the rumor resurfaced. It’s possible that parole decisions
159
+ regarding the release of other former Manson Family members could have contributed
160
+ to Charles Manson parole rumors resurfacing. A panel recommended the release of  a
161
+ former Manson Family member named Bruce Davis who murdered musician Gary Hinman
162
+ and stuntman Donald “Shorty” Shea in 1969. The final decision, however, will rest
163
+ with California Gov. Jerry Brown, who had about five months to make a decision.
164
+ the Los Angeles Times reports. Meanwhile, an appeals panel postponed a decision
165
+ on wether or not to recommend the release of former Manson Family member Patricia
166
+ Krenwinkel in December 2016, Fox News reports. Krenwinkel was present at the 1969
167
+ murder of Sharon Tate and four others. But regardless of developments with other
168
+ members of the Manson Family, all Charles Manson parole rumors should be considered
169
+ “fiction” until at least 2027, when his next hearing is scheduled. Comments'
170
+ sentences:
171
+ - '"""Common usage of the phrase """"Always a bridesmaid but never a bride"""" originated
172
+ with an advertising campaign for Listerine mouthwash."""'
173
+ - Colorado governor signs recreational marijuana regulations into law.
174
+ - State to consider 6 conditions to treat with medical pot.
175
+ - source_sentence: 'A “Chicken Soup”-like tale warning us against the folly of judging
176
+ people solely by appearances hit the Internet in mid-1998. As usual, the framework
177
+ of the tale bore some general resemblance to the truth, but details were greatly
178
+ altered so as to turn it into something quite different from the real story: The
179
+ President of Harvard made a mistake by prejudging people and it cost him dearly.
180
+ A lady in a faded gingham dress and her husband, dressed in a homespun threadbare
181
+ suit, stepped off the train in Boston, and walked timidly without an appointment
182
+ into the president’s outer office. The secretary could tell in a moment that such
183
+ backwoods, country hicks had no business at Harvard and probably didn’t even deserve
184
+ to be in Cambridge. She frowned. “We want to see the president,” the man said
185
+ softly. “He’ll be busy all day,” the secretary snapped. “We’ll wait,” the lady
186
+ replied. For hours, the secretary ignored them, hoping that the couple would finally
187
+ become discouraged and go away. They didn’t. And the secretary grew frustrated
188
+ and finally decided to disturb the president, even though it was a chore she always
189
+ regretted to do. “Maybe if they just see you for a few minutes, they’ll leave,”
190
+ she told him. And he signed in exasperation and nodded. Someone of his importance
191
+ obviously didn’t have the time to spend with them, but he detested gingham dresses
192
+ and homespun suits cluttering up his outer office. The president, stern-faced
193
+ with dignity, strutted toward the couple. The lady told him, “We had a son that
194
+ attended Harvard for one year. He loved Harvard. He was happy here. But about
195
+ a year ago, he was accidentally killed. And my husband and I would like to erect
196
+ a memorial to him, somewhere on campus.” The president wasn’t touched; he was
197
+ shocked. “Madam,” he said gruffly, “We can’t put up a statue for every person
198
+ who attended Harvard and died. If we did, this place would look like a cemetery.”
199
+ “Oh, no,” the lady explained quickly, “We don’t want to erect a statue. We thought
200
+ we would like to give a building to Harvard.” The president rolled his eyes. He
201
+ glanced at the gingham dress and homespun suit, then exclaimed, “A building! Do
202
+ you have any earthly idea how much a building costs? We have over seven and a
203
+ half million dollars in the physical plant at Harvard.” For a moment the lady
204
+ was silent. The president was pleased. He could get rid of them now. And the lady
205
+ turned to her husband and said quietly, “Is that all it costs to start a University?
206
+ Why don’t we just start our own?” Her husband nodded. The president’s face wilted
207
+ in confusion and bewilderment. And Mr. and Mrs. Leland Stanford walked away, traveling
208
+ to Palo Alto, California, where they established the University that bears their
209
+ name, a memorial to a son that Harvard no longer cared about. The very premise
210
+ of the tale was completely implausible. Leland Stanford (1824-93) was one of the
211
+ most prominent men of his time in America: He was a wealthy railroad magnate who
212
+ built the Central Pacific Railroad (and drove the gold spike to symbolize the
213
+ completion of the first transcontinental rail line at Promontory Summit, Utah,
214
+ in 1869), as well as a Republican Party leader who served as California’s eighth
215
+ governor (1862-63) and later represented that state in the U.S. Senate (1885-93).
216
+ He was an imposing figure, hardly the type of man to dress in a “homespun threadbare
217
+ suit,” walk “timidly” into someone’s office without an appointment, and sit cooling
218
+ his heels “for hours” until someone deigned to see him. Harvard’s president would
219
+ had to have been an ignorant buffoon not to recognize Stanford’s name and promptly
220
+ greet him upon hearing of his arrival: Moreover, the Stanfords’ only son (Leland
221
+ Stanford, Jr.) died of typhoid fever at age 15, in Florence, Italy. His death
222
+ would hardly have been described as “accidental,” nor had he spent a year studying
223
+ at Harvard while barely into his teens: The family was in Italy in 1884 when
224
+ Leland contracted typhoid fever. He was thought to be recovering, but on March
225
+ 13 at the Hotel Bristol in Florence, Leland’s bright and promising young life
226
+ came to an end, a few weeks before his 16th birthday. Stanford, who had remained
227
+ at Lelands’ bedside continuously, fell into a troubled sleep the morning the boy
228
+ died. When he awakened he turned to his wife and said, “The children of California
229
+ shall be our children.” These words were the real beginning of Stanford University.
230
+ The closest this story came to reality was in its acknowledgement that in 1884,
231
+ a few month’s after their son’s death, the Stanfords did pay a visit to Harvard
232
+ and met with that institution’s president, Charles Eliot. However, the couple
233
+ did not go there with the purpose of donating a building to Harvard as a memorial
234
+ to their dead son — they intended to establish some form of educational facility
235
+ of their own in northern California, and so they visited several prominent Eastern
236
+ schools to gather ideas and suggestions about what they might build, as Stanford’s
237
+ website described the meeting: The Stanfords … visited Cornell, Yale, Harvard
238
+ and Massachusetts Institute of Technology. They talked with President Eliot of
239
+ Harvard about three ideas: a university at Palo Alto, a large institution in San
240
+ Francisco combining a lecture hall and a museum, and a technical school. They
241
+ asked him which of these seemed most desirable and President Eliot answered, a
242
+ university. Mrs. Stanford then asked him how much the endowment should be, in
243
+ addition to land and buildings, and he replied, not less than $5 million. A silence
244
+ followed and Mrs. Stanford looked grave. Finally, Mr. Stanford said with a smile,
245
+ “Well, Jane, we could manage that, couldn’t we?” and Mrs. Stanford nodded her
246
+ assent. They settled on creating a great university, one that, from the outset,
247
+ was untraditional: coeducational, in a time when most were all-male; nondenominational,
248
+ when most were associated with a religious organization; avowedly practical, producing
249
+ “cultured and useful citizens” when most were concerned only with the former.
250
+ Although they consulted with several of the presidents of leading institutions,
251
+ the founders were not content to model their university after eastern schools.
252
+ The Stanfords did found their university, modeled after Cornell and located on
253
+ the grounds of their horse-trotting farm, in memory of their son (hence the school’s
254
+ official name of “Leland Stanford Junior University”) — not because they were
255
+ rudely rebuffed by Harvard’s president, but rather because it was what they had
256
+ planned all along. The “rudely-spurned university endowment” theme of the Stanford
257
+ story has reportedly played out at least once in real life. In July 1998, William
258
+ Lindsay of Las Vegas said he contacted an unnamed Scottish institution of higher
259
+ learning by telephone and told them he intended to give some money to a university
260
+ in Scotland. Taking him for a crank, the person he spoke to rudely dismissed him.
261
+ His next call to Glasgow University met with a warmer reception, and in March
262
+ 2000 that school received a check for £1.2 million, enough to endow a professorship
263
+ in Lindsay’s name.'
264
+ sentences:
265
+ - Early study results suggest 2 Ebola treatments saving lives.
266
+ - '"""Honduras """"bans citizens from owning guns"""" and has the """"highest homicide
267
+ rate in the entire world."""" Switzerland, with a similar population, """"requires
268
+ citizens to own guns"""" and has the """"lowest homicide rate in the entire world."""'
269
+ - Pat Robertson asserted the Orlando nightclub shooting was God's punishment for
270
+ legalizing same-sex marriage.
271
+ - source_sentence: '"""A chain message circulating on messaging apps claims the United
272
+ States is about to enter a period of federally mandated quarantine. The source:
273
+ """"my aunt’s friend"""" who works for the government. There is no evidence of
274
+ this. The message, which a reader sent us a screenshot of on March 16, appears
275
+ in a group chat on iMessage. The sender claims to have information from """"my
276
+ aunt''s friend"""" who works for the Centers for Disease Control and Prevention
277
+ and """"just got out of a meeting with Trump."""" """"He’s announcing tomorrow
278
+ that the U.S. is going into quarantine for the next 14 days,"""" the message reads.
279
+ """"Meaning everyone needs to stay in their homes/where they are."""" We’ve seen
280
+ screenshots of similar messages circulating on WhatsApp, a private messaging app
281
+ that’s popular abroad. Misinformation tends to get passed around via chain messages
282
+ during major news events, so we looked into this one. (Screenshots) There is no
283
+ evidence that the federal government is set to announce a nationwide lockdown
284
+ like the ones seen in France, Italy and Spain. President Donald Trump and the
285
+ National Security Council have both refuted the claim. So far, officials have
286
+ advised Americans to practice """"social distancing,"""" or avoiding crowded public
287
+ spaces. In a press conference March 16, Trump outlined several recommendations
288
+ to prevent the spread of the coronavirus. Among them is avoiding gatherings of
289
+ 10 or more people. """"My administration is recommending that all Americans, including
290
+ the young and healthy, work to engage in schooling from home when possible, avoid
291
+ gathering in groups of more than 10 people, avoid discretionary travel and avoid
292
+ eating and drinking in bars, restaurants and public food courts,"""" he said.
293
+ In response to a question, he said the administration is not considering a national
294
+ curfew or quarantine. He reiterated that point in another press conference March
295
+ 17. """"It’s a very big step. It’s something we talk about, but we haven’t decided
296
+ to do that,"""" he said. Andrew Cuomo ordered a one-mile containment zone on March
297
+ 10. Large gathering spots were closed for 14 days and National Guard troops are
298
+ delivering food to people. In the San Francisco Bay Area, local officials on March
299
+ 16 announced sweeping measures to try to contain the coronavirus. Residents of
300
+ six counties have been ordered to """"shelter in place"""" in their homes and
301
+ stay away from others as much as possible for the next three weeks. The move falls
302
+ short of a total lockdown. At the federal level, the CDC does have the power to
303
+ quarantine people who may have come in contact with someone infected by the coronavirus,
304
+ but most quarantines are done voluntarily. And decisions are usually left up to
305
+ states and localities. We reached out to the CDC for comment on the chain message,
306
+ but we haven’t heard back. The chain message is inaccurate. If you receive a chain
307
+ message that you want us to fact-check, send a screenshot to [email protected]."""'
308
+ sentences:
309
+ - Texas guard Andrew Jones diagnosed with leukemia.
310
+ - Treadmill classes mix it up with workhorse of the gym.
311
+ - Drug overdoses are now the second-most common cause of death in New Hampshire.
312
+ pipeline_tag: sentence-similarity
313
+ ---
314
+
315
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
316
+
317
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) on the [bigbio/pubhealth](https://huggingface.co/datasets/bigbio/pubhealth) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
318
+
319
+ ## Model Details
320
+
321
+ ### Model Description
322
+ - **Model Type:** Sentence Transformer
323
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision 8b3219a92973c328a8e22fadcfa821b5dc75636a -->
324
+ - **Maximum Sequence Length:** 256 tokens
325
+ - **Output Dimensionality:** 384 tokens
326
+ - **Similarity Function:** Cosine Similarity
327
+ - **Training Dataset:**
328
+ - [bigbio/pubhealth](https://huggingface.co/datasets/bigbio/pubhealth)
329
+ - **Language:** en
330
+ <!-- - **License:** Unknown -->
331
+
332
+ ### Model Sources
333
+
334
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
335
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
336
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
337
+
338
+ ### Full Model Architecture
339
+
340
+ ```
341
+ SentenceTransformer(
342
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
343
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
344
+ (2): Normalize()
345
+ )
346
+ ```
347
+
348
+ ## Usage
349
+
350
+ ### Direct Usage (Sentence Transformers)
351
+
352
+ First install the Sentence Transformers library:
353
+
354
+ ```bash
355
+ pip install -U sentence-transformers
356
+ ```
357
+
358
+ Then you can load this model and run inference.
359
+ ```python
360
+ from sentence_transformers import SentenceTransformer
361
+
362
+ # Download from the 🤗 Hub
363
+ model = SentenceTransformer("vladargunov/pubhealth-sentence-similarity")
364
+ # Run inference
365
+ sentences = [
366
+ '"""A chain message circulating on messaging apps claims the United States is about to enter a period of federally mandated quarantine. The source: """"my aunt’s friend"""" who works for the government. There is no evidence of this. The message, which a reader sent us a screenshot of on March 16, appears in a group chat on iMessage. The sender claims to have information from """"my aunt\'s friend"""" who works for the Centers for Disease Control and Prevention and """"just got out of a meeting with Trump."""" """"He’s announcing tomorrow that the U.S. is going into quarantine for the next 14 days,"""" the message reads. """"Meaning everyone needs to stay in their homes/where they are."""" We’ve seen screenshots of similar messages circulating on WhatsApp, a private messaging app that’s popular abroad. Misinformation tends to get passed around via chain messages during major news events, so we looked into this one. (Screenshots) There is no evidence that the federal government is set to announce a nationwide lockdown like the ones seen in France, Italy and Spain. President Donald Trump and the National Security Council have both refuted the claim. So far, officials have advised Americans to practice """"social distancing,"""" or avoiding crowded public spaces. In a press conference March 16, Trump outlined several recommendations to prevent the spread of the coronavirus. Among them is avoiding gatherings of 10 or more people. """"My administration is recommending that all Americans, including the young and healthy, work to engage in schooling from home when possible, avoid gathering in groups of more than 10 people, avoid discretionary travel and avoid eating and drinking in bars, restaurants and public food courts,"""" he said. In response to a question, he said the administration is not considering a national curfew or quarantine. He reiterated that point in another press conference March 17. """"It’s a very big step. It’s something we talk about, but we haven’t decided to do that,"""" he said. Andrew Cuomo ordered a one-mile containment zone on March 10. Large gathering spots were closed for 14 days and National Guard troops are delivering food to people. In the San Francisco Bay Area, local officials on March 16 announced sweeping measures to try to contain the coronavirus. Residents of six counties have been ordered to """"shelter in place"""" in their homes and stay away from others as much as possible for the next three weeks. The move falls short of a total lockdown. At the federal level, the CDC does have the power to quarantine people who may have come in contact with someone infected by the coronavirus, but most quarantines are done voluntarily. And decisions are usually left up to states and localities. We reached out to the CDC for comment on the chain message, but we haven’t heard back. The chain message is inaccurate. If you receive a chain message that you want us to fact-check, send a screenshot to [email\xa0protected]."""',
367
+ 'Drug overdoses are now the second-most common cause of death in New Hampshire.',
368
+ 'Treadmill classes mix it up with workhorse of the gym.',
369
+ ]
370
+ embeddings = model.encode(sentences)
371
+ print(embeddings.shape)
372
+ # [3, 384]
373
+
374
+ # Get the similarity scores for the embeddings
375
+ similarities = model.similarity(embeddings, embeddings)
376
+ print(similarities.shape)
377
+ # [3, 3]
378
+ ```
379
+
380
+ <!--
381
+ ### Direct Usage (Transformers)
382
+
383
+ <details><summary>Click to see the direct usage in Transformers</summary>
384
+
385
+ </details>
386
+ -->
387
+
388
+ <!--
389
+ ### Downstream Usage (Sentence Transformers)
390
+
391
+ You can finetune this model on your own dataset.
392
+
393
+ <details><summary>Click to expand</summary>
394
+
395
+ </details>
396
+ -->
397
+
398
+ <!--
399
+ ### Out-of-Scope Use
400
+
401
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
402
+ -->
403
+
404
+ <!--
405
+ ## Bias, Risks and Limitations
406
+
407
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
408
+ -->
409
+
410
+ <!--
411
+ ### Recommendations
412
+
413
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
414
+ -->
415
+
416
+ ## Training Details
417
+
418
+ ### Training Dataset
419
+
420
+ #### bigbio/pubhealth
421
+
422
+ * Dataset: [bigbio/pubhealth](https://huggingface.co/datasets/bigbio/pubhealth)
423
+ * Size: 16,158 training samples
424
+ * Columns: <code>sentence2</code>, <code>sentence1</code>, and <code>score</code>
425
+ * Approximate statistics based on the first 1000 samples:
426
+ | | sentence2 | sentence1 | score |
427
+ |:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------|
428
+ | type | string | string | int |
429
+ | details | <ul><li>min: 91 tokens</li><li>mean: 246.21 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 21.43 tokens</li><li>max: 96 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
430
+ * Samples:
431
+ | sentence2 | sentence1 | score |
432
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------|:---------------|
433
+ | <code>"""Hillary Clinton is in the political crosshairs as the author of a new book alleges improper financial ties between her public and personal life. At issue in conservative author Peter Schweizer’s forthcoming book Clinton Cash are donations from foreign governments to the Clinton Foundation during the four years she served as secretary of state. George Stephanopoulos used an interview with Schweizer on ABC This Week to point out what other nonpartisan journalists have found: There is no """"smoking gun"""" showing that donations to the foundation influenced her foreign policy decisions. Still, former Republican House Speaker Newt Gingrich says the donations are """"clearly illegal"""" under federal law. In his view, a donation by a foreign government to the Clinton Foundation while Clinton was secretary of state is the same as money sent directly to her, he said, even though she did not join the foundation’s board until she left her post. """"The Constitution of the United States says you cannot take money from foreign governments without explicit permission of the Congress. They wrote that in there because they knew the danger of corrupting our system by foreign money is enormous,"""" Gingrich said. """"You had a sitting secretary of state whose husband radically increased his speech fees, you have a whole series of dots on the wall now where people gave millions of dollars — oh, by the way, they happen to get taken care of by the State Department."""" He continued, """"My point is they took money from foreign governments while she was secretary of State. That is clearly illegal."""" PunditFact wanted to know if a criminal case against Clinton is that open and shut. Is what happened """"clearly illegal""""? A spokesman for the Clinton Foundation certainly disagreed, calling Gingrich’s accusation """"a baseless leap"""" because Clinton was not part of her husband’s foundation while serving as a senator or secretary of state. We did not hear from Gingrich by our deadline. Foundation basics Former President Clinton started the William J. Clinton Foundation in 2001, the year after Hillary Clinton won her first term as a New York senator. The foundation works with non-governmental organizations, the private sector and governments around the world on health, anti-poverty, HIV/AIDS and climate change initiatives. Spokesman Craig Minassian said it’s reasonable for the foundation to accept money from foreign governments because of the global scope of its programs, and the donations are usually in the form of tailored grants for specific missions. Hillary Clinton was not part of her husband’s foundation while she was a senator or secretary of state. Her appointment to the latter post required Senate confirmation and came with an agreement between the White House and Clinton Foundation that the foundation would be more transparent about its donors. According to the 2008 memorandum of understanding, the foundation would release information behind new donations and could continue to collect donations from countries with which it had existing relationships or running grant programs. If countries with existing contributions significantly stepped up their contributions, or if a new foreign government wanted to donate, the State Department would have to approve. Clinton took an active role in fundraising when she left the State Department and the foundation became the Bill, Hillary & Chelsea Clinton Foundation in 2013. But she left the board when she announced her run for the presidency in April 2015. The Emoluments Clause So how does Gingrich come up with the claim that Clinton Foundation donations are """"clearly illegal"""" and unconstitutional? The answer is something known as the Emoluments Clause. A few conservative websites have made similar arguments in recent days, including the Federalist blog. The Emoluments Clause, found in Article 1, Section 9 of the Constitution, reads in part: """"No Title of Nobility shall be granted by the United States: And no Person holding any Office of Profit or Trust under them, shall, without the Consent of the Congress, accept of any present, Emolument, Office, or Title, of any kind whatever, from any King, Prince, or foreign State."""" The framers came up with this clause to prevent the government and leaders from granting or receiving titles of nobility and to keep leaders free of external influence. (An emolument, per Merriam-Webster Dictionary, is """"the returns arising from office or employment usually in the form of compensation or perquisites."""") Lest you think the law is no longer relevant, the Pentagon ethics office in 2013 warned employees the """"little known provision"""" applies to all federal employees and military retirees. There’s no mention of spouses in the memo. J. Peter Pham, director of the Atlantic Council’s Africa Center, said interpretation of the clause has evolved since its adoption at the Constitutional Convention, when the primary concern was about overseas diplomats not seeking gifts from foreign powers they were dealing with. The Defense Department memo, in his view, goes beyond what the framers envisioned for the part of the memo dealing with gifts. """"I think that, aside from the unambiguous parts, the burden would be on those invoking the clause to show actual causality that would be in violation of the clause,"""" Pham said. Expert discussion We asked seven different constitutional law experts on whether the Clinton Foundation foreign donations were """"clearly illegal"""" and a violation of the Emoluments Clause. We did not reach a consensus with their responses, though a majority thought the layers of separation between the foundation and Hillary Clinton work against Gingrich. The American system often distinguishes between public officers and private foundations, """"even if real life tends to blur some of those distinctions,"""" said American University law professor Steve Vladeck. Vladeck added that the Emoluments Clause has never been enforced. """"I very much doubt that the first case in its history would be because a foreign government made charitable donations to a private foundation controlled by a government employee’s relative,"""" he said. """"Gingrich may think that giving money to the Clinton Foundation and giving money to then-Secretary Clinton are the same thing. Unfortunately for him, for purposes of federal regulations, statutes, and the Constitution, they’re formally — and, thus, legally — distinct."""" Robert Delahunty, a University of St. Thomas constitutional law professor who worked in the Justice Department’s Office of Legal Counsel from 1989 to 2003, also called Gingrich’s link between Clinton and the foreign governments’ gifts to the Clinton Foundation as """"implausible, and in any case I don’t think we have the facts to support it."""" """"The truth is that we establish corporate bodies like the Clinton Foundation because the law endows these entities with a separate and distinct legal personhood,"""" Delahunty said. John Harrison, University of Virginia law professor and former deputy assistant attorney general in the Office of Legal Counsel from 1990 to 1993, pointed to the Foreign Gifts Act, 5 U.S.C. 7432, which sets rules for how the Emoluments Clause should work in practice. The statute spells out the minimal value for acceptable gifts, and says it applies to spouses of the individuals covered, but """"it doesn’t say anything about receipt of foreign gifts by other entities such as the Clinton Foundation."""" """"I don’t know whether there’s any other provision of federal law that would treat a foreign gift to the foundation as having made to either of the Clintons personally,"""" Harrison said, who added that agencies have their own supplemental rules for this section, and he did not know if the State Department addressed this. Other experts on the libertarian side of the scale thought Gingrich was more right in his assertion. Clinton violates the clause because of its intentionally broad phrasing about gifts of """"any kind whatever,"""" which would cover indirect gifts via the foundation, said Dave Kopel, a constitutional law professor at Denver University and research director at the libertarian Independence Institute. Kopel also brought up bribery statutes, which would require that a gift had some influence in Clinton’s decision while secretary of state. Delahunty thought Kopel’s reasoning would have """"strange consequences,"""" such as whether a state-owned airline flying Bill Clinton to a conference of former heads of state counted as a gift to Hillary Clinton. Our ruling Gingrich said the Clinton Foundation """"took money from from foreign governments while (Hillary Clinton) was secretary of state. It is clearly illegal. … The Constitution says you can’t take this stuff."""" A clause in the Constitution does prohibit U.S. officials such as former Secretary of State Hillary Clinton from receiving gifts, or emoluments, from foreign governments. But the gifts in this case were donations from foreign governments that went to the Clinton Foundation, not Hillary Clinton. She was not part of the foundation her husband founded while she was secretary of state. Does that violate the Constitution? Some libertarian-minded constitutional law experts say it very well could. Others are skeptical. What’s clear is there is room for ambiguity, and the donations are anything but """"clearly illegal."""" The reality is this a hazy part of U.S. constitutional law."</code> | <code>Britain plans for opt-out organ donation scheme to save lives.</code> | <code>0</code> |
434
+ | <code>The story does discuss costs, but the framing is problematic. The story, based on a conversation with one source, the study’s lead investigator, says, “It’s difficult at this point to predict costs. However, he expects costs will not approach those for Provenge, the pricey treatment vaccine for prostate cancer approved by the FDA in 2010. Provenge costs $93,000 for the one-month, three-dose treatment. Medicare covers it.” This tells readers that, no matter what the drug costs, Medicare likely will cover it. We appreciate the effort to bring cost information into the story, but this type of information is misleading. The story does explain that only one patient remains cancer free following the study. It then details how for most of the patients cancer continued to progress after 2 months. It says that the median overall survival in both the breast cancer and ovarian cancer patients was less than 16 months. But the story is framed in such a way to highlight the one potentially positive outcome of the study and to downplay the negative. We read more sooner about the one patient who may have responded well to the vaccine than we do about the 25 other patients who did not. The story mentions side effects in a satisfactory way. Technically, the story provides readers with much of the information they would need to assess the validity of the study, but it comes out in bits and pieces. For example, we only find out near the end of the story that “The woman, who remains disease-free, had a previous treatment with a different treatment vaccine. ‘That might have primed her immune system,’ Gulley speculates. She also had only one regimen of chemotherapy, perhaps keeping her immune system stronger.” This casts much doubt on the study’s design, and it would have been nice to have seen some outside expertise brought in to either discuss those design problems or to torpedo the story altogether. Again, the story deserves high marks for being very specific in the lead and throughout the story. It says, that the vaccine is “for breast and ovarian cancer that has spread to other parts of the body” in the lead and later details the particular circumstances of the study cohort. It says, “The patients had already undergone a variety of treatments but the cancer was progressing. Twenty one of the 26 had undergone three or more chemotherapy regimens.” This is the root of the story’s main shortcoming. Almost all of the information in the story comes from one source: Dr. James Gulley, who oversaw the study. Gulley is quite enthusiastic about this vaccine, despite the evidence, and the story needed more perspectives to put this vaccine into a broader context. At the very end, there are a few comments from Dr. Vincent K. Tuohy, who also is working on a breast cancer vaccine. Because of his competing research, he seems to have a conflict, but even putting that aside, his comments were not used to their best effect. There was no comparison in the story to existing alternatives. The median survival, for example, is presented without the context of how long these patients might have lived had they been undergoing standard chemotherapy and radiation treatments. We give high marks to the story for saying right in the lead that the findings are from “a preliminary study in 26 patients.” That tells readers both that the findings need to be interpreted with caution and that the treatment is not available to most people. The concept of vaccines for breast/ovarian cancer is indeed novel, and the story acknowledges that other vaccines are being studied. The story does not rely on a news release.</code> | <code>Virus raises specter of gravest attacks in modern US times.</code> | <code>0</code> |
435
+ | <code>"""Although the story didn’t cite the cost of appendectomy – emergency or urgent surgery – and we wish it had, we nonetheless will give it a satisfactory score because it at least cited what the editorial writer wrote, """"A secondary benefit is the savings to the hospital generated by minimizing staff and anesthesiologist presence late in the evening and during the wee hours of the morning."""" As with our harms score above, although the story didn’t give absolute numbers, in this case we think it was sufficient for it to report that """"The scientists found no significant difference among the groups in the patients’ condition 30 days after surgery or in the length of their operation or hospital stay."""" Although the story didn’t give absolute numbers, in this case we think it was sufficient for it to report that """"The scientists found no significant difference among the groups in the patients’ condition 30 days after surgery or in the length of their operation or hospital stay."""" Despite running less than 300 words, this story did an adequate job in explaining the quality of the evidence, including pointing out limitations. No disease-mongering here. The story meets the bare minimum requirement for this criterion in that it at least cited what an editorial stated. The focus of the story was on a study comparing emergency appendectomy with surgery done up to 12 hours later or beyond. This is the whole focus of the story – and one we applaud – when it begins:  """"Appendectomy is the most common emergency surgery in the world, but it doesn’t have to be."""" There were no claims made about the novelty of this research, and we may have wished for a bit more context on this. Nonetheless, the potential for guiding future care decisions was made clear. Not applicable. Given that the story only pulled excerpts from the journal article and the accompanying editorial, and didn’t include any fresh quotes from interviews, we can’t be sure of the extent to which it may have been influenced by a news release."""</code> | <code>Legionnaires’ case identified at Quincy veterans’ home.</code> | <code>0</code> |
436
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
437
+ ```json
438
+ {
439
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
440
+ }
441
+ ```
442
+
443
+ ### Training Hyperparameters
444
+ #### Non-Default Hyperparameters
445
+
446
+ - `per_device_train_batch_size`: 128
447
+ - `learning_rate`: 2e-05
448
+ - `num_train_epochs`: 10
449
+ - `warmup_ratio`: 0.1
450
+ - `batch_sampler`: no_duplicates
451
+
452
+ #### All Hyperparameters
453
+ <details><summary>Click to expand</summary>
454
+
455
+ - `overwrite_output_dir`: False
456
+ - `do_predict`: False
457
+ - `eval_strategy`: no
458
+ - `prediction_loss_only`: True
459
+ - `per_device_train_batch_size`: 128
460
+ - `per_device_eval_batch_size`: 8
461
+ - `per_gpu_train_batch_size`: None
462
+ - `per_gpu_eval_batch_size`: None
463
+ - `gradient_accumulation_steps`: 1
464
+ - `eval_accumulation_steps`: None
465
+ - `learning_rate`: 2e-05
466
+ - `weight_decay`: 0.0
467
+ - `adam_beta1`: 0.9
468
+ - `adam_beta2`: 0.999
469
+ - `adam_epsilon`: 1e-08
470
+ - `max_grad_norm`: 1.0
471
+ - `num_train_epochs`: 10
472
+ - `max_steps`: -1
473
+ - `lr_scheduler_type`: linear
474
+ - `lr_scheduler_kwargs`: {}
475
+ - `warmup_ratio`: 0.1
476
+ - `warmup_steps`: 0
477
+ - `log_level`: passive
478
+ - `log_level_replica`: warning
479
+ - `log_on_each_node`: True
480
+ - `logging_nan_inf_filter`: True
481
+ - `save_safetensors`: True
482
+ - `save_on_each_node`: False
483
+ - `save_only_model`: False
484
+ - `restore_callback_states_from_checkpoint`: False
485
+ - `no_cuda`: False
486
+ - `use_cpu`: False
487
+ - `use_mps_device`: False
488
+ - `seed`: 42
489
+ - `data_seed`: None
490
+ - `jit_mode_eval`: False
491
+ - `use_ipex`: False
492
+ - `bf16`: False
493
+ - `fp16`: False
494
+ - `fp16_opt_level`: O1
495
+ - `half_precision_backend`: auto
496
+ - `bf16_full_eval`: False
497
+ - `fp16_full_eval`: False
498
+ - `tf32`: None
499
+ - `local_rank`: 0
500
+ - `ddp_backend`: None
501
+ - `tpu_num_cores`: None
502
+ - `tpu_metrics_debug`: False
503
+ - `debug`: []
504
+ - `dataloader_drop_last`: False
505
+ - `dataloader_num_workers`: 0
506
+ - `dataloader_prefetch_factor`: None
507
+ - `past_index`: -1
508
+ - `disable_tqdm`: False
509
+ - `remove_unused_columns`: True
510
+ - `label_names`: None
511
+ - `load_best_model_at_end`: False
512
+ - `ignore_data_skip`: False
513
+ - `fsdp`: []
514
+ - `fsdp_min_num_params`: 0
515
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
516
+ - `fsdp_transformer_layer_cls_to_wrap`: None
517
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
518
+ - `deepspeed`: None
519
+ - `label_smoothing_factor`: 0.0
520
+ - `optim`: adamw_torch
521
+ - `optim_args`: None
522
+ - `adafactor`: False
523
+ - `group_by_length`: False
524
+ - `length_column_name`: length
525
+ - `ddp_find_unused_parameters`: None
526
+ - `ddp_bucket_cap_mb`: None
527
+ - `ddp_broadcast_buffers`: False
528
+ - `dataloader_pin_memory`: True
529
+ - `dataloader_persistent_workers`: False
530
+ - `skip_memory_metrics`: True
531
+ - `use_legacy_prediction_loop`: False
532
+ - `push_to_hub`: False
533
+ - `resume_from_checkpoint`: None
534
+ - `hub_model_id`: None
535
+ - `hub_strategy`: every_save
536
+ - `hub_private_repo`: False
537
+ - `hub_always_push`: False
538
+ - `gradient_checkpointing`: False
539
+ - `gradient_checkpointing_kwargs`: None
540
+ - `include_inputs_for_metrics`: False
541
+ - `eval_do_concat_batches`: True
542
+ - `fp16_backend`: auto
543
+ - `push_to_hub_model_id`: None
544
+ - `push_to_hub_organization`: None
545
+ - `mp_parameters`:
546
+ - `auto_find_batch_size`: False
547
+ - `full_determinism`: False
548
+ - `torchdynamo`: None
549
+ - `ray_scope`: last
550
+ - `ddp_timeout`: 1800
551
+ - `torch_compile`: False
552
+ - `torch_compile_backend`: None
553
+ - `torch_compile_mode`: None
554
+ - `dispatch_batches`: None
555
+ - `split_batches`: None
556
+ - `include_tokens_per_second`: False
557
+ - `include_num_input_tokens_seen`: False
558
+ - `neftune_noise_alpha`: None
559
+ - `optim_target_modules`: None
560
+ - `batch_eval_metrics`: False
561
+ - `batch_sampler`: no_duplicates
562
+ - `multi_dataset_batch_sampler`: proportional
563
+
564
+ </details>
565
+
566
+ ### Training Logs
567
+ | Epoch | Step | Training Loss |
568
+ |:------:|:----:|:-------------:|
569
+ | 0.7874 | 100 | 0.0603 |
570
+ | 1.5748 | 200 | 0.131 |
571
+ | 2.3622 | 300 | 0.1188 |
572
+ | 3.1496 | 400 | 0.1173 |
573
+ | 3.9370 | 500 | 0.0551 |
574
+ | 4.7244 | 600 | 0.0622 |
575
+ | 5.5118 | 700 | 0.0454 |
576
+ | 6.2992 | 800 | 0.0521 |
577
+ | 7.0866 | 900 | 0.0478 |
578
+ | 7.8740 | 1000 | 0.0403 |
579
+ | 8.6614 | 1100 | 0.035 |
580
+ | 9.4488 | 1200 | 0.0386 |
581
+
582
+
583
+ ### Framework Versions
584
+ - Python: 3.10.13
585
+ - Sentence Transformers: 3.0.1
586
+ - Transformers: 4.41.2
587
+ - PyTorch: 2.1.2
588
+ - Accelerate: 0.30.1
589
+ - Datasets: 2.19.2
590
+ - Tokenizers: 0.19.1
591
+
592
+ ## Citation
593
+
594
+ ### BibTeX
595
+
596
+ #### Sentence Transformers
597
+ ```bibtex
598
+ @inproceedings{reimers-2019-sentence-bert,
599
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
600
+ author = "Reimers, Nils and Gurevych, Iryna",
601
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
602
+ month = "11",
603
+ year = "2019",
604
+ publisher = "Association for Computational Linguistics",
605
+ url = "https://arxiv.org/abs/1908.10084",
606
+ }
607
+ ```
608
+
609
+ <!--
610
+ ## Glossary
611
+
612
+ *Clearly define terms in order to be accessible across audiences.*
613
+ -->
614
+
615
+ <!--
616
+ ## Model Card Authors
617
+
618
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
619
+ -->
620
+
621
+ <!--
622
+ ## Model Card Contact
623
+
624
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
625
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-MiniLM-L6-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:158a5195687680cab093367c230b775f743d85b565b59e626a207ed75151db0d
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 128,
50
+ "model_max_length": 256,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff