Report for Sigma/financial-sentiment-analysis

#65
by inoki-giskard - opened

Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊

We have identified 8 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_50agree, split train).

👉Robustness issues (3)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.396 Transform to title case 396/1000 tested samples (39.6%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 39.6% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to title case(text) Original prediction Prediction after perturbation
996 These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . These Moderate But Significant Changes Resulted In A Significant 24-32 % Reduction In The Estimated Cvd Risk . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
4662 Cash flow after investments amounted to EUR45m , down from EUR46m . Cash Flow After Investments Amounted To Eur45M , Down From Eur46M . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
300 The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . The Stock Rose For A Second Day On Wednesday Bringing Its Two-Day Rise To Gbx12 .0 Or 2.0 % . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.392 Transform to uppercase 392/1000 tested samples (39.2%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 39.2% of the cases. We expected the predictions not to be affected by this transformation.
text Transform to uppercase(text) Original prediction Prediction after perturbation
996 These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . THESE MODERATE BUT SIGNIFICANT CHANGES RESULTED IN A SIGNIFICANT 24-32 % REDUCTION IN THE ESTIMATED CVD RISK . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
4662 Cash flow after investments amounted to EUR45m , down from EUR46m . CASH FLOW AFTER INVESTMENTS AMOUNTED TO EUR45M , DOWN FROM EUR46M . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
300 The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . THE STOCK ROSE FOR A SECOND DAY ON WEDNESDAY BRINGING ITS TWO-DAY RISE TO GBX12 .0 OR 2.0 % . LABEL_2 (p = 1.00) LABEL_1 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Robustness major 🔴 Fail rate = 0.109 Add typos 109/1000 tested samples (10.9%) changed prediction after perturbation
🔍✨Examples When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 10.9% of the cases. We expected the predictions not to be affected by this transformation.
text Add typos(text) Original prediction Prediction after perturbation
982 The financial impact is estimated to be an annual improvement of EUR2 .0 m in the division 's results , as of fiscal year 2008 . The fjinancial impafct is wstimated to be an annaul improvement of EUR2 .0 m in the ivision 's results , az of fisca year 2008 LABEL_2 (p = 1.00) LABEL_1 (p = 0.98)
1289 NASDAQ-listed Yahoo Inc has introduced a new service that enables Malaysians to take their favorite Internet content and services with them on their mobile phones . NASDAQ-listed Yahoo Inc has intrlduced a new servjce that enablez Malaysians to take their favorite Internet content and serfices with them on their mobile phones . LABEL_2 (p = 0.74) LABEL_1 (p = 1.00)
4561 The Baltimore Police and Fire Pension , which has about $ 1.5 billion , lost about $ 3.5 million in Madoff Ponzi scheme . The Baltimore Pokice and Fire Penxsion , which ha about $ 1.5 bioilon , olst about $ 3.5 million in Madoff Ponzi scheme . LABEL_0 (p = 1.00) LABEL_1 (p = 1.00)
👉Performance issues (5)
Vulnerability Level Data slice Metric Transformation Deviation
Performance major 🔴 avg_digits(text) < 0.038 Balanced Accuracy = 0.731 -10.48% than global
🔍✨Examples For records in the dataset where `avg_digits(text)` < 0.038, the Balanced Accuracy is 10.48% lower than the global Balanced Accuracy.
text avg_digits(text) label Predicted label
12 A purchase agreement for 7,200 tons of gasoline with delivery at the Hamina terminal , Finland , was signed with Neste Oil OYj at the average Platts index for this September plus eight US dollars per month . 0.0193237 LABEL_2 LABEL_1 (p = 1.00)
21 ( Filippova ) A trilateral agreement on investment in the construction of a technology park in St Petersburg was to have been signed in the course of the forum , Days of the Russian Economy , that opened in Helsinki today . 0 LABEL_2 LABEL_1 (p = 0.99)
42 Nyrstar has also agreed to supply to Talvivaara up to 150,000 tonnes of sulphuric acid per annum for use in Talvivaara 's leaching process during the period of supply of the zinc in concentrate . 0.0307692 LABEL_2 LABEL_1 (p = 0.99)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_word_length(text) >= 4.597 AND avg_word_length(text) < 4.707 Balanced Accuracy = 0.737 -9.72% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` >= 4.597 AND `avg_word_length(text)` < 4.707, the Balanced Accuracy is 9.72% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
42 Nyrstar has also agreed to supply to Talvivaara up to 150,000 tonnes of sulphuric acid per annum for use in Talvivaara 's leaching process during the period of supply of the zinc in concentrate . 4.6 LABEL_2 LABEL_1 (p = 0.99)
79 TELECOMWORLDWIRE-7 April 2006-TJ Group Plc sells stake in Morning Digital Design Oy Finnish IT company TJ Group Plc said on Friday 7 April that it had signed an agreement on selling its shares of Morning Digital Design Oy to Edita Oyj . 4.64286 LABEL_2 LABEL_1 (p = 0.61)
150 According to Deputy MD Pekka Silvennoinen the aim is double turnover over the next three years . 4.70588 LABEL_2 LABEL_1 (p = 0.77)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_word_length(text) >= 4.707 AND avg_word_length(text) < 5.213 Balanced Accuracy = 0.741 -9.26% than global
🔍✨Examples For records in the dataset where `avg_word_length(text)` >= 4.707 AND `avg_word_length(text)` < 5.213, the Balanced Accuracy is 9.26% lower than the global Balanced Accuracy.
text avg_word_length(text) label Predicted label
62 `` The new agreement is a continuation to theagreement signed earlier this year with the Lemminkainen Group , whereby Cramo acquired the entire construction machine fleet ofLemminkainen Talo Oy Ita - ja Pohjois Suomo , and signed asimilar agreement , '' said Tatu Hauhio , managing director ofCramo Finland . 5.18 LABEL_1 LABEL_2 (p = 0.92)
68 The contract covers HDO platform , AC800 and CXE880 optical Fttb nodes designed to increase the forward and return path capacity of the transmission networks . 5.15385 LABEL_2 LABEL_1 (p = 0.99)
74 Finnish real estate investor Sponda Plc said on Wednesday 12 March that it has signed agreements with Danske Bank A-S , Helsinki Branch for a 7-year EUR150m credit facility and with Ilmarinen Mutual Pension Insurance Company for a 7-year EUR50m credit facility . 5.11628 LABEL_1 LABEL_2 (p = 1.00)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 avg_whitespace(text) < 0.162 AND avg_whitespace(text) >= 0.148 Balanced Accuracy = 0.753 -7.75% than global
🔍✨Examples For records in the dataset where `avg_whitespace(text)` < 0.162 AND `avg_whitespace(text)` >= 0.148, the Balanced Accuracy is 7.75% lower than the global Balanced Accuracy.
text avg_whitespace(text) label Predicted label
47 The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . 0.153488 LABEL_2 LABEL_1 (p = 0.77)
62 `` The new agreement is a continuation to theagreement signed earlier this year with the Lemminkainen Group , whereby Cramo acquired the entire construction machine fleet ofLemminkainen Talo Oy Ita - ja Pohjois Suomo , and signed asimilar agreement , '' said Tatu Hauhio , managing director ofCramo Finland . 0.159091 LABEL_1 LABEL_2 (p = 0.92)
68 The contract covers HDO platform , AC800 and CXE880 optical Fttb nodes designed to increase the forward and return path capacity of the transmission networks . 0.157233 LABEL_2 LABEL_1 (p = 0.99)
Vulnerability Level Data slice Metric Transformation Deviation
Performance medium 🟡 text_length(text) >= 100.500 AND text_length(text) < 108.500 Precision = 0.805 -5.26% than global
🔍✨Examples For records in the dataset where `text_length(text)` >= 100.500 AND `text_length(text)` < 108.500, the Precision is 5.26% lower than the global Precision.
text text_length(text) label Predicted label
153 After the takeover , Cramo will become the second largest rental services provider in the Latvian market . 106 LABEL_2 LABEL_1 (p = 0.89)
298 The increase in capital stock has been registered in the Finnish Trade Register on 20 November 2006 . 101 LABEL_2 LABEL_1 (p = 1.00)
314 YIT says the acquisition is a part of its strategy for expansion in Central and Eastern European markets . 106 LABEL_2 LABEL_1 (p = 0.99)

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

💡 What's Next?

  • Checkout the Giskard Space and improve your model.
  • The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.

🙌 Big Thanks!

We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!

Sign up or log in to comment