teklia-team commited on
Commit
ae14f1d
1 Parent(s): 58700dc

Add model description and evaluation to README

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -12,3 +12,42 @@ metrics:
12
13
  - AP@[.5,.95]
14
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
13
  - AP@[.5,.95]
14
  ---
15
+
16
+
17
+ # Generic page detection
18
+
19
+ The generic page detection model predicts single pages from document images.
20
+
21
+ ## Model description
22
+
23
+ The model has been trained using the Doc-UFCN library on [Horae](https://github.com/oriflamms/HORAE/) and [READ-BAD](https://github.com/ctensmeyer/pagenet) datasets.
24
+ It has been trained on images with their largest dimension equal to 768 pixels, keeping the original aspect ratio.
25
+
26
+ ## Evaluation results
27
+
28
+ The model achieves the following results:
29
+
30
+ | | set | IoU | F1 | AP@[.5] | AP@[.75] | AP@[.5,.95] |
31
+ | ----- | -------- | ----- | ----- | ------- | -------- | ----------- |
32
+ | HOME | test | 93.92 | 95.84 | 98.98 | 98.98 | 97.61 |
33
+ | Horae | test | 96.68 | 98.31 | 99.76 | 98.49 | 98.08 |
34
+ | Horae | test-300 | 95.66 | 97.27 | 98.87 | 98.45 | 97.38 |
35
+
36
+ ## How to use
37
+
38
+ Please refer to the Doc-UFCN library page (https://pypi.org/project/doc-ufcn/) to use this model.
39
+
40
+ # Cite us!
41
+
42
+ ```bibtex
43
+ @inproceedings{boillet2020,
44
+ author = {Boillet, Mélodie and Kermorvant, Christopher and Paquet, Thierry},
45
+ title = {{Multiple Document Datasets Pre-training Improves Text Line Detection With
46
+ Deep Neural Networks}},
47
+ booktitle = {2020 25th International Conference on Pattern Recognition (ICPR)},
48
+ year = {2021},
49
+ month = Jan,
50
+ pages = {2134-2141},
51
+ doi = {10.1109/ICPR48806.2021.9412447}
52
+ }
53
+ ```