nevmenandr commited on
Commit
8b4abc3
1 Parent(s): 2d3e2cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ The texts for the training corpus are taken from two datasets published in the [
20
 
21
  Казакова, Елена, 2023, "[Забытые романы русских писателей из фондов Пушкинского Дома (1857–1917)](https://dataverse.pushdom.ru/dataset.xhtml?persistentId=doi:10.31860/openlit-2023.12-C007)", https://doi.org/10.31860/openlit-2023.12-C007, Репозиторий открытых данных по русской литературе и фольклору, V2, UNF:6:DCGrSrMDXXtoRfHBDWfS4A== [fileUNF]
22
 
23
- Only texts published after 1835 (the era of realism) remain in the corpus.
24
 
25
  The texts are marked up using the Russian version of the [booknlp](https://github.com/booknlp/booknlp) library, which highlighted the characters of the fictional works.
26
 
 
20
 
21
  Казакова, Елена, 2023, "[Забытые романы русских писателей из фондов Пушкинского Дома (1857–1917)](https://dataverse.pushdom.ru/dataset.xhtml?persistentId=doi:10.31860/openlit-2023.12-C007)", https://doi.org/10.31860/openlit-2023.12-C007, Репозиторий открытых данных по русской литературе и фольклору, V2, UNF:6:DCGrSrMDXXtoRfHBDWfS4A== [fileUNF]
22
 
23
+ Only texts published after 1835 (the era of realism) remain in the corpus. Texts presented in old orthography have been converted to modern orthography with the help of a [package](https://pypi.org/project/prereform2modern/).
24
 
25
  The texts are marked up using the Russian version of the [booknlp](https://github.com/booknlp/booknlp) library, which highlighted the characters of the fictional works.
26