Docspell v0.19.0 Release Notes
-
Jan 25, 2021
๐ This release comes with major improvements to the text analysis ๐ง module. It is now much more configurable, has improved results and can learn tags from all categories. Additionally, more languages for document processing have been added and it's now easier to add more. Please open an issue if want more languages to be included.
- text analysis improvements (#263, #570)
- docspell can now learn from all your tag categories
- the detection for correspondents/concerned entities has been improved by using the classifier for this, too
- all text analysis steps are now configurable that makes it possible to adapt it better to your data and machine.
- The docs have been updated with some details here and here.
- more languages (#488)
- Adds: Spanish, Italian, Portuguese, Czech, Dutch, Danish, Finnish, Norwegian, Swedish, Russian, Romanian
- languages have different support for text-analysis, but there is some basic support for all
- there is extended support for English, German and French through Stanford CoreNLP nlp models (as before)
- scan mailbox change (#576)
- The change from last version (#551) has been moved behind a flag in the "scan mailbox settings". Please review your scan mailbox tasks in your user settings.
- The scan mailbox settings form view has been organized into tabs, as it grew too large for a single form.
- ๐ฆ nix tools package fixed (#584)
- If you are using docspell tools package for nix, it has now been
fixed in that all scripts are available. They are now all prefixed
by
ds-
(except theds
script)
- If you are using docspell tools package for nix, it has now been
fixed in that all scripts are available. They are now all prefixed
by
- ๐ fix deleting organization (#578)
- Due to the new relationship of a person to an organization, deleting an organization whith references a person was not possible. This is now fixed.
- base url fix (#579)
- The
baseurl
setting is optional, but when specified it was required to omit a trailing slash. This is now fixed in that it is always rendered without the trailing slash to the client, no matter what is in the config
- The
- ๐ท tag category case sensitive search fix (#568)
- This was a bug introduced by the last release. When tag categories can now be spelled upper- or lower-case. In 0.18.0 you had to spell them lowercase, otherwise the search doesn't work.
- โ adds a workaround for mails that don't specify their used charset (#591)
๐ฅ Breaking Changes
- ๐ง The joex configuration changed around text analysis. If you had some custom settings there, please review these wrt the new default config.
- ๐ฆ When using the nix package manager: the tools package renamed the
scripts to be better distinguishable, since they all end up in
$PATH
. They are now prefixed byds-
. - ๐ณ The path of the consumedir script changed in the consumedir docker image
- The settings of the scan-mailbox task has been extended by another flag. It controls when to apply the post-processing (moving or deleting). If you were relying that all mails (even those excluded by a subject filter) where moved away, you need to check your scan-mailbox task settings.
REST Api Changes
- the data structure for
ClassifierSettings
changed to allow specfiying a blacklist or whitelist of tag categories and theenabled
flag has been removed.
๐ง Configuration Changes
- joex
- the config regarding text analysis changed, there are new config
options, like
nlp.mode
and themax-due-date-years
has been moved insidetext-anlysis
. Please have a look at the new default config if you changed something there. - The
regex-ner
section has changed: theenabled
flag has been removed, you can now limit the number of entries usingmax-entries
to apply and0
means to disable it.
- the config regarding text analysis changed, there are new config
options, like
- text analysis improvements (#263, #570)