Docspell v0.19.0 Release Notes

  • Jan 25, 2021

    ๐Ÿš€ This release comes with major improvements to the text analysis ๐Ÿ”ง module. It is now much more configurable, has improved results and can learn tags from all categories. Additionally, more languages for document processing have been added and it's now easier to add more. Please open an issue if want more languages to be included.

    • text analysis improvements (#263, #570)
      • docspell can now learn from all your tag categories
      • the detection for correspondents/concerned entities has been improved by using the classifier for this, too
      • all text analysis steps are now configurable that makes it possible to adapt it better to your data and machine.
      • The docs have been updated with some details here and here.
    • more languages (#488)
      • Adds: Spanish, Italian, Portuguese, Czech, Dutch, Danish, Finnish, Norwegian, Swedish, Russian, Romanian
      • languages have different support for text-analysis, but there is some basic support for all
      • there is extended support for English, German and French through Stanford CoreNLP nlp models (as before)
    • scan mailbox change (#576)
      • The change from last version (#551) has been moved behind a flag in the "scan mailbox settings". Please review your scan mailbox tasks in your user settings.
      • The scan mailbox settings form view has been organized into tabs, as it grew too large for a single form.
    • ๐Ÿ“ฆ nix tools package fixed (#584)
      • If you are using docspell tools package for nix, it has now been fixed in that all scripts are available. They are now all prefixed by ds- (except the ds script)
    • ๐Ÿ›  fix deleting organization (#578)
      • Due to the new relationship of a person to an organization, deleting an organization whith references a person was not possible. This is now fixed.
    • base url fix (#579)
      • The baseurl setting is optional, but when specified it was required to omit a trailing slash. This is now fixed in that it is always rendered without the trailing slash to the client, no matter what is in the config
    • ๐Ÿท tag category case sensitive search fix (#568)
      • This was a bug introduced by the last release. When tag categories can now be spelled upper- or lower-case. In 0.18.0 you had to spell them lowercase, otherwise the search doesn't work.
    • โž• adds a workaround for mails that don't specify their used charset (#591)

    ๐Ÿ’ฅ Breaking Changes

    • ๐Ÿ”ง The joex configuration changed around text analysis. If you had some custom settings there, please review these wrt the new default config.
    • ๐Ÿ“ฆ When using the nix package manager: the tools package renamed the scripts to be better distinguishable, since they all end up in $PATH. They are now prefixed by ds-.
    • ๐Ÿณ The path of the consumedir script changed in the consumedir docker image
    • The settings of the scan-mailbox task has been extended by another flag. It controls when to apply the post-processing (moving or deleting). If you were relying that all mails (even those excluded by a subject filter) where moved away, you need to check your scan-mailbox task settings.

    REST Api Changes

    • the data structure for ClassifierSettings changed to allow specfiying a blacklist or whitelist of tag categories and the enabled flag has been removed.

    ๐Ÿ”ง Configuration Changes

    • joex
      • the config regarding text analysis changed, there are new config options, like nlp.mode and the max-due-date-years has been moved inside text-anlysis. Please have a look at the new default config if you changed something there.
      • The regex-ner section has changed: the enabled flag has been removed, you can now limit the number of entries using max-entries to apply and 0 means to disable it.