All Versions
39
Latest Version
Avg Release Cycle
14 days
Latest Release
1282 days ago

Changelog History
Page 3

  • v0.22.0 Changes

    January 23, 2020

    🐳 Docker image/tag: semitechnologies/weaviate:0.22.0
    πŸ‘€ See also: example docker compose files in english and dutch.

    Contains Breaking Change!

    πŸš€ Note: While this release contains no API-level breaking changes, the internals have changes so much, that we recommend not to simply replace your existing Weaviate container with the new one. Instead you should create a new cluster and reimport our things and actions. See changelog below for more detailed reasons why.

    πŸ’₯ Breaking Changes

    πŸ‘Œ Improve cross-reference storing strategy (#1069)
    πŸš€ Prior to this release Weaviate would build an automated cache of referenced objects. This led to very fast response time for nested queries, at the cost of large disk usage. We have since learned that disk usage can be so excessive in heavily connected graphs that the benefits don't outweigh the costs. In addition configuring cache boundaries led to unnecessary complexity.

    The major goal of 0.22.0 was to replace automated denormalization caching with a smarter strategy without losing the snappiness of cached results and the overall low latencies of queries our users have come to appreciate.

    πŸš€ We believe we have found a good strategy with this release, by implementing smarter query strategies to keep inter-container traffic to a minimum and use our backing storage in a way it performs well.

    This boils down to the following advantages that 0.22.0 provides over 0.21.x:

    • Feature parity No feature got lost through the rewrite. If it worked with 0.21.x it works with 0.22.x. If you think otherwise, please open an issue
    • Much smaller disk footprint Since we don't excessively normalize references anymore, the disk footprint got much smaller. Essentially the size on disk is now (object size + vector size + index overheads) * desiredReplication. The amount of cross-references no longer has a direct impact on disk space (other than storing the link itself which is effectively the size of the bytes in a weaviate://... beacon)
    • No depth limit on nested filters Prior to this release a filter on a cross ref prop, such as path: ["inCity", "City", "inCountry", "Country", "name"] had a limit. It would only work within a cache boundary. This limitation is now gone and you can filter as deep as you like. Please note that an excessively deep query will have a perfomance impact.
    • Smaller CPU impact during imports Prior to this release we'd spent a share of the available resources on building a denormalized cache asynchronously after importing a connected object. Without having to build such a cache, more performance on imports is available for storing, vectorizing and indexing objects.

    🐎 Please note that caching was previously done at import time. We recommend not to try to upgrade a 0.21.x cluster, but instead creating a new cluster and reimporting. This is the only way to guarantee your cluster won't have cache leftovers which can impact performance.

    πŸ†• New Features

    none

    πŸ›  Fixes

    • #967 became obsolete through this change
  • v0.21.12 Changes

    January 17, 2020

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.12
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    none

    πŸ›  Fixes

    • πŸ‘Œ Improved Contextionary Weighting Algorithm
      πŸš€ This release updates the default contextionary version to ...v0.4.6 which includes an improved weighting algorithms. Prior to this release the occurrence-based weighting was done with a linear algorithm. This often led to unimportant words getting too much weight. The latest version uses a logarithmic approach. With this approach we were able to improve the accuracy of classifications done with weaviate.

    ⚑️ The example docker-compose files linked above have already been updated. If you're not using them, make sure to update the contextionary version accordingly in your setup.

    ⚑️ This change is non-breaking. Keep in mind that object vectorization happens at import time. So if you want all your objects to benefit from the updated algorithm, you should reimport them.

    If you aren't happy with the results and would like to use the classic linear approach, you can force the contextionary to do so, by setting the environment variable OCCURRENCE_WEIGHT_STRATEGY=linear for the contextionary (!) service. It defaults to log.

  • v0.21.11 Changes

    January 16, 2020

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.11
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    πŸ”€ Entity Merging (#975)
    πŸ”€ Entity merging allows you to deduplicate results. If you have several objects which describe the same physical entity, e.g. "Google Inc." and "Google Incorporated" (they both describe the real-world company "Google"), you can hide duplicates or even let Weaviate merge duplicates into a single entity.

    Usage

    Usage is best described in the following three example screenshots.

    πŸ”€ No grouping/merging
    πŸ‘€ First up is the behavior without any grouping or merging strategy. As you can see there are a lot of duplicates:

    Screenshot 2020-01-16 at 12 36 31

    Grouping strategy closest
    πŸ— With strategy closest Weaviate tries to build groups based on your results. For each group it will show the results closest to your search query. Note that there is also a force field. The higher the force the more likely Weaviate is going to group two objects together. The force: 1.0 would mean that every single item, no matter how different should be grouped. A force: 0 means that only exactly identical items should be grouped. The example below uses force: 0.1 as that yielded the best results. You can see that no more company names are duplicated:

    Screenshot 2020-01-16 at 12 44 11

    πŸ”€ Grouping strategy merge
    πŸ”€ The example above hides duplicates. This isn't an issue if every single field is identical. But what if you need to know the original values. Strategy merge will keep the contents of the original fields. String fields contain all original values as shown below, numerical fields display a mean and reference fields contain all the references from all merged objects:

    Screenshot 2020-01-16 at 12 37 15

    Best Practices

    To get the best possible results, please keep the following things in mind:

    • The grouping/merging is done internally based on vector distance. It is thus important that the items to be merged are as close to each other as possible. If your items use a lot of words which are not recognized by the contextionary, those words do not influence the vector position. In this case consider extending the contextionary using the REST API (/c11y/extensions), so that it understands more words from your object
    • You get the best possible results if noise is removed in vectorization, we thus strongly recommend setting vectorizeClassName: false and vectorizePropertyName: false for each property. Those settings were introduced in 0.21.10.

    πŸ›  Fixes

    none

  • v0.21.10

    January 15, 2020
  • v0.21.9 Changes

    January 09, 2020

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.9
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    πŸ†• New Features

    πŸ›  Fixes

  • v0.21.8 Changes

    January 07, 2020

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.8
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    none

    πŸ›  Fixes

    Missing limit parameter in Aggregation (#1058)
    It wasn't possible to limit (or increase) the amount of groups when making a grouped aggregation, this addresses this by adding a limit: <int> Option to Aggregate{ Things { Class() } }

    Missing limit parameter in stringProp aggregation's field topOccurrences (#992)
    πŸš€ It wasn't possible to limit (or increase) the amount of string prop results using stringProp { topOccurrences { value occurs } } which always defaulted to 5. This release introduces a limit: <int> field, so that the size of the result buckets can be freely set by the user, e.g. stringProp { topOccurrences(limit: 2000) { value occurs } }

  • v0.21.7 Changes

    December 19, 2019

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.7
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    none

    πŸ›  Fixes

    • Aggregate not being able to count string props (#974)
      This issue was also duplicated as #993. Previously "null" was returned in the count field of a string prop. Now the actual count is returned.
  • v0.21.6 Changes

    December 18, 2019

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.6
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    • πŸ‘ Better defaults for Replication/Sharding (#1014)
      0️⃣ This affects the underlying Elasticsearch database ("esvector"). Sets reasonable defaults, but can also be overwritten in config using vectorIndex.numberOfShards: <int> as well as vectorIndex.autoExpandReplicas: <string>. The fields behave like their equivalents in Elasticsearch.

    πŸ›  Fixes

    Code base is now compatible with Go 1.13 (#1056)
    This mostly resolved go module incompatibilities.

    Supernodes are no longer cached (#1053)
    This is a simple fix that will need a more elaborate solution later one. Right now, a supernode is simply not cached at all. When traversing the graph the references are thus resolved in real-time. This means a supernode can currently not be used in a "search by reference" where filter, e.g. on a class City with path: ["hasRestaurants", "Restaurant", "name"] will not find those Cities which are considered super nodes.

    0️⃣ By default any class with at least 100 outgoing references is considered a supernode, but this setting can be overwritten by setting vectorIndex.supernodeThreshold: int.

    "No value given" error when using valueText filter (#1048)
    This error only affected the GraphQL where filter, this issue was not present in the REST where filter (currently only available on the classifications API)

  • v0.21.5 Changes

    December 06, 2019

    🐳 Docker image/tag: semitechnologies/weaviate:0.21.5
    πŸ‘€ See also: example docker compose files in english and dutch.

    πŸ’₯ Breaking Changes

    none

    πŸ†• New Features

    Set where filters in all classification types (#985)
    This feature allows narrowing down objects in a classification through setting where filters.

    A total of three filters can be set (all three are optional).

    • sourceWhere: This limits the to-be-classified (or "unclassified") items which will be processed during a classification run.
    • trainingSetWhere: This limits the training set. This filter can only be used with classification types which rely on a training set, such as "type": "knn"
    • targetWhere: This limits the potential targets (or "labels") of a classification. This filter can only be used on classification types which don't have a training set, but rather produce a direct relationship between source and target, such as "type": "contextual"

    πŸ‘€ For more elaborate examples on when to use which filter, see this post.

    The type/structure of the where filter object for all three options is identical to the those of the existing where filters currently present in the GraphQL API.

    πŸ›  Fixes

    • 0️⃣ Incorrect defaults in classification of type contextual (#1045)
      0️⃣ Prior to this version the optional field k would always default to 3. This is indeed the desired behavior for a classification of type knn. However, it would also be set for contextual where the field doesn't make sense. This fix makes sure that the defaults are only set where appropriate.
  • v0.21.4

    December 05, 2019