Weaviate/CHANGELOG and Weaviate Releases

All Versions

Latest Version

0.22.20

Avg Release Cycle

14 days

Latest Release

1239 days ago

Changelog History

Page 1

v0.22.20 Changes
November 27, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.20
🔖 See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a custom docker-compose.yml file using the documentation.

💥 Breaking Changes

none

🆕 New Features
🆕 New classification (knn-only) ref meta fields added (#1244)
👀 See #1244 for details. When using the _classification underscore prop, the ref meta, i.e. the fields as part of each reference object contain new distances now. Note: Those distances will only be set if the object was affected by a kNN-classification. The new fields are overallCount, winningCount, losingCount, meanWinningDistance, meanLosingDistance, closestOverallDistance, closestWinningDistance and closestLosingDistance
Standalone mode reaches feature parity with ES-based mode
🚀 It is still possible to switch between modes. Standalone is not considered production ready yet, as some minor issues (see Milestone "Standalone") are not yet complete. ETA for a production-ready release - which removes all ES-features entirely - is v0.23.0 with ETA late December 2020.

👀 For details see #1265, #1249, #1271, #1272, #1281, #1250, #1291, #1278, #1286, #1308

🆕 New Deprecations
- winningDistance and losingDistance in _classification ref meta deprecated
  👀 They have been replaced with meanLosingDistance and meanWinningDistance, see this deprecation notice for details.
🛠 Fixes

none
v0.22.19 Changes
October 19, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.19
🔖 See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a custom docker-compose.yml file using the documentation.

💥 Breaking Changes

none

🆕 New Features
- Introduce certainty in Get {} with explore set (#1258)
  🚀 Prior to this release the certainty (i.e. how close is a result to the search query) could only be set on Explore {}, but not on Get {} with an additional explore parameter set. This release introduces the _certainty underscore prop which introduces this ability
🛠 Fixes
- 👌 Improvements in moveAwayFrom (#1267)
  🚀 This releases fixes issues where setting the moveAwayFrom parameter in Explore and Get{} with explore would lead to bad results that were in some cases not related to the original search term anymore. This fix uses an improved formula (see #1267 for details) to calculate the new position after moving.
v0.22.18 Changes
September 28, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.18
🔖 See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a custom docker-compose.yml file using the documentation.

💥 Breaking Changes

none

🆕 New Features

none

🛠 Fixes
- 🛠 Fix error message on GraphQL {Explore} (gh-1256)
  accidentally introduced in 0.22.17
v0.22.17 Changes
September 16, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.17
🔖 See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a custom docker-compose.yml file using the documentation.

💥 Breaking Changes

none

🆕 New Features

none

🗄 Deprecations
🗄 Deprecate cardinality in schema property (#1240, #1142)
🗄 The cardinality field has become deprecated. There are no more restrictions. Regardless of the content of this field, reference properties can have 0..n references, primitive properties are always just a single property (i.e. they are either set or not set).

🚚 This change is non breaking as the field is not removed, but only ignored. If you explicitly still set a value for cardinality, weaviate will log a deprecation message to stdout.

🛠 Fixes
- ✂ Remove obsolete and unused telemetry logging (#1232)
- 🛠 Fix issue where periodic "telemetry logging" error message could occur
  🔧 Related to the removal of this unused/obsolete feature there was a chance that such log messages would be printed depending on the configuration. In other words, it was possible to accidentally turn this feature on even though it shouldn't have been. This commit now removes the feature entirely, so it can't be accidentally turned on anymore.

v0.22.16 Changes

September 15, 2020

🐳 Docker image/tag: semitechnologies/weaviate:0.22.16
🐳 See also: example docker-compose files in English, Dutch, German, Czech, Italian.

💥 Breaking Changes

none

🆕 New Features

🔧 Specify configuration through environment variables (#1230)
🚀 Prior to this release, Weaviate required a config file. That was not the cloud-native way to handle config. Furthermore, users who wanted to try out Weaviate using the docker-compose setup, had to download two files. Now all config can be managed through the docker-compose file itself.

The following environment variables can be set:

Variable	Description	Type	Example Value
🚀	`ORIGIN`	Set the http(s) origin for Weaviate	`string - HTTP origin`
`CONFIGURATION_STORAGE_URL`	Service-Discovery for the (etcd) config store.	`string - URL`	`http://etcd:2379`
`CONTEXTIONARY_URL`	Service-Discovery for the contextionary container	`string - URL`	`http://contextionary`
`ESVECTOR_URL`	Service-Discovery for the Elasticsearch instance	`string - URL`	`http://esvector:9200`
`ESVECTOR_NUMBER_OF_SHARDS`	Configure default number of ES shards	`int`	`1`
`ESVECTOR_AUTO_EXPAND_REPLICAS`	Wheter ES should auto expand replicas	`string`	`1-3`
`STANDALONE_MODE`	Turn on experimental standalone mode	`string - true/false`	`false`
`PERSISTENCE_DATA_PATH`	Only if `STANDALONE_MODE=true`: Where should Weaviate Standalone store its data?	`string - file path`	`/var/lib/weaviate`
`AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED`	Allow users to interact with weaviate without auth	`string - true/false`	`true`
`AUTHENTICATION_OIDC_ENABLED`	Enable OIDC Auth	`string - true/false`	`false`
`AUTHENTICATION_OIDC_ISSUER`	OIDC Token Issuer	`string - URL`	`https://myissuer.com`
`AUTHENTICATION_OIDC_CLIENT_ID`	OIDC Client ID	`string`	`my-client-id`
`AUTHENTICATION_OIDC_USERNAME_CLAIM`	OIDC Username Claim	`string`	`email`
`AUTHENTICATION_OIDC_GROUPS_CLAIM`	OIDC Groups Claim	`string`	`groups`
`AUTHORIZATION_ADMINLIST_ENABLED`	Enable AdminList Authorization mode	`string - true/false`	`true`
`AUTHORIZATION_ADMINLIST_USERS`	Users with admin permission	`string - comma-separated list`	`[email protected],[email protected]`
`AUTHORIZATION_ADMINLIST_READONLY_USERS`	Users with read-only permission	`string - comma-separated list`	`[email protected],[email protected]`

More CRUD capabilities in Standalone-Mode (#1221)
📚 If you are running the experimental standalone mode (a preview of some of the features we will introduce in 1.0.0), more CRUD capabilities were added. See #1221 for details and the "Standalone Mode" status page for an up-to-date feature/limitations overview.

🛠 Fixes

✂ Remove broken/unused/impossible endpoints (#1235)
🚀 Prior to this release, there were some REST endpoints that were never implemented or always failed, due to the action not being possible. They have been cleaned up in this release. This removal of endpoints is considered non-breaking, as they were never usable to begin with. See #1235 for which endpoints were removed.

v0.22.15 Changes
August 28, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.15
👀 See also: example docker compose files in English, German, Dutch, Italian and Czech.

💥 Breaking Changes

none

🆕 New Features
Optional Compound Splitting in Contextionary

Motivation

Sometimes Weaviate's Contextionary does not understand words which are compounded out of words it would otherwise understand. This impact is far greater in languages that allow for arbritrary compounding (such as Dutch or German) than in languages where compounding is not very common (such as English).

Effect

🚚 Imagine you import an object of class Post with content This is a thunderstormcloud. The arbitrarily compunded word thunderstormcloud is not present in the Contextionary. So your object's position will be made up of the only words it recognizes: "post", "this" ("is" and "a" are removed as stopwords).

👀 If you check how this content was vectorized using the _interpration feature, you will see something like the following:
```
"\_interpretation": { "source": [{ "concept": "post", "occurrence": 62064610, "weight": 0.3623903691768646 }, { "concept": "this", "occurrence": 932425699, "weight": 0.10000000149011612 }] }
```
To overcome this limitation the optional Compound Splitting Feature can be enabled in the Contextionary. It will understand the arbitrary compounded word and interpret your object as follows:
```
"\_interpretation": { "source": [{ "concept": "post", "occurrence": 62064610, "weight": 0.3623903691768646 }, { "concept": "this", "occurrence": 932425699, "weight": 0.10000000149011612 }, { "concept": "thunderstormcloud (thunderstorm, cloud)", "occurrence": 5756775, "weight": 0.5926488041877747 }] }
```
Note that the newly found word (made up of the parts thunderstorm and cloud has the highest weight in the vectorization. So this meaning that would have been lost without Compound Splitting can now be recognized.

Trade-Off Import speed vs Word recognition

0️⃣ Compound Splitting runs an any word that is otherwise not recognized. Depending on your dataset this can lead to a signifcantly longer import time (up to 100% longer). Therefore, you should carefully evaluate whether the higher precision in recognition or the faster import times are more important to your use case. As the benefit is larger in some languages (e.g. Dutch, German) than in others (e.g. English) this feature is turned off by default.

How to activate

To turn on compound splitting simply change the environment variable ENABLE_COMPOUND_SPLITTING to true on the contextionary container. For example, on the English language docker-compose files, the variable can be found in this line.
Classification Perfomance Improvements
🚀 Prior to this release Classifications (both kNN and contextual) were single threaded and thus not utilizing all available resources. With this release both classification types will use as many threads as CPU cores are available. This can speed up classifications considerably on larger machines.

🛠 Fixes

none
v0.22.14 Changes
July 14, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.14
👀 See also: example docker compose files in English, German, Dutch, Italian and Czech.

💥 Breaking Changes

none

🆕 New Features
👌 Improved noise filtering in _nearestNeighbors and _semanticPath (semi-technologies/contextionary#35)
Prior to this release the features relying on nearest neighbors (namely _nearestNeighbors and _semanticPath) might have contained "noise words", i.e. concatenated words of random words with no easily recognizable meaning. These words are present in the Contextionary training space, but are extremely rare and therefore distributed seemingly randomly. As a result an "ordinary" result might have such a noise word as it's immediate neighbor.

To combat this noise, a neighbor filtering feature was introduced in the contextionary, which ignores words of the configured bottom percentile - ranked by occurrence in the respective training set. By default this value is set to the bottom 5th percentile. This setting can be overridden. To set another value, e.g. to ignore the bottom 10th percentile, provide the environment variable NEIGHBOR_OCCURRENCE_IGNORE_PERCENTILE=10 to the contextionary container. An example of this variable set correctly can be found here (english version) for example.

Note that this feature requires at least contextionary version ...-v0.4.15.

🛠 Fixes
🔧 Check if required contextionary container is running and configured at startup (gh-1190)
🔧 Prior to this version, it was possible to run a new version of weaviate together with an outdated and incompatible contextionary version. This (invalid) combination would start up fine, but eventually lead to runtime errors when a contextionary method was called that was incompatible with the configured contextionary container.

🔧 This fix makes this impossible. If a contextionary version is configured that does not meet the required conditions, Weaviate will no longer start up. The logs will show one of the following options:

Option 1: Wrong version
```
FATA[0001] insufficient contextionary version, cannot start up action=startup_check_contextionary contextionaryVersion=en0.16.0-v0.4.14 requiredMinimumContextionaryVersion=0.4.15
exit status 1
```
Option 2: Happy path
```
INFO[0001] found a valid contextionary version action=startup_check_contextionary contextionaryVersion=en0.16.0-v0.4.15 requiredMinimumContextionaryVersion=0.4.15
```
v0.22.13 Changes
July 10, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.13
👀 See also: example docker compose files in English, German, Dutch, Italian and Czech.

💥 Breaking Changes

🆕 New Features
➕ Add _semanticPath underscore prop to Get{} queries with explore set (#1168)
The semantic path is the path between a search term and each result. For example if the search term is iphone and the first result is an article about "Microsoft Inc.", then the path would be how to get from the search term to the result, for example ["iphone", "apple", "company", "microsoft"]

🏗 The semantic path can be queried using the _semanticPath GraphQL property alongside the schema-defined properties in GraphQL. Building a semantic path is only possible if an explore: {} parameter is set. As the explore term represents the beginning of the path and each search result represents the end of the path. If no explore: {} parameter is set, the query is just a list query and not a search query, therefore no path can be built. Since explore:{} queries are currently exclusively possible in GraphQL, the _semanticPath is therefore not available in the REST API, either.

Notes
1. Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a _featureProjection in high-load situations where response time matters. Avoid parallel requests including a _featureProjection, so that some threads stay available to serve other, time-critical requests. The maximum supported request limit (to be set through the limit: <int> parameter is 25
2. The semantic path feature requires a contextionary version of at least ...-v0.4.14 which is used in the provided docker-compose files linked above.
🛠 Fixes
v0.22.12 Changes
June 26, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.12
👀 See also: example docker compose files in English, German, Dutch, Italian and Czech.

💥 Breaking Changes

none

🆕 New Features
🆕 New Underscore Prop for visualization: _featureProjection (#1178, #1139)
🚀 This release adds a new optional property ("underscore prop") to list results (REST GET /v1/{kinds}, GraphQL Get {}). The feature projection is intended to reduce the dimensionality of the object's vector into something easily suitable for visualizing, such as 2d or 3d. The underlying algorithm is exchangeable, the first algorithm to be provided is t-SNE.

🔋 The feature can be used without any params and tries to pick reasonable defaults. To do so, use the include parameter on REST (GET /v1/{kinds}/?include=_featureProjection) or the_featureProjection { vector }` paramter in GraphQL which appears alongside the schema-defined properties.

Optional Parameteres

0️⃣ To tweak the feature projection optional paramaters (currently GraphQL-only) can be provided. The values and their defaults are:

0️⃣ | Parameter | Type | Default | Implication | | --- | --- | --- | --- | | dimensions | int | 2 | Target dimensionality, usually 2 or 3 | 👍 | algorithm | string | tsne | Algorithm to be used, currently supported: tsne | | perplexity | int | min(5, len(results)-1) | The t-SNE perplexity value, must be smaller than the n-1 where n is the number of results to be visualized | | learningRate | int | 25 | The t-SNE learning rate | | iterations | int | 100 | The number of iterations the t-SNE algorithm runs. Higher values lead to more stable results at the cost of a larger response time |

Limitations and Restrictions
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a _featureProjection query. However, due to the O(n²⁾ complexity of the t-SNE algorithm, large requests size have an exponential effect on the response time. We recommend to keep the request size at or below 100 items, as we have noticed drastic increases in response time thereafter.
- Feature Projection happens in real-time, per query. The dimensions returned have no meaning across queries.
- Currently only root elements (not resolved cross-references) are taken into consideration for the featureProjection.
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a _featureProjection in high-load situations where response time matters. Avoid parallel requests including a _featureProjection, so that some threads stay available to serve other, time-critical requests.
Example

The screenshot below shows a visualization done on a subset of the 20 newsgroup dataset with the article's main category used as label. The chart was created in Python using matplotlib.pyplot's scatter feature.

🛠 Fixes

none
v0.22.11 Changes
June 24, 2020
🐳 Docker image/tag: semitechnologies/weaviate:0.22.11
👀 See also: example docker compose files in English, German, Dutch, Italian and Czech.

💥 Breaking Changes

none

🆕 New Features
➕ Add _nearestNeighbors underscore prop to REST and GraphQL API (#1169)
Display information about the neighboring concepts of an object in a single (GET /v1/{kind}/{id}) or list (GET /v1/{kinds}) REST response by setting the ?include=_nearestNeighbors property. The same can be achieved in a GraphQL Get {} query by requesting the _nearestNeighbors{} prop alongside the schema-defined props.

Note: Displaying nearest neighbors leads to n additional searches, where n is the number of search results. Use carefully in high-load situations or scale up Weaviate accordingly.

🛠 Fixes
- Missing classification distances in meta/_classification responses (#1176)
  🚀 Release 0.22.8 introduced an issue where the kNN-classification distances were sometimes not shown when the ?meta=true (deprecated) or the ?include=_classification flags were set. This release fixes this issue.

Weaviate changelog

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

Changelog History Page 1

💥 Breaking Changes

🆕 New Features

🆕 New Deprecations

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🗄 Deprecations

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🛠 Fixes

💥 Breaking Changes

🆕 New Features

Motivation

Effect

Trade-Off Import speed vs Word recognition

How to activate

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🛠 Fixes

Option 1: Wrong version

Option 2: Happy path

💥 Breaking Changes

🆕 New Features

Notes

🛠 Fixes

💥 Breaking Changes

🆕 New Features

Optional Parameteres

Limitations and Restrictions

Example

🛠 Fixes

💥 Breaking Changes

🆕 New Features

🛠 Fixes

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.

Changelog History

Page 1