Weaviate v0.22.12 Release Notes
Release Date: 2020-06-26 // almost 4 years ago-
๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.12
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
none
๐ New Features
๐ New Underscore Prop for visualization:
_featureProjection
(#1178, #1139)
๐ This release adds a new optional property ("underscore prop") to list results (RESTGET /v1/{kinds}
, GraphQLGet {}
). The feature projection is intended to reduce the dimensionality of the object's vector into something easily suitable for visualizing, such as 2d or 3d. The underlying algorithm is exchangeable, the first algorithm to be provided ist-SNE
.๐ The feature can be used without any params and tries to pick reasonable defaults. To do so, use the include parameter on REST (
GET /v1/{kinds}/?include=_featureProjection) or the
_featureProjection { vector }` paramter in GraphQL which appears alongside the schema-defined properties.Optional Parameteres
0๏ธโฃ To tweak the feature projection optional paramaters (currently GraphQL-only) can be provided. The values and their defaults are:
0๏ธโฃ | Parameter | Type | Default | Implication | | --- | --- | --- | --- | |
dimensions
|int
|2
| Target dimensionality, usually2
or3
| ๐ |algorithm
|string
|tsne
| Algorithm to be used, currently supported:tsne
| |perplexity
|int
|min(5, len(results)-1)
| Thet-SNE
perplexity value, must be smaller than then-1
wheren
is the number of results to be visualized | |learningRate
|int
|25
| Thet-SNE
learning rate | |iterations
|int
|100
| The number of iterations thet-SNE
algorithm runs. Higher values lead to more stable results at the cost of a larger response time |Limitations and Restrictions
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a
_featureProjection
query. However, due to the O(n2) complexity of thet-SNE
algorithm, large requests size have an exponential effect on the response time. We recommend to keep the request size at or below 100 items, as we have noticed drastic increases in response time thereafter. - Feature Projection happens in real-time, per query. The dimensions returned have no meaning across queries.
- Currently only root elements (not resolved cross-references) are taken into consideration for the featureProjection.
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a
_featureProjection
in high-load situations where response time matters. Avoid parallel requests including a_featureProjection
, so that some threads stay available to serve other, time-critical requests.
Example
The screenshot below shows a visualization done on a subset of the 20 newsgroup dataset with the article's main category used as label. The chart was created in Python using
matplotlib.pyplot
'sscatter
feature.๐ Fixes
none
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a