Weaviate v0.22.12 Release Notes

Release Date: 2020-06-26 // about 2 years ago
  • ๐Ÿณ Docker image/tag: semitechnologies/weaviate:0.22.12
    ๐Ÿ‘€ See also: example docker compose files in English, German, Dutch, Italian and Czech.

    ๐Ÿ’ฅ Breaking Changes

    none

    ๐Ÿ†• New Features

    ๐Ÿ†• New Underscore Prop for visualization: _featureProjection (#1178, #1139)
    ๐Ÿš€ This release adds a new optional property ("underscore prop") to list results (REST GET /v1/{kinds}, GraphQL Get {}). The feature projection is intended to reduce the dimensionality of the object's vector into something easily suitable for visualizing, such as 2d or 3d. The underlying algorithm is exchangeable, the first algorithm to be provided is t-SNE.

    ๐Ÿ”‹ The feature can be used without any params and tries to pick reasonable defaults. To do so, use the include parameter on REST (GET /v1/{kinds}/?include=_featureProjection) or the_featureProjection { vector }` paramter in GraphQL which appears alongside the schema-defined properties.

    Optional Parameteres

    0๏ธโƒฃ To tweak the feature projection optional paramaters (currently GraphQL-only) can be provided. The values and their defaults are:

    0๏ธโƒฃ | Parameter | Type | Default | Implication | | --- | --- | --- | --- | | dimensions | int | 2 | Target dimensionality, usually 2 or 3 | ๐Ÿ‘ | algorithm | string | tsne | Algorithm to be used, currently supported: tsne | | perplexity | int | min(5, len(results)-1) | The t-SNE perplexity value, must be smaller than the n-1 where n is the number of results to be visualized | | learningRate | int | 25 | The t-SNE learning rate | | iterations | int | 100 | The number of iterations the t-SNE algorithm runs. Higher values lead to more stable results at the cost of a larger response time |

    Limitations and Restrictions

    • There is no request size limit (other than the global 10,000 items request limit) which can be used on a _featureProjection query. However, due to the O(n2) complexity of the t-SNE algorithm, large requests size have an exponential effect on the response time. We recommend to keep the request size at or below 100 items, as we have noticed drastic increases in response time thereafter.
    • Feature Projection happens in real-time, per query. The dimensions returned have no meaning across queries.
    • Currently only root elements (not resolved cross-references) are taken into consideration for the featureProjection.
    • Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a _featureProjection in high-load situations where response time matters. Avoid parallel requests including a _featureProjection, so that some threads stay available to serve other, time-critical requests.

    Example

    The screenshot below shows a visualization done on a subset of the 20 newsgroup dataset with the article's main category used as label. The chart was created in Python using matplotlib.pyplot's scatter feature.

    image

    ๐Ÿ›  Fixes

    none