Changelog History
Page 1
-
v0.22.20 Changes
November 27, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.20
๐ See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a customdocker-compose.yml
file using the documentation.๐ฅ Breaking Changes
none
๐ New Features
๐ New classification (knn-only) ref meta fields added (#1244)
๐ See #1244 for details. When using the_classification
underscore prop, the ref meta, i.e. the fields as part of each reference object contain new distances now. Note: Those distances will only be set if the object was affected by a kNN-classification. The new fields areoverallCount
,winningCount
,losingCount
,meanWinningDistance
,meanLosingDistance
,closestOverallDistance
,closestWinningDistance
andclosestLosingDistance
Standalone mode reaches feature parity with ES-based mode
๐ It is still possible to switch between modes. Standalone is not considered production ready yet, as some minor issues (see Milestone "Standalone") are not yet complete. ETA for a production-ready release - which removes all ES-features entirely - is v0.23.0 with ETA late December 2020.๐ For details see #1265, #1249, #1271, #1272, #1281, #1250, #1291, #1278, #1286, #1308
๐ New Deprecations
winningDistance
andlosingDistance
in_classification
ref meta deprecated
๐ They have been replaced withmeanLosingDistance
andmeanWinningDistance
, see this deprecation notice for details.
๐ Fixes
none
-
v0.22.19 Changes
October 19, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.19
๐ See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a customdocker-compose.yml
file using the documentation.๐ฅ Breaking Changes
none
๐ New Features
- Introduce certainty in
Get {}
withexplore
set (#1258)
๐ Prior to this release the certainty (i.e. how close is a result to the search query) could only be set onExplore {}
, but not onGet {}
with an additionalexplore
parameter set. This release introduces the_certainty
underscore prop which introduces this ability
๐ Fixes
- ๐ Improvements in
moveAwayFrom
(#1267)
๐ This releases fixes issues where setting themoveAwayFrom
parameter inExplore
andGet{} with explore
would lead to bad results that were in some cases not related to the original search term anymore. This fix uses an improved formula (see #1267 for details) to calculate the new position after moving.
- Introduce certainty in
-
v0.22.18 Changes
September 28, 2020
๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.18
๐ See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a customdocker-compose.yml
file using the documentation.๐ฅ Breaking Changes
none
๐ New Features
none
๐ Fixes
- ๐ Fix error message on GraphQL
{Explore}
(gh-1256)
accidentally introduced in0.22.17
- ๐ Fix error message on GraphQL
-
v0.22.17 Changes
September 16, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.17
๐ See also: example docker-compose files in English, Dutch, German, Czech, Italian. If you need to configure additional settings, you can also generate a customdocker-compose.yml
file using the documentation.๐ฅ Breaking Changes
none
๐ New Features
none
๐ Deprecations
๐ Deprecate
cardinality
in schema property (#1240, #1142)
๐ Thecardinality
field has become deprecated. There are no more restrictions. Regardless of the content of this field, reference properties can have0..n
references, primitive properties are always just a single property (i.e. they are either set or not set).๐ This change is non breaking as the field is not removed, but only ignored. If you explicitly still set a value for
cardinality
, weaviate will log a deprecation message tostdout
.๐ Fixes
- โ Remove obsolete and unused telemetry logging (#1232)
- ๐ Fix issue where periodic "telemetry logging" error message could occur
๐ง Related to the removal of this unused/obsolete feature there was a chance that such log messages would be printed depending on the configuration. In other words, it was possible to accidentally turn this feature on even though it shouldn't have been. This commit now removes the feature entirely, so it can't be accidentally turned on anymore.
-
v0.22.16 Changes
September 15, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.16
๐ณ See also: example docker-compose files in English, Dutch, German, Czech, Italian.๐ฅ Breaking Changes
none
๐ New Features
๐ง Specify configuration through environment variables (#1230)
๐ Prior to this release, Weaviate required a config file. That was not the cloud-native way to handle config. Furthermore, users who wanted to try out Weaviate using the docker-compose setup, had to download two files. Now all config can be managed through the docker-compose file itself.The following environment variables can be set:
Variable Description Type Example Value ๐ ORIGIN
Set the http(s) origin for Weaviate string - HTTP origin
CONFIGURATION_STORAGE_URL
Service-Discovery for the (etcd) config store. string - URL
http://etcd:2379
CONTEXTIONARY_URL
Service-Discovery for the contextionary container string - URL
http://contextionary
ESVECTOR_URL
Service-Discovery for the Elasticsearch instance string - URL
http://esvector:9200
ESVECTOR_NUMBER_OF_SHARDS
Configure default number of ES shards int
1
ESVECTOR_AUTO_EXPAND_REPLICAS
Wheter ES should auto expand replicas string
1-3
STANDALONE_MODE
Turn on experimental standalone mode string - true/false
false
PERSISTENCE_DATA_PATH
Only if STANDALONE_MODE=true
: Where should Weaviate Standalone store its data?string - file path
/var/lib/weaviate
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED
Allow users to interact with weaviate without auth string - true/false
true
AUTHENTICATION_OIDC_ENABLED
Enable OIDC Auth string - true/false
false
AUTHENTICATION_OIDC_ISSUER
OIDC Token Issuer string - URL
https://myissuer.com
AUTHENTICATION_OIDC_CLIENT_ID
OIDC Client ID string
my-client-id
AUTHENTICATION_OIDC_USERNAME_CLAIM
OIDC Username Claim string
email
AUTHENTICATION_OIDC_GROUPS_CLAIM
OIDC Groups Claim string
groups
AUTHORIZATION_ADMINLIST_ENABLED
Enable AdminList Authorization mode string - true/false
true
AUTHORIZATION_ADMINLIST_USERS
Users with admin permission string - comma-separated list
[email protected],[email protected]
AUTHORIZATION_ADMINLIST_READONLY_USERS
Users with read-only permission string - comma-separated list
[email protected],[email protected]
More CRUD capabilities in Standalone-Mode (#1221)
๐ If you are running the experimental standalone mode (a preview of some of the features we will introduce in1.0.0
), more CRUD capabilities were added. See #1221 for details and the "Standalone Mode" status page for an up-to-date feature/limitations overview.๐ Fixes
- โ Remove broken/unused/impossible endpoints (#1235)
๐ Prior to this release, there were some REST endpoints that were never implemented or always failed, due to the action not being possible. They have been cleaned up in this release. This removal of endpoints is considered non-breaking, as they were never usable to begin with. See #1235 for which endpoints were removed.
- โ Remove broken/unused/impossible endpoints (#1235)
-
v0.22.15 Changes
August 28, 2020
๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.15
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
none
๐ New Features
Optional Compound Splitting in Contextionary
Motivation
Sometimes Weaviate's Contextionary does not understand words which are compounded out of words it would otherwise understand. This impact is far greater in languages that allow for arbritrary compounding (such as Dutch or German) than in languages where compounding is not very common (such as English).
Effect
๐ Imagine you import an object of class
Post
with contentThis is a thunderstormcloud
. The arbitrarily compunded wordthunderstormcloud
is not present in the Contextionary. So your object's position will be made up of the only words it recognizes:"post", "this"
("is"
and"a"
are removed as stopwords).๐ If you check how this content was vectorized using the
_interpration
feature, you will see something like the following:"\_interpretation": { "source": [{ "concept": "post", "occurrence": 62064610, "weight": 0.3623903691768646 }, { "concept": "this", "occurrence": 932425699, "weight": 0.10000000149011612 }] }
To overcome this limitation the optional Compound Splitting Feature can be enabled in the Contextionary. It will understand the arbitrary compounded word and interpret your object as follows:
"\_interpretation": { "source": [{ "concept": "post", "occurrence": 62064610, "weight": 0.3623903691768646 }, { "concept": "this", "occurrence": 932425699, "weight": 0.10000000149011612 }, { "concept": "thunderstormcloud (thunderstorm, cloud)", "occurrence": 5756775, "weight": 0.5926488041877747 }] }
Note that the newly found word (made up of the parts
thunderstorm
andcloud
has the highest weight in the vectorization. So this meaning that would have been lost without Compound Splitting can now be recognized.Trade-Off Import speed vs Word recognition
0๏ธโฃ Compound Splitting runs an any word that is otherwise not recognized. Depending on your dataset this can lead to a signifcantly longer import time (up to 100% longer). Therefore, you should carefully evaluate whether the higher precision in recognition or the faster import times are more important to your use case. As the benefit is larger in some languages (e.g. Dutch, German) than in others (e.g. English) this feature is turned off by default.
How to activate
To turn on compound splitting simply change the environment variable
ENABLE_COMPOUND_SPLITTING
totrue
on the contextionary container. For example, on the English language docker-compose files, the variable can be found in this line.Classification Perfomance Improvements
๐ Prior to this release Classifications (bothkNN
andcontextual
) were single threaded and thus not utilizing all available resources. With this release both classification types will use as many threads as CPU cores are available. This can speed up classifications considerably on larger machines.๐ Fixes
none
-
v0.22.14 Changes
July 14, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.14
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
none
๐ New Features
๐ Improved noise filtering in
_nearestNeighbors
and_semanticPath
(semi-technologies/contextionary#35)
Prior to this release the features relying on nearest neighbors (namely_nearestNeighbors
and_semanticPath
) might have contained "noise words", i.e. concatenated words of random words with no easily recognizable meaning. These words are present in the Contextionary training space, but are extremely rare and therefore distributed seemingly randomly. As a result an "ordinary" result might have such a noise word as it's immediate neighbor.To combat this noise, a neighbor filtering feature was introduced in the contextionary, which ignores words of the configured bottom percentile - ranked by occurrence in the respective training set. By default this value is set to the bottom 5th percentile. This setting can be overridden. To set another value, e.g. to ignore the bottom 10th percentile, provide the environment variable
NEIGHBOR_OCCURRENCE_IGNORE_PERCENTILE=10
to thecontextionary
container. An example of this variable set correctly can be found here (english version) for example.Note that this feature requires at least contextionary version
...-v0.4.15
.๐ Fixes
๐ง Check if required contextionary container is running and configured at startup (gh-1190)
๐ง Prior to this version, it was possible to run a new version of weaviate together with an outdated and incompatible contextionary version. This (invalid) combination would start up fine, but eventually lead to runtime errors when a contextionary method was called that was incompatible with the configured contextionary container.๐ง This fix makes this impossible. If a contextionary version is configured that does not meet the required conditions, Weaviate will no longer start up. The logs will show one of the following options:
Option 1: Wrong version
FATA[0001] insufficient contextionary version, cannot start up action=startup_check_contextionary contextionaryVersion=en0.16.0-v0.4.14 requiredMinimumContextionaryVersion=0.4.15 exit status 1
Option 2: Happy path
INFO[0001] found a valid contextionary version action=startup_check_contextionary contextionaryVersion=en0.16.0-v0.4.15 requiredMinimumContextionaryVersion=0.4.15
-
v0.22.13 Changes
July 10, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.13
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
๐ New Features
โ Add
_semanticPath
underscore prop to Get{} queries with explore set (#1168)
The semantic path is the path between a search term and each result. For example if the search term isiphone
and the first result is an article about "Microsoft Inc.", then the path would be how to get from the search term to the result, for example["iphone", "apple", "company", "microsoft"]
๐ The semantic path can be queried using the
_semanticPath
GraphQL property alongside the schema-defined properties in GraphQL. Building a semantic path is only possible if anexplore: {}
parameter is set. As the explore term represents the beginning of the path and each search result represents the end of the path. If noexplore: {}
parameter is set, the query is just a list query and not a search query, therefore no path can be built. Sinceexplore:{}
queries are currently exclusively possible in GraphQL, the_semanticPath
is therefore not available in the REST API, either.Notes
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a _featureProjection in high-load situations where response time matters. Avoid parallel requests including a _featureProjection, so that some threads stay available to serve other, time-critical requests. The maximum supported request limit (to be set through the
limit: <int>
parameter is25
- The semantic path feature requires a contextionary version of at least
...-v0.4.14
which is used in the provided docker-compose files linked above.
๐ Fixes
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a _featureProjection in high-load situations where response time matters. Avoid parallel requests including a _featureProjection, so that some threads stay available to serve other, time-critical requests. The maximum supported request limit (to be set through the
-
v0.22.12 Changes
June 26, 2020๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.12
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
none
๐ New Features
๐ New Underscore Prop for visualization:
_featureProjection
(#1178, #1139)
๐ This release adds a new optional property ("underscore prop") to list results (RESTGET /v1/{kinds}
, GraphQLGet {}
). The feature projection is intended to reduce the dimensionality of the object's vector into something easily suitable for visualizing, such as 2d or 3d. The underlying algorithm is exchangeable, the first algorithm to be provided ist-SNE
.๐ The feature can be used without any params and tries to pick reasonable defaults. To do so, use the include parameter on REST (
GET /v1/{kinds}/?include=_featureProjection) or the
_featureProjection { vector }` paramter in GraphQL which appears alongside the schema-defined properties.Optional Parameteres
0๏ธโฃ To tweak the feature projection optional paramaters (currently GraphQL-only) can be provided. The values and their defaults are:
0๏ธโฃ | Parameter | Type | Default | Implication | | --- | --- | --- | --- | |
dimensions
|int
|2
| Target dimensionality, usually2
or3
| ๐ |algorithm
|string
|tsne
| Algorithm to be used, currently supported:tsne
| |perplexity
|int
|min(5, len(results)-1)
| Thet-SNE
perplexity value, must be smaller than then-1
wheren
is the number of results to be visualized | |learningRate
|int
|25
| Thet-SNE
learning rate | |iterations
|int
|100
| The number of iterations thet-SNE
algorithm runs. Higher values lead to more stable results at the cost of a larger response time |Limitations and Restrictions
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a
_featureProjection
query. However, due to the O(n2) complexity of thet-SNE
algorithm, large requests size have an exponential effect on the response time. We recommend to keep the request size at or below 100 items, as we have noticed drastic increases in response time thereafter. - Feature Projection happens in real-time, per query. The dimensions returned have no meaning across queries.
- Currently only root elements (not resolved cross-references) are taken into consideration for the featureProjection.
- Due to the relatively high cost of the underlying algorithm, we recommend to limit requests including a
_featureProjection
in high-load situations where response time matters. Avoid parallel requests including a_featureProjection
, so that some threads stay available to serve other, time-critical requests.
Example
The screenshot below shows a visualization done on a subset of the 20 newsgroup dataset with the article's main category used as label. The chart was created in Python using
matplotlib.pyplot
'sscatter
feature.๐ Fixes
none
- There is no request size limit (other than the global 10,000 items request limit) which can be used on a
-
v0.22.11 Changes
June 24, 2020
๐ณ Docker image/tag:
semitechnologies/weaviate:0.22.11
๐ See also: example docker compose files in English, German, Dutch, Italian and Czech.๐ฅ Breaking Changes
none
๐ New Features
โ Add
_nearestNeighbors
underscore prop to REST and GraphQL API (#1169)
Display information about the neighboring concepts of an object in a single (GET /v1/{kind}/{id}
) or list (GET /v1/{kinds}
) REST response by setting the?include=_nearestNeighbors
property. The same can be achieved in a GraphQLGet {}
query by requesting the_nearestNeighbors{}
prop alongside the schema-defined props.Note: Displaying nearest neighbors leads to
n
additional searches, wheren
is the number of search results. Use carefully in high-load situations or scale up Weaviate accordingly.๐ Fixes
- Missing classification distances in meta/_classification responses (#1176)
๐ Release0.22.8
introduced an issue where the kNN-classification distances were sometimes not shown when the?meta=true
(deprecated) or the?include=_classification
flags were set. This release fixes this issue.
- Missing classification distances in meta/_classification responses (#1176)