This Week in Elasticsearch and Apache Lucene - June 30, 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
Have an #Elasticsearch upgrade on the horizon? Get some tips on leveling up your cluster from @ryanjernst next week: https://t.co/L2vFS0YucM
— elastic (@elastic) June 25, 2015
Elasticsearch Core
- Internal: Remove
XContentParser.map[Ordered]AndClose()
(#11846, 2.0.0) - Internal: Consolidate shard level abstractions (#11847, 2.0.0)
- Exceptions: Render strucutred exception in multi search (#11849, 2.0.0)
- Core: Allow IBM J9 2.8+ in version check (#11850, 2.0.0)
- Mappings: Remove
close()
from Mapper (#11863, 2.0.0) - Translog: Mark translog as upgraded in the engine even if a legacy generation exists (#11860, 2.0.0)
- Translog: Make translog file name parsing strict (#11875, 2.0.0)
- Cluster State: Rename cluster state uuid to
updateId
(#11862, 2.0.0) - Internal: Rename caches (#11893, 2.0.0)
- Docs: clarification of allocation awareness w.r.t. rack failures (#11908, 2.0.0)
- Recovery: Fix
RecoveryState
timestamps (#11871, 2.0.0, 1.7.0, 1.6.1) - Cache: Store filter cache statistics at the shard level instead of index (#11886, 2.0.0)
- Translog: Don't convert possibly corrupted bytes to UTF-8 (#11911, 2.0.0)
- Testing: Add mocking for securitymanager environment (#11913, 2.0.0)
- Security Manager: Steps to remove dangerous security permissions (#11898, 2.0.0)
- Nested: Add
min
score mode. (#11909, 2.0.0) - Packaging: Load plugins into classpath in bootstrap (#11918, 2.0.0)
- Testing: Fix running of rest tests against external cluster (#11906, 2.0.0)
- Snapshot/Restore: Extract all shard-level snapshot operation into dedicated
SnapshotShardsService
(#11756, 2.0.0) - Cache: Give the filter cache a smaller maximum number of cached filters (#11833, 2.0.0)
- Internal: Fix
FieldDataTermsFilter.equals()
(#11835, 2.0.0, 1.7.0, 1.6.1) - Search: Always return metadata in get/search APIs (#11816, 2.0.0)
- Parsing: add list parse methods to
XContentParser
(#10455, 2.0.0) - Packaging: Fix endless looping if starting fails (#11836, 2.0.0, 1.7.0, 1.6.1)
- Mapping: Restrict fields with the same name in different types to have the same core settings (#11812, 2.0.0)
- Internal: Add a null-check for
XContentBuilder#field
for BigDecimals (#11790, 2.0.0, 1.7.0, 1.6.1) - Highlighting: Fix exception for plain highlighter and huge terms for Lucene 4.x (#11683, 1.7.0, 1.6.1)
- Aggregations: Add
cumulative sum
aggregation (#11825, 2.0.0) - Dates: Allow for backwards compatibility for unix timestamp in pre 2.x indices (#11515, 2.0.0)
- Logging: add the ability to specify an alternate logging configuration location (#10852, 2.0.0)
- Snapshot/Restore: Aborting snapshot might not abort snapshot of shards in very early stages in the snapshot process (#11839, 1.7.0, 1.6.1)
- Logging:
ClusterStateObserver
should log on trace on timeout (#11722, 2.0.0) - Network: Make sure messages are fully read even in case of EOS markers (#11768, 2.0.0)
Apache Lucene
- Deleted documents are now handled just like a filter, made possible by two-phase iterators
- Silly typo in
GermanStemmer
might cause invalid stems - Lucene's SSD detector was not working with the latest fast NVMe based drives, affecting yours truly!
- Reduce per-field RAM cost by using simple
FieldInfo
array instead of TreeMap for non-sparse cases BigramDictionary
now closes open files even on exceptionGeo3D
's GeoCircle can now handle a world-globe diameter- Nested conjunctions are now flattened
- Intersecting an automaton with the terms dictionary is a bit faster
- Forbid new Java 1.8
java.time
APIs that use default Locale or <code>Timezone - More iterations towards a new
GeoPointDistanceQuery
- Some small reductions in RAM cost for FSTs
Geo3D
will soon compute arc distance from a point to a shape- Should
Geo3D
live in Lucene's sandbox module for now? - Various places are missing
try/finally
blocks to close files on exception, and null checking IndexWriter
lists all files in the directory twice on init now- Should Lucene create a directory in the filesystem when you instantiate
Directory
? BookendFilter
offers "exact" matching even on tokenized fieldsMMapDirectory
's checkUnmapSupported <a href="https://issues.apache.org/jira/browse/LUCENE-6618" target="_blank">could incorrectly return <code>false- Move query-time boosting to a dedicated
BoostQuery
SegmentInfo.toString
should confess how that segment was sortedHighlighter
doesn't preserve each token's payloadMockDirectoryWrapper
should more carefully throw its exceptions- Can we restrict what
Similarity.coord
is allowed to return? MappingCharFilter
sometimes produces wrong offsets, but it's not clear how to fix it- Make
SynonymFilter
's graph a bit more correct for multi-token synonyms - Unordered spans should use the same definition of width as ordered spans
- An IBM J9 fix is coming for a JVM bug affecting unicode file names
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!