Cassandra Documentation

Version:

You are viewing the documentation for a prerelease version.

Time Window Compaction Strategy (TWCS)

The Unified Compaction Strategy (UCS) is the recommended compaction strategy for most workloads starting with Cassandra 5.0. If you are creating new tables, use this strategy.

The TimeWindowCompactionStrategy (TWCS) is the recommended compaction strategy for time-series and expiring Time-To-Live (TTL) workloads. If the workload has data grouped by timestamp, TWCS can be used to group SSTables by time window, and then drop entire SSTables when they expire. Thus, disk space can be reclaimed much more reliably than when using STCS or LCS.

TWCS compacts SSTables using a series of time windows buckets. During the active time window, TWCS compacts all SSTables flushed from memory into larger SSTables using STCS. At the end of the time period, all of these SSTables are compacted into a single SSTable based on the SSTable maximum timestamp. Once the major compaction for a time window is completed, no further compaction of the data will ever occur. The process starts over with the SSTables written in the next time window. Notice that each TWCS time window contains varying amounts of data. Then the next time window starts and the process repeats.

TimeWindowCompactionStrategy Operational Concerns

The primary motivation for TWCS is to separate data on disk by timestamp and to allow fully expired SSTables to drop more efficiently. One potential way this optimal behavior can be subverted is if data is written to SSTables out of order, with new data and old data in the same SSTable.

Out of order data can appear in two ways:

  • If the user mixes old data and new data in the traditional write path, the data will be comingled in the memtables and flushed into the same SSTable, where it will remain comingled.

  • If the user’s read requests for old data cause read repairs that pull old data into the current memtable, that data will be comingled and flushed into the same SSTable.

While TWCS tries to minimize the impact of comingled data, users should attempt to avoid this behavior. Specifically, users should avoid queries that explicitly set the timestamp via CQL USING TIMESTAMP. Additionally, users should run frequent repairs (which streams data in such a way that it does not become comingled).

The SizeTieredCompactionStrategy is the default compaction strategy. Any other compaction strategy must be defined in the cassandra.yaml file.

TWCS Options

Subproperty Description

enabled

Enables background compaction. Default value: true

tombstone_compaction_interval

The minimum number of seconds after which an SSTable is created before Cassandra considers the SSTable for tombstone compaction. An SSTable is eligible for tombstone compaction if the table exceeds the tombstone_threshold ratio. Default value: 86400

tombstone_threshold

The ratio of garbage-collectable tombstones to all contained columns. If the ratio exceeds this limit, Cassandra starts compaction on that table alone, to purge the tombstones. Default value: 0.2

unchecked_tombstone_compaction

If set to true, allows Cassandra to run tombstone compaction without pre-checking which tables are eligible for this operation. Even without this pre-check, Cassandra checks an SSTable to make sure it is safe to drop tombstones. Default value: false

log_all

Activates advanced logging for the entire cluster. Default value: false

compaction_window_unit

Time unit used to define the bucket size. The value is based on the Java TimeUnit. For the list of valid values, see the Java API TimeUnit page located at docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html. Default: days

compaction_window_size

Units per bucket. Default value: 1

timestamp_resolution

The timestamp resolution used when inserting data, can be MILLISECONDS, MICROSECONDS etc (should be understandable by Java TimeUnit). Do not change this value unless you do mutations with USING TIMESTAMP (or equivalent directly in the client). Default: microseconds

expired_sstable_check_frequency_seconds

Determines how often to check for expired SSTables. Default: 10 minutes

unsafe_aggressive_sstable_expiration

Expired SSTables will be dropped without checking its data is shadowing other SSTables. This is a potentially risky option that can lead to data loss or deleted data re-appearing, going beyond what unchecked_tombstone_compaction does for single SSTable compaction. Due to the risk, the JVM must also be started with the option: -Dcassandra unsafe_aggressive_sstable_expiration=true. Default: false

Taken together, the operator can specify windows of virtually any size, and TWCS will work to create a single SSTable for writes within that window. For efficiency during writing, the newest window will be compacted using STCS.

Ideally, operators should select a compaction_window_unit and compaction_window_size pair that produces approximately 20-30 windows. If writing with a 90 day TTL, for example, a 3 Day window would be a reasonable choice, setting the options to 'compaction_window_unit':'DAYS' and 'compaction_window_size':3.

Changing TimeWindowCompactionStrategy Options

Operators wishing to enable TWCS on existing data should consider running a major compaction first, placing all existing data into a single (old) window. Subsequent newer writes will then create typical SSTables as expected.

Operators wishing to change compaction_window_unit or compaction_window_size can do so, but may trigger additional compactions as adjacent windows are joined together. If the window size is decrease d (for example, from 24 hours to 12 hours), then the existing SSTables will not be modified. TWCS can not split existing SSTables into multiple windows.