Genie Yuan, Head of Solutions Engineering, Asia-Pacific (APAC) & Japan, Couchbase
Managing large volumes of high velocity data brings a unique set of challenges. From the ability of the database to ingest the data, to the total cost of ownership of the solution given expanding data volumes. Next-generation storage engines make a compelling case for storing such data at optimal costs, without hampering performance for analysis.
Today, there are numerous organisations that deal with write-heavy data-intensive applications to support a large client or customer base. To support industry leading use cases that provide the next level experiences, it is critical to have a mature storage engine that is optimised for high performance with large datasets. This includes IoT-based devices that stream information from multiple sites, which can be varied and move at high velocity.
As data volumes have grown, increasing the size of analytics clusters is necessary to keep pace, and this can drive up costs. Here is where a next-generation storage engine that is designed to be highly performant – even with huge datasets that do not fit in memory – can make the difference. Such an engine is transcendent for use cases where disk access is paramount and optimised to run on limited amounts of memory even with vast datasets. A new breed solution like this can really shine when used for datasets that will not fit into available memory and that require maximum data compression.
The Couchbase Magma storage engine does exactly this, and is being leveraged by early adopters that have these specific needs. It is ideal for use cases that rely primarily on disk access, guaranteeing it performs as well as the underlying disk sub-systems. Magma also performs well even if memory is constrained and datasets are large – requiring just a minimum memory-to-data ratio of 1%.
Infosys, for example, has used Couchbase Magma to significantly improve storage efficiency – crucially lowering total cost of ownership. As a result, Infosys realised a more than triple reduction in hardware and storage capacity – this also generated immense savings for the company.
Today, write-heavy data-intensive applications like ad-serving, internet-of-things, messaging, and online gaming, result in large amounts of data being created.
Unsurprisingly, this has driven up requirements for storing and retrieving large volumes of data. This rapid growth has raised the need for distributed databases that can scale out horizontally by adding more nodes. Organisations needing to manage storage and processing of this data can then use these capabilities to serve the requirements of these highly scalable applications without having to add more nodes – thereby improving storage efficiency and most importantly without inflating ownership costs.