An Introduction to CockroachDB as a Promising NewSQL Database

CockroachDB is one of the upcoming popular DB in the NewSQL paradigm, which has a distributed SQL database structure built on a consistent transactional key-value store. The significant advantage of CockroachDB is that it can scale unlimited horizontally and survive the machine, disk, rack, and even data center-level failures. It also features only minimal latency disruption, and there is no need for any no manual intervention.

CockroachDB also supports highly consistent ACID transactions and offers a very familiar and friendly SQL API for the users to administer the structuring of data as well as manipulating and querying it. We can see that Google’s Spanner mostly inspires CockroachDB, and its source code is open source. This article is in an FAQ format by discussing the most frequently asked questions about the CockroachDB NewSQL database.

Recommended: 10 Most Common Programming and Coding Mistakes

CockroachDB NewSQL Database

Whey CockroachDB is an ideal choice?

CockroachDB is ideal for modern applications that demand more reliability, availability, data integrity, and quick response time, regardless of the scale. This database is built to replicate automatically, recover, and rebalance with only minimal operational and configuration overhead to the users. The specific use cases where CockroachDB can be ideally suited are:

  • Multi datacenter and multi-regional database deployment
  • Replicated or distributed OLTP
  • Cloud DB migrations
  • Infrastructure as cloud initiatives

When we say it is fast, CockroachDB can return the single-row reads in less than 2 milliseconds, and single-row writes in less than 4 ms. It also supports a wide range of standard SQL and operational practices for query performance optimization. However, CockroachDB is not ideal for the heavy analytics of OLAP use cases yet.

Is CockroachDB easy to install?

Downloading CockroachDB is as easy as downloading a simple binary as running a Docker Image or Kubernetes configuration. There are many easy install methods available for it, simple as running a Homebrew recipe on OS X or building it from the source files on Linux and OS X.

What is the scalability of CockroachDB?

As we read in the introduction, CockroachDB is horizontally scalable with limited operational overhead. You can quickly run it on your home computer, on a single server environment, or even on an enterprise development cluster. It can also run on a public or private cloud to ensure the same scalability levels. Expanding capacity is simply like pointing towards a new node at running clusters.

At the base level of key-value, CockroachDB initiates with a single empty range. As you add more data, the single range reaches to a higher threshold as 512 MiB or so by default. When this threshold is attained, the data gets split into two different ranges, each of which covers a contiguous segment of the whole key-value space. It is an ongoing process as data expands, where the existing ranges keep on splitting into more ranges by always keeping a comparatively smaller, consistent range size.

For those who don’t want to take the overhead of database administration internally for your startup or MSME business enterprise, it is advisable to explore RemoteDBA offerings in cost-effective database consulting.

How effectively can CockroachDB survive failures?

Being a new-age NewSQL database, CockroachDB is built to survive all significant hardware and software failures expected, including server restarts or even complete data center outages. This is achieved without getting confused with the artifacts typical to other available distributed systems. By using strongly consistent replicationsand automated failure repairs, CockroachDB can effectively survive failures.

As seen above, CockroachDB can also effectively replicate data for anytime availability and can also guarantee consistency between the replicas using Raft consensus algorithms, which are recognized as an effective alternative to the Paxos. The users can define the locations to create the models in different ways, based on the types of failures you expected and how you want to secure the database against the risks based on the network topology.

Different servers are spreading across geographical regions in different datacenters to help CockroachDB to tolerate failures. Usually, the round-trip latency between client and server locations affect the database experience. For this, it is important to consider the latency requirements of every table and use the most appropriate topologies for locating data to optimize the performance and resilience.

How consistent is CockroachDB?

CockroachDB enables serializable SQL transactions, which ensures the highest level of isolation by SQL standard. It achieves it by combining the Raft consensus (an alternative to Paxos) for the writes and the algorithm of time-based synchronization for the reads. The data stored in the DB gets versioned by MVCC, which makes the reads to limit their scope to data visible at the transaction time.

To make sure that a write transaction is not getting interfered with reads, which run concurrently, CockroachDB initiates a timestamp cache that remembers when data is read last by the ongoing transactions. This approach helps the users experience a serialized consistency in concurrent transactions.

The CAP (Consistency, Availability, and Partition Tolerance) theorem states that no distributed database system can process more than two of these database guarantees simultaneously. Considering this limitation, CockroachDB can be identified as a CP (consistency and partition tolerance) based database system. Adding to these, CockroachDB is also highly available, whereas ‘availability’ means something different from how described in the CAP theorem. In general, availability is considered a binary property, but to ensure High Availability, CockroachDB believes it as a spectrum.

Why CockroachDB SQL for enterprise applications?

At the baseline, CockroachDB is a well-distributed, consistent, CP and HA functional, transactional database, but it is external API functions as Standard SQL. The developers may find it familiar with relational concepts such as columns, tables, schemas, and indexes as in SQL and the possibilities to manipulate the structure, and query data using known SQL processes. CockroachDB also supports PostgreSQL protocol, making it simple for your enterprise application to talk to Cockroach by merely finding the language-specific PostgreSQL driver to build.

Do CockroachDB transactions comply with ACID semantics?

Each CockroachDB transaction guarantees ACID compliance spanning across the arbitrary rows and tables, even in the distributed environment. In terms of Atomicity, all transactions in CockroachDB function as “all or nothing.” For consistency, SQL operations do not experience any intermediate states than from one valid state to another valid state. CockroachDB transactions also implement the most robust ANSI ‘isolation’ levels. In terms of durability, each acknowledged write persists consistently on the majority of the replicas by default through the Raft consensus algorithm.

This is just a first FAQ for those who are getting new on to CockroachDB. Being a comparatively new introduction to the NewSQL suite, this DB is continually improving, and we may expect more features and functionalities out of it in the coming years.