Thursday, December 12, 2019

An in depth review of the RavenDB Cloud

The RavenDB Cloud may be an excellent choice for developers looking for a SAAS database on the cloud.
The RavenDB team recently launched the RavenDB Cloud. Becase I use RavenDB for some time and even have blogged a little about it, I think the Raven Cloud deserved an in-depth review.

About RavenDB

I’m a big fan of RavenDB. Over the last 5 years using it in production, I didn't notice any performance issue or impeditive behaviour worth mentioning. The database itself is fast, reliable, has a lot of interesting features (like indexes, transformers, ETLs, scheduling, extensive and very well documented Api) and provides a robust and friendly C# Api.

If you want to know more about my RavenDB experience, please click here.

It's also important to remember that the Raven Cloud runs on top of RavenDB 4, which by itself, imposes some breaking changes on 3.5 users. So, before going forward, let’s take a look at what RavenDB 4 brings to the table.

RavenDB 4 Enhancements

RavenDB 4 came with many, many welcome enhancements from RavenDB 3.5. I highlight some of my favourites below:
  • Management Studio – much faster, beautiful and intuitive. Also brings new features (including clustering support) and is more reliable and simpler to use.
  • Speed – Indeed Raven 4 is indeed much faster than 3.5. All Raven 4 tested features (Db imports, queries, patches and custom implementation) showed significant improvements when compared to RavenDB 3.5.
  • Security – RavenDB 4 offers encrypted storage and backups. Another big shift is authentication via certificates instead of usernames/passwords.
  • New Server Dashboard – the new dashboard offers a holistic overview of the cluster, nodes and databases deployed on a cloud account.
  • Clustering – Raven 4 works on a cluster fashion instead of an instance fashion. This brings better performance, stability and consistency for applications.
  • Auto-Backups: you can setup the database to run periodic backups (incremental or not). This feature is also present on the RavenDB helping to reduce the reliance on personalized jobs and/or scripts. Backups can be encrypted and uploaded to different servers including blob storages on AWS and Azure.
  • Ongoing Tasks – the new Manage Ongoing Tasks interface simplifies the configuration and deployment of important services such as ETL, Sql Replication and Backups.
  • RQL RQL is the new Raven Query Language - a mix of LINQ and JavaScript. Much clearer and intuitive than Lucene. See the Queries section for more information.

The RavenDB Cloud

With all that information now let’s review some of the most interesting aspects of the RavenDB Cloud.

Licensing

Currently, the RavenDB Cloud offers 3 types of licenses: Free, Developer and Production with significant differences between them. Notably:
  • Only one free version per cloud account
  • The free version runs only on AWS US East 1
  • Some features (such as SQL replication) are only available on the Production version
If you would like to test it out, remember that apart from the free cloud offering, you could always use a local version of the RavenDB database.
Other aspects of the the free version can be seen below (as of Dec 2019):
And what could you actually do with it? In summary, this is what a free RavenDB Cloud license offers:
  • Create a new RavenDB instance
  • Manage the RavenDB server and databases
  • Configure some aspects of the database (other were not available with the free version)
  • Import an  existing database
  • Query data using a simple console tool
  • Test partially the SQL replication feature
  • Test partially the backup feature

Azure and AWS Integration

The RavenDB Cloud can aslo be deployed on either AWS and Azure’s most popular regions. That means the database could potentially sit on the same datacenter as your application which would significantly reduce the latency between your services and the RavenDB Cloud.

For example, the below image shows available regions for the Azure data center upon creation of my free cluster:

Pricing

The pricing for Raven Cloud varies according to the tier and time utilization. For example, this was the estimated pricing from the Development Tier for AWS and Azure on Dec 09, 2019:

Production Pricing

Obviously, prices for production will differ from those. Please check the pricing page for an updated information on that.

Testing the RavenDB Cloud

With all that information, let’s now test some aspects of the RavenDB Cloud.

Creating a Raven Cloud Instance

The first step is to access a Raven Cloud database instance is to create an account on RavenDB.net. You should get an email with a token to access your cloud portal from which we’re able to create and manage your cloud cluster.

I created my RavenDB Cloud cluster as shown below by clicking Add Product and choosing a cloud provider (AWS):
Then, you will be prompted to review your request:
You will also get the current specs for that instance. Mine were:

Note that as of the creation of this post, AWS US East (N. Virginia) was the only free option available for free cloud accounts, which can significantly deteriorate your tests if your application is running on a different data center, cloud provider and/or region of the world.

Deployment

With all the above information submitted, the deployment process starts. It took me around 2 minutes to have the new cluster provisioned. Once deployed, you should see it on the Raven Cloud portal:

Accessing the Instance

In order to access that cluster, clicking on the Manage button takes us to the new Management Studio also available on RavenDB 4. New users will be required to install a new certificate. Once logged in, here’s what the new RavenDB Studio looks like:
Some of the interesting features that the new studio provides are:
  • a nice overview of our databases on the right. From there you can quickly view failed indexes, alerts and errors.
  • database telemetry including cpu, memory, storage and indexing
  • database management
  • a global overview of the database cluster

Creating Databases

Let's now create and import databases. The process to create a new database is simple. Click Databases -> New Database:

Enter the database name and some other properties and the database is quickly created. The deployment of a new database takes less than 5 seconds. An empty database utilizes aproximately 60 Mb on disk.

Costs

Unfortunately I can't provide an estimate on costs. But, I'd like to recall that when factoring your costs it's important to consider not only the estimated price / utilization tier but also costs for:
  • networking: networking costs will vary and likely increase your costs.
  • backups: backup costs will also vary and will probably increase your costs.

Replication

Replication is done directly through Raven Studio either on creation or on settings. The database administrator sets how many nodes of the cluster he’d like to use and chooses between dynamic/manual replication and Raven handles all the rest. This is an important feature as it allows your database to be up in case one node within your cluster fails.

Importing Databases

I also tested the database important and more importantly, if it would be easy to migrate data between Raven 3.5 and Raven 4. Luckily the process of importing databases didn’t change much and a new RavenDB 4 accepts most of the imported data successfully.

This is how the import process looks like:
My Free Raven Cloud instance imported a database of 1.5 million records in just over 2:40 minutes:

Managing your Server

Server management is done on Studio -> Manage your Server. There, you'll have access to tools such as cluster, client configuration, logs, certificates, backups, traffic, storage, running queries and more. Below you can see what will be available for you when managing your cluster:

Database Tools

You will also find interesting database-specific resources under Settings > Manage Ongoing Tasks (some of which beyond the scope of this article):

Backups

Backups are handled directly from Management Portal -> Database -> Settings -> Manage Ongoing Tasks tool:

When on the Raven Cloud, I was pleasantly surprised that they offer backups to AmazonS3 buckets and Azure blob storage by default:
Once setup, you'll see that the automatic backup runs periodically:

Other important considerations according to the backup documentation are:
  • The Free and Production tiers are regularly and automatically backed up.
  • You can define your own custom backup tasks, as you would with an on-premises RavenDB server.
  • A mandatory-backup task that stores a full backup every 24 hours 
  • An incremental backup happens every 30 minutes
  • Backups created by the mandatory backup routine are stored in a RavenDB Cloud
  • You will have no direct access to backups
  • You can view and restore them using your portal's Backups tab and the management Studio.
  • Mandatory-backup files are kept in RavenDB's own cloud.
  • RavenDB offers 1 GB per product per month for free
  • The backup storage usage is measured once a day, and you'll be charged each month based on your average daily usage.

External Replication

The External Replication feature is available on Raven 4 and can be configured on the Management Studio by Selecting a Database -> Settings -> Manage Ongoing Tasks -> Add Task -> External Replication.

The below screenshot details how it can be configured.

SQL Replication

The Raven Cloud also supports SQL replication (called SQL ETL in Raven 4). Unfortunately this feature is not available on the free version used by this spike. From what I could test, the feature didn’t change much from Raven 3.5. The results of the partial tests can be observed below.
The next step is to write a transformation script. For example:
All this done, the Raven will automatically replicate its records to the remote SQL database. This is excellent for reporting and, in case you need, querying your NoSql data from a traditional SQL database.

Scheduled Backups

Apart from automatic backups I also tested scheduled backups via The Raven Cloud (Database -> Setup -> Manage Ongoing Tasks -> Add Task). Scheduled backups are very customizable. We can specify schedule, location and other settings:

Scaling

Being clustered by default, RavenDB 4 can be easily scaled via the Portal. The documentation describes in detail how it can be configured. Below, a screenshot provided by RavenDB on how it should work (the feature isn't available on the free tier):

Security

The RavenDB Cloud offer strong security features including:
  • Authentication: RavenDB uses X.509 certificate-based authentication. All access happens via certificates, all instances are encrypted using HTTPS / TLS 1.2 / X.509 certificates.
  • IP restriction: you can choose which IP addresses your server can be contacted by.
  • Database Encryption: implemented at the storage level, with XChaCha20-Poly1305 authenticated encryption using 256 bit keys.
  • Encryption at Rest: the raw data is encrypted and unreadable without possession of the secret key.
  • Encrypted Backups: your mandatory backup routines produce encrypted backup files.
For more information, please read RavenDB on the Cloud: Security.

Cluster Api

The Raven Cloud also makes a Cluster API available for managing the cluster. This allows your devops team to script database operations including ability to:

For example, we can dynamically add a node to a cluster by running the following PowerShell script:
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$clientCert = Get-PfxCertificate -FilePath <path-to-pfx-cert>
Invoke-WebRequest -Method Put -URI "http:///admin/cluster/node?url=&tag=&watcher=&assignedCores= -Certificate $cert"

Or with the equivalent in cURL:
curl -X PUT http:///admin/cluster/node?url=&tag=&watcher=&assignedCores= --cert

Database Api

Since we’re talking devops, it’s important to note that the client api only manages nodes on the cluster. By itself, that feature still isn’t sufficient to start a new working environment since databases, indexes and data would be required.

Querying the Cloud Database

A major change happened on Raven 4: RQL replaces Lucene as the default language for queries and patches. If you don’t run RavenDB yet, you shouldn’t be alarmed. But for folks migrating from RavenDB 3.5, that potentially will be a significant impact requiring bigchanges on the code.

The good news is that RQL comes with important changes and multiple improvements on the new Management Studio. The UI is now friendlier, faster and simpler to use, query and export data.

Overall RQL’s the syntax is a mix of .NET’s LINQ and JavaScript. Overall, it’s elegant, clean and simple to use. It also makes querying and patching the Raven database simpler. However, for users currently relying on Lucene, this may represent a risk as those queries will have to be migrated (and subsequently, extensively tested) to RQL.

Further Reference:
Breaking the language barrier
Querying: RQL - Raven Query Language

Running queries on the Portal

It's very simple to run queries from RavenDB Management Studio. Let's see some examples.

Example 1 – Querying a Collection

Querying a collection using RQL from RavenDB's Management Studio is straightforward. .NET developers should recognize the language since it's very similar to LINQ:

Customizating Queries

The Management Studio also brings enhancements on the query viewer. Now we can choose more easily which columns should be returned by the results table:

Patching Data

The patching system also changed a little requiring patches to be written in the simpler RQL language. For example, this is how we apply a simple patch using the new syntax:

Managing the database using the C# api

Another decent number of use cases apper when we consider a database on the cloud. For example, can we script database creations, drops, imports and restores? The short answer is yes! Managing your Raven database using the C#'s api is simple and fun. I made available a simple demo on GitHub to demonstrate it. Below, you can see a sample on those operations, including the connection using the provided certificate:


Risks

With every new technology, there are risks. It’s important to understand that even if RavenDB is a mature technology (and their developers are bright!), there are risks that should be considered with this and any new platform. I highlight:
  • Costs: I wasn't able to determine the overall cost mainly because all values provided by RavenDB are estimates. If the costs are an important requirement for the migration, a more in-depth evaluation should be performed.
  • Performance: I didn't invest much time testing the performance of the Raven Cloud. The good new is that Raven 4 is much faster than 3.5 and is potentially offered on the same Cloud provider / region as your application.
  • Hidden Costs: as previously said, all prices listed are estimates. It’s probable that other costs will be added to your bill at the end of the month.
  • RQL RQL is the new way to run queries against the Raven 4 database. However, due to the amount and complexity of some of our queries relying on Lucene (the old advanced way of querying the Raven database), migrating all the complex queries to RQL will be a challenge in terms of time and testing efforts necessary. 
  • Major changes on the API: ignore this if you're new to RavenDB. But, if you were using RavenDB 3.5 there aresignificant changes on the RavenDB 4 Api. Broken dependencies will potentially be: business logic (if implicitly coupled to RavenDb 3.5), indexes, tests and tools. Also, Lucene queries, Map-Reduce indexes, patches and logic that contains bulk-insert operations will likely have to be upgraded.
  • NServiceBus: the Raven 4 api requires conflicting libraries with NServiceBus. So it may be possible that a RavenDB upgrade will also require you a NServiceBus upgrade.

For Further Investigation

My short experience with the RavenDB Cloud was exciting. However I would like to highlight other topics that could potentially be researched in the future:
  • Full Cost Estimate – all the costs on this post are estimates are subject to variation. Most of these estimates were provided by RavenDB on the Raven Cloud website. It’s highly probable that on a real production environment, costs will be bigger. But, for what the Raven Cloud provide, I still find their prices very attractive.
  • Performance Benchmarks – I personally didn’t do any performance benchmark when testing the Raven Cloud. Based on this exercise, I did realize that both the local and the cloud versions or RavenDB 4 showed a good increase in the overall performance.
  • Security – No security tests were performed as outside of the scope of the spike. My understanding is that security is way beefier on Raven 4. But how secure is it?
  • SQL Integration – The free version doesn’t support sql replication. It’s a very important feature for those that need some sort of reporting. Probably a good reason to go to the dev/prod subscriptions.
  • Backup/Restore – The backup/restore feature wasn’t tested because the only available option for the AWS free version was on S3 storage. Worth investigating if considering using the Raven Cloud on production. My experience with a local install of Raven 4 is that it’s reliable and super fast!
  • Smuggler – The smuggler tool is available on the Raven 4 Api. I built a simple console tool to manage databases and import/export data. The source code is available here.
  • Cluster Api – since the free version does not include clustering, I couldn’t test the Api. However, since the Raven Apis are extensive and well document, I don’t expect any problems with that.

Conclusion

This ends the Raven Cloud evaluation. Thanks for reading thus far!

Hopefully you enjoyed this quick look on some of the most important features of RavenDB. I hope I responded the most common questions regarding this technology and you now have more information to consider it going forward. RavenDB remains a very strong alternative on the NoSql market and its cloud brings significant benefits for teams looking forward to reducing costs.

I dare to say that, for all it provides, RavenDB Cloud is a strong contender against MongoDB Atlas, Elastic Search and Azure CosmosDB.

References

RavenDB Cloud
RavenDB on the Cloud: Overview 
RavenDB on the Cloud: Security
Cluster: Cluster API
Breaking the language barrier
Querying: RQL - Raven Query Language
Raven Cloud - Pricing
List of Differences in Client API between 3.x and 4.0
Apache Lucene 

See Also

A simple introduction to RavenDB
Installing and Running RavenDB on Windows and Linux
Running RavenDB on Docker

For more posts about RavenDB, please click here.
Do you have any comment on one this post? Contact me @BrunoHilden