Teamscale in the Cloud - Making Teamscale Scale
Over the past years we’ve seen customers loading hundreds of projects into single Teamscale installations, used by many users. And while Teamscale can handle the analysis of dozens of projects on a single machine, this often means running the hardware at maximum capacity, even on very large machines. With the release of version 4.1, Teamscale is now cloud ready. To keep up with increasing numbers of users and projects handled by single Teamscale instances, we’ve been hard at work on a horizontally scalable deployment for Teamscale. It is now possible to deploy Teamscale onto many machines, making use of extra resources to handle larger work loads.
In this blog post, I will present an architectural overview of how Teamscale can be deployed for such a setup.
Architectural Overview - Components
A distributed Teamscale deployment is made up of multiple components, which normally run in a single process for a standard installation:
- Rest API / Web Interface Nodes
- Analysis Nodes
- Scheduling Node
- Distributed Database
Each of the components can be scaled to multiple machines (except for the scheduler), to archive maximum performance and reliability. The deployment and the envolved ways of communication are depicted in the following image:
Rest API / Web Interface Nodes
This layer is used by all external clients, i.e., the IDE Clients for Eclipse, Visual Studio and IntelliJ and the web interface, to access data provided by Teamscale. This can be scaled to multiple nodes to provide proper load balancing and ensure quick responses even during peak load times.
The analysis nodes do the heavy lifting and analysis work. These machines should be expected to be under high CPU and IO load during the analysis and communicate heavily with the database. This is probably the first layer where multiple machines are useful, to keep up with the analysis load of hundreds of projects.
The scheduling node is the only component that can not be scaled to multiple instances right now. Since this component is currently far from being the bottleneck and can easily be restarted in case of hardware outage, there is no need to scale this to multiple machines. This might change for even larger deployments in the future.
The database is under high load from the analysis and rest API and thus has to be able to handle many (large) requests per second. We currently use Apache Cassandra as our main distributed database, a widely used and battle tested solution.
Message Queue and Global Locking
If you inspected the image closely, you probably stumbled over the message queue and global locking facilities. They are used for internode communication, ensuring a consistent state across machines. We use Hazelcast for this purpose and it is integrated into each of the other components and runs inside the main process. Therefore, it does not need to be started separately.
Deployment through Docker
All components are delivered as part of our general Docker image, which is available on Docker Hub. This means the distributed deployment can easily be run on anything that supports docker images, such as, Google Cloud or a private Kubernetes cluster. Even if you do not run Teamscale on multiple machines, running it using the Docker image can be advantageous, as it allows to easily update Teamscale in case of patch releases. For more information have a look at the README on Docker Hub and talk to us in case of questions.
Do I Need a Distributed Deployment?
We are often asked whether our customers need a distributed deployment or if they can use one big server. Realistically, a single instance server is suffient for most customers. Unless you are planning on having at least a few dozen big (read multiple million lines of code) projects and hundreds of users, it is generally much easier to maintain Teamscale as a single installation on one (big) machine and it is advisable to stick to this as long as possible. If you are planning to grow beyond this size, let us know and we’ll be happy to assist you in setting up a proper distributed installation.
Running a distributed deployment of Teamscale allows us to scale to even bigger installations and support even larger teams on our mission to improve code quality.
Let me know what you think, and which big features we should tackle next, in the comments below!