Multitenancy and Google App Engine (GAE) Java

Multi-tenant applications, one instance serves more than one organization, but at the same time provides virtual isolation to data and applications from other tenants of the application. Since the hardware, Operating system, and in some cases application code are the same for all tenants of the application, it’s easier to maintain, monitor and make incremental changes based on aggregate data from tenants. It also provides economics of scale by which the services can be provided at lower cost to tenants. The Multi-tenancy Principal is not a new phenomenon, it has been around for at least a few decades, but with emergence of Cloud computing, multi-tenant architecture is gaining more ground in the application space on cloud. When you create an account on a storage service like DropBox, you are assigned is a 2GB of space from petabytes of storage they have. Your 2GB space is exclusively for you and no one else can access it unless you share it. But space assigned to each user is potentially on the same physical storage device or set of physical storage devices. So you are one of tenants of multi tenant storage system of DropBox!

Levels of Multitenancy

Hardware and Resource multitenancy

This is simplest form of multitenancy, and something which has been in existence for quite some time! When you get storage on Amazon S3 for your application, you are getting space on a shared storage space and you are one of many tenants of the infrastructure. Most IaaS consumers are tenants of the infrastructure they are using.

Data multitenancy

Data multitenancy is achieved at a datastore level. Based on security and compliance constraints, and purpose of organization, the degree of architectural multitenancy might vary. Organizations wanting security and complete isolation of data might choose to have an isolated DB setup for their organization. Other possible variations might include having one schema for an organization within same Database, or having separate set of tables for each organization. A detailed treatment on this subject can be found here but would not be the main focus of our discussion here. When we talk about data multitenancy, the application code accessing the data is same. The application might be hosted on multiple servers for scaling, but essentially the UI screens, logic and customizations of tenants remains identical, it’s only the data that differs.

Application multitenancy

Multitenancy weaved throughout an application is most complex type, and the hardest one to achieve. So, how is it possible that each tenat will have different logic and screens accessing different data still sharing common resources. Let’s look at a high level architecture of such a system:

A complete multi-tenant system achieves multitenancy at data level in similar to data multitenancy we earlier discussed. At the application level, a run-time engine combines the tenant specific metadata and customization data to kernel code, which gives a tenant specific application. The same logic might be applied at data level for cases where each tenant has a different kind of schema/objects. With object databases, it’s far easier to have different schema/objects for each tenat, or even have different attributes on same object for meeting needs of different tenants. Complexity in having a completely weaved multi-tenant application arises from having the filters for tenants throughout application, at the same time being able to deliver the promise of scale and speed. Various mechanisms such as metadata caching are used to achieve that. Best practical example of a multi-tenancy weaved through application are SalesForce applications.

Multi tenancy in Google App Engine

GAE provides the Namespace API to achieve multitenancy at a data level as of today. The Namespace API is available for Datastore, Task Queues and Memchache. The Blobstore API does not support Namespace yet, so your binary asset will have to be compartmentalized at the application level with mechanism designed in application. Namespace API also supports Google Apps domain, so if your tenants are going to be Google apps users, then you can set their domain name as Namespace for their data. There are various strategies to use namespace for example at user level, google apps domain level and so on. Namespace can be set before request enters the application by using filters configured in deployment descriptor. Namespace can also be set within request to other namespace to get common data, which can be unset or set to old namespace again.

Sample use cases

Setting the Namespace to the current google apps domain – this will split data for each google apps user. This can be done in a filter so that request entering the application is Namespace aware. In the following piece of code, we are checking if Namespace is null and if it’s null we are setting it to the current google apps domain.

[sourcecode language=”java”] if (NamespaceManager.get() == null) { NamespaceManager.set(NamespaceManager.getGoogleAppsNamespace()); } [/sourcecode] If the data is split for each user, then user ID can be used as namespace; [sourcecode language=”java”] UserServiceFactory.getUserService().getCurrentUser().getUserId(); [/sourcecode]

A common namespace can be used across various tenants to access data common to all tenants for example to access zip codes of all cities in US can be retrieved from common namespace

[sourcecode language=”java”] String currentNs = NamespaceManager.get(); NamespaceManager.set(“COMMON_NS”); // Get data which is needed from common Namespace NamespaceManager.set(currentNs); [/sourcecode]

Seperate namespaces can be used to separate data between dev, QA and staging environments, and versions of GAE can be used to host different stages of codebase for respective environments.
Codebase of various modules of an application can be deployed in different versions like mail.APP_NAME.appspot.com and data specific to that module can be stored in individual namespaces.
Common data can be stored in common namespace, and namespaces can be switched when needed.

Frequently Asked Questions (FAQs) about Google App Engine (GAE) and Multitenancy

What is the architecture of Google App Engine (GAE)?

Google App Engine (GAE) is built on a sandbox model which provides a secure and isolated environment for your applications. It uses a distributed architecture, meaning that your application is run on multiple servers simultaneously. This ensures high availability and redundancy. The architecture is designed to scale automatically in response to the incoming traffic. It also includes services for data storage, caching, and other tasks, so you can focus on writing your application code without worrying about server management or infrastructure.

What are the key features of GAE?

GAE offers several key features including automatic scaling, built-in services and APIs, no server management, and support for several programming languages. Automatic scaling allows your application to handle increased traffic without any manual intervention. Built-in services and APIs provide functionalities like data storage, user authentication, caching, and more. With GAE, you don’t need to worry about server management or infrastructure, as Google handles these aspects. It also supports several programming languages including Java, Python, PHP, and Go.

What are the advantages of using GAE?

GAE offers several advantages such as easy scalability, cost-effectiveness, and robust security. Its automatic scaling feature allows your application to handle increased traffic seamlessly. It’s cost-effective as you only pay for the resources you use. Moreover, GAE is built on Google’s secure and reliable infrastructure, ensuring robust security for your applications.

Are there any limitations of GAE?

While GAE offers several benefits, it does have some limitations. For instance, it supports a limited set of programming languages. Also, since it’s a fully managed platform, you have less control over the underlying infrastructure. Additionally, there might be some restrictions on the usage of certain APIs and services.

How does multitenancy work in GAE?

Multitenancy in GAE allows you to serve multiple tenants (users or groups of users) from a single instance of your application. This is achieved by using the Namespace API, which allows you to partition data across tenants. Each tenant’s data is isolated and cannot be accessed by other tenants.

What is the role of Java in GAE?

Java is one of the supported programming languages in GAE. You can build and deploy Java applications on GAE, leveraging its automatic scaling, built-in services, and robust security. GAE also supports the Java Servlet standard and provides a Java SDK for developing applications.

How does GAE handle data storage?

GAE provides several options for data storage, including Google Cloud Datastore, Google Cloud SQL, and Google Cloud Storage. These services offer scalable, reliable, and durable storage options for your applications. You can choose the one that best fits your application’s needs.

What is the runtime environment in GAE?

The runtime environment in GAE is the environment in which your application runs. It includes the programming language, the operating system, and the server software. GAE provides several pre-configured runtime environments, including Java, Python, PHP, and Go.

How secure is GAE?

GAE is built on Google’s secure and reliable infrastructure. It provides several security features including secure data storage, user authentication, and access control. Moreover, Google’s security team continuously monitors the infrastructure to detect and prevent security threats.

How does GAE compare to other cloud platforms?

GAE offers several unique features that set it apart from other cloud platforms. These include automatic scaling, built-in services and APIs, no server management, and robust security. However, it also has some limitations, such as support for a limited set of programming languages and less control over the underlying infrastructure. Therefore, the choice between GAE and other cloud platforms depends on your specific needs and requirements.