Introduction to the Google Professional Cloud Architect Exam
Professional Cloud Architect Certification Exam objectives covered in this chapter include the following:
Section 1: Designing and planning a cloud solution architecture
- ✓ 1.1 Designing a solution infrastructure that meets business requirements. Considerations include:
- Business use cases and product strategy
- Cost optimization
- Supporting the application design
- Integration
- Movement of data
- Tradeoffs
- Build, buy or modify
- Success measurements (e.g., Key Performance Indicators (KPI), Return on Investment (ROI), metrics)
- Compliance and observability
This Study Guide is designed to help you acquire the technical knowledge and analytical skills that you will need to pass the Google Cloud Professional Architect certification exam. This exam is designed to evaluate your skills for assessing business requirements, identifying technical requirements, and mapping those requirements to solutions using Google Cloud products, as well as monitoring and maintaining those solutions. This breadth of topics alone is enough to make this a challenging exam. Add to that the need for “soft” skills, such as working with colleagues in order to understand their business requirements, and you have an exam that is difficult to pass.
The Google Cloud Professional Architect exam is not a body of knowledge exam. You can know Google Cloud product documentation in detail, memorize most of what you read in this guide, and view multiple online courses, but that will not guarantee that you pass the exam. Rather, you will be required to exercise judgment. You will have to understand how business requirements constrain your options for choosing a technical solution. You will be asked the kinds of questions a business sponsor might ask about implementing their project.
This chapter will review the following:
- Exam objectives
- Scope of the exam
- Case studies written by Google and used as the basis for some exam questions
- Additional resources to help in your exam preparation
Exam Objectives
The Google Cloud Professional Cloud Architect exam will test your architect skills, including the following:
- Planning a cloud solution
- Managing a cloud solution
- Securing systems and processes
- Complying with government and industry regulations
- Understanding technical requirements and business considerations
- Maintaining solutions deployed to production, including monitoring
It is clear from the exam objectives that the full lifecycle of solution development is covered from inception and planning through monitoring and maintenance.
Analyzing Business Requirements
An architect starts the planning phase by collecting information, starting with business requirements. You might be tempted to start with technical details about the current solution. You might want to ask technical questions so that you can start eliminating options. You may even think that you’ve solved this kind of problem before and you just have to pick the right architecture pattern. Resist those inclinations if you have them. All architecture design decisions are made in the context of business requirements.
Business requirements define the operational landscape in which you will develop a solution. Example business requirements are as follows:
- The need to reduce capital expenditures
- Accelerating the pace of software development
- Reporting on service-level objectives
- Reducing time to recover from an incident
- Improving compliance with industry regulations
Business requirements may be about costs, customer experience, or operational improvements. A common trait of business requirements is that they are rarely satisfied by a single technical decision.
Reducing Operational Expenses
Reducing operational expenses may be satisfied by a combination of managed services, preemptible virtual machines, and the use of autoscalers.
Managed services reduce the workload on systems administrators and DevOps engineers because they eliminate some of the work required when managing your own implementation of a platform. A database administrator, for example, would not have to spend time performing backups or patching operating systems if they used Cloud SQL instead of running a database on Compute Engine instances or in their own data center.
Preemptible VMs are low-cost instances that can run up to 24 hours before being preempted, or shut down. They are a good option for batch processing and other tasks that are easily recovered and restarted.
Autoscaling enables engineers to deploy an adequate number of instances needed to meet the load on a system. When demand is high, instances are increased, and when demand is low, the number of instances is reduced. With autoscaling, organizations can stop purchasing infrastructure adequate to meet peak capacity and can instead adjust their infrastructure to meet the immediate need.
Accelerating the Pace of Development
Successful businesses are constantly innovating. Agile software development practices are designed to support rapid development, testing, deployment, and feedback.
A business that wants to accelerate the pace of development may turn to managed services to reduce the operational workload on their operations teams. Managed services also allow engineers to implement services, such as image processing and natural language processing, which they could not do on their own if they did not have domain expertise on the team.
Continuous integration and continuous deployment are additional practices within software development. The idea is that it’s best to integrate small amounts of new code frequently so that it can be tested and deployed rather than trying to release a large number of changes at one time. Small releases are easier to review and debug. They also allow developers to get feedback from colleagues and customers about features, performance, and other factors.
As an architect, you may have to work with monolithic applications that are difficult to update in small increments. In that case, there may be an implied business requirement to consider decomposing the monolithic application into a microservice architecture. If there is an interest in migrating to a microservices architecture, then you will need to decide if you should migrate the existing application into the cloud as is, known as lift and shift, or you should begin transforming the application during the cloud migration.
There is no way to make a decision about this without considering business requirements. If the business needs to move to the cloud as fast as possible to avoid a large capital expenditure on new equipment or to avoid committing to a long-term lease in a co-location data center or if the organization wants to minimize change during the migration, then lift and shift is the better choice. Most important, you have to assess if the application can run in the cloud with minimal modification. Otherwise, you cannot perform a lift-and-shift migration.
If the monolithic application is dependent on deprecated components and written in a language that is no longer supported in your company, then rewriting the application or using a third-party application is a reasonable choice.
Reporting on Service-Level Objectives
The operational groups of a modern business depend on IT applications. A finance department needs access to accounting systems. A logistics analyst needs access to data about how well the fleet of delivery vehicles is performing. The sales team constantly queries and updates the customer management system. Different business units will have different business requirements around the availability of applications and services.
A finance department may only need access to accounting systems during business hours. In that case, upgrades and other maintenance can happen during off-hours and would not require the accounting system to be available during that time. The customer management system, however, is typically used 24 hours a day, every day. The sales team expects the application to be available all the time. This means that support engineers need to find ways to update and patch the customer management system while minimizing downtime.
Requirements about availability are formalized in service-level objectives (SLOs). SLOs can be defined in terms of availability, such as being available 99.9 percent of the time. A database system may have SLOs around durability or the ability to retrieve data. For example, the human resources department may have to store personnel data reliably for seven years, and the storage system must guarantee that there is a less than 1 in 10 billion chance of an object being lost. Interactive systems have performance-related SLOs. A web application SLO may require a page loading average response time of 2 seconds with a 95 th percentile of 4 seconds.
Logging and monitoring data is used to demonstrate compliance with SLOs. Stackdriver logging is used for collecting information about significant events, such as a disk running out of space. Monitoring services collect metrics from infrastructure, such as average CPU utilization during a particular period of time or the number of bytes written to a network in a defined time span. Developers can create reports and dashboards using logging details and metrics to monitor compliance with SLOs. These metrics are known as service-level indicators (SLIs).
Reducing Time to Recover from an Incident
Incidents, in the context of IT services, are a disruption that causes a service to be degraded or unavailable. An incident can be caused by single factors, such as an incorrect configuration. Often, there is no single root cause of an incident. Instead, a series of failures and errors contributes to a service failure.
For example, consider an engineer on call who receives a notification that customer data is not being processed correctly by an application. In this case, a database is failing to complete a transaction because a disk is out of space, which causes the application writing to the database to block while the application repeatedly retries the transaction in rapid succession. The application stops reading from a message queue, which causes messages to accumulate until the maximum size of the queue is reached, at which point the message queue starts to drop data.
Once an incident begins, systems engineers and system administrators need information about the state of components and services. To reduce the time to recover, it is best to collect metrics and log events and then make them available to engineers at any time, especially during an incident response.
The incident might have been avoided if database administrators created alerts on free disk space or if the application developer chose to handle retries using exponential backoff instead of simply retrying as fast as possible until it succeeds. Alerting on the size of the message queue could have notified the operations team of a potential problem in time to make adjustments before data was dropped.
Improving Compliance with Industry Regulations
Many businesses are subject to government and industry regulations. Regulations range from protecting the privacy of customer data to ensuring the integrity of business transactions and financial reporting. Major regulations include the following:
- Health Insurance Portability and Accountability Act (HIPAA), a healthcare regulation
- Children’s Online Privacy Protection Act (COPPA), a privacy regulation
- Sarbanes-Oxley Act (SOX), a financial reporting regulation
- Payment Card Industry Data Standard (PCI), a data protection regulation for credit card processing
- General Data Protection Regulation (GDPR), a European Union privacy protection regulation
Complying with privacy regulations usually requires controls on who can access and change protected data. As an architect, you will have to develop schemes for controls that meet regulations. Fine-grained access controls may be used to control further who can update data. When granting access, follow security best practices, such as granting only the permissions needed to perform one’s job and separating high-risk duties across multiple roles.
Business requirements define the context in which architects make design decisions. On the Google Cloud Professional Architect exam, you must understand business requirements and how they constrain technical options and specify characteristics required in a technical solution.
Business Terms to Know
Brick and Mortar A term to describe businesses with physical presence, especially retail stores.
Capital Expenditure (Capex) Funds spent to acquire assets, such as computer equipment, vehicles, and land. Capital expenditures are used to purchase assets that will have a useful life of at least a few years. The other major type of expenditure is operational expenditures.
Compliance Implementing controls and practices to meet the requirements of regulations.
Digital Transformation Major changes in businesses as they adopt information technologies to develop new products, improve customer service, optimize operations, and make other major improvements enabled by technology. Brick-and-mortar retailers using mobile technologies to promote products and engage with customers is an example of digital transformation.
Governance Procedures and practices used to ensure that policies and principles of organizational operations are followed. Governance is the responsibility of directors and executives within an organization.
Key Performance Indicator (KPI) A measure that provides information about how well a business or organization is achieving an important or key objective. For example, an online gaming company may have KPIs related to the number of new players acquired per week, total number of player hours, and operational costs per player.
Line of Business The parts of a business that deliver a particular class of products and services. For example, a bank may have consumer banking and business banking lines, while an equipment manufacturer may have industrial as well as agricultural lines of business. Different lines of business within a company will have some business and technical requirements in common as well as their own distinct needs.
Operational Expenditures (Opex) An expense paid for from the operating budget, not the capital budget.
Operating Budget A budget allocating funds to meet the costs of labor, supplies, and other expenses related to performing the day-to-day operations of a business. Contrast this to capital expenditure budgets, which are used for longer-term investments.
Service-Level Agreement (SLA) An agreement between a provider of a service and a customer using the service. SLAs define responsibilities for delivering a service and consequences when responsibilities are not met.
Service-Level Indicator (SLI) A metric that reflects how well a service-level objective is being met. Examples include latency, throughput, and error rate.
Service-Level Objective (SLO) An agreed-upon target for a measurable attribute of a service that is specified in a service-level agreement.
Analyzing Technical Requirements
Technical requirements specify features of a system that relate to functional and nonfunctional performance. Functional features include providing ACID transactions in a database, which guarantees that transactions are atomic, consistent, isolated, and durable; ensuring at least once delivery in a messaging system; and encrypting data at rest. Nonfunctional features are the general features of a system, including scalability, reliability, and maintainability.
Functional Requirements
The exam will require you to understand functional requirements related to computing, storage, and networking. The following are some examples of the kinds of issues you will be asked about on the exam.
Understanding Compute Requirements
Google Cloud has a variety of computing services, including Compute Engine, App Engine, and Kubernetes Engine. As an architect, you should be able to determine when each of these platforms is the best option for a use case. For example, if there is a technical requirement to use a virtual machine running a particular hardened version of Linux, then Compute Engine is the best option. Sometimes, though, the choice is not so obvious.
If you want to run containers in the Google Cloud Platform (GCP), you could choose either App Engine Flexible or Kubernetes Engine for a managed service. If you already have application code running in App Engine and you intend to run a small number of containers, then App Engine Flexible is a good option. If you plan to deploy and manage a large number of containers and want to use a service mesh like Istio to secure and monitor microservices, Kubernetes Engine is a better option.
Understanding Storage Requirements
There are even more options when it comes to storage. There are a number of factors to consider when choosing a storage option, including how the data is structured, how it will be accessed and updated, and for how long it will be stored.
Let’s look at how you might decide which data storage service to use given a set of requirements. Structured data fits well with both relational and NoSQL databases. If SQL is required, then your choices are Cloud SQL, Spanner, BigQuery, or running a relational database yourself in Compute Engine. If you require a global, strongly consistent transactional data store, then Spanner is the best choice, while Cloud SQL is a good choice for regional-scale databases. If the application using the database requires a flexible schema, then you should consider NoSQL options. Cloud Datastore is a good option when a document store is needed, while Bigtable is well suited for ingesting large volumes of data at low latency.
Of course, you could run a NoSQL database in Compute Engine. If a service needs to ingest time-series data at low latency and one of the business requirements is to maximize the use of managed services, then Bigtable should be used. If there is no requirement to use managed services, you might consider deploying Cassandra to a cluster in Compute Engine. This would be a better choice, for example, if you are planning a lift-and-shift migration to the cloud and are currently running Cassandra in an on-premises data center.
When long-term archival storage is required, then Cloud Storage is probably the best option. Since Cloud Storage has four types of storage, you will have to consider access patterns and reliability requirements. If the data is frequently accessed, then regional or multiregional class storage is appropriate. If high availability of access to the data is a concern or if data will be accessed from different areas of the world, you should consider multiregional storage. If data will be infrequently accessed, then Nearline or Coldline storage is a good choice. Nearline storage is designed for data that won’t be accessed more than once a month. Coldline storage is well suited for data that will be accessed not more than once a year.
Understanding Network Requirements
Networking topics that require an architect tend to fall into two categories: structuring virtual private clouds and supporting hybrid cloud computing.
Virtual private clouds (VPCs) isolate a Google Cloud Platform customer’s resources. Architects should know how to configure VPCs to meet requirements about who can access specific resources, the kinds of traffic allowed in or out of the network, and communications between VPCs. To develop solutions to these high-level requirements, architects need to understand basic networking components such as the following:
- Firewalls and firewall rules
- Domain name services (DNS)
- CIDR blocks and IP addressing
- Autogenerated and custom subnets
- VPC peering
Many companies and organizations adopting cloud computing also have their own data centers. Architects need to understand options for networking between on-premises data centers and the Google Cloud Platform network. Options include using a virtual private network, dedicated interconnect, and partner interconnects. Virtual private networks are a good choice when bandwidth demands are not high and data is allowed to traverse the public Internet. Partner interconnects provide a minimum of 50 Mbps bandwidth, and data is transmitted through the partner’s network, not the public Internet. Dedicated interconnects are used when bandwidth requirements are 10 Gbps or greater.
Nonfunctional Requirements
Nonfunctional requirements often follow from business requirements. They include the following:
- Availability
- Reliability
- Scalability
- Durability
Availability is a measure of the time that services are functioning correctly and accessible to users. Availability requirements are typically stated in terms of percent of time a service should be up and running, such as 99.99 percent. Fully supported Google Cloud services have SLAs for availability so that you can use them to help guide your architectural decisions. Note, alpha and beta products typically do not have SLAs.
Reliability is a closely related concept to availability. Reliability is a measure of the probability that a service will continue to function under some load for a period of time. The level of reliability that a service can achieve is highly dependent on the availability of systems upon which it depends.
Scalability is the ability of a service to adapt its infrastructure to the load on the system. When load decreases, some resources may be shut down. When load increases, resources can be added. Autoscalers and instance groups are often used to ensure scalability when using Compute Engine. One of the advantages of services like Cloud Storage and App Engine is that scalability is managed by GCP, which reduces the operational overhead on DevOps teams.
Durability is used to measure the likelihood that a stored object will be retrievable in the future. Cloud Storage has 99.999999999 percent (eleven 9s) durability guarantees, which means it is extremely unlikely that you will lose an object stored in Cloud Storage. Because of the math, as the number of objects increases, the likelihood that one of them is lost will increase.
The Google Cloud Professional Cloud Architect exam tests your ability to understand both business requirements and technical requirements, which is reasonable since those skills are required to function as a cloud architect.
Exam Case Studies
Note
Exam case studies are reprinted with permission from Google LLC. They are subject to change as are the exams themselves. Please visit the Google website to check for the latest Google Cloud Professional Architect exam case studies.
The Google Cloud Professional Cloud Architect Certification exam uses three case studies as the basis for some questions on the exam. Become familiar with the case studies before the exam in order to save time while taking the test.
Each case study includes a company overview, solution concept, technical requirements, business requirements, and an executive statement. As you read each case study, be sure that you understand the driving business considerations and the solution concept. These provide constraints on the possible solutions.
When existing infrastructure is described, think of what GCP services could be used as a replacement if needed. For example, Cloud SQL can be used to replace an on-premise MySQL server, Cloud Dataproc can replace self-managed Spark and Hadoop clusters, and Cloud Pub/Sub can be used instead of RabbitMQ.
Read for the technical implications of the business statements—they may not be stated explicitly. For example, in the Dress4Win case study, the statement “Improve business agility and speed of innovation through rapid provisioning of new resources” could mean that you should plan to use infrastructure-as-code deployments and enable autoscaling when possible. There is also the statement: “Our traffic patterns are highest in the mornings and weekend evenings; during other times, 80% of our capacity is sitting idle,” which describes an opportunity to use autoscaling to optimize resource usage. Both examples show business statements that imply additional requirements that the architect needs to identify without being told that there is a requirement explicitly stated.
The three case studies are available online here:
Dress4Win
Mountkirk Games
TerramEarth
Dress4Win
Company Overview
Dress4Win is a web-based company that helps their users organize and manage their personal wardrobe using a web app and mobile application. The company also cultivates an active social network that connects their users with designers and retailers. They monetize their services through advertising, e-commerce, referrals, and a freemium app model. The application has grown from a few servers in the founder’s garage to several hundred servers and appliances in a collocated data center. However, the capacity of their infrastructure is now insufficient for the application’s rapid growth. Because of this growth and the company’s desire to innovate faster, Dress4Win is committing to a full migration to a public cloud.
Solution Concept
For the first phase of their migration to the cloud, Dress4Win is moving their development and test environments. They are also building a disaster recovery site because their current infrastructure is at a single location. They are not sure which components of their architecture can be migrated as is and which components need to be changed before migrating them.
Existing Technical Environment
The Dress4Win application is served out of a single data center location. All servers run Ubuntu LTS v16.04.
Databases:
MySQL: 1 server for user data, inventory, and static data:
- MySQL 5.7
- 8 core CPUs
- 128 GB of RAM
- 2x 5 TB HDD (RAID 1)
Redis: 3 server cluster for metadata, social graph, and caching. Each server consists of:
- Redis 3.2
- 4 core CPUs
- 32GB of RAM
Compute:
40 Web application servers providing micro-services based APIs and static content.
- Tomcat - Java
- Nginx
- 4 core CPUs
- 32 GB of RAM
20 Apache Hadoop/Spark servers:
- Data analysis
- Real-time trending calculations
- Eight core CPUs
- 128 GB of RAM
- 4x 5 TB HDD (RAID 1)
3 RabbitMQ servers for messaging, social notifications, and events:
- Eight core CPUs
- 32GB of RAM
Miscellaneous servers:
- Jenkins, monitoring, bastion hosts, and security scanners
- Eight core CPUs
- 32GB of RAM
Storage appliances:
- iSCSI for VM hosts
- Fiber channel SAN - MySQL databases
- 1 PB total storage; 400 TB available
- NAS - image storage, logs, backups
- 100 TB total storage; 35 TB available
Business Requirements
- Build a reliable and reproducible environment with scaled parity of production.
- Improve security by defining and adhering to a set of security and Identity and Access Management (IAM) best practices for the cloud.
- Improve business agility and speed of innovation through rapid provisioning of new resources.
- Analyze and optimize architecture for performance in the cloud.
Technical Requirements
- Easily create non-production environments in the cloud.
- Implement an automation framework for provisioning resources in cloud. Implement a continuous deployment process for deploying applications to the on-premises data center or cloud.
- Support failover of the production environment to the cloud during an emergency.
- Encrypt data on the wire and at rest.
- Support multiple private connections between the production data center and cloud environment.
Executive Statement
Our investors are concerned about our ability to scale and contain costs with our current infrastructure. They are also concerned that a competitor could use a public cloud platform to offset their up-front investment and free them to focus on developing better features. Our traffic patterns are highest in the mornings and weekend evenings; during other times, 80 percent of our capacity is sitting idle.
Our capital expenditure is now exceeding our quarterly projections. Migrating to the cloud will likely cause an initial increase in spending, but we expect to transition completely before our next hardware refresh cycle. Our total cost of ownership (TCO) analysis over the next five years for a public cloud strategy achieves a cost reduction of 30–50 percent over our current model.
Mountkirk Games Case Study
Company Overview
Mountkirk Games makes online, session-based, multiplayer games for mobile platforms. They build all of their games using some server-side integration. Historically, they have used cloud providers to lease physical servers.
Due to the unexpected popularity of some of their games, they have had problems scaling their global audience, application servers, MySQL databases, and analytics tools.
Their current model is to write game statistics to files and send them through an ETL tool that loads them into a centralized MySQL database for reporting.
Solution Concept
Mountkirk Games is building a new game, which they expect to be very popular. They plan to deploy the game’s backend on the Google Compute Engine so that they can capture streaming metrics, run intensive analytics, and take advantage of its autoscaling server environment and integrate with a managed NoSQL database.
Business Requirements
- Increase to a global footprint
- Improve uptime (downtime is loss of players)
- Increase efficiency of the cloud resources they use
- Reduce latency to all customers
Technical Requirements
Requirements for Game Backend Platform
- Dynamically scale up or down based on game activity.
- Connect to a transactional database service to manage user profiles and game state.
- Store game activity in a time series database service for future analysis.
- As the system scales, ensure that data is not lost due to processing backlogs.
- Run hardened Linux distro.
Requirements for Game Analytics Platform
- 1. Dynamically scale up or down based on game activity.
- 2. Process incoming data on the fly directly from the game servers.
- 3. Process data that arrives late because of slow mobile networks.
- 4. Allow queries to access at least 10 TB of historical data.
- 5. Process files that are regularly uploaded by users’ mobile devices.
Executive Statement
Our last successful game did not scale well with our previous cloud provider, resulting in lower user adoption and affecting the game’s reputation. Our investors want more key performance indicators (KPIs) to evaluate the speed and stability of the game, as well as other metrics that provide deeper insight into usage patterns so that we can adapt the game to target users. Additionally, our current technology stack cannot provide the scale we need, so we want to replace MySQL and move to an environment that provides autoscaling, low-latency load balancing, and frees us up from managing physical servers.
TerramEarth Case Study
Company Overview
TerramEarth manufactures heavy equipment for the mining and agricultural industries. About 80 percent of their business is from mining and 20 percent is from agriculture. They currently have over 500 dealers and service centers in 100 countries. Their mission is to build products that make their customers more productive.
Solution Concept
There are 20 million TerramEarth vehicles in operation that collect 120 fields of data per second. Data is stored locally on the vehicle, and it can be accessed for analysis when a vehicle is serviced. The data is downloaded via a maintenance port. This same port can be used to adjust operational parameters, allowing the vehicles to be upgraded in the field with new computing modules.
Approximately 200,000 vehicles are connected to a cellular network, allowing TerramEarth to collect data directly. At a rate of 120 fields of data per second, with 22 hours of operation per day, TerramEarth collects a total of about 9 TB of data per day from these connected vehicles.
Existing Technical Environment
TerramEarth’s existing architecture is composed of Linux and Windows-based systems that reside in a single U.S, west coast-based data center. These systems gzip CSV files from the field, upload via FTP, and place the data in their data warehouse. Because this process takes time, aggregated reports are based on data that is three weeks old.
With this data, TerramEarth has been able to stock replacement parts preemptively and reduce unplanned downtime of their vehicles by 60 percent. However, because the data is stale, some customers are without their vehicles for up to four weeks while they wait for replacement parts.
Business Requirements
- Decrease unplanned vehicle downtime to less than one week
- Support the dealer network with more data on how their customers use their equipment to position new products and services better.
- Have the ability to partner with different companies—especially with seed and fertilizer suppliers in the fast-growing agricultural business—to create compelling joint offerings for their customers.
Technical Requirements
- Expand beyond a single data center to decrease latency to the American Midwest and east coast
- Create a backup strategy
- Increase security of data transfer from equipment to the data center
- Improve data in the data warehouse
- Use customer and equipment data to anticipate customer needs
Application 1: Data ingest
A custom Python application reads uploaded data files from a single server and writes to the data warehouse.
Compute
Windows Server 2008 R2
- 16 CPUs
- 128 GB of RAM
- 10 TB local HDD storage
Application 2: Reporting
An off-the-shelf application that business analysts use to run a daily report to see what equipment needs repair. Only 2 analysts of a team of 10 (5 west coast, 5 east coast) can connect to the reporting application at a time.
Compute
Off-the-shelf application. License tied to number of physical CPUs.
- Windows Server 2008 R2
- 16 CPUs
- 32 GB of RAM
- 500 GB HDD
Data warehouse
- A single PostgreSQL server
- RedHat Linux
- 64 CPUs
- 128 GB of RAM
- 4x 6TB HDD in RAID 0
Executive Statement
Our competitive advantage has always been in our manufacturing process, with our ability to build better vehicles for lower cost than our competitors. However, new products with different approaches are constantly being developed, and I’m concerned that we lack the skills to undergo the next wave of transformations in our industry. My goals are to build our skills while addressing immediate market needs through incremental innovations.
Summary
The Google Cloud Professional Architect exam covers several broad areas, including the following:
- Planning a cloud solution
- Managing a cloud solution
- Securing systems and processes
- Complying with government and industry regulations
- Understanding technical requirements and business considerations
- Maintaining solutions deployed to production, including monitoring
These areas require business as well as technical skills. For example, since architects regularly work with nontechnical colleagues, it is important for architects to understand issues such as reducing operational expenses, accelerating the pace of development, maintaining and reporting on service-level agreements, and assisting with regulatory compliance. In the realm of technical knowledge, architects are expected to understand functional requirements around computing, storage, and networking as well as nonfunctional characteristics of services, such as availability and scalability.
The exam includes three case studies, and some exam questions reference the case studies. Questions about the case studies may be business or technical questions.
Exam Essentials
Assume every word matters in case studies and exam questions. Some technical requirements are stated explicitly, but some are implied in business statements. Review the business requirements as carefully as the technical requirements in each case study. Similarly, when reading an exam question, pay attention to all of the statements. What may look like extraneous background information at first may turn out to be information that you need in order to choose between two options.
Study and analyze case studies before taking the exam. Become familiar with the case studies before the exam to save time while taking the text. You don’t need to memorize the case studies, as you’ll have access to them during the test. Watch for numbers that indicate the scale of the problem. If you need to transmit more than 10 Gbps, then you should consider a Cloud Interconnect solution over a VPN solution, which works up to about 3 Gbps.
Understand what is needed in the near term and what may be needed in the future. For example, in the TerramEarth case study, 200,000 vehicles are equipped with cellular communications equipment that can collect data daily. What would change about your design if all 20 million vehicles in production reported their data daily? This requirement is not stated, and not even implied, but it is the kind of planning for the future that architects are expected to do.
Understand how to plan a migration. Migrations are high-risk operations. Data can be lost, and services may be unavailable. Know how to plan to run new and old systems in parallel so that you can compare results. Be able to identify lower-risk migration steps so that they can be scheduled first. Plan for incremental migrations.
Know agile software development practices. You won’t have to write code for this exam, but you will need to understand continuous integration/continuous deployment and maintaining development, test, staging, and production environments. Understand what is meant by an infrastructure-as-code service and how that helps accelerate development and deployment.
Keep in mind that solutions may involve non-Google services or applications. Google has many services, but sometimes the best solution involves a third-party solution. For example, Jenkins and Spinnaker are widely used tools to support continuous integration and deployment. Google Cloud has a code repository, but many developers use GitHub. Sometimes businesses are locked into existing solutions, such as a third-party database. The business may want to migrate to another database solution, but the cost may be too high for the foreseeable future.
Review Questions
You have been tasked with interviewing line-of-business owners about their needs for a new cloud application. Which of the following do you expect to find?
- A comprehensive list of defined business and technical requirements
- That their business requirements do not have a one-to-one correlation with technical requirements
- Business and technical requirements in conflict
- Clear consensus on all requirements
You have been asked by stakeholders to suggest ways to reduce operational expenses as part of a cloud migration project. Which of the following would you recommend?
- Managed services, preemptible machines, access controls
- Managed services, preemptible machines, autoscaling
- NoSQL databases, preemptible machines, autoscaling
- NoSQL databases, preemptible machines, access controls
Some executives are questioning your recommendation to employ continuous integration/continuous deployment (CI/CD). What reasons would you give to justify your recommendation?
- CI/CD supports small releases, which are easier to debug and enable faster feedback.
- CI/CD is used only with preemptible machines and therefore saves money.
- CI/CD fits well with waterfall methodology but not agile methodologies.
- CI/CD limits the number of times code is released.
The finance director has asked your advice about complying with a document retention regulation. What kind of service-level objective (SLO) would you recommend to ensure that the finance director will be able to retrieve sensitive documents for at least the next seven years? When a document is needed, the finance director will have up to seven days to retrieve it. The total storage required will be approximately 100 GB.
- High availability SLO
- Durability SLO
- Reliability SLO
- Scalability SLO
You are facilitating a meeting of business and technical managers to solicit requirements for a cloud migration project. The term incident comes up several times. Some of the business managers are unfamiliar with this term in the context of IT. How would you describe an incident?
- A disruption in the ability of a DevOps team to complete work on time
- A disruption in the ability of the business managers to approve a project plan on schedule
- A disruption that causes a service to be degraded or unavailable
- A personnel problem on the DevOps team
You have been asked to consult on a cloud migration project that includes moving private medical information to a storage system in the cloud. The project is for a company in the United States. What regulation would you suggest that the team review during the requirements-gathering stages?
- General Data Protection Regulations (GDPR)
- Sarbanes–Oxley (SOX)
- Payment Card Industry Data Security Standard (PCI DSS)
- Health Insurance Portability and Accountability Act ( HIPAA)
You are in the early stages of gathering business and technical requirements. You have noticed several references about needing up-to-date and consistent information regarding product inventory. Inventory is managed on a global scale, and the warehouses storing inventory are located in North America, Africa, Europe, and India. Which managed database solution in Google Cloud would you include in your set of options for an inventory database?
- Cloud Storage
- BigQuery
- Cloud Spanner
- Microsoft SQL Server
A developer at Mountkirk Games is interested in how architects decide which database to use. The developer describes a use case that requires a document store. The developer would rather not manage database servers or have to run backups. What managed service would you suggest the developer consider?
- Cloud Datastore
- Cloud Spanner
- Cloud Storage
- BigQuery
Members of your company’s legal team are concerned about using a public cloud service because other companies, organizations, and individuals will be running their systems in the same cloud. You assure them that your company’s resources will be isolated and not network-accessible to others because of what networking resource in Google Cloud?
- CIDR blocks
- Direct connections
- Virtual private clouds
- Cloud Pub/Sub
What two business drivers are behind Dress4Win’s interest in moving to the cloud?
- Insufficient infrastructure capacity and desire to be more agile
- Insufficient infrastructure and competitors moving to the cloud
- Competitors moving to the cloud and desire to be more agile
- Insufficient infrastructure and short-term cost savings
Dress4Win is considering replacing its self-managed MySQL database with a managed service. Which Google Cloud service would you recommend that they consider?
- Cloud Dataproc
- Cloud Dataflow
- Cloud SQL
- PostgreSQL
Which of the following requirements from a customer makes you think the application should run in Compute Engine and not App Engine?
- Dynamically scale up or down based on game activity
- Connect to a database
- Run a hardened Linux distro on a virtual machine
- Don’t lose data
Consider the TerramEarth case study. What aspect of that case study prompts you to consider potentially significant changes to requirements in the future?
- Dealers will want more reports about their customers.
- Of 20 million pieces of equipment, only 200,000 have cellular connections; 19,800,000 additional pieces of equipment may someday transmit data in real time instead of downloading it in batches.
- TerramEarth is in a competitive industry.
- TerramEarth would like to partner with other companies to improve overall service to their customers.
Mountkirk Games wants to store player game data in a time-series database. Which Google Cloud managed database would you recommend?
- Bigtable
- BigQuery
- Cloud Storage
- Cloud Dataproc
The game analytics platform for Mountkirk Games requires analysts to be able to query up to 10 TB of data. What is the best managed database solution for this requirement?
- Cloud Spanner
- BigQuery
- Cloud Storage
- Cloud Dataprep