Google Professional Machine Learning Engineer

Practice Test #1

50問と120分の制限時間で実際の試験をシミュレーションしましょう。AI検証済み解答と詳細な解説で学習できます。

50問題120分700/1000合格点

練習問題を見る

AI搭載

3重AI検証済み解答＆解説

すべての解答は3つの主要AIモデルで交差検証され、最高の精度を保証します。選択肢ごとの詳細な解説と深い問題分析を提供します。

GPT Pro

Claude Opus

Gemini Pro

選択肢ごとの解説

深い問題分析

3モデル合意の精度

練習問題

問題 1

Your team deployed a regression model that predicts hourly water usage for industrial chillers. Four months after launch, a vendor firmware update changed sensor sampling and units for three input features, and the live feature distributions diverged: 5 of 18 features now have a population stability index > 0.25, 27% of temperature readings fall outside the training range, and production RMSE increased from 0.62 to 1.45. How should you address the input differences in production?

問題 2

You are building an end-to-end scikit-learn MLOps workflow in Vertex AI Pipelines (Kubeflow Pipelines) that ingests 50 GB of CSV data from Cloud Storage, performs data cleaning, feature selection, model training, and model evaluation, then writes a .pkl model artifact to a versioned path in a GCS bucket. You are iterating on multiple versions of the feature selection and training components, submitting each version as a new pipeline run in us-central1 on n1-standard-4 CPU-only executors; each end-to-end run currently takes about 80 minutes. You want to reduce iteration time during development without increasing your GCP costs; what should you do?

問題 3

Your team must deliver an ML solution on Google Cloud to triage warranty claim emails for a global appliance manufacturer into 8 categories within 4 weeks. You are required to use TensorFlow to maintain full control over the model's code, serving, and deployment, and you will orchestrate the workflow with Kubeflow Pipelines. You have 30,000 labeled examples and want to accelerate delivery by leveraging existing resources and managed services instead of training a brand-new model from scratch. How should you build the classifier?

問題分析

Core concept: This question tests when to use transfer learning with TensorFlow on Google Cloud (Vertex AI/legacy AI Platform) versus fully managed “no/low-code” NLP services, under constraints requiring full control of model code, serving, and deployment, and pipeline orchestration with Kubeflow Pipelines. Why the answer is correct: You have 30,000 labeled emails and only 4 weeks, so training a modern NLP model from scratch is unnecessary and risky. The requirement to “use TensorFlow to maintain full control over the model’s code, serving, and deployment” rules out managed black-box training/serving approaches (Natural Language API classification and AutoML Natural Language). The best fit is to start from an established text classification model (for example, a pretrained Transformer encoder or a TF Hub text embedding/classifier backbone) and fine-tune it on your 8 warranty categories. This is classic transfer learning: it accelerates convergence, reduces data requirements, and improves accuracy/time-to-market. You can implement training in TensorFlow, package the model artifact, and deploy it on Vertex AI Prediction (or GKE) with custom containers, all orchestrated via Kubeflow Pipelines. Key features / best practices: Use pretrained language representations (e.g., BERT-style encoders or TF Hub text embeddings) and fine-tune a classification head for 8 classes. Build a Kubeflow Pipeline with components for data validation, preprocessing (tokenization), training, evaluation (precision/recall per class, confusion matrix), and conditional deployment. Use Vertex AI custom training jobs (or GKE) for reproducibility, and Vertex AI Model Registry + endpoints (or KFServing/KServe) for controlled serving. Ensure global email language considerations (multilingual models if needed) and monitor drift. Common misconceptions: Managed APIs (Natural Language API) feel fast, but they don’t provide full control over model code and deployment. AutoML is also fast, but it abstracts training and typically doesn’t satisfy “full control” requirements. Using a pretrained model “as-is” rarely matches domain-specific labels like warranty triage categories. Exam tips: When a question explicitly requires TensorFlow control and custom deployment, prefer custom training/transfer learning over AutoML/APIs. When labels are domain-specific, expect fine-tuning rather than zero-shot or off-the-shelf classification. Map “accelerate delivery” + “limited data” to transfer learning.

問題 4

You are building an anomaly detection model for an industrial IoT platform using Keras and TensorFlow. The last 24 months of sensor events (~900 million rows, ~2.6 TB) are stored in a single partitioned table in BigQuery, and you need to apply feature scaling, categorical encoding, and time-window aggregations in a cost-effective and efficient way before training. The trained model will be used to run weekly batch inference directly in BigQuery against newly ingested partitions. How should you implement the preprocessing workflow?

問題分析

Core Concept: This question tests scalable feature engineering and training data input pipelines when the source of truth is BigQuery and inference will run in BigQuery. It emphasizes pushing preprocessing to the data (BigQuery SQL) and using efficient, distributed ingestion into TensorFlow. Why the Answer is Correct: Option C aligns the entire workflow with BigQuery as the central analytical engine. BigQuery is well-suited for large-scale transformations (2.6 TB, 900M rows) using partition pruning, clustering, window functions, and SQL-based feature engineering. Doing scaling, categorical encoding, and time-window aggregations in BigQuery is cost-effective because you can restrict scans to relevant partitions (e.g., last 24 months) and materialize features into a derived table or view. For training, the TensorFlow I/O BigQuery connector (or equivalent BigQuery-to-tf.data integration) enables streaming data into a tf.data pipeline without exporting massive intermediate files, supporting shuffling, batching, and parallel reads. This also keeps the feature logic consistent with weekly batch inference “directly in BigQuery” (e.g., via BigQuery ML remote models or by applying the same SQL feature view to new partitions). Key Features / Best Practices: - Use partitioned tables and WHERE filters on partition columns to minimize bytes scanned and cost. - Use window functions (e.g., SUM/AVG over time windows) and APPROX functions where appropriate for performance. - Materialize engineered features into a partitioned/clustered feature table to avoid recomputation and improve repeatability. - Ensure training/serving consistency by reusing the same SQL feature definitions for both training and weekly inference. - Follow Google Cloud Architecture Framework principles: optimize cost (partition pruning), performance (BigQuery’s distributed execution), and operational excellence (single source of feature truth). Common Misconceptions: Spark/Dataflow pipelines can be powerful, but exporting large intermediate datasets often increases operational overhead, storage costs, and risks training/serving skew if inference is done in BigQuery with different logic. CSV exports are especially inefficient at this scale. Exam Tips: When data is already in BigQuery and inference will run in BigQuery, prefer SQL-based feature engineering and avoid unnecessary ETL exports. Look for answers that minimize data movement, leverage partitioning/clustering, and keep preprocessing logic consistent across training and serving.

問題 5

You are building an MLOps workflow for a smart‑city traffic analytics project that stitches together data preprocessing, model training, and model deployment across different Google Cloud services; traffic cameras upload 40–60 JSONL files (~50 MB each) per hour into a Cloud Storage bucket named gs://city-traffic-raw with bursty arrivals, you have already written code for each task, and you now need an orchestration layer that runs only when new files have arrived since the last successful run while minimizing always-on compute costs for orchestration; what should you do?

外出先でもすべての問題を解きたいですか？

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

問題 6

You work for a real-time multiplayer gaming company. You must design a system that stores and manages player telemetry features (e.g., positions, actions, and matches completed) and server locations over time. The system must provide sub-50 ms online retrieval of the latest features to feed a fraud-detection model for live inference, while the data science team must retrieve a point-in-time consistent snapshot of historical features (e.g., as-of a given timestamp) for training and backtesting. The solution should handle ingestion of approximately 200 million feature rows per day, support feature versioning, and require minimal operational effort. What should you do?

問題分析

Core Concept: This question tests selecting the right managed “feature store” pattern: low-latency online feature serving for real-time inference plus point-in-time correct historical retrieval for training/backtesting (to avoid training-serving skew and label leakage), at high ingestion scale with minimal ops. Why the Answer is Correct: Vertex AI Feature Store is purpose-built to store, manage, and serve ML features. It supports an online store optimized for millisecond retrieval of the latest feature values (meeting sub-50 ms needs) and an offline store for historical feature access used in training and backtesting. Critically, it is designed to provide point-in-time feature retrieval semantics so data scientists can build “as-of timestamp” datasets that are consistent with what would have been known at that time. It also supports feature definitions/metadata and feature versioning/management workflows, reducing operational burden compared to building custom pipelines. Key Features / Configurations / Best Practices: - Online serving: entity-keyed lookup for latest feature values with low latency; integrate with live inference (e.g., Vertex AI endpoints) without custom caching layers. - Offline access: export/query historical features for training; supports time-based correctness to reduce leakage. - Feature management: centralized feature definitions, monitoring/metadata, and reuse across teams (aligns with Google Cloud Architecture Framework: operational excellence, reliability, and security via managed services and governed reuse). - Scale: designed for high-throughput ingestion (hundreds of millions of rows/day is a common feature-store use case), with managed scaling and reduced SRE overhead. Common Misconceptions: - Bigtable can meet low-latency online reads, but it does not natively provide point-in-time consistent historical feature retrieval and feature-store semantics; you’d need to engineer versioning, TTLs, backfills, and “as-of” joins yourself. - BigQuery is excellent for offline analytics/training, but it is not intended for sub-50 ms per-request online serving at scale; Storage Read API is for high-throughput batch reads, not low-latency key-based serving. - Vertex AI Datasets are for managing training data artifacts, not for online feature serving or point-in-time feature retrieval. Exam Tips: When you see requirements for (1) low-latency online feature lookup, (2) historical point-in-time correctness for training/backtesting, and (3) minimal ops with feature governance/versioning, the canonical answer is Vertex AI Feature Store. Choose Bigtable only when the question is purely about key/value low-latency storage without feature-store requirements.

問題 7

You are setting up a weekly demand-forecasting workflow for a nationwide grocery chain: you train a custom model on 85 GB of historical sales data stored in Cloud Storage and produce about 6 million batch predictions per run; compliance requires an auditable end-to-end lineage that links the exact training data snapshot, the resulting model artifact, and each weekly batch prediction job for at least 90 days; what should you do to ensure this lineage is automatically captured across training and prediction?

問題分析

Core Concept: This question tests Vertex AI lineage/metadata capture across an end-to-end ML workflow. In Google Cloud, auditable lineage is best achieved by running training and batch prediction as steps in Vertex AI Pipelines (Kubeflow Pipelines on Vertex AI), which automatically records executions, inputs/outputs, and artifacts in Vertex AI Metadata (MLMD). Why the Answer is Correct: Compliance requires an auditable linkage between (1) the exact training data snapshot, (2) the produced model artifact, and (3) each weekly batch prediction job, retained for 90 days. Vertex AI Pipelines provides automatic, system-managed tracking of pipeline runs and component executions, including artifact URIs (for example, Cloud Storage paths), parameters, and produced artifacts (model, metrics, batch prediction outputs). When you use standard pipeline components (custom training job component and batch prediction component), Vertex AI records the relationships in Metadata without you building your own logging/lineage system. This creates a queryable lineage graph tying the dataset version/snapshot reference used at training time to the resulting model and to each subsequent batch prediction run. Key Features / Best Practices: - Use Vertex AI Pipelines for orchestration and reproducibility; each weekly run is a pipeline execution with immutable recorded inputs/outputs. - Ensure the pipeline passes explicit data snapshot identifiers (for example, a dated GCS prefix or object generation) as parameters so the exact training data reference is captured. - Use Vertex AI Batch Prediction job component so prediction job configuration and output locations are captured as artifacts. - Retention: Vertex AI Metadata stores lineage for auditing; align project-level retention/governance policies to meet the 90-day requirement. Common Misconceptions: - “Managed dataset + training pipeline + batch prediction” (Option A) sounds right, but “Vertex AI training pipeline” is ambiguous; lineage is most reliably and automatically captured when both training and prediction are executed within Vertex AI Pipelines/Metadata, not merely by using separate Vertex AI services. - Vertex AI Experiments (Option D) tracks experiment runs/metrics, but it is not a complete, automatic end-to-end lineage solution for batch prediction jobs. Exam Tips: When you see requirements like “auditable lineage,” “end-to-end traceability,” and “automatically captured,” think Vertex AI Pipelines + Vertex AI Metadata. Prefer built-in pipeline components (training and batch prediction) over ad-hoc SDK scripts, because components integrate with Metadata and produce a consistent lineage graph.

問題 8

Your analytics guild is preparing a time-boxed 3-week prototype, and you must provide a shared Vertex AI Workbench user-managed notebook VM in us-central1 for exactly 8 external contractors while preventing the other 500 project users from opening or running the environment. You will provision the notebook instance yourself and need to follow least-privilege and ensure that notebook code can call Vertex AI APIs during experiments. What should you do to configure access correctly?

問題 9

You are training custom models with Vertex AI Training to classify defects in 12-megapixel manufacturing photos, and each week you swap in new neural architectures from research to benchmark them on the same fixed 600 GB dataset; you want automatic retraining to occur only when code changes are pushed to the main branch, keep full version control of code and build artifacts, and minimize costs by avoiding always-on orchestration or manual steps. What should you do to meet these requirements?

問題 10

You are organizing a 24-hour internal ML sprint for a team of 12 data scientists who need to explore and prototype PySpark and Spark SQL transformations on 40 TB of Parquet data stored in Cloud Storage. The environment must be accessible via web-based notebooks, support distributed Spark execution out of the box, and require minimal setup with no manual package installs. What is the fastest way to provide a robust, scalable notebook environment for this sprint?

問題分析

Core Concept: This question tests selecting the fastest, lowest-friction environment for interactive, web-based notebooks that can run distributed PySpark/Spark SQL at scale against large datasets in Cloud Storage. The key services are Dataproc (managed Spark/Hadoop) and notebook front ends (Jupyter). Why the Answer is Correct: A Dataproc cluster with the Jupyter optional component provides a ready-to-use, web-accessible notebook UI that is already integrated with a properly configured Spark runtime (drivers/executors, YARN, Spark SQL, connectors). For a 24-hour sprint on 40 TB of Parquet in Cloud Storage, Dataproc is purpose-built: it can scale horizontally, read Parquet efficiently, and supports Spark SQL out of the box. It also minimizes setup: no manual package installs, no custom kernels, and no ad hoc cluster wiring. You can create the cluster in minutes, enable autoscaling, and give the team immediate access. Key Features / Best Practices: - Dataproc optional components: Jupyter/JupyterLab provides browser notebooks hosted on the cluster. - Native Spark + Spark SQL: preinstalled and configured; consistent environment for all 12 users. - Cloud Storage connector: standard for Dataproc, enabling direct reads of Parquet from gs:// without copying data. - Scalability: resize cluster or use autoscaling policies to handle concurrent exploration; consider preemptible/spot workers for cost during a short sprint. - IAM and network: use least privilege (Storage Object Viewer on the bucket), and consider private IP + IAP/authorized networks for notebook access. Common Misconceptions: Vertex AI Workbench is excellent for notebooks, but it does not provide distributed Spark “out of the box”; you typically still need a Spark backend (often Dataproc) and additional configuration (kernels/connectors). Colab Enterprise is great for Python notebooks but is not the standard, turnkey solution for distributed Spark on large data without additional setup and constraints. A manual VM build is slow, brittle, and not scalable. Exam Tips: When you see “PySpark/Spark SQL,” “distributed execution,” “minimal setup,” and “large data in Cloud Storage,” Dataproc is the default managed Spark answer. If the question emphasizes “web notebooks on the cluster” and “out of the box Spark,” look for Dataproc + Jupyter optional component. If it emphasizes “managed notebook + connect to Spark,” then Workbench + Dataproc might appear, but that is not minimal setup compared to Dataproc’s built-in notebook option.

合格体験記(7)

C***************Nov 24, 2025

学習期間: 1 month

Just want to say a massive thank you to the entire Cloud pass, for helping me pass my exam first time. I wont lie, it wasn't easy, especially the way the real exam is worded, however the way practice questions teaches you why your option was wrong, really helps to frame your mind and helps you to understand what the question is asking for and the solutions your mind should be focusing on. Thanks once again.

f****Nov 23, 2025

学習期間: 1 month

Good questions banks and explanations that help me practise and pass the exam.

민

민**Nov 12, 2025

学習期間: 1 month

강의 듣고 바로 문제 풀었는데 정답률 80% 가량 나왔고, 높은 점수로 시험 합격했어요. 앱 잘 이용했습니다

S************Nov 11, 2025

学習期間: 1 month

Good mix of theory and practical scenarios

A***********Nov 6, 2025

学習期間: 1 month

I used the app mainly to review the fundamentals—data preparation, model tuning, and deployment options on GCP. The explanations were simple and to the point, which really helped before the exam.

他の模擬試験

Practice Test #2

50 問題·120分·合格 700/1000

Practice Test #3

50 問題·120分·合格 700/1000

← すべてのGoogle Professional Machine Learning Engineer問題を見る

今すぐ学習を始める

Cloud Passをダウンロードして、すべてのGoogle Professional Machine Learning Engineer練習問題を利用しましょう。

外出先でもすべての問題を解きたいですか？

アプリを入手

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

Cloud Pass

Google Professional Machine Learning Engineer

Practice Test #1

50問と120分の制限時間で実際の試験をシミュレーションしましょう。AI検証済み解答と詳細な解説で学習できます。

50問題120分700/1000合格点

練習問題を見る

AI搭載

3重AI検証済み解答＆解説

すべての解答は3つの主要AIモデルで交差検証され、最高の精度を保証します。選択肢ごとの詳細な解説と深い問題分析を提供します。

GPT Pro

Claude Opus

Gemini Pro

選択肢ごとの解説

深い問題分析

3モデル合意の精度

練習問題

問題 1

問題 2

問題 3

問題分析

問題 4

問題分析

問題 5

外出先でもすべての問題を解きたいですか？

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

問題 6

問題分析

問題 7

問題分析

問題 8

問題 9

問題 10

問題分析

合格体験記(7)

C***************Nov 24, 2025

学習期間: 1 month

f****Nov 23, 2025

学習期間: 1 month

Good questions banks and explanations that help me practise and pass the exam.

민

민**Nov 12, 2025

学習期間: 1 month

강의 듣고 바로 문제 풀었는데 정답률 80% 가량 나왔고, 높은 점수로 시험 합격했어요. 앱 잘 이용했습니다

S************Nov 11, 2025

学習期間: 1 month

Good mix of theory and practical scenarios

A***********Nov 6, 2025

学習期間: 1 month

I used the app mainly to review the fundamentals—data preparation, model tuning, and deployment options on GCP. The explanations were simple and to the point, which really helped before the exam.

他の模擬試験

Practice Test #2

50 問題·120分·合格 700/1000

Practice Test #3

50 問題·120分·合格 700/1000

← すべてのGoogle Professional Machine Learning Engineer問題を見る

今すぐ学習を始める

Cloud Passをダウンロードして、すべてのGoogle Professional Machine Learning Engineer練習問題を利用しましょう。

外出先でもすべての問題を解きたいですか？

アプリを入手

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。