AWS Certified Data Engineer - Associate (DEA-C01)

Practice Test #1

65問と130分の制限時間で実際の試験をシミュレーションしましょう。AI検証済み解答と詳細な解説で学習できます。

65問題130分720/1000合格点

練習問題を見る

AI搭載

3重AI検証済み解答＆解説

すべての解答は3つの主要AIモデルで交差検証され、最高の精度を保証します。選択肢ごとの詳細な解説と深い問題分析を提供します。

GPT Pro

Claude Opus

Gemini Pro

選択肢ごとの解説

深い問題分析

3モデル合意の精度

練習問題

問題 1

(2つ選択)

A streaming media company runs six production studios across five AWS Regions, each studio’s compliance team uses a distinct IAM role, and all raw subtitle files and QC logs are consolidated in a single Amazon S3 data lake with partitions by aws_region (for example, s3://media-lake/raw/aws_region=eu-central-1/), and the data engineering team must, with the least operational overhead and without creating new buckets or duplicating data, ensure that each studio can query only records from its own Region via services like Amazon Athena; which combination of steps should the team take? (Choose two.)

問題分析

Core concept: This question tests AWS Lake Formation governance for an S3-based data lake queried by Athena, specifically fine-grained access control (FGAC) using data filters (row/column-level security) without duplicating data or creating new buckets. Why the answer is correct: To restrict each studio to only its own Region’s partition (aws_region=...), the data engineering team should use Lake Formation to centrally govern access to the shared table. First, the S3 bucket/prefix that contains the data must be registered as a Lake Formation data location so Lake Formation can enforce permissions and manage access through its service-linked role (and optionally via “data location permissions”). Second, enable FGAC and create a Region-based data filter (e.g., filter expression aws_region = 'eu-central-1') and grant each studio’s IAM role permissions on the table using the appropriate data filter. This ensures Athena queries return only rows for that Region, with minimal operational overhead and no data duplication. Key AWS features and configurations: - Lake Formation data locations: Register the S3 bucket/prefix used by the data lake so Lake Formation can control access to underlying objects. - Data filters (LF-Tags/data filters): Data filters provide row-level and column-level filtering for governed tables. For partitioned data, filters can effectively limit access to specific partitions (e.g., aws_region). - Grants to IAM principals: You grant Lake Formation permissions (SELECT, DESCRIBE, etc.) to each studio’s IAM role, scoped by the data filter. - Athena integration: Athena uses the Glue Data Catalog/Lake Formation permissions when querying governed tables, enabling centralized governance rather than per-role S3 policies. Common misconceptions: A is tempting because it mentions data filters and prefixes, but “register prefixes as data locations using data filters” mixes two separate constructs; data filters don’t register S3 locations. C is incorrect because data filters are not attached to IAM roles; they are Lake Formation resources used in permission grants. E violates constraints (no new buckets/duplication) and shifts governance to coarse S3 prefix policies rather than query-time FGAC. Exam tips: When you see “single S3 data lake,” “Athena,” and “each team can only see subset of rows/partitions,” think Lake Formation FGAC with data filters (or LF-Tags) plus registering the S3 data location. Also remember: IAM controls who can call services; Lake Formation controls what data they can see in the data lake.

問題 2

A media analytics company plans to lift-and-shift its on-premises Kafka cluster (3 brokers, 24 partitions, ~2 MB/s average ingest with bursts to 12 MB/s, 50-KB messages) and the consumer application that processes incremental CDC updates emitted by an on-premises MySQL via Debezium to AWS, and the team insists on a replatform (not refactor) strategy with minimal operational management while preserving Kafka APIs and automatic scaling—which AWS service choice meets these requirements with the least management overhead?

問題 3

A data engineer must optimize a smart-utility analytics pipeline that processes residential smart-meter readings, where Apache Parquet files are delivered daily to an Amazon S3 bucket under the prefix s3://utility-raw/consumption/. Every Monday, the team runs ad hoc SQL to compute KPIs filtered by reading_date for multiple windows (last 7, 30, and 180 days). The dataset currently grows by about 15 GB per day and is expected to reach 60 GB per day within a year; the solution must prevent query performance from degrading as data volume increases while being the most cost-effective. Which approach meets these requirements most cost-effectively?

問題 4

(2つ選択)

A data platform team queries time-series telemetry in Amazon S3 with Amazon Athena using the AWS Glue Data Catalog, but a single table has about 1.2 million partitions organized by year/month/day/hour under a prefix like s3://prod-telemetry/tenant_id={t}/year={YYYY}/month={MM}/day={DD}/hour={HH}, causing query planning to become a bottleneck; while keeping data in S3, which solutions will remove the bottleneck and reduce Athena planning time? (Choose two.)

問題分析

Core concept: This question tests Athena query planning behavior with highly partitioned tables in the AWS Glue Data Catalog. With ~1.2M partitions, the bottleneck is not scan/compute but metadata and partition enumeration during planning. The goal is to keep data in S3 while reducing the number of partitions Athena must list/consider. Why the answers are correct: A (Glue partition index + partition filtering) addresses the planning bottleneck by accelerating partition lookups in the Data Catalog. A partition index stores partition metadata in an indexed form so Athena can quickly find matching partitions for predicates (e.g., tenant_id and time range) instead of scanning/listing huge partition sets. When partition filtering is enabled/used, Athena prunes partitions earlier and avoids expensive full partition enumeration. C (Athena partition projection) removes the need to store and retrieve millions of partition entries from the Glue Data Catalog at all. Instead, you define the partition scheme (tenant_id/year/month/day/hour) and valid ranges/patterns, and Athena computes the partition values and corresponding S3 paths at query time. This eliminates the “partition explosion” metadata overhead and typically yields the largest planning-time reduction for time-series layouts. Key AWS features / best practices: - AWS Glue Data Catalog partition indexes: improve partition retrieval performance for large partition counts. - Athena partition projection: define projection types (integer, enum, date) and storage.location.template to map partition values to S3 prefixes; reduces or eliminates partition management operations (e.g., MSCK REPAIR TABLE). - Predicate design: ensure queries include partition columns (tenant_id, year/month/day/hour or derived timestamp filters) so pruning/projection is effective. Common misconceptions: - Converting to Parquet (D) improves scan efficiency and cost, but does not directly fix planning-time partition enumeration. - Combining small files (E) helps runtime performance (fewer S3 GETs, fewer splits) but does not reduce partition metadata planning overhead. - Bucketing (B) can help join/aggregation performance in some engines, but Athena’s primary planning bottleneck here is partition metadata scale, not file distribution. Exam tips: When you see “millions of partitions” and “planning time bottleneck” in Athena/Glue, think metadata optimizations: partition projection (avoid catalog partitions) and partition indexes (speed catalog partition lookups). File format and small-file fixes are usually about scan/runtime, not planning.

問題 5

A mobility analytics startup ingests vehicle telemetry into an Amazon MSK cluster at 2,800 JSON events per second on average (bursts up to 11,000 events/s, ~1.8 KB per event) and must make this data available in Amazon Redshift with sub-minute freshness (SLA: under 45 seconds end-to-end) for operational dashboards while optimizing storage cost by avoiding an extra durable raw copy outside the streaming source and keeping operational overhead to a minimum; which solution best meets these requirements with the least operational effort?

外出先でもすべての問題を解きたいですか？

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

問題 6

A transportation logistics startup ingests vehicle telemetry and order-tracking events into an Amazon DynamoDB table configured for provisioned capacity; traffic is highly predictable: every weekday from 06:45 to 10:00 local time the workload spikes to 6x the baseline, while from Friday 20:00 through Sunday 23:00 usage drops to about 10% of the weekday peak; the team needs to maintain single-digit millisecond latency during peaks and minimize spend during off-hours. Which solution will meet these requirements in the most cost-effective way?

問題 7

(2つ選択)

A fintech startup runs an Amazon Aurora MySQL-Compatible DB cluster (port 3306) in two private subnets (subnet-10.0.1.0/24 in us-east-1a and subnet-10.0.2.0/24 in us-east-1b) with no route to an internet gateway, the DB security group (sg-db) currently allows inbound only from itself on TCP 3306, and a developer created an AWS Lambda function with default networking (no VPC) to insert/update/delete rows; the team must allow the function to connect to the cluster endpoint privately without traversing the public internet or using a NAT, with the least operational overhead—Which combination of steps meets the requirement? (Choose two.)

問題 8

A logistics company stores a 120-million-row table named shipments in Amazon Redshift that includes a column called port_code, and analysts need a SQL query that returns all rows where port_code begins with 'NY' or 'LA'; which query meets this requirement?

問題 9

A marine research vessel streams vibration, salinity, and gyro readings from 24 onboard sensor arrays, each sending 150 KB of JSON every 12 seconds through a shipboard gateway to AWS over TLS; an operations job polls an Amazon S3 bucket every 45 seconds to pick up the latest files for aggregation, and you must choose an ingestion design that delivers the arriving data into S3 with the least end-to-end latency while sustaining the throughput. Which solution will deliver the data to the S3 bucket with the least latency?

問題 10

(2つ選択)

An IoT analytics team maintains a centralized AWS Glue Data Catalog for telemetry files arriving in multiple Amazon S3 buckets across two AWS accounts, and they must keep the catalog updated incrementally within 10 minutes of new object writes without building custom code or long-running infrastructure; S3 event notifications are already configured to publish ObjectCreated events to an Amazon SQS standard queue dedicated to catalog updates; which combination of steps should the team take to meet these requirements with the least operational overhead? (Choose two.)

合格体験記(8)

나

나**Nov 25, 2025

学習期間: 1 month

문제 제대로 이해하고 풀었으면 여러분들도 합격 가능할거에요! 화이팅

Z**********Nov 23, 2025

学習期間: 1 month

I passed the AWS data engineer associate exam. Cloud pass questions is best app which help candidate to preparer well for any exam. Thanks

박

박**Nov 7, 2025

学習期間: 1 month

시험하고 문제 패턴이 비슷

주

주**Nov 7, 2025

学習期間: 2 months

813/1000 합격했어요!! 시험하고 문제가 유사한게 많았어요

여

여**Nov 2, 2025

学習期間: 1 month

해설까지 있어서 공부하기 좋았어요. 담에 또 올게요

他の模擬試験

Practice Test #2

65 問題·130分·合格 720/1000

← すべてのAWS Certified Data Engineer - Associate (DEA-C01)問題を見る

今すぐ学習を始める

Cloud Passをダウンロードして、すべてのAWS Certified Data Engineer - Associate (DEA-C01)練習問題を利用しましょう。

外出先でもすべての問題を解きたいですか？

アプリを入手

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

Cloud Pass

AWS Certified Data Engineer - Associate (DEA-C01)

Practice Test #1

65問と130分の制限時間で実際の試験をシミュレーションしましょう。AI検証済み解答と詳細な解説で学習できます。

65問題130分720/1000合格点

練習問題を見る

AI搭載

3重AI検証済み解答＆解説

すべての解答は3つの主要AIモデルで交差検証され、最高の精度を保証します。選択肢ごとの詳細な解説と深い問題分析を提供します。

GPT Pro

Claude Opus

Gemini Pro

選択肢ごとの解説

深い問題分析

3モデル合意の精度

練習問題

問題 1

(2つ選択)

問題分析

問題 2

問題 3

問題 4

(2つ選択)

問題分析

問題 5

外出先でもすべての問題を解きたいですか？

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。

問題 6

問題 7

(2つ選択)

問題 8

問題 9

問題 10

(2つ選択)

合格体験記(8)

나

나**Nov 25, 2025

学習期間: 1 month

문제 제대로 이해하고 풀었으면 여러분들도 합격 가능할거에요! 화이팅

Z**********Nov 23, 2025

学習期間: 1 month

I passed the AWS data engineer associate exam. Cloud pass questions is best app which help candidate to preparer well for any exam. Thanks

박

박**Nov 7, 2025

学習期間: 1 month

시험하고 문제 패턴이 비슷

주

주**Nov 7, 2025

学習期間: 2 months

813/1000 합격했어요!! 시험하고 문제가 유사한게 많았어요

여

여**Nov 2, 2025

学習期間: 1 month

해설까지 있어서 공부하기 좋았어요. 담에 또 올게요

他の模擬試験

Practice Test #2

65 問題·130分·合格 720/1000

← すべてのAWS Certified Data Engineer - Associate (DEA-C01)問題を見る

今すぐ学習を始める

Cloud Passをダウンロードして、すべてのAWS Certified Data Engineer - Associate (DEA-C01)練習問題を利用しましょう。

外出先でもすべての問題を解きたいですか？

アプリを入手

Cloud Passをダウンロード — 模擬試験、学習進捗の追跡などを提供します。