CloudPass LogoCloud Pass
AWSGoogle CloudMicrosoftCiscoCompTIADatabricks
Certifications
AWSGoogle CloudMicrosoftCiscoCompTIADatabricks
  1. Cloud Pass
  2. AWS
  3. AWS Certified Generative AI Developer - Professional (AIP-C01)
AWS Certified Generative AI Developer - Professional (AIP-C01)

AWS

AWS Certified Generative AI Developer - Professional (AIP-C01)

85+ questions d'entraînement avec réponses vérifiées par IA

Free questions & answersQuestions réelles d'examen
AI-powered explanationsExplications détaillées
Real exam-style questionsAu plus proche de l'examen réel
Parcourir les 85+ questions

Propulsé par l'IA

Réponses et explications vérifiées par triple IA

Chaque réponse AWS Certified Generative AI Developer - Professional (AIP-C01) est vérifiée par 3 modèles d'IA de pointe pour garantir une précision maximale. Obtenez des explications détaillées par option et une analyse approfondie des questions.

GPT Pro
Claude Opus
Gemini Pro
Explications par option
Analyse approfondie des questions
Précision par consensus de 3 modèles

Domaines de l'examen

Foundation Model Integration, Data Management, and CompliancePondération 31%
Implementation and IntegrationPondération 26%
AI Safety, Security, and GovernancePondération 20%
Operational Efficiency and Optimization for GenAI ApplicationsPondération 12%
Testing, Validation, and TroubleshootingPondération 11%

Questions d'entraînement

1
Question 1
(Sélectionnez 3)

A finance company is developing an AI assistant to help clients plan investments and manage their portfolios. The company identifies several high-risk conversation patterns such as requests for specific stock recommendations or guaranteed returns. High-risk conversation patterns could lead to regulatory violations if the company cannot implement appropriate controls. The company must ensure that the AI assistant does not provide inappropriate financial advice, generate content about competitors, or make claims that are not factually grounded in the company's approved financial guidance. The company wants to use Amazon Bedrock Guardrails to implement a solution. Which combination of steps will meet these requirements? (Choose three.)

Denied topics in Amazon Bedrock Guardrails are the right control for blocking specific conversational patterns described in natural language, such as 'requests for specific stock recommendations' or 'guaranteed returns.' You define the topic with a name, a definition, and a few sample phrases, and Guardrails uses the model to detect and block prompts (and outputs) that match the topic semantically. This is more robust than keyword matching because users phrase the same intent in many different ways, so it directly addresses the requirement to keep the assistant from giving inappropriate financial advice.

Content filters target a fixed set of harm categories (hate, insults, sexual content, violence, misconduct, prompt attack). 'Specific stock recommendations' or 'guaranteed returns' do not fit those categories, so a content filter would not reliably catch them. Denied topics is the correct mechanism for enforcing domain-specific conversational restrictions.

Content filters cannot accept arbitrary lists of strings such as competitor names — they classify against built-in harm categories. The right tool for an exact list is a custom word filter, which is option D.

Custom word filters apply exact-string matching, which is exactly what is needed for a closed list of competitor names. Setting both the input action and the output action to block ensures the assistant neither accepts prompts that mention competitors nor produces responses that contain them, satisfying the 'do not generate content about competitors' requirement with no false negatives that a category-based filter would have.

A low grounding score threshold means the system tolerates responses that are only weakly grounded in the source content, which is the inverse of the requirement. To enforce 'factually grounded' responses you raise the threshold (option F), making the guardrail reject any response below the higher bar.

Bedrock Guardrails contextual grounding check evaluates whether a generated response is grounded in the source/context. The grounding score threshold sets the minimum acceptable groundedness — a higher threshold rejects more weakly grounded answers, which is what the company wants when it must keep responses 'factually grounded in the company's approved financial guidance.' A low threshold (option E) would do the opposite and let ungrounded responses through.

2
Question 2

A healthcare company is using Amazon Bedrock to build a Retrieval Augmented Generation (RAG) application that helps practitioners make clinical decisions. The application must achieve high accuracy for patient information retrievals, identify hallucinations in generated content, and reduce human review costs. Which solution will meet these requirements?

Comprehend with Step Functions and entity-recognition confidence does not measure hallucination — it measures how confident an entity classifier is, which is unrelated to whether an FM-generated answer faithfully reflects the source. Tracking entity scores would miss factually wrong but well-formed clinical statements.

Building a custom fine-tuned medical evaluator and parallelizing it with Lambda is heavier than necessary and contradicts the requirement to reduce human review costs only by adding development and maintenance cost on the evaluation side. Bedrock's built-in evaluation already provides judge-model precision and hallucination metrics out of the box.

CloudWatch Synthetics generates synthetic traffic for canary monitoring; it does not score retrieval accuracy or detect hallucinations against clinical sources. Comparing against expected outcomes only catches drift on a fixed test set and gives no per-response hallucination signal.

A hybrid evaluation system that uses LLM-as-a-judge for first-pass screening and reserves human review for the edge cases is the standard pattern for medical RAG: it produces graded scores at scale (high accuracy + hallucination scoring) while keeping clinician time focused on the responses that actually matter, which is exactly the cost-reduction the company wants. Pairing this with the Bedrock built-in evaluation, which natively reports retrieval precision and hallucination rate, removes the need to build those metrics from scratch, and SageMaker Feature Store gives a versioned home for evaluation datasets so that runs are comparable over time.

3
Question 3

A company is developing a customer support application that uses Amazon Bedrock foundation models (FMs) to provide real-time AI assistance to the company's employees. The application must display AI-generated responses character by character as the responses are generated. The application needs to support thousands of concurrent users with minimal latency. The responses typically take 15 to 45 seconds to finish. Which solution will meet these requirements?

Amazon API Gateway WebSocket APIs are designed for persistent bidirectional connections, which lets the backend push partial output to the client as soon as the model produces it. Pairing the WebSocket API with a Lambda integration that calls the Amazon Bedrock InvokeModelWithResponseStream API gives true token-by-token streaming: each chunk returned by Bedrock is forwarded over the open WebSocket as it arrives, producing the character-by-character behavior the requirements demand. WebSocket APIs also handle connection management at scale, supporting tens of thousands of concurrent clients without custom infrastructure, and Lambda scales horizontally to absorb the burst of incoming connections during 15–45 second model responses.

REST API + InvokeModel + 100 ms client polling cannot stream. The standard InvokeModel API blocks until the full response is generated and returns it as a single body, so polling buys nothing while the model is still generating; once it finishes the entire answer arrives at once, defeating the character-by-character requirement. The polling cadence also adds wasted client requests and scales much worse than maintaining one open WebSocket per user.

Calling Bedrock directly from the browser with IAM user credentials would force long-lived AWS credentials to be embedded in client code — a fundamental security antipattern: those credentials would have full Bedrock access, could not be safely rotated, and would be invisible to backend audit and quota controls. Even temporary credentials retrieved through a separate auth flow would still bypass server-side guardrails, throttling, and prompt management, which is unacceptable for a customer-facing application.

Caching complete responses in DynamoDB and serving them through paginated GET requests is the opposite of streaming: by definition the cache holds only the finished response, so the user sees nothing until the full 15–45 seconds elapse. Pagination just splits the already-final text into pages, it does not produce real-time output, and the cache hit rate is low for AI-assistance traffic where most prompts are unique.

4
Question 4

A media company must use Amazon Bedrock to implement a robust governance process for AI-generated content. The company needs to manage hundreds of prompt templates. Multiple teams use the templates across multiple AWS Regions to generate content. The solution must provide version control with approval workflows that include notifications for pending reviews. The solution must also provide detailed audit trails that document prompt activities and consistent prompt parameterization to enforce quality standards. Which solution will meet these requirements?

Bedrock Studio is the developer/builder UI; it does not provide centralized version control or audit history for prompts at the scale described. Storing approval status in DynamoDB and enforcing it with Lambda is the opposite of LEAST development — it builds a workflow that Prompt Management gives natively.

Amazon Bedrock Prompt Management is the AWS-managed service for storing prompt templates with built-in versioning, parameterized variables, and per-version access control. Using IAM policies to gate who can promote a version covers the approval-workflow requirement without bolting on a separate workflow engine, and AWS CloudTrail automatically captures every Prompt Management API call as an immutable audit record. Together these three services satisfy versioning, approvals (via permissions), audit trail, and parameterization with no custom orchestration.

S3 + tags is a documents-as-prompts pattern that lacks first-class prompt parameterization and versioning. EventBridge notifications cover the alerting piece, but the missing prompt-template engine means the team would still have to roll their own variable substitution and validation logic.

SageMaker Canvas is for no-code ML modeling, not prompt management. CloudFormation can version templates as files but is not designed to host hundreds of prompt variants with per-team approval flow, and AWS Config evaluates resource compliance, not prompt approvals.

5
Question 5
(Sélectionnez 2)

A company is using Amazon Bedrock to design an application to help researchers apply for grants. The application is based on an Amazon Nova Pro foundation model (FM). The application contains four required inputs and must provide responses in a consistent text format. The company wants to receive a notification in Amazon Bedrock if a response contains bullying language. However, the company does not want to block all flagged responses. The company creates an Amazon Bedrock flow that takes an input prompt and sends it to the Amazon Nova Pro FM. The Amazon Nova Pro FM provides a response. Which additional steps must the company take to meet these requirements? (Choose two.)

Amazon Bedrock Prompt Management lets you define a parameterized prompt with named variables, pin the foundation model (Nova Pro) and inference parameters, and specify the expected output format — exactly what is needed when the four required application inputs must be passed as variables and the response must follow a consistent text format. Once defined, the prompt is referenced from the prompts node of the Bedrock flow, so the input → model → output mapping is centrally managed rather than hard-coded into the flow definition.

Setting the action to 'block' would prevent the response whenever the filter triggers, which contradicts the explicit requirement that the company does not want to block all flagged responses. The hate category is also a poor match for general bullying language, which is more accurately covered by the insults category in option D.

A prompt router selects between multiple foundation models for each invocation; it is not a mechanism to declare required inputs or constrain output format. The flow already pins Nova Pro, so introducing a router adds complexity without addressing the parameterization or content-filtering requirements.

An insults content filter set to the 'detect' (rather than 'block') action is what produces a notification when bullying language appears without preventing the response from being delivered — matching the requirement to be alerted but not block all flagged responses. Bullying language falls under the insults category in Bedrock Guardrails, and 'detect' is the action that raises an auditable signal while still letting the response flow through.

Application inference profiles control where and how a model is invoked (cross-Region inference, monitoring tags) — they are not a prompt-templating feature. Encoding the output format inside a profile description and tagging input variables is a misuse of the profile; Prompt Management (option A) is the supported way to define inputs, model, and output format.

Envie de vous entraîner partout ?

Téléchargez Cloud Pass — inclut des tests d'entraînement, le suivi de progression et plus encore.

6
Question 6

A company is developing an internal generative AI (GenAI) assistant that uses Amazon Bedrock to summarize corporate documents for multiple business units. The GenAI assistant must generate responses in a consistent format that includes a document summary, classification of business risks, and terms that are flagged for review. The GenAI assistant must adapt the tone of responses for each user's business unit, such as legal, human resources, or finance. The GenAI assistant must block hate speech, inappropriate topics, and sensitive information such as personal health information. The company needs a solution to centrally manage prompt variants across business units and teams. The company wants to minimize ongoing orchestration efforts and maintenance for post-processing logic. The company also wants to have the ability to adjust content moderation criteria for the GenAI assistant over time. Which solution will meet these requirements with the LEAST maintenance overhead?

Amazon Bedrock Prompt Management directly supports reusable templates and per-business-unit prompt variants — exactly the centralization the company wants. Pairing the templates with Bedrock Guardrails category filters (covering hate speech and other harmful content) and sensitive-term lists for PHI/PII handles the moderation layer with managed services, leaving very little ongoing orchestration. Both services are tunable from the console or API, so adjusting moderation criteria over time is a configuration change, not a code change.

Audience-based threshold tuning and an internal admin API are non-standard inventions, not features of Bedrock Guardrails. Building this would mean writing and maintaining a custom admin layer, which is more — not less — orchestration overhead.

Stuffing instructions into API calls, validating responses with Step Functions, and post-filtering with Comprehend creates a pipeline of moving parts (DynamoDB rules, Step Functions, Comprehend) that has to be maintained per business unit. Bedrock Guardrails on the inference call already handles category filtering with no orchestration.

Per-business-unit prompts in DynamoDB plus two Lambdas for selection and post-processing is exactly the custom orchestration the prompt asks the company to avoid. Comprehend is general-purpose NLP and is not specialized for the harmful-content categories Guardrails covers natively.

7
Question 7

A financial services company is building a customer support application that retrieves relevant financial regulation documents from a database based on semantic similarities to user queries. The application must integrate with Amazon Bedrock to generate responses. The application must be able to search documents that are in English, Spanish, and Portuguese. The application must filter documents by metadata such as publication date, regulatory agency, and document type. The database stores approximately 10 million document embeddings. To minimize operational overhead, the company wants a solution that minimizes management and maintenance effort. The application must provide low-latency responses for real-time customer interactions. Which solution will meet these requirements?

Amazon OpenSearch Serverless is a managed vector store that handles both vector similarity search and metadata filtering at the 10-million-document scale described, with the scaling and operational pieces taken care of for you. Connecting it as the vector store for an Amazon Bedrock Knowledge Base lets the application use RetrieveAndGenerate against an Anthropic Claude foundation model with built-in multilingual embedding support. The combination matches the trifecta the question asks for: low-latency vector search, metadata filtering on publication date / agency / document type, and minimal management overhead.

Aurora PostgreSQL with pgvector is self-managed enough that the team would need to size instances, tune indexes, and handle maintenance/patching at 10M embeddings — that's higher operational overhead than serverless, not lower. Custom SQL similarity also misses the natural integration with Bedrock Knowledge Bases.

S3 Vectors with non-filterable metadata fields cannot satisfy the explicit requirement to filter by publication date, regulatory agency, and document type. Filtering is described as core to the use case, so a non-filterable store is disqualified.

Amazon Neptune Analytics is a graph database. While it supports vector indexing, its operational model is designed for graph workloads and multi-hop relationship analytics — overkill for straightforward semantic similarity over independent documents and not the LEAST overhead option.

8
Question 8

A medical company is building a generative AI (GenAI) application that uses RAG to provide evidence-based medical information. The application uses Amazon OpenSearch Service to retrieve vector embeddings. Users report that searches frequently miss results that contain exact medical terms and acronyms and return too many semantically similar but irrelevant documents. The company needs to improve retrieval quality and maintain low end user latency, even as the document collection grows to millions of documents. Which solution will meet these requirements with the LEAST operational overhead?

Hybrid search combines vector similarity (semantic understanding) with traditional keyword/BM25 matching (exact terms and acronyms). This is a documented configuration in Amazon Bedrock Knowledge Bases and Amazon OpenSearch Service, so enabling it does not require new infrastructure — it is a search-mode change against the existing vector store. The combination directly addresses the failure mode the team observes: vector-only search misses queries that need exact medical terminology or acronyms, while keyword-only search misses semantically similar phrasing. Hybrid covers both, with no extra moving parts and minimal latency impact, satisfying the LEAST operational overhead requirement.

Increasing embedding dimensionality from 384 to 1,536 makes both index storage and similarity computation more expensive without addressing the actual problem, which is the lack of exact-term matching. Adding a Lambda post-filter is bolt-on logic that re-ranks symptoms rather than fixing retrieval.

Replacing OpenSearch with Amazon Kendra is a full migration of the search backend — far from minimal operational overhead. Query expansion handles acronyms but loses the team's existing investment in their vector search setup.

A two-stage retrieve-then-rerank pipeline with a SageMaker model adds two new components to operate (the SageMaker endpoint and the orchestration around it) and increases latency. It is more capable but not the LEAST operational overhead option.

9
Question 9

A company runs a generative AI (GenAI)-powered summarization application in an application AWS account that uses Amazon Bedrock. The application architecture includes an Amazon API Gateway REST API that forwards requests to AWS Lambda functions that are attached to private VPC subnets. The application summarizes sensitive customer records that the company stores in a governed data lake in a centralized data storage account. The company has enabled Amazon S3, Amazon Athena, and AWS Glue in the data storage account. The company must ensure that calls that the application makes to Amazon Bedrock use only private connectivity between the company's application VPC and Amazon Bedrock. The company's data lake must provide fine-grained column-level access across the company's AWS accounts. Which solution will meet these requirements?

Interface VPC endpoints for the Bedrock runtime keep all model invocations on AWS PrivateLink, never traversing the public internet, which is exactly what the company means by 'only private connectivity between the application VPC and Amazon Bedrock.' Running the Lambda functions in private subnets behind those endpoints, plus IAM conditions that pin invocations to approved endpoints and roles, completes the network-and-identity controls. For the data lake side, AWS Lake Formation LF-tag-based access control is AWS's recommended mechanism for fine-grained, cross-account column-level grants on tables registered in the Glue Data Catalog — directly satisfying the column-level cross-account requirement.

Routing Bedrock traffic through a NAT gateway sends it over the public internet, violating the private-only requirement. S3 ACLs are coarse-grained and cannot do column-level grants, and a weekly CloudTrail review is too infrequent for an enforceable control.

An S3 gateway endpoint does not give private connectivity for Bedrock — it covers S3 only. Calling Bedrock through public endpoints contradicts the requirement, and database-level grants cannot enforce column-level access.

VPC endpoints for both services do address private connectivity, but IAM path-based policies cannot deliver fine-grained column-level access; that requires Lake Formation LF-tags. Sending CloudTrail to CloudWatch Logs without metric filters or alarms also weakens audit posture.

10
Question 10

A retail company has a generative AI (GenAI) product recommendation application that uses Amazon Bedrock. The application suggests products to customers based on browsing history and demographics. The company needs to implement fairness evaluation across multiple demographic groups to detect and measure bias in recommendations between two prompt approaches. The company wants to collect and monitor fairness metrics in real time. The company must receive an alert if the fairness metrics show a discrepancy of more than 15% between demographic groups. The company must receive weekly reports that compare the performance of the two prompt approaches. Which solution will meet these requirements with the LEAST custom development effort?

Building post-processing analysis in EventBridge-triggered Lambda functions to compute fairness metrics is exactly the custom development the question asks to minimize. There is no native fairness measurement; the team would have to write and maintain the metric pipeline themselves.

Bedrock Guardrails content filters target harmful content categories (hate, insults, etc.) — they do not measure demographic fairness. The InvocationsIntervened metric counts guardrail interventions and is unrelated to fairness across demographic groups.

Amazon SageMaker Clarify is AWS's purpose-built bias and fairness analysis service — it produces statistical fairness metrics across demographic groups (DPL, DI, KL, etc.) that are exactly the metrics the company needs to compare two prompt approaches. Publishing those metrics to Amazon CloudWatch and using composite alarms lets the discrepancy threshold (15%) be expressed as a metric math rule that triggers in real time. CloudWatch dashboards then deliver the weekly performance comparison as a stock visualization rather than custom development, satisfying the LEAST custom development requirement.

Bedrock model evaluation jobs are batch evaluations and do not run in real time. InvocationsIntervened is again a guardrail metric, not a fairness metric; tagging it with demographic dimensions does not produce bias measurements.

Envie de vous entraîner partout ?

Téléchargez Cloud Pass — inclut des tests d'entraînement, le suivi de progression et plus encore.

11
Question 11

A company has deployed an AI assistant as a React application that uses AWS Amplify, an AWS AppSync GraphQL API, and Amazon Bedrock Knowledge Bases. The application uses the GraphQL API to call the Amazon Bedrock RetrieveAndGenerate API for knowledge base interactions. The company configures an AWS Lambda resolver to use the RequestResponse invocation type. Application users report frequent timeouts and slow response times. Users report these problems more frequently for complex questions that require longer processing. The company needs a solution to fix these performance issues and enhance the user experience. Which solution will meet these requirements?

AWS Amplify AI Kit is purpose-built for streaming generative-AI responses through an AWS AppSync GraphQL API, which is exactly the existing stack. Switching the resolver to a streaming pattern means the client renders tokens as they arrive instead of waiting for RetrieveAndGenerate to complete in a single RequestResponse Lambda invocation, eliminating both the timeout symptom and the perceived slowness for long-running queries. Because the AI Kit is an Amplify primitive, the change requires no replatforming away from AppSync.

Increasing the Lambda timeout and adding exponential-backoff retries treats the symptom, not the cause. Users still see no output during the entire processing window for complex queries, and retries on a synchronous resolver compound latency rather than reduce it.

Routing into an SQS queue for asynchronous processing breaks the synchronous client UX expected from a chat-style assistant. The user no longer gets a streaming response — they get an eventual one, which is worse for the experience the team is trying to fix.

Migrating from AppSync to a separate API Gateway WebSocket API discards the existing GraphQL stack and rebuilds the whole transport layer, which is far more invasive than enabling Amplify AI Kit's streaming on the resolver they already have.

12
Question 12

An ecommerce company operates a global product recommendation system that needs to switch between multiple foundation models (FM) in Amazon Bedrock based on regulations, cost optimization, and performance requirements. The company must apply custom controls based on proprietary business logic, including dynamic cost thresholds, AWS Region-specific compliance rules, and real-time A/B testing across multiple FMs. The system must be able to switch between FMs without deploying new code. The system must route user requests based on complex rules including user tier, transaction value, regulatory zone, and real-time cost metrics that change hourly and require immediate propagation across thousands of concurrent requests. Which solution will meet these requirements?

Updating Lambda environment variables in the console is itself a deploy event (a function-configuration update), so it does not satisfy 'switch between FMs without deploying new code' for production-grade rollouts. There is no native validation, rollback, or staged deployment.

API Gateway request-transformation templates with stage variables are designed for routing API requests, not for evaluating per-request rules like user tier and transaction value. Stage variables are also a deploy-step change, not real-time configuration.

AWS AppConfig is the AWS-managed runtime configuration service: routing rules can be updated independently of code, deployments include validation and safe-rollout strategies, and the AppConfig Agent caches the configuration at the Lambda runtime so per-request reads are local-memory fast at thousands of concurrent invocations. Putting the routing logic in a Lambda function that reads the AppConfig configuration and selects the Bedrock foundation model lets dynamic cost thresholds, regulatory-zone rules, and A/B traffic splits all be expressed as configuration that propagates without redeploying code — which is the central requirement.

Lambda authorizers run before the integration target is selected and are not the right place to make per-request model routing decisions. Routing to model-specific Lambdas also fans out the architecture and makes A/B testing across FMs harder than handling routing in a single Lambda.

13
Question 13

A financial services company is developing a Retrieval Augmented Generation (RAG) application to help investment analysts query complex financial relationships across multiple investment vehicles, market sectors, and regulatory environments. The dataset contains highly interconnected entities that have multi-hop relationships. The analysts must be able to examine the relationships holistically to provide accurate investment guidance. The application must deliver comprehensive answers that capture indirect relationships between financial entities. The application must produce responses in less than 3 seconds. Which solution will meet these requirements with the LEAST operational overhead?

Amazon Bedrock Knowledge Bases with GraphRAG and Amazon Neptune Analytics is purpose-built for the multi-hop, highly interconnected queries the analysts need: GraphRAG combines vector retrieval with graph-relationship traversal so that indirect connections between entities (investment vehicles → sectors → regulatory environments) surface in answers, not just direct semantic matches. Both services are managed, so there is no relationship-traversal code to write or graph store to operate, which is the LEAST operational overhead path. Sub-3-second latency is achievable because Neptune Analytics serves graph queries from memory.

Driving multi-hop relationships through chained Lambda functions over a flat vector store rebuilds the graph traversal that GraphRAG already provides, multiplying both latency and operational surface (function code, error handling, sequencing).

OpenSearch Serverless k-NN handles vector similarity but not relationship modeling. Putting the relationship layer in EC2 Auto Scaling means the team operates an application tier that GraphRAG would handle natively.

DynamoDB has no native semantic or graph search. A custom indexing system over DynamoDB combined with SageMaker for generation is a from-scratch build that ignores the managed graph-RAG primitives Bedrock provides.

14
Question 14

An elevator service company has developed an AI assistant application by using Amazon Bedrock. The application generates elevator maintenance recommendations to support the company's elevator technicians. The company uses Amazon Kinesis Data Streams to collect the elevator sensor data. New regulatory rules require that a human technician must review all AI-generated recommendations. The company needs to establish human oversight workflows to review and approve AI recommendations. The company must store all human technician review decisions for audit purposes. Which solution will meet these requirements?

A custom Lambda + SQS approval workflow is exactly the kind of bespoke orchestration that .waitForTaskToken is designed to replace: the team would re-implement state, retries, and timeout handling that Step Functions already provides.

AWS Step Functions natively supports human-in-the-loop pauses through the .waitForTaskToken integration pattern: the state machine emits a token, hands it to the human-review system, and pauses indefinitely until SendTaskSuccess or SendTaskFailure is called with that token. This maps cleanly onto the technician review requirement, gives the workflow a single durable record of the decision, and persisting the approval payload to DynamoDB satisfies the long-lived audit trail. Step Functions provides built-in retries, error handling, and visualization, so the moving parts are minimal.

AWS Glue is a data-integration/ETL service, not a workflow engine for human approvals; using a Glue workflow to gate technician review is misusing the service and lacks the long-running task-token pattern.

EventBridge can route events but has no native concept of pausing for a human decision, and Glue jobs/ElastiCache are wrong tools for storing audit decisions (cache is volatile by design).

15
Question 15

A financial services company uses an AI application to process financial documents by using Amazon Bedrock. During business hours, the application handles approximately 10,000 requests each hour, which requires consistent throughput. The company uses the CreateProvisionedModelThroughput API to purchase provisioned throughput. Amazon CloudWatch metrics show that the provisioned capacity is unused while on-demand requests are being throttled. The company finds the following code in the application: python response = bedrock_runtime.invoke_model(modelId="anthropic.claude-v2", body=json.dumps(payload)) The company needs the application to use the provisioned throughput and to resolve the throttling issues. Which solution will meet these requirements?

Adding more model units does not help when no traffic is actually hitting the provisioned capacity in the first place. The provisioned pool is unused, not undersized.

Amazon Bedrock provisioned throughput is associated with the ARN that CreateProvisionedModelThroughput returns — not with the foundation model ID. The code in the prompt invokes 'anthropic.claude-v2', which is the on-demand model ID, so every request is being served from the on-demand pool (and getting throttled), while the provisioned capacity sits idle. Replacing modelId with the provisioned ARN is what binds the application to the dedicated capacity and resolves both the unused-throughput and the throttling symptom in one change.

Exponential backoff masks throttling on the on-demand path but never causes the application to use the provisioned throughput it has already paid for. It also adds latency rather than removing the cause of the throttling.

Switching to InvokeModelWithResponseStream changes the response shape (streaming vs unary) but still routes against the same model ID, so the provisioned capacity remains untouched.

Envie de vous entraîner partout ?

Téléchargez Cloud Pass — inclut des tests d'entraînement, le suivi de progression et plus encore.

16
Question 16

A financial services company uses multiple foundation models (FMs) through Amazon Bedrock for its generative AI (GenAI) applications. To comply with a new regulation for GenAI use with sensitive financial data, the company needs a token management solution. The token management solution must proactively alert when applications approach model-specific token limits. The solution must also process more than 5,000 requests each minute and maintain token usage metrics to allocate costs across business units. Which solution will meet these requirements?

Token limits in Amazon Bedrock are model-specific, so estimating token usage with the appropriate per-model tokenizer in a Lambda function before sending the request is the only way to alert proactively (i.e., before the request is rejected). Publishing those estimates as Amazon CloudWatch metrics gives a single place to set thresholds and alarms, and storing detailed usage in Amazon DynamoDB supports per-business-unit cost allocation. The combination directly maps to the three requirements (proactive alerts, throughput at 5,000 requests per minute, and cross-business-unit cost reporting) using only managed AWS services.

Bedrock Guardrails do not have token-quota policies; their concerns are content moderation, denied topics, and PII handling. Building token alerting on guardrail metrics misuses the service and misses the proactive-alerting requirement.

An SQS dead-letter queue captures failed requests after the fact, which is reactive — the requirement explicitly asks for proactive alerts before token limits are reached. Log-based analysis through Logs Insights is also slower than per-request CloudWatch metrics.

API Gateway usage plans count requests, not tokens. Token usage is content-dependent and varies per request even within the same usage plan, so request-count throttling cannot enforce token-level limits accurately.

17
Question 17

A retail company is developing a customer service application that must process 10,000 daily queries about products, orders, and warranties. The application must be able to respond to queries about 50,000 product documents that are updated every day. The application must integrate with an order management API to check the status of orders and to help process returns. The application must maintain context throughout multi-turn interactions with customers. The company must collect complete audit trails for application responses. Which solution will meet these requirements with the LEAST operational overhead?

Fine-tuning a model per product category is heavy and expensive, does not give you order-management API integration, and conflicts with the daily-update requirement (the catalog would drift away from the model after every fine-tune).

Continued pre-training is even heavier than fine-tuning, and again does not address API-tool use or maintainability under daily document changes. Custom Lambda + REST API on top is hand-built orchestration that Bedrock agents already provide.

SageMaker AI containers, Kendra, and Step Functions are a workable but custom alternative — they require the team to operate model containers and string together orchestration that Bedrock agents handle out of the box.

An Amazon Bedrock agent with action groups can call the order management API directly, while an associated Bedrock Knowledge Base provides RAG over the 50,000 daily-updated product documents — the agent picks whether to retrieve from documentation or invoke the order action per user turn. Multi-turn context is built into Bedrock agents through session state, and enabling trace events captures the agent's reasoning, retrieval, and action invocations for the audit trail requirement. The whole stack is managed, with no orchestration layer to maintain.

18
Question 18

A company provides a service that helps users from around the world discover new restaurants. The service has 50 million monthly active users. The company wants to implement a semantic search solution across a database that contains 20 million restaurants and 200 million reviews. The company currently stores the data in a PostgresQL database. The solution must support complex natural language queries and return results for at least 95% of queries within 500 ms. The solution must maintain data freshness for restaurant details that update hourly. The solution must also scale cost-effectively during peak usage periods. Which solution will meet these requirements with the LEAST development effort?

Pure keyword search with custom analyzers cannot satisfy 'complex natural language queries' — that is precisely the case where semantic similarity matters and lexical matching falls short.

OpenSearch Service is built for this scale: 20 million restaurants and 200 million reviews fit comfortably with proven sub-500 ms p95 latency on properly sized indexes. Generating embeddings with a Bedrock foundation model and storing them as vector fields lets the same FM convert user queries to vectors at search time, so semantic similarity uses identical embedding semantics on both sides. k-NN search returns the top semantically similar results, and OpenSearch's index updates absorb the hourly freshness requirement. The development effort is mostly configuration of the vector index and the embedding pipeline.

Aurora PostgreSQL with pgvector struggles at hundreds of millions of vectors: index build time, query latency, and operational tuning all become real burdens at this scale, and you carry the always-on instance cost.

Bedrock Knowledge Bases plus a custom ingestion pipeline is heavier than necessary: KB has practical ingestion and document-count limits that make 220M+ items awkward, and the Retrieve API does not expose the vector index for custom search-tuning.

19
Question 19
(Sélectionnez 2)

A company uses Amazon Bedrock to generate technical content for customers. The company has recently experienced a surge in hallucination outputs when the company's model generates summaries of long technical documents. The model outputs include inaccurate or fabricated details. The company's current solution uses a large foundation model (FM) with a basic one-shot prompt that includes the full document in a single input. The company needs a solution that will reduce hallucinations and meet factual accuracy goals. The solution must process more than 1,000 documents each hour and deliver summaries within 3 seconds for each document. Which combination of solutions will meet these requirements? (Choose two.)

Zero-shot chain-of-thought prompting forces the model to reason step by step and verify intermediate facts before producing the summary, which is one of the most-validated techniques for reducing hallucinations on long-document summarization tasks. It is a prompt-side change, so it adds essentially no latency or infrastructure and can be measured against the existing one-shot baseline.

Retrieval Augmented Generation with a Bedrock Knowledge Base grounds the summary in source content rather than asking the FM to recall it. Semantic chunking keeps each unit of meaning intact (a section, a finding) and tuned embeddings improve retrieval relevance, so the model is summarizing what it can actually see rather than what it might have memorized.

Bedrock Guardrails cannot match arbitrary 'hallucination patterns' — guardrails operate on harm categories, denied topics, PII, and grounding scores. Pattern-matching hallucinated content via filters is not a supported capability.

Increasing the temperature parameter makes the model produce more varied (and less deterministic) outputs, which empirically increases hallucinations rather than decreasing them. This is the wrong direction.

Continuing to summarize the entire document in one pass is the existing approach that is already failing on long technical documents — repeating it changes nothing about the failure mode.

20
Question 20

A company is building a generative AI (GenAI) application that produces content based on a variety of internal and external data sources. The company wants to ensure that the generated output is fully traceable. The application must support data source registration and enable metadata tagging to attribute content to its original source. The application must also maintain audit logs of data access and usage throughout the pipeline. Which solution will meet these requirements?

Lake Formation does centralize permissions over Catalog tables but does not register heterogeneous data sources or apply metadata at the source level the way the question describes. Tagging objects directly in S3 is also limited to S3 and does not extend to other sources.

CloudWatch Logs is for application/operational logging, not audit. Pairing it with Glue Catalog leaves the audit-trail requirement unmet because CloudWatch retention, immutability, and per-API granularity are weaker than CloudTrail.

S3 object tagging covers attribution only for S3 objects; Glue Data Catalog managing schemas plus CloudTrail for S3 access leaves cross-service audit (Bedrock, Lambda, Glue, etc.) out of scope. The audit picture is incomplete.

AWS Glue Data Catalog is the AWS service for registering data sources and attaching metadata to them — including provenance and source attribution tags — across S3, RDS, Redshift, and other services. AWS CloudTrail captures management and data-plane API events across services, which gives you a single, immutable audit log of who accessed what and when throughout the GenAI pipeline. The pair satisfies registration, metadata tagging, and end-to-end auditability with no custom infrastructure.

Autres certifications AWS

AWS Certified Solutions Architecture - Associate (SAA-C03)

AWS Certified Solutions Architecture - Associate (SAA-C03)

Associate

AWS Certified AI Practitioner (AIF-C01)

AWS Certified AI Practitioner (AIF-C01)

Practitioner

AWS Certified Advanced Networking - Specialty (ANS-C01)

AWS Certified Advanced Networking - Specialty (ANS-C01)

Specialty

AWS Certified Cloud Practitioner (CLF-C02)

AWS Certified Cloud Practitioner (CLF-C02)

Practitioner

AWS Certified Data Engineer - Associate (DEA-C01)

AWS Certified Data Engineer - Associate (DEA-C01)

Associate

AWS Certified Developer - Associate (DVA-C02)

AWS Certified Developer - Associate (DVA-C02)

Associate

AWS Certified DevOps Engineer - Professional (DOP-C02)

AWS Certified DevOps Engineer - Professional (DOP-C02)

Professional

AWS Certified Machine Learning Engineer - Associate (MLA-C01)

AWS Certified Machine Learning Engineer - Associate (MLA-C01)

Associate

AWS Certified Security - Specialty (SCS-C02)

AWS Certified Security - Specialty (SCS-C02)

Specialty

AWS Certified Solutions Architect - Professional (SAP-C02)

AWS Certified Solutions Architect - Professional (SAP-C02)

Professional

Commencer à s'entraîner

Téléchargez Cloud Pass et commencez à vous entraîner sur toutes les questions AWS Certified Generative AI Developer - Professional (AIP-C01).

Get it on Google PlayDownload on the App Store
Cloud PassCloud Pass

Application d'entraînement aux certifications IT

Get it on Google PlayDownload on the App Store

Certifications

AWSGCPMicrosoftCiscoCompTIADatabricks

Mentions légales

FAQPolitique de confidentialitéConditions d'utilisation

Entreprise

ContactSupprimer le compte

© Copyright 2026 Cloud Pass, Tous droits réservés.

Envie de vous entraîner partout ?

Obtenir l'application

Téléchargez Cloud Pass — inclut des tests d'entraînement, le suivi de progression et plus encore.