Databricks

Databricks Certified Generative AI Engineer Associate: Certified Generative AI Engineer Associate

87+ 기출 문제 (AI 검증 답안 포함)

실제 시험 기출 문제

상세 해설

실제 시험과 가장 유사

87+ 문제 보기

AI 기반

3중 AI 검증 답안 및 해설

모든 Databricks Certified Generative AI Engineer Associate: Certified Generative AI Engineer Associate 답안은 3개의 최고 AI 모델로 교차 검증하여 최고의 정확도를 보장합니다. 선택지별 상세 해설과 심층 문제 분석을 제공합니다.

GPT Pro

Claude Opus

Gemini Pro

선택지별 상세 해설

심층 문제 분석

3개 모델 합의 정확도

시험 도메인

Design Applications출제율 14%

Data Preparation출제율 14%

Application Development출제율 30%

Assembling and Deploying Applications출제율 22%

Evaluation and Monitoring출제율 12%

Governance출제율 8%

실전 문제

문제 1

(2개 선택)

A Generative Al Engineer has created a RAG application to look up answers to questions about a series of fantasy novels that are being asked on the author’s web forum. The fantasy novel texts are chunked and embedded into a vector store with metadata (page number, chapter number, book title), retrieved with the user’s query, and provided to an LLM for response generation. The Generative AI Engineer used their intuition to pick the chunking strategy and associated configurations but now wants to more methodically choose the best values. Which TWO strategies should the Generative AI Engineer take to optimize their chunking strategy and parameters? (Choose two.)

문제 분석

Core concept: This question tests systematic evaluation and tuning of Retrieval-Augmented Generation (RAG) retrieval quality, specifically chunking strategy (how to split documents) and chunking parameters (chunk size, overlap, separators). In Databricks GenAI workflows, chunking is treated as a retriever hyperparameter that should be optimized using offline evaluation (ground-truth Q/A sets) and/or model-based judging. Why the answers are correct: C is correct because the most defensible way to optimize chunking is to define an information-retrieval metric (e.g., Recall@k, Precision@k, MRR, NDCG) and run controlled experiments varying chunk boundaries (paragraphs, sections, chapters), chunk size, and overlap. Chunking directly affects whether the retriever returns the passages that contain the answer and how much irrelevant text is included. Using retrieval metrics on a labeled set of questions with known relevant passages provides objective, repeatable selection of the best configuration. E is correct because LLM-as-a-judge is a practical evaluation approach when exact relevance labels are expensive. You can have an LLM score whether retrieved chunks are the “most appropriate” evidence for answering historical forum questions (or a curated eval set). This produces a scalar metric that can be optimized across chunking configurations. It aligns with modern RAG evaluation practices: judge the evidence quality and/or answer groundedness, then tune chunking to maximize those scores. Key features / best practices: - Build an evaluation dataset: questions + expected answer or expected source passages. - Evaluate retrieval separately from generation (retrieval metrics) and optionally end-to-end (judge answer quality/groundedness). - Sweep chunk size/overlap and splitting heuristics; keep other variables fixed to isolate chunking impact. - Use consistent k (top-k) and reranking settings during experiments. Common misconceptions: - Changing embedding models (A) can improve retrieval, but it does not methodically optimize chunking parameters; it confounds variables. - Query classification and metadata filtering (B) can help retrieval, but it’s a different optimization lever than chunking. - Asking an LLM for “best token count” (D) is not an evaluation; it guesses a size without measuring retrieval success. Exam tips: When asked to “methodically choose best chunking values,” look for answers involving (1) explicit evaluation metrics and (2) systematic experimentation/ablation. Databricks exam questions often distinguish between tuning retrieval (chunking, embeddings, indexing) and adding pipeline complexity (classifiers/filters) that doesn’t directly validate chunking quality.

이동 중에도 모든 문제를 풀고 싶으신가요?

Cloud Pass를 다운로드하세요 — 모의고사, 학습 진도 추적 등을 제공합니다.

다른 Databricks 자격증

Databricks Certified Machine Learning Associate: Certified Machine Learning Associate

Databricks Certified Data Analyst Associate: Certified Data Analyst Associate

Databricks Certified Data Engineer Associate: Certified Data Engineer Associate

지금 학습 시작하기

Cloud Pass를 다운로드하여 모든 Databricks Certified Generative AI Engineer Associate: Certified Generative AI Engineer Associate 기출 문제를 풀어보세요.

이동 중에도 모든 문제를 풀고 싶으신가요?

앱 받기

Cloud Pass를 다운로드하세요 — 모의고사, 학습 진도 추적 등을 제공합니다.

문제 2

A company has a typical RAG-enabled, customer-facing chatbot on its website.

diagram

Select the correct sequence of components a user's questions will go through before the final output is returned. Use the diagram above for reference.

문제 3

A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not. Which prompt will work to allow the engineer to respond to call classification labels correctly?

문제 4

When developing an LLM application, it’s crucial to ensure that the data used for training the model complies with licensing requirements to avoid legal risks. Which action is NOT appropriate to avoid legal risks?

문제 5

A Generative AI Engineer is developing a chatbot designed to assist users with insurance-related queries. The chatbot is built on a large language model (LLM) and is conversational. However, to maintain the chatbot’s focus and to comply with company policy, it must not provide responses to questions about politics. Instead, when presented with political inquiries, the chatbot should respond with a standard message: “Sorry, I cannot answer that. I am a chatbot that can only answer questions around insurance.” Which framework type should be implemented to solve this?

문제 6

(2개 선택)

A Generative Al Engineer is responsible for developing a chatbot to enable their company’s internal HelpDesk Call Center team to more quickly find related tickets and provide resolution. While creating the GenAI application work breakdown tasks for this project, they realize they need to start planning which data sources (either Unity Catalog volume or Delta table) they could choose for this application. They have collected several candidate data sources for consideration: call_rep_history: a Delta table with primary keys representative_id, call_id. This table is maintained to calculate representatives’ call resolution from fields call_duration and call start_time. transcript Volume: a Unity Catalog Volume of all recordings as a *.wav files, but also a text transcript as *.txt files. call_cust_history: a Delta table with primary keys customer_id, cal1_id. This table is maintained to calculate how much internal customers use the HelpDesk to make sure that the charge back model is consistent with actual service use. call_detail: a Delta table that includes a snapshot of all call details updated hourly. It includes root_cause and resolution fields, but those fields may be empty for calls that are still active. maintenance_schedule – a Delta table that includes a listing of both HelpDesk application outages as well as planned upcoming maintenance downtimes. They need sources that could add context to best identify ticket root cause and resolution. Which TWO sources do that? (Choose two.)

문제 7

A Generative Al Engineer is creating an LLM-based application. The documents for its retriever have been chunked to a maximum of 512 tokens each. The Generative Al Engineer knows that cost and latency are more important than quality for this application. They have several context length levels to choose from. Which will fulfill their need?

문제 8

A Generative AI Engineer is designing a RAG application for answering user questions on technical regulations as they learn a new sport. What are the steps needed to build this RAG application and deploy it?

문제 분석

Core concept: This question tests the end-to-end lifecycle of a Retrieval-Augmented Generation (RAG) application on Databricks: preparing a knowledge corpus, indexing it in Databricks Vector Search, performing retrieval at query time to ground an LLM’s response, then evaluating and deploying the final chain (retriever + prompt + model) using Databricks Model Serving. Why the answer is correct: A correct RAG flow has two phases: (1) offline/ingestion: ingest documents, chunk/clean, embed, and index into a vector store; and (2) online/inference: user query → retrieve relevant chunks → provide them as context to the LLM → generate an answer. Evaluation should occur after you have a working end-to-end pipeline (retrieval + generation), because you need to measure answer quality (groundedness, relevance, correctness) and retrieval quality (recall/precision) on representative questions. Option B places evaluation after the LLM generates responses, which matches how RAG systems are evaluated in practice. Key features / best practices (Databricks-specific): - Ingestion: use Auto Loader / Delta for scalable ingestion; chunk documents appropriately for embedding. - Indexing: create embeddings (e.g., via foundation model endpoints or embedding models) and store them in a Delta table; build a Databricks Vector Search index (managed or self-managed) for similarity search. - Retrieval: at query time, embed the user question, retrieve top-k chunks, optionally apply filters (metadata, regulation version), and pass retrieved text into the prompt. - Evaluation: use offline evaluation sets and metrics (answer correctness, faithfulness/groundedness, context relevance). Track results with MLflow to compare prompt/model/index changes. - Deployment: package the RAG chain as a model (often via MLflow) and deploy with Databricks Model Serving for low-latency, scalable endpoints. Common misconceptions: A common trap is placing evaluation before generation (you can’t evaluate answer quality without answers) or omitting the query-time retrieval step. Another confusion is thinking “LLM retrieves documents” directly; in RAG, retrieval is performed by the application using Vector Search, then the LLM is conditioned on retrieved context. Exam tips: Look for the canonical ordering: ingest/index first; query-time retrieval then generation; evaluation after end-to-end behavior exists; deploy last. If evaluation appears before generation, it’s usually incorrect for RAG application evaluation.

문제 9

A Generative AI Engineer just deployed an LLM application at a digital marketing company that assists with answering customer service inquiries. Which metric should they monitor for their customer service LLM application in production?

문제 10

A Generative AI Engineer is building a Generative AI system that suggests the best matched employee team member to newly scoped projects. The team member is selected from a very large team. The match should be based upon project date availability and how well their employee profile matches the project scope. Both the employee profile and project scope are unstructured text. How should the Generative Al Engineer architect their system?

문제 분석

Core concept: This question tests retrieval-augmented matching for unstructured text at scale, combined with structured constraints (date availability). In Databricks GenAI application design, the standard pattern is: (1) apply hard filters/constraints using deterministic tools (availability), then (2) use embeddings + vector search to semantically retrieve the best matches from a large corpus (employee profiles), optionally with metadata filtering. Why D is correct: Employee profiles and project scopes are unstructured text, so semantic similarity via embeddings is the most robust way to match “fit” beyond exact keywords. Because the team is very large, iterating through all employees and scoring each one is inefficient and does not leverage vector indexes. The system should first call a tool that returns available employees for the project dates (structured query against a schedule/HR system). Then, embed employee profiles into a vector store (Databricks Vector Search) and query using the project scope embedding. Critically, use filtering (metadata filters or pre-filtered candidate IDs) so retrieval only considers available employees. This yields scalable, low-latency top-k candidates that can then be optionally re-ranked or summarized by an LLM. Key features / best practices: - Use Databricks Vector Search with an embedding model (e.g., foundation model embeddings) to index employee profile chunks. - Store metadata alongside vectors (employee_id, role, skills, location, and availability flags or a join key) to enable filtered retrieval. - Apply “hard constraints first” (availability) to reduce the search space, then semantic retrieval for relevance. - Optionally add a second-stage re-ranker (LLM or cross-encoder) on the top-k results for higher precision, but the core architecture remains filtered vector retrieval. Common misconceptions: - Keyword extraction/matching (B) seems simpler but fails on synonyms, nuanced scope descriptions, and varied writing styles. - Brute-force similarity scoring over all employees (C) is conceptually valid but not scalable for a very large team. - Embedding project scopes and querying with employee profiles (A) reverses the typical direction and doesn’t naturally support “find best employee for a project”; it also complicates filtering by availability. Exam tips: When you see “very large corpus” + “unstructured text matching,” default to embeddings + vector search. When you also see “date availability” or other structured constraints, use tools/filters to enforce constraints, then perform vector retrieval with metadata filtering to get the best candidates efficiently.

문제 11

A Generative AI Engineer has a provisioned throughput model serving endpoint as part of a RAG application and would like to monitor the serving endpoint’s incoming requests and outgoing responses. The current approach is to include a micro-service in between the endpoint and the user interface to write logs to a remote server. Which Databricks feature should they use instead which will perform the same task?

문제 12

A Generative Al Engineer is building a system which will answer questions on latest stock news articles. Which will NOT help with ensuring the outputs are relevant to financial news?

문제 13

A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible. Which combination of chaining components and configuration meets these requirements?

문제 14

A Generative AI Engineer I using the code below to test setting up a vector store:

from databricks.vector_search.client import VectorSearchClient

vsc = VectorSearchClient()

vsc.create_endpoint(
    name="vector_search_test",
    endpoint_type="STANDARD"
)

Assuming they intend to use Databricks managed embeddings with the default embedding model, what should be the next logical function call?

문제 15

A Generative AI Engineer wants to build an LLM-based solution to help a restaurant improve its online customer experience with bookings by automatically handling common customer inquiries. The goal of the solution is to minimize escalations to human intervention and phone calls while maintaining a personalized interaction. To design the solution, the Generative AI Engineer needs to define the input data to the LLM and the task it should perform. Which input/output pair will support their goal?

문제 16

What is an effective method to preprocess prompts using custom code before sending them to an LLM?

문제 17

A Generative AI Engineer is developing an LLM application that users can use to generate personalized birthday poems based on their names. Which technique would be most effective in safeguarding the application, given the potential for malicious user inputs?

문제 18

Which indicator should be considered to evaluate the safety of the LLM outputs when qualitatively assessing LLM responses for a translation use case?

문제 19

A Generative AI Engineer is developing a patient-facing healthcare-focused chatbot. If the patient’s question is not a medical emergency, the chatbot should solicit more information from the patient to pass to the doctor’s office and suggest a few relevant pre-approved medical articles for reading. If the patient’s question is urgent, direct the patient to calling their local emergency services. Given the following user input: “I have been experiencing severe headaches and dizziness for the past two days.” Which response is most appropriate for the chatbot to generate?

문제 20

(2개 선택)

After changing the response generating LLM in a RAG pipeline from GPT-4 to a model with a shorter context length that the company self-hosts, the Generative AI Engineer is getting the following error:

{
"error_code": "BAD_REQUEST", "message": "Bad request: rpc error:
code = InvalidArgument desc = prompt token count (4595) cannot
exceed 4096."
}

What TWO solutions should the Generative AI Engineer implement without changing the response generating model? (Choose two.)