<p>The grader rejects retrieved documents. The system reformulates the query and tries again. The new retrieval also fails the grader. The system loops. Without a retry cap and loop detection, this runs until you hit your rate limit or your daily cost cap. I saw this cost a client $340 in a single afternoon because one ambiguous user query triggered a loop that ran 87 iterations.</p>
<p><strong>Fix:</strong> Hard cap retry count at 3. After 3 failed retrievals, either generate from whatever you have or return a graceful "I don't have sufficient information" response. Never let the graph run without a termination condition. In the example provided, the `retry_count` is incremented each time the grader rejects documents and reformulates the query for another retrieval attempt.</p>
<h3>2. The Overly Generous Grader</h3>
<p>This occurs when the grading logic accepts marginally relevant documents as sufficient context. This leads to downstream hallucinations because the generator tries to answer from insufficient or irrelevant information.</p>
<p><strong>Fix:</strong> Implement strict binary grading: "relevant" or "irrelevant". Ensure that only clearly pertinent documents are marked as
Read the full article at DEV Community
Want to create content about this topic? Use Nemati AI tools to generate articles, social posts, and more.