Conversation
- Skip CLI version check at worker init (saves ~300ms/request) - Pre-warm bundled CLI binary at startup to warm OS page caches - Parallelize E2B setup, system prompt fetch, and transcript download with asyncio.gather() (saves ~200-500ms) - Enable Langfuse prompt caching with configurable TTL (default 300s)
|
This PR targets the Automatically setting the base branch to |
WalkthroughThe PR adds a configurable Langfuse prompt cache TTL, implements CLI pre-warming during worker startup, and refactors the copilot SDK to run E2B setup, system-prompt fetch, and transcript download concurrently, plus adjusts message compression helper return types and transcript resume handling. Changes
Sequence DiagramsequenceDiagram
participant Client
participant SDK as SDK Service
participant E2B as E2B Sandbox
participant Langfuse as Langfuse/Prompt Store
participant Transcript as Transcript Storage
Client->>SDK: stream_chat_completion_sdk()
rect rgba(100,150,200,0.5)
note over SDK: Parallel setup (asyncio.gather)
par E2B Setup
SDK->>E2B: _setup_e2b()
E2B-->>SDK: sandbox ready
and Fetch System Prompt
SDK->>Langfuse: fetch prompt (cache_ttl=config.langfuse_prompt_cache_ttl)
Langfuse-->>SDK: base_system_prompt
and Transcript Download
SDK->>Transcript: _fetch_transcript()
Transcript-->>SDK: transcript data
end
end
rect rgba(150,150,100,0.5)
note over SDK: Processing and composition
SDK->>SDK: validate/process transcript
SDK->>SDK: build query message (returns was_compacted)
SDK->>SDK: _compress_messages (returns was_compacted)
SDK->>SDK: compose system prompt with tool supplements
end
SDK-->>Client: stream response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 PR Overlap DetectionThis check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early. 🟡 Medium Risk — Some Line OverlapThese PRs have some overlapping changes:
🟢 Low Risk — File Overlap OnlyThese PRs touch the same files but different sections (click to expand)
Summary: 0 conflict(s), 1 medium risk, 6 low risk (out of 7 PRs with file overlap) Auto-generated on push. Ignores: |
There was a problem hiding this comment.
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
autogpt_platform/frontend/src/app/(platform)/copilot/tools/CreateAgent/CreateAgent.tsx (1)
110-116:⚠️ Potential issue | 🟠 MajorClarification answers can be lost due to raw-vs-normalized keyword mismatch.
Line 116 builds the message with raw question keywords, while Line 201 renders inputs with normalized keywords. When normalization changes keys, answer lookup fails and empty values are sent.
💡 Proposed fix
export function CreateAgentTool({ part }: Props) { @@ const output = getCreateAgentToolOutput(part); + const clarificationQuestions = + output && isClarificationNeededOutput(output) + ? normalizeClarifyingQuestions(output.questions ?? []) + : []; @@ function handleClarificationAnswers(answers: Record<string, string>) { - const questions = - output && isClarificationNeededOutput(output) - ? (output.questions ?? []) - : []; - - onSend(buildClarificationAnswersMessage(answers, questions, "create")); + onSend( + buildClarificationAnswersMessage( + answers, + clarificationQuestions, + "create", + ), + ); } @@ {output && isClarificationNeededOutput(output) && ( <ClarificationQuestionsCard - questions={normalizeClarifyingQuestions(output.questions ?? [])} + questions={clarificationQuestions} message={output.message} onSubmitAnswers={handleClarificationAnswers} /> )}Also applies to: 199-202
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/tools/CreateAgent/CreateAgent.tsx around lines 110 - 116, The handler handleClarificationAnswers is passing raw question keywords to buildClarificationAnswersMessage while the form inputs use normalized keys, causing lookups to miss answers; fix by normalizing the question keys before building the message (or convert the answers map to raw keys expected by buildClarificationAnswersMessage) so the keys match. Locate handleClarificationAnswers, buildClarificationAnswersMessage and the normalization logic used when rendering inputs (the code around isClarificationNeededOutput and where inputs are created) and apply the same normalizeKey utility (or inverse mapping) to the questions or answers so submitted answers align with the message builder.autogpt_platform/frontend/src/app/(platform)/copilot/tools/EditAgent/EditAgent.tsx (1)
99-105:⚠️ Potential issue | 🟠 MajorUse the same normalized question keys for both UI input and message building.
Line 105 builds the message from raw
output.questions, but answers are keyed from normalized questions used in Line 175. If backend keywords contain casing/whitespace/duplicates, submitted answers won’t map correctly and can be sent as empty.💡 Proposed fix
export function EditAgentTool({ part }: Props) { const text = getAnimationText(part); const { onSend } = useCopilotChatActions(); @@ const output = getEditAgentToolOutput(part); + const clarificationQuestions = + output && isClarificationNeededOutput(output) + ? normalizeClarifyingQuestions(output.questions ?? []) + : []; @@ function handleClarificationAnswers(answers: Record<string, string>) { - const questions = - output && isClarificationNeededOutput(output) - ? (output.questions ?? []) - : []; - - onSend(buildClarificationAnswersMessage(answers, questions, "edit")); + onSend( + buildClarificationAnswersMessage( + answers, + clarificationQuestions, + "edit", + ), + ); } @@ {output && isClarificationNeededOutput(output) && ( <ClarificationQuestionsCard - questions={normalizeClarifyingQuestions(output.questions ?? [])} + questions={clarificationQuestions} message={output.message} onSubmitAnswers={handleClarificationAnswers} /> )}Also applies to: 173-176
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/tools/EditAgent/EditAgent.tsx around lines 99 - 105, handleClarificationAnswers is building the clarification message from raw output.questions while the UI keys answers use a normalized form, so answers can mis-map; before calling buildClarificationAnswersMessage, transform output.questions into the same normalized-key form used when rendering inputs (apply the exact normalization helper/logic used for input keys: trim, lowercase, collapse whitespace/duplicates) and produce an array of question objects keyed by that normalized key (deduplicate by normalized key, keeping the first occurrence), then pass that normalized questions array into buildClarificationAnswersMessage; do the same normalization update at the other call site that builds/sends clarification answers so both UI input keys and message-building use the identical normalization logic (reference: handleClarificationAnswers, buildClarificationAnswersMessage, output.questions and the input key normalization logic used when rendering the clarification inputs).autogpt_platform/backend/backend/data/workspace.py (1)
326-342:⚠️ Potential issue | 🟠 MajorFix the docstring-implementation mismatch and add explicit user ID scoping for guideline compliance.
The docstring claims the function "only fetches the
sizeBytescolumn to minimise data transfer," but the Prisma query lacks aselectparameter, so it loads all columns fromUserWorkspaceFile. Apply field projection (select={"sizeBytes": True}) to match the documented intent and reduce data transfer.Additionally, this function violates the coding guideline for
autogpt_platform/backend/backend/data/**/*.pyrequiring user ID checks. While the current architecture uses a 1:1 workspace-to-user relationship and all callers validate the user before calling this function, adding an explicituser_idparameter to the function signature ensures defense-in-depth and aligns with the stated guideline. Either add the parameter for validation, or document in a comment why it is not needed (e.g., "User scoping is enforced via workspace uniqueness constraint").🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/data/workspace.py` around lines 326 - 342, The docstring and implementation disagree and the function lacks explicit user scoping: update get_workspace_total_size to accept a user_id parameter (e.g., def get_workspace_total_size(workspace_id: str, user_id: str) -> int) and apply it in the Prisma query filter to scope results to that user (use UserWorkspaceFile.prisma().find_many(where={"workspaceId": workspace_id, "userId": user_id, "isDeleted": False})). Also add the field projection select={"sizeBytes": True} to the same find_many call to only fetch sizeBytes and update the docstring to reflect the new parameter and the projection. Ensure references to UserWorkspaceFile.prisma().find_many and get_workspace_total_size are used to locate changes.autogpt_platform/backend/backend/copilot/tools/__init__.py (1)
114-121:⚠️ Potential issue | 🟠 MajorNormalize anonymous IDs to
Nonebefore calling tool execution.This path forwards
user_idverbatim into tools. Add a guard so analytics-onlyanon_*IDs are never treated as real tool user IDs.🔧 Suggested fix
track_tool_called( user_id=user_id, session_id=session.session_id, tool_name=tool_name, tool_call_id=tool_call_id, ) - - return await tool.execute(user_id, session, tool_call_id, **parameters) + effective_user_id = ( + user_id if user_id and not user_id.startswith("anon_") else None + ) + return await tool.execute(effective_user_id, session, tool_call_id, **parameters)Based on learnings: In
autogpt_platform/backend/backend/copilot/tools/, anonymous users must passuser_id=Noneto tool execution;anon_*is analytics-only and must not be forwarded as a real user ID.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/tools/__init__.py` around lines 114 - 121, The code currently forwards user_id directly into tool.execute; normalize analytics-only anonymous IDs by converting any user_id that starts with "anon_" (or matches the anon pattern used in your system) to None before calling tool.execute and before passing into track_tool_called if appropriate—update the logic around track_tool_called and the return await tool.execute(user_id, session, tool_call_id, **parameters) call so that the variable passed to tool.execute is None for anon_* IDs (leave stored analytics ids unchanged where needed, but ensure tool.execute receives None).autogpt_platform/backend/backend/copilot/sdk/transcript.py (1)
380-391:⚠️ Potential issue | 🟠 MajorPrevent stale/skipped uploads from regressing transcript metadata.
When
content_skippedisTrue, metadata is still overwritten with the currentmessage_count. In stale-upload races, this can decrease the stored watermark while transcript content remains newer, which can break downstream gap-fill behavior.💡 Suggested fix
- # Always update metadata (even when content is skipped) so message_count - # stays current. The gap-fill logic in _build_query_message relies on - # message_count to avoid re-compressing the same messages every turn. - try: - meta = {"message_count": message_count, "uploaded_at": time.time()} - mwid, mfid, mfname = _meta_storage_path_parts(user_id, session_id) - await storage.store( - workspace_id=mwid, - file_id=mfid, - filename=mfname, - content=json.dumps(meta).encode("utf-8"), - ) - except Exception as e: - logger.warning(f"[Transcript] Failed to write metadata for {session_id}: {e}") + # Avoid regressing watermark on stale/skipped uploads. + # (Alternative: load existing meta and store max(existing, message_count)). + if not content_skipped: + try: + meta = {"message_count": message_count, "uploaded_at": time.time()} + mwid, mfid, mfname = _meta_storage_path_parts(user_id, session_id) + await storage.store( + workspace_id=mwid, + file_id=mfid, + filename=mfname, + content=json.dumps(meta).encode("utf-8"), + ) + except Exception as e: + logger.warning(f"[Transcript] Failed to write metadata for {session_id}: {e}")autogpt_platform/backend/backend/copilot/sdk/service.py (1)
1162-1202:⚠️ Potential issue | 🟠 MajorAvoid double transcript uploads per successful turn.
The upload block here runs once, and the
finallyblock runs it again (see Line 1287+). This duplicates storage traffic and adds latency.💡 Proposed fix
captured_transcript = CapturedTranscript() sdk_cwd = "" + transcript_uploaded = False ... if raw_transcript: - await asyncio.shield( - _try_upload_transcript( - user_id, - session_id, - raw_transcript, - message_count=len(session.messages), - ) - ) + transcript_uploaded = await asyncio.shield( + _try_upload_transcript( + user_id, + session_id, + raw_transcript, + message_count=len(session.messages), + ) + ) ... - if config.claude_agent_use_resume and user_id: + if config.claude_agent_use_resume and user_id and not transcript_uploaded:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 1162 - 1202, The transcript upload is being executed twice (once in the main block and again in the finally block), causing duplicate uploads; modify the logic around _try_upload_transcript (the asyncio.shield call using user_id, session_id, raw_transcript, message_count=len(session.messages)) so the upload only runs once per turn — e.g., introduce and set a local flag (e.g., transcript_uploaded) after a successful await asyncio.shield(_try_upload_transcript(...)) and check that flag in the finally block (or remove the redundant upload there) before invoking _try_upload_transcript again; ensure the same conditions (raw_transcript truthiness) are used so upload happens only when appropriate.
🟠 Major comments (22)
autogpt_platform/frontend/src/app/(platform)/copilot/components/ScaleLoader/ScaleLoader.module.css-20-20 (1)
20-20:⚠️ Potential issue | 🟠 MajorMove this loader styling to Tailwind/design-token based styling.
Line 20 adds new styling in a CSS module, which conflicts with the frontend styling standard. Please migrate this animation behavior to Tailwind utilities (or Tailwind config keyframes/animation tokens) instead of extending module CSS.
As per coding guidelines:
autogpt_platform/frontend/**/*.{js,jsx,ts,tsx,css}must "Use Tailwind CSS only for styling, use design tokens, and use Phosphor Icons only".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/components/ScaleLoader/ScaleLoader.module.css at line 20, Remove the animation-fill-mode declaration from ScaleLoader.module.css and instead declare a Tailwind animation token that includes the fill-mode in tailwind.config (e.g. theme.extend.keyframes.{scale: {...}} and theme.extend.animation.{ 'scale-loader': 'scale 400ms ease 0s 1 normal backwards' } or your chosen name/duration), then replace the module CSS usage with the Tailwind class (e.g. animate-scale-loader) in the ScaleLoader component and remove the CSS-module animation rule; reference ScaleLoader.module.css, the ScaleLoader component class usage, and tailwind.config to locate the changes.autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools_test.py-13-13 (1)
13-13:⚠️ Potential issue | 🟠 MajorMake these filesystem tests hermetic and collision-safe.
Line 13 + fixed
encodednames write into a shared home path, and cleanup is partial (os.unlink/os.rmdir), so reruns or parallel workers can collide and cause flaky failures. Use unique per-test directories and recursive cleanup.Suggested refactor pattern
import os +import shutil +import uuid import pytest @@ class TestReadLocal: - def _make_tool_results_file(self, encoded: str, filename: str, content: str) -> str: + def _make_tool_results_file( + self, filename: str, content: str, encoded: str | None = None + ) -> tuple[str, str, str]: """Create a tool-results file and return its path.""" + encoded = encoded or f"copilot-e2b-test-{uuid.uuid4().hex}" + project_dir = os.path.join(_SDK_PROJECTS_DIR, encoded) tool_results_dir = os.path.join(_SDK_PROJECTS_DIR, encoded, "tool-results") os.makedirs(tool_results_dir, exist_ok=True) filepath = os.path.join(tool_results_dir, filename) with open(filepath, "w") as f: f.write(content) - return filepath + return encoded, filepath, project_dir @@ - encoded = "-tmp-copilot-e2b-test-read" - filepath = self._make_tool_results_file( - encoded, "result.txt", "line 1\nline 2\nline 3\n" - ) + encoded, filepath, project_dir = self._make_tool_results_file( + "result.txt", "line 1\nline 2\nline 3\n" + ) @@ finally: _current_project_dir.reset(token) - os.unlink(filepath) + shutil.rmtree(project_dir, ignore_errors=True)Also applies to: 75-83, 84-99, 106-120, 132-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools_test.py` at line 13, Replace the shared home-based constant _SDK_PROJECTS_DIR and the tests that write files with unique, hermetic per-test directories (use tempfile.TemporaryDirectory or pytest tmp_path fixture) so encoded names are written under a test-scoped directory instead of "~/.claude/projects"; update each test that touches files (the blocks around lines 75-83, 84-99, 106-120, 132-149) to create and use that unique directory, and perform teardown with shutil.rmtree (or rely on TemporaryDirectory context manager) for recursive cleanup to avoid collisions and flakiness.autogpt_platform/frontend/src/app/(platform)/copilot/helpers.ts-31-50 (1)
31-50:⚠️ Potential issue | 🟠 MajorDedup can drop newer assistant updates and collapse distinct tool states.
Current logic keeps the first message per
idand fingerprints assistant messages using onlytext/toolCallId. In resumed/streamed flows, that can discard newer payloads (sameid, richerparts) and treat state/output transitions as duplicates.💡 Suggested fix
export function deduplicateMessages(messages: UIMessage[]): UIMessage[] { - const seenIds = new Set<string>(); + const lastIndexById = new Map<string, number>(); + messages.forEach((msg, index) => lastIndexById.set(msg.id, index)); let lastAssistantFingerprint = ""; - return messages.filter((msg) => { - if (seenIds.has(msg.id)) return false; - seenIds.add(msg.id); + return messages.filter((msg, index) => { + // Keep the latest copy when the same ID appears multiple times. + if (lastIndexById.get(msg.id) !== index) return false; if (msg.role === "assistant") { - const fingerprint = msg.parts - .map( - (p) => - ("text" in p && p.text) || - ("toolCallId" in p && p.toolCallId) || - "", - ) - .join("|"); + // Include stateful/tool-result fields so updates are not collapsed. + const fingerprint = JSON.stringify( + msg.parts.map((p) => ({ + state: "state" in p ? p.state : undefined, + text: "text" in p ? p.text : undefined, + toolCallId: "toolCallId" in p ? p.toolCallId : undefined, + output: "output" in p ? p.output : undefined, + errorText: "errorText" in p ? p.errorText : undefined, + })), + ); if (fingerprint && fingerprint === lastAssistantFingerprint) return false; if (fingerprint) lastAssistantFingerprint = fingerprint; } else { lastAssistantFingerprint = ""; } return true; }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/helpers.ts around lines 31 - 50, The current dedupe in messages.filter uses seenIds and keeps the first msg per msg.id and builds fingerprints from only text/toolCallId, which can drop newer assistant updates or collapse distinct tool states; change the dedupe to prefer the latest message per id (e.g., iterate messages in reverse or build a map keyed by msg.id and take the last entry) so newer/streamed payloads are preserved, and strengthen the assistant fingerprint logic in the assistant branch (in the block computing fingerprint from msg.parts) to include the full serialized part objects or any tool/state fields (not just text/toolCallId) so different tool states aren’t treated as identical; update lastAssistantFingerprint usage to operate on the chronological sequence after deduping (or while iterating reversed) so it compares adjacent assistant messages correctly.autogpt_platform/frontend/src/app/(platform)/copilot/helpers/convertChatSessionToUiMessages.ts-53-82 (1)
53-82:⚠️ Potential issue | 🟠 MajorPrevent silent message loss when attached-file parsing fails.
extractFilePartsstrips the attached block before confirming any files were parsed. If line parsing misses (e.g., unit or format drift), bothcleanTextandfilePartscan end up empty, and the message is dropped at Line 182. Keep original content when no file entries are reconstructed, and loosen the file-line pattern.Proposed fix
-const FILE_LINE_RE = /^- (.+) \(([^,]+),\s*[\d.]+ KB\), file_id=([0-9a-f-]+)$/; +const FILE_LINE_RE = + /^- (.+) \(([^,]+),\s*[\d.]+\s*(?:B|KB|MB|GB)\), file_id=([0-9a-fA-F-]+)$/; function extractFileParts(content: string): { cleanText: string; fileParts: FileUIPart[]; } { const match = content.match(ATTACHED_FILES_RE); if (!match) return { cleanText: content, fileParts: [] }; const cleanText = content.replace(match[0], "").trim(); const lines = match[1].trim().split("\n"); const fileParts: FileUIPart[] = []; @@ - return { cleanText, fileParts }; + if (fileParts.length === 0) { + return { cleanText: content, fileParts: [] }; + } + + return { cleanText, fileParts }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/helpers/convertChatSessionToUiMessages.ts around lines 53 - 82, extractFileParts currently removes the attached-files block up front and uses a strict FILE_LINE_RE so if no lines parse you lose message text; change the logic in extractFileParts to first attempt to parse lines without mutating content, loosen FILE_LINE_RE to accept broader file_id and size formats (e.g., optional KB/bytes and more UUID/hex variants), and only strip the ATTACHED_FILES_RE block from cleanText when you successfully reconstructed at least one FileUIPart; use getGetWorkspaceDownloadFileByIdUrl and push reconstructed parts into fileParts as before, but if fileParts is empty return the original content as cleanText to avoid silent message loss.autogpt_platform/backend/backend/copilot/executor/processor.py-150-156 (1)
150-156:⚠️ Potential issue | 🟠 Major
coro.close()is unsafe after scheduling; cancel the future instead.At Line 155,
coro.close()runs for all exceptions, including whenrun_coroutine_threadsafealready scheduled the coroutine. In timeout/error paths this can raise and skip cleanup completion.Suggested fix
- coro = shutdown_workspace_storage() - try: - future = asyncio.run_coroutine_threadsafe(coro, self.execution_loop) - future.result(timeout=5) - except Exception as e: - coro.close() # Prevent "coroutine was never awaited" warning + coro = shutdown_workspace_storage() + future = None + try: + future = asyncio.run_coroutine_threadsafe(coro, self.execution_loop) + future.result(timeout=5) + except Exception as e: + if future is None: + # Submission failed; coroutine was never scheduled. + coro.close() + elif not future.done(): + future.cancel() error_msg = str(e) or type(e).__name__ logger.warning( f"[CoPilotExecutor] Worker {self.tid} cleanup error: {error_msg}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/executor/processor.py` around lines 150 - 156, The code calls coro.close() unconditionally which is unsafe if the coroutine was already scheduled via asyncio.run_coroutine_threadsafe; change the logic in the shutdown sequence around shutdown_workspace_storage(): call asyncio.run_coroutine_threadsafe(coro, self.execution_loop) and store the returned future, then on failures from future.result(...) cancel the scheduled work using future.cancel() (and optionally await/handle cancellation) instead of calling coro.close(); only call coro.close() if run_coroutine_threadsafe itself raised (meaning the coro was never scheduled). Ensure references to shutdown_workspace_storage, self.execution_loop, future, coro.close(), and future.cancel() are used so the fix is applied in the correct block.autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotStream.ts-149-165 (1)
149-165:⚠️ Potential issue | 🟠 MajorGuard
refetchSession()inonFinishto prevent dropped reconnect flow.If
refetchSession()rejects,onFinishexits early via unhandled rejection and the reconnect decision is skipped.Suggested fix
- const result = await refetchSession(); - const d = result.data; + let d: unknown; + try { + const result = await refetchSession(); + d = result.data; + } catch { + handleReconnect(sessionId); + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/useCopilotStream.ts around lines 149 - 165, The onFinish handler currently calls refetchSession() without guarding against rejections, so a thrown/rejected promise can abort onFinish and skip the backend reconnect logic; wrap the refetchSession call in a try/catch (or await with .catch) inside onFinish, ensure result/data checks only run when the call succeeds, and still call handleReconnect(sessionId) when backendActive is true; reference the refetchSession(), onFinish, backendActive variable/property checks, and handleReconnect(sessionId) when making the change..pre-commit-config.yaml-38-136 (1)
38-136:⚠️ Potential issue | 🟠 MajorInstall guards only watch lockfiles, allowing manifest/lock drift to bypass pre-commit validation.
At lines 44, 61, 78, 95, 113, and 130, the guards only match lockfiles (
poetry.lock,pnpm-lock.yaml). A developer can modifypyproject.tomlorpackage.jsonwithout updating the corresponding lockfile, and the hooks will skip execution. While poetry/pnpm validate consistency when run locally, this pre-commit gap contradicts the codebase's own comment at line 32 ("It's also a good idea to check that poetry.lock is consistent with pyproject.toml").🔧 Proposed fix
- fi | grep -qE "^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$" || exit 0; + fi | grep -qE "^autogpt_platform/(backend|autogpt_libs)/(poetry\.lock|pyproject\.toml)$" || exit 0; - fi | grep -qE "^autogpt_platform/autogpt_libs/poetry\.lock$" || exit 0; + fi | grep -qE "^autogpt_platform/autogpt_libs/(poetry\.lock|pyproject\.toml)$" || exit 0; - fi | grep -qE "^autogpt_platform/frontend/pnpm-lock\.yaml$" || exit 0; + fi | grep -qE "^autogpt_platform/frontend/(pnpm-lock\.yaml|package\.json)$" || exit 0; - fi | grep -qE "^classic/(original_autogpt|forge)/poetry\.lock$" || exit 0; + fi | grep -qE "^classic/(original_autogpt|forge)/(poetry\.lock|pyproject\.toml)$" || exit 0; - fi | grep -qE "^classic/forge/poetry\.lock$" || exit 0; + fi | grep -qE "^classic/forge/(poetry\.lock|pyproject\.toml)$" || exit 0; - fi | grep -qE "^classic/benchmark/poetry\.lock$" || exit 0; + fi | grep -qE "^classic/benchmark/(poetry\.lock|pyproject\.toml)$" || exit 0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.pre-commit-config.yaml around lines 38 - 136, The install-guard grep only watches lockfiles so changing manifests can bypass hooks; for each install hook (e.g., alias poetry-install-platform-libs, pnpm-install-platform-frontend, poetry-install-classic-autogpt, poetry-install-classic-forge, poetry-install-classic-benchmark and the other poetry-install entry) update the entry grep -qE pattern to include both the lockfile and its manifest (e.g., for poetry use "pyproject\.toml|poetry\.lock", for pnpm use "package\.json|pnpm-lock\.yaml") so the hook runs when either file changes, keeping the rest of the bash logic intact.autogpt_platform/backend/backend/util/request.py-222-227 (1)
222-227:⚠️ Potential issue | 🟠 Major
parse_urlscheme detection is too broad and fails on URLs with://in query parameters.The current substring check at line 226 (
if "://" not in url) rejects valid URLs likeexample.com?next=https://foo.com, since://appears in the query string. This causesurlparse()to fail, leaving scheme and netloc empty, which then fails the downstreamALLOWED_SCHEMESvalidation. Use parsed scheme and netloc instead of substring detection.💡 Proposed fix
def parse_url(url: str) -> URL: """Canonicalizes and parses a URL string.""" url = url.strip("/ ").replace("\\", "/") - # Ensure scheme is present for proper parsing. - # Avoid regex to sidestep CodeQL py/polynomial-redos on user-controlled - # input. We only need to detect "scheme://"; urlparse() and the - # ALLOWED_SCHEMES check downstream handle full validation. - if "://" not in url: + # Ensure scheme is present for proper parsing without regex. + parsed = urlparse(url) + if not parsed.scheme or not parsed.netloc: url = f"http://{url}" return urlparse(url)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/util/request.py` around lines 222 - 227, The current simple substring check in parse_url incorrectly treats any occurrence of "://" (even in query params) as a valid scheme; replace that logic by using urllib.parse.urlparse to inspect parsed.scheme and parsed.netloc: if parsed.scheme and parsed.netloc are present, use the parsed result; otherwise prepend "http://" to the original input, re-run urlparse, and use that parsed result for downstream ALLOWED_SCHEMES checks and further processing so scheme/netloc are derived from parsing not substring matching.autogpt_platform/backend/backend/copilot/tools/bash_exec.py-87-89 (1)
87-89:⚠️ Potential issue | 🟠 MajorValidate and clamp
timeoutbefore use.Current parsing can throw on malformed input, and values are not bounded to the documented max (120s).
🔧 Suggested fix
- timeout: int = int(kwargs.get("timeout", 30)) + raw_timeout = kwargs.get("timeout", 30) + try: + timeout = int(raw_timeout) + except (TypeError, ValueError): + timeout = 30 + timeout = max(1, min(timeout, 120))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/tools/bash_exec.py` around lines 87 - 89, Validate and clamp the timeout extracted from kwargs: when reading timeout (currently assigned to variable timeout) parse it inside a try/except to avoid ValueError on malformed input, fall back to the default 30 on parse failure, coerce non-positive values to the default, and clamp the resulting integer to a maximum of 120 before use; keep the command assignment as-is but ensure any downstream use of timeout relies on this sanitized value so malformed or out-of-range inputs cannot raise or bypass the documented 120s cap.autogpt_platform/frontend/src/app/api/workspace/files/upload/route.ts-23-27 (1)
23-27:⚠️ Potential issue | 🟠 MajorAdd a timeout to the backend upload proxy call.
This outbound call can block indefinitely under backend slowness and hold request workers. Add a timeout signal and return a 504 on timeout.
🔧 Suggested fix
const response = await fetch(uploadUrl.toString(), { method: "POST", headers, body: formData, + signal: AbortSignal.timeout(30_000), }); if (!response.ok) { const errorText = await response.text(); return new NextResponse(errorText, { status: response.status, }); } const data = await response.json(); return NextResponse.json(data); } catch (error) { + if (error instanceof Error && error.name === "TimeoutError") { + return NextResponse.json( + { error: "Upload request timed out" }, + { status: 504 }, + ); + } console.error("File upload proxy error:", error); return NextResponse.json( { error: "Failed to upload file", detail: error instanceof Error ? error.message : String(error), }, { status: 500 }, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/api/workspace/files/upload/route.ts` around lines 23 - 27, The outbound fetch to uploadUrl currently can hang indefinitely; wrap the fetch in an AbortController with a timeout (e.g., configurable ms) and pass controller.signal into fetch(options) (the call that uses uploadUrl, headers, body: formData). Start a timer that calls controller.abort() after the timeout and clear the timer on success; catch the abort/timeout case and return an HTTP 504 response (instead of letting the request hang). Ensure other errors still propagate/return appropriate responses and that the timer is cleared in all code paths.autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py-144-146 (1)
144-146:⚠️ Potential issue | 🟠 MajorDon’t delete the Redis sandbox mapping before kill succeeds.
If connect/kill fails after deleting the key, the sandbox may keep running but the ID is lost, so retry cleanup is impossible.
Proposed fix
sandbox_id = raw if isinstance(raw, str) else raw.decode() - await redis.delete(redis_key) if sandbox_id == _CREATING: + await redis.delete(redis_key) return False try: @@ await asyncio.wait_for(_connect_and_kill(), timeout=10) + await redis.delete(redis_key) logger.info( "[E2B] Killed sandbox %.12s for session %.12s", sandbox_id,Also applies to: 150-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py` around lines 144 - 146, The Redis key for the sandbox mapping is being deleted before the sandbox process is confirmed killed (see variables/methods sandbox_id, raw, redis_key and the call await redis.delete); move the deletion so it only happens after a successful connect/kill sequence: first decode raw into sandbox_id, attempt to connect and kill the sandbox (handle and surface any errors), and only upon successful kill call await redis.delete(redis_key); if kill fails, leave the key intact so retry/cleanup can later recover the sandbox ID.autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py-31-31 (1)
31-31:⚠️ Potential issue | 🟠 MajorLock wait duration is shorter than lock TTL, causing avoidable failures under contention.
Workers only wait ~10s, but the creation lock can remain valid for 60s. Under load, a requester can raise
RuntimeErroreven though another worker is still creating the sandbox.Proposed fix
_CREATION_LOCK_TTL = 60 -_MAX_WAIT_ATTEMPTS = 20 # 20 * 0.5s = 10s max wait +_POLL_INTERVAL_SECONDS = 0.5 +_MAX_WAIT_ATTEMPTS = int(_CREATION_LOCK_TTL / _POLL_INTERVAL_SECONDS) @@ - for _ in range(_MAX_WAIT_ATTEMPTS): - await asyncio.sleep(0.5) + for _ in range(_MAX_WAIT_ATTEMPTS): + await asyncio.sleep(_POLL_INTERVAL_SECONDS)Also applies to: 85-113
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/tools/e2b_sandbox.py` at line 31, The current _MAX_WAIT_ATTEMPTS (20) makes the lock wait (~10s) shorter than the lock TTL (~60s), causing RuntimeError under contention; update the waiting logic so the total wait time is at least the creation lock TTL (e.g., compute attempts = ceil(lock_ttl / poll_interval) or set _MAX_WAIT_ATTEMPTS to match TTL/poll_interval) and apply the same change to the related wait loop used by acquire_creation_lock/create_sandbox (the loop referenced around lines 85-113) so callers wait long enough before raising an error.autogpt_platform/frontend/src/app/(platform)/copilot/tools/RunMCPTool/helpers.tsx-46-54 (1)
46-54:⚠️ Potential issue | 🟠 MajorTighten the setup-requirements guard to prevent a runtime crash.
Right now the guard can return
truefortype === setup_requirementseven whensetup_infois missing/malformed, which can break at Line 188 when readingoutput.setup_info.agent_name.Proposed fix
export function isSetupRequirementsOutput( output: RunMCPToolOutput, ): output is SetupRequirementsResponse { - return ( - output.type === ResponseType.setup_requirements || - ("setup_info" in output && - output.setup_info !== null && - typeof output.setup_info === "object") - ); + const candidate = output as { + setup_info?: { agent_name?: unknown } | null; + }; + return ( + output.type === ResponseType.setup_requirements && + typeof candidate.setup_info?.agent_name === "string" + ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/tools/RunMCPTool/helpers.tsx around lines 46 - 54, The type guard is too permissive: isSetupRequirementsOutput currently returns true for ResponseType.setup_requirements even if output.setup_info is missing or malformed, which leads to a crash when code later accesses output.setup_info.agent_name; update isSetupRequirementsOutput to only return true when output.setup_info exists, is a non-null object, and contains the expected properties (e.g., agent_name) so that both the ResponseType.setup_requirements branch and the "setup_info" property check validate the same shape before narrowing to SetupRequirementsResponse.autogpt_platform/backend/backend/util/truncate.py-82-85 (1)
82-85:⚠️ Potential issue | 🟠 MajorFast-path string truncation can exceed the requested size limit.
Line 82 introduces a bypass that returns
_truncate_string_middle(value, size_limit)directly, but that helper adds annotation text; the result can be larger thansize_limit. This breaks the function contract and can leak oversized payloads downstream.💡 Proposed fix
- # Fast path: plain strings don't need the binary search machinery. - if isinstance(value, str): - return _truncate_string_middle(value, size_limit) + # Keep strings in the same measured path so final output respects size_limit. + # (_truncate_string_middle can add annotation text beyond its `limit` arg.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/util/truncate.py` around lines 82 - 85, The fast-path for plain strings returns _truncate_string_middle(value, size_limit) but that helper can add annotation text and exceed size_limit; change the fast-path so the returned string is guaranteed <= size_limit by either (a) running the normal truncation/binary-search path used for non-strings instead of the direct helper, or (b) post-trimming the helper result to size_limit (e.g., slice to size_limit) before returning; update the conditional that calls _truncate_string_middle(value, size_limit) accordingly so any annotation cannot produce an output longer than size_limit.autogpt_platform/backend/backend/copilot/tools/run_mcp_tool.py-161-167 (1)
161-167:⚠️ Potential issue | 🟠 MajorGuard credential lookup/client construction with the same error-handling path.
auto_lookup_mcp_credential(...)andMCPClient(...)execute before the protectedtry. If either fails, the tool can raise instead of returning a structuredErrorResponse.💡 Suggested fix
- # Fast DB lookup — no network call. - # Normalize for matching because stored credentials use normalized URLs. - creds = await auto_lookup_mcp_credential(user_id, normalize_mcp_url(server_url)) - auth_token = creds.access_token.get_secret_value() if creds else None - - client = MCPClient(server_url, auth_token=auth_token) - try: + # Fast DB lookup — no network call. + # Normalize for matching because stored credentials use normalized URLs. + creds = await auto_lookup_mcp_credential( + user_id, normalize_mcp_url(server_url) + ) + auth_token = creds.access_token.get_secret_value() if creds else None + client = MCPClient(server_url, auth_token=auth_token) + await client.initialize()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/tools/run_mcp_tool.py` around lines 161 - 167, The credential lookup and client construction (auto_lookup_mcp_credential, normalize_mcp_url, and MCPClient) are running before the try/except that converts failures to an ErrorResponse, so exceptions can escape; move the calls to creds = await auto_lookup_mcp_credential(...), the auth_token extraction, and client = MCPClient(...) inside the same try block (or wrap them in their own try/except) and on any exception return the same structured ErrorResponse used elsewhere so lookup/constructor errors are handled identically; also defensively guard access_token.get_secret_value() in case creds is None or malformed.autogpt_platform/backend/backend/copilot/sdk/security_hooks.py-310-311 (1)
310-311:⚠️ Potential issue | 🟠 MajorProtect
on_compactcallback execution from propagating failures.If
on_compact()throws, the hook can fail and disrupt SDK processing. This callback should be best-effort.💡 Suggested fix
- if on_compact is not None: - on_compact() + if on_compact is not None: + try: + on_compact() + except Exception: + logger.warning("[SDK] on_compact callback failed", exc_info=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/security_hooks.py` around lines 310 - 311, Wrap the on_compact() invocation in a try/except so its exceptions are best-effort and cannot propagate: check on_compact is not None, then call it inside a try block and catch Exception, logging the error (using the module/logger used in security_hooks.py or create a minimal logger) and continue without re-raising; ensure you do not change return semantics of the surrounding function and only swallow the exception after logging.autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools.py-75-79 (1)
75-79:⚠️ Potential issue | 🟠 MajorValidate
offset/limitparsing to avoid handler crashes.Line 77 and Line 78 can raise on non-integer input, which bypasses this handler’s
_mcp(...)error format and turns simple bad arguments into hard failures.✅ Suggested fix
async def _handle_read_file(args: dict[str, Any]) -> dict[str, Any]: file_path: str = args.get("file_path", "") - offset: int = max(0, int(args.get("offset", 0))) - limit: int = max(1, int(args.get("limit", 2000))) + try: + offset = max(0, int(args.get("offset", 0))) + limit = max(1, int(args.get("limit", 2000))) + except (TypeError, ValueError): + return _mcp("offset and limit must be integers", error=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools.py` around lines 75 - 79, The parsing of offset and limit in _handle_read_file currently calls int(...) directly on args.get(...) which can raise ValueError for non-integer inputs; wrap the conversions in a try/except (catch ValueError/TypeError), validate values (offset >= 0, limit >= 1), and on invalid input return the handler's formatted error via the existing _mcp(...) mechanism (or fall back to safe defaults like offset=0, limit=2000 if you prefer), updating the code paths that reference offset and limit to use these validated variables.autogpt_platform/backend/backend/api/features/workspace/routes.py-226-235 (1)
226-235:⚠️ Potential issue | 🟠 MajorHandle rollback failure explicitly in post-write quota enforcement.
If
soft_delete_workspace_file(...)fails on Line 227, the endpoint throws an unhandled exception and skips the intended 413 response path.🛠️ Suggested fix
new_total = await get_workspace_total_size(workspace.id) if storage_limit_bytes and new_total > storage_limit_bytes: - await soft_delete_workspace_file(workspace_file.id, workspace.id) + try: + await soft_delete_workspace_file(workspace_file.id, workspace.id) + except Exception as rollback_error: + logger.error( + "Quota rollback failed for workspace_file_id=%s workspace_id=%s", + workspace_file.id, + workspace.id, + exc_info=True, + ) + raise fastapi.HTTPException( + status_code=500, + detail="Upload exceeded quota and rollback failed; please retry.", + ) from rollback_error raise fastapi.HTTPException( status_code=413, detail={ "message": "Storage limit exceeded (concurrent upload)", "used_bytes": new_total, "limit_bytes": storage_limit_bytes, }, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/api/features/workspace/routes.py` around lines 226 - 235, The soft-delete rollback call soft_delete_workspace_file(workspace_file.id, workspace.id) can itself raise and bypass the intended 413 response; wrap that call in a try/except around the block that enforces storage quotas (referencing soft_delete_workspace_file, workspace_file.id, workspace.id, storage_limit_bytes, new_total) so any exception from the rollback is caught, logged, and does not prevent returning the quota-exceeded HTTPException; include in the caught path either (a) re-raise the original 413 with an extra flag/detail indicating rollback_failed and the error message, or (b) log the rollback failure and still raise the same fastapi.HTTPException(status_code=413, detail=...) so the client gets the quota response while operators get the rollback error in logs.autogpt_platform/backend/backend/copilot/sdk/service.py-698-705 (1)
698-705:⚠️ Potential issue | 🟠 MajorMake transcript download best-effort in preflight.
At Line 704, any
download_transcript(...)exception aborts the request even though resume is optional. This should degrade gracefully to non-resume mode instead of failing the stream.💡 Proposed fix
async def _fetch_transcript(): """Download transcript for --resume if applicable.""" if not ( config.claude_agent_use_resume and user_id and len(session.messages) > 1 ): return None - return await download_transcript(user_id, session_id) + try: + return await download_transcript(user_id, session_id) + except Exception as err: + logger.warning( + "[SDK] [%s] Transcript download failed; continuing without --resume: %s", + session_id[:12], + err, + exc_info=True, + ) + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/sdk/service.py` around lines 698 - 705, The _fetch_transcript coroutine currently calls download_transcript(...) and lets any exception propagate, which will abort the preflight even though resume is optional; wrap the download_transcript call inside a try/except in _fetch_transcript (checking config.claude_agent_use_resume, user_id and session.messages as before), catch all exceptions, log a warning or debug via the existing logger, and return None on error so the flow gracefully falls back to non-resume mode instead of failing the stream.autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/components/MessageAttachments.tsx-41-49 (1)
41-49:⚠️ Potential issue | 🟠 MajorValidate attachment URLs before using them in
href.On Line 43 and Line 59,
file.urlis used directly. Unsafe schemes (for examplejavascript:) should be blocked before rendering links.🛡️ Suggested hardening
+function getSafeDownloadHref(url?: string): string | null { + if (!url) return null; + const value = url.trim(); + return /^(https?:|blob:|\/)/i.test(value) ? value : null; +} + export function MessageAttachments({ files, isUser }: Props) { if (files.length === 0) return null; return ( <div className="mt-2 flex flex-col gap-2"> - {files.map((file, i) => - isUser ? ( + {files.map((file, i) => { + const downloadHref = getSafeDownloadHref(file.url); + return isUser ? ( <div key={`${file.filename}-${i}`} className="min-w-0 rounded-lg border border-purple-300 bg-purple-100 p-3" > @@ - {file.url && ( + {downloadHref && ( <a - href={file.url} + href={downloadHref} download aria-label="Download file" className="shrink-0 text-purple-400 hover:text-purple-600" > <DownloadIcon className="h-5 w-5" /> </a> )} </div> </div> ) : ( <ContentCard key={`${file.filename}-${i}`}> <ContentCardHeader action={ - file.url ? ( + downloadHref ? ( <a - href={file.url} + href={downloadHref} download aria-label="Download file" className="shrink-0 text-neutral-400 hover:text-neutral-600" > <DownloadIcon className="h-5 w-5" /> </a> ) : undefined } > @@ </ContentCard> - ), - )} + ); + })} </div> ); }Also applies to: 57-65
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/components/ChatMessagesContainer/components/MessageAttachments.tsx around lines 41 - 49, The MessageAttachments component currently uses file.url directly in anchor hrefs (the <a href={file.url}> instances) which allows unsafe schemes like javascript:; update the component to validate/sanitize file.url before rendering by parsing it (e.g., with the URL constructor or a safe parser) and only allow safe schemes (at least http and https, optionally mailto) — if the URL is invalid or uses a disallowed scheme, avoid rendering the link or render a sanitized fallback (e.g., no href or a disabled download control). Apply this check for both places where file.url is used (the download link and the other anchor instances referenced in the diff and lines noted).autogpt_platform/frontend/src/app/(platform)/copilot/CopilotPage.tsx-44-50 (1)
44-50:⚠️ Potential issue | 🟠 MajorPrevent drag-counter underflow to avoid stuck drop overlay.
dragCounter.currentcan go negative on unbalanced drag events, and the=== 0guard won’t resetisDraggingin that case.🔧 Proposed fix
function handleDragLeave(e: React.DragEvent) { e.preventDefault(); e.stopPropagation(); - dragCounter.current -= 1; - if (dragCounter.current === 0) { + dragCounter.current = Math.max(0, dragCounter.current - 1); + if (dragCounter.current <= 0) { setIsDragging(false); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/frontend/src/app/`(platform)/copilot/CopilotPage.tsx around lines 44 - 50, In handleDragLeave, prevent dragCounter underflow by ensuring dragCounter.current never goes below 0 and treating any non-positive value as end-of-drag; after updating dragCounter.current (decrement), if dragCounter.current <= 0 set dragCounter.current = 0 and call setIsDragging(false). Update references to dragCounter.current and setIsDragging in handleDragLeave accordingly so unbalanced drag events won't leave the drop overlay stuck.autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.ts-93-122 (1)
93-122:⚠️ Potential issue | 🟠 MajorNew-session attachment flow can silently drop the draft on upload failure.
Lines 95-99 clear pending state before upload completion, and Lines 104-111 can abort send without restoring that draft. Because
onSendalready resolved in the no-session path, upstream input state may already be cleared even though no message was actually sent.Please ensure the no-session path only resolves after upload+send succeeds (or rejects so the caller can preserve/retry input state).
|
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
|
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/executor/processor.py (1)
126-131: Avoid relying onclaude-agent-sdkprivate internals for CLI discovery.The
claude-agent-sdkbundles the CLI but provides no documented public API for discovering its path. Lines 126–131 import from the private_internalmodule and call_find_bundled_cli(), which can break on SDK updates. The SDK instead supports explicit overrides viaClaudeAgentOptions(cli_path=...)or theCLAUDE_AGENT_CLI_PATHenvironment variable, neither of which is used here.Additionally, the subprocess call does not validate the return code—
subprocess.run()withoutcheck=Truesucceeds silently even ifcli -vfails, so the unconditional "pre-warm done" log masks CLI invocation errors.Either validate the subprocess return code (and skip pre-warming on failure), or refactor to let the SDK handle CLI discovery via its public APIs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/executor/processor.py` around lines 126 - 131, Don't import or call the private SubprocessCLITransport._find_bundled_cli; instead let the SDK discover the CLI via its public API by passing ClaudeAgentOptions(cli_path=...) when you can, or set the CLAUDE_AGENT_CLI_PATH env var; if you must invoke the CLI directly for "pre-warm", run subprocess.run with check=True or inspect the CompletedProcess.returncode and on non-zero skip the "pre-warm done" log and log the stderr/returncode so failures are visible — update references around SubprocessCLITransport._find_bundled_cli and the pre-warm subprocess invocation to implement one of these approaches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@autogpt_platform/backend/backend/copilot/executor/processor.py`:
- Around line 132-137: The current pre-warm uses subprocess.run([cli_path,
"-v"], ...) and logs success unconditionally; update the pre-warm in
CoPilotExecutor (processor.py) to verify the subprocess exit status before
logging: either pass check=True to subprocess.run and catch
subprocess.CalledProcessError to log an error (including the stdout/stderr) or
inspect the returned CompletedProcess.returncode and only call
logger.info(f"[CoPilotExecutor] CLI pre-warm done: {cli_path}") when returncode
== 0, otherwise log an error with the output and handle/fail appropriately.
In `@autogpt_platform/backend/backend/copilot/sdk/service.py`:
- Around line 717-724: The _fetch_transcript coroutine currently awaits
download_transcript(user_id, session_id) and lets any exception propagate
(causing asyncio.gather to fail); modify _fetch_transcript to catch exceptions
from download_transcript and return None on failure so transcript download is
treated as optional—wrap the await in try/except (catch broad Exception), log or
ignore the error, and return None on exception so callers (and asyncio.gather)
continue normally.
---
Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/executor/processor.py`:
- Around line 126-131: Don't import or call the private
SubprocessCLITransport._find_bundled_cli; instead let the SDK discover the CLI
via its public API by passing ClaudeAgentOptions(cli_path=...) when you can, or
set the CLAUDE_AGENT_CLI_PATH env var; if you must invoke the CLI directly for
"pre-warm", run subprocess.run with check=True or inspect the
CompletedProcess.returncode and on non-zero skip the "pre-warm done" log and log
the stderr/returncode so failures are visible — update references around
SubprocessCLITransport._find_bundled_cli and the pre-warm subprocess invocation
to implement one of these approaches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 561a0c65-9738-4df5-815a-1e70867ac297
📒 Files selected for processing (4)
autogpt_platform/backend/backend/copilot/config.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.pyautogpt_platform/backend/backend/copilot/service.py
🚧 Files skipped from review as they are similar to previous changes (1)
- autogpt_platform/backend/backend/copilot/config.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Seer Code Review
- GitHub Check: Check PR Status
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: test (3.11)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (2)
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/service.pyautogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (2)
autogpt_platform/backend/backend/copilot/service.py (1)
188-188: Config-driven Langfuse cache TTL update looks good.This cleanly removes hardcoded no-cache behavior while preserving the existing fallback path.
autogpt_platform/backend/backend/copilot/sdk/service.py (1)
725-736: Good parallelization of independent startup I/O.Running E2B setup, prompt build, and transcript fetch concurrently here is a solid latency optimization.
|
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
- Check subprocess exit status before logging pre-warm success - Handle transcript download failures as non-fatal in asyncio.gather
There was a problem hiding this comment.
🧹 Nitpick comments (1)
autogpt_platform/backend/backend/copilot/executor/processor.py (1)
123-146: Replace private SDK API with public alternative.
SubprocessCLITransport._find_bundled_cli()is a private method from an internal module path (claude_agent_sdk._internal.transport). Thetype: ignore[arg-type]comment further indicates non-standard usage. The SDK exposes a public API (ClaudeAgentOptions(cli_path="...")) and documents that the bundled CLI follows a predictable path without requiring private API access.Consider:
- Use the public
ClaudeAgentOptionsor compute the bundled CLI path directly (e.g.,Path(site-packages)/claude_agent_sdk/_bundled/claude) instead of calling private methods- Add an integration test verifying the pre-warm path works with the installed SDK version (currently no tests exist for this functionality)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@autogpt_platform/backend/backend/copilot/executor/processor.py` around lines 123 - 146, The _prewarm_cli method currently calls the private SubprocessCLITransport._find_bundled_cli; replace this with the public SDK approach by using ClaudeAgentOptions(cli_path=...) or deriving the bundled CLI path deterministically (e.g., resolve site-packages/claude_agent_sdk/_bundled/claude) instead of importing from claude_agent_sdk._internal.transport and remove the type: ignore[arg-type]; update _prewarm_cli to obtain cli_path via ClaudeAgentOptions or the computed Path, then run subprocess.run as before; optionally add an integration test that asserts _prewarm_cli succeeds against the installed SDK to prevent regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@autogpt_platform/backend/backend/copilot/executor/processor.py`:
- Around line 123-146: The _prewarm_cli method currently calls the private
SubprocessCLITransport._find_bundled_cli; replace this with the public SDK
approach by using ClaudeAgentOptions(cli_path=...) or deriving the bundled CLI
path deterministically (e.g., resolve
site-packages/claude_agent_sdk/_bundled/claude) instead of importing from
claude_agent_sdk._internal.transport and remove the type: ignore[arg-type];
update _prewarm_cli to obtain cli_path via ClaudeAgentOptions or the computed
Path, then run subprocess.run as before; optionally add an integration test that
asserts _prewarm_cli succeeds against the installed SDK to prevent regressions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 18479791-292e-404e-8010-cee7ee4b6947
📒 Files selected for processing (2)
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Seer Code Review
- GitHub Check: types
- GitHub Check: Analyze (typescript)
- GitHub Check: Analyze (python)
- GitHub Check: conflicts
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
- GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (4)
autogpt_platform/backend/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development
Files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/**/*.{py,txt}
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use
poetry runprefix for all Python commands, including testing, linting, formatting, and migrations
Files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/backend/backend/**/*.py
📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)
Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings
Files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
autogpt_platform/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Format Python code with
poetry run format
Files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
🧠 Learnings (2)
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.
Applied to files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
📚 Learning: 2026-03-04T08:04:35.881Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12273
File: autogpt_platform/backend/backend/copilot/tools/workspace_files.py:216-220
Timestamp: 2026-03-04T08:04:35.881Z
Learning: In the AutoGPT Copilot backend, ensure that SVG images are not treated as vision image types by excluding 'image/svg+xml' from INLINEABLE_MIME_TYPES and MULTIMODAL_TYPES in tool_adapter.py; the Claude API supports PNG, JPEG, GIF, and WebP for vision. SVGs (XML text) should be handled via the text path instead, not the vision path.
Applied to files:
autogpt_platform/backend/backend/copilot/executor/processor.pyautogpt_platform/backend/backend/copilot/sdk/service.py
🔇 Additional comments (8)
autogpt_platform/backend/backend/copilot/executor/processor.py (1)
9-10: LGTM on the skip-version-check and pre-warm initialization.The environment variable setup and pre-warm call are well-placed in the worker startup sequence. The optimization should effectively reduce per-request latency.
Also applies to: 113-119
autogpt_platform/backend/backend/copilot/sdk/service.py (7)
158-183: LGTM on the shared tool notes constant.Clear documentation for the agent about file sharing, long-running tools, and sub-agent task constraints.
186-231: LGTM on the local tool supplement.Good distinction between ephemeral working directory and persistent workspace storage. The
{cwd}placeholder is properly substituted at line 744.
234-279: LGTM on the E2B tool supplement.Consistent structure with the local supplement, appropriately documenting the cloud sandbox behavior differences.
393-456: LGTM on the_compress_messagesreturn type change.Returning
(messages, was_compacted)tuple enables callers to know if compression occurred without re-comparing the lists.
523-568: LGTM on_build_query_messageimplementation.The gap-aware compression logic correctly handles the case where the transcript is stale (covers fewer messages than session history). The fallback at lines 555-566 for sessions without transcripts ensures multi-turn context is preserved.
686-738: LGTM on the parallel I/O implementation.Good use of
asyncio.gatherto run independent network operations concurrently. The nested functions properly capture the enclosing scope variables and have appropriate exception handling that degrades gracefully (returningNoneinstead of propagating errors).The past review concern about
_fetch_transcriptnot handling exceptions has been addressed with the try/except block at lines 723-732.
740-774: LGTM on the transcript processing and system prompt assembly.The transcript validation and resume file handling is well-structured. The conditional logging at lines 770-774 correctly identifies the scenario where resume was expected but no transcript was available.
Issues attributed to commits in this pull requestThis pull request was merged and Sentry observed the following issues:
|
Summary
asyncio.gather()(saves ~200-500ms)Test plan
poetry run pytest backend/copilot/sdk/service_test.py -s -vvv