OneLogic
All editions

Lumina Digest

The AI developments that matter, explained.

How would you like to read it?

Same edition, explained without the jargon — and just as faithful. It's not a quick summary: an independent check confirms the plain-language version stays true to the original, without dropping or distorting anything.

The Token Bill Comes Due: The Tokenomics Foundation Is Born to Tackle AI Costs

Spending on AI model tokens has become a runaway budget line. The Linux Foundation is launching a standards body modeled on FinOps, but the very labs that set the prices are missing from the table.

Spending on AI model tokens has become a budget line that is hard to keep under control, and the industry is scrambling to respond. According to TechCrunch, one company reportedly racked up a Claude bill of around 500 million dollars by failing to impose usage limits on its employees (a reported figure; the company is not named). Also according to the outlet, Uber reportedly exhausted its 2026 AI coding budget as early as April, while for one Priceline employee a Cursor renewal allegedly came back «4-5 times more expensive». J.R. Storment of the FinOps Foundation reports companies already at three times their token budget for all of 2026.

The response comes from the Linux Foundation. On June 3 it announced its intent to launch the Tokenomics Foundation, with open standards, benchmarks, and new metrics to measure token spending: cost-per-intelligence, tokens-per-watt, and token factory effectiveness. It is being created in partnership with the FinOps Foundation and extends the FOCUS specification to token consumption. The formal launch is expected at FinOps X in San Diego. The twelve initial backers include Google Cloud, Microsoft, Oracle, IBM, SAP, Salesforce, Accenture, KPMG, and JPMorgan Chase.

The limitation is structural. As SDxCentral observes, the very frontier labs that set token prices — OpenAI and Anthropic — are missing from the table. The foundation tackles measurement and efficiency, not price reduction. The price per token is falling (DeepSeek has cut it by about 90%), but bills are climbing because volumes are exploding. Goldman Sachs estimates a 24-fold increase in consumption by 2030. The sticking point flagged by Nicholas Arcolano (Jellyfish) remains: working out whether the spending generates business value, something many companies still cannot measure.

Why it matters

  • Entrepreneurs: AI token spending erodes margins and is now a strategic procurement line item: it must be tied to measurable business value, because — as Arcolano warns — that is something almost no one can do today. Shared standards help compare suppliers, but they don't lower prices: governance remains the company's responsibility.
  • ICT engineers / IT managers: The 500 million case stems from the absence of usage limits: it calls for consumption governance, observability, and operational metrics such as tokens-per-watt and cost-per-intelligence. The FOCUS specification extended to tokens promises a common schema for comparing spending across different vendors and internal platforms.

US AI Executive Order: Voluntary 30-Day Vetting for Frontier Models, No Licensing Requirements

Trump signs "Promoting Advanced Artificial Intelligence Innovation and Security": three pillars — federal cyber, a voluntary pre-release channel, criminal enforcement — and no licensing requirement, in sharp contrast with the European AI Act.

On June 2, 2026, President Trump signed the executive order "Promoting Advanced Artificial Intelligence Innovation and Security", which structures the US approach to frontier models along three lines, not just the vetting that made headlines.

Federal cyber. Agencies must strengthen the defenses of government and critical infrastructure within 30 days (July 2). The Treasury establishes an "AI cybersecurity clearinghouse" to coordinate vulnerability scanning and disclosure (Latham & Watkins analysis).

Voluntary pre-release channel. By August 1, 2026, a process must be built in which developers can give the government access to "covered frontier models" up to 30 days before release to "trusted partners." The window was cut from the 90 days of a May draft (The Register). Both terms remain undefined: the threshold that qualifies a model as "frontier" is set by the NSA through classified benchmarking.

Criminal enforcement. The attorney general prioritizes crimes committed by using AI for unlawful computer access.

The order explicitly excludes "any mandatory licensing, pre-clearance, or permit" to develop or release models. But the discretion over who qualifies as a "trusted partner" worries analysts. The Cato Institute fears it could be used "against companies at odds with the administration," while the Council on Foreign Relations notes that "the government cannot assess what it cannot see." It is the opposite of the European AI Act, which presumes systemic risk above 10^25 FLOP and imposes obligations.

Why it matters

  • Entrepreneurs: The United States is choosing the voluntary, deregulated path: no licensing or pre-clearance requirement to bring a model to market, a lighter compliance burden than the European AI Act. The flip side is the discretion over who counts as a "trusted partner" and which models are "frontier," which introduces political risk — analysts fear favoritism or retaliation against companies that have fallen out of favor.
  • ICT engineers / IT managers: The Treasury's "AI cybersecurity clearinghouse" and the 30-day federal hardening redesign vulnerability coordination, but the "covered frontier model" threshold stays classified and undefined: planning governance and release timelines becomes uncertain. And as experts note, finding the flaws is the easy part — consistent patching remains the operational bottleneck.

A Malicious config.json Runs Code in Hugging Face Transformers, Bypassing trust_remote_code

CVE-2026-4372: a single configuration field turns a routine model load into code execution, with no warnings or prompts. Fixed in Transformers 5.3.0.

A critical flaw in Hugging Face Transformers, the library that underpins much of the machine learning stack, allowed arbitrary code to be executed on the victim's system. A routine model load was all it took. Tracked as CVE-2026-4372 (CVSS 7.8, source huntr.dev; NIST assessment still in progress), the vulnerability was discovered by researcher Yotam Perkal of Pluto Security and reported to Hugging Face in February 2026.

The mechanism, reconstructed in the analysis by CSO Online, centers on a configuration field, _attn_implementation_internal: placed in a malicious config.json, it points to an attacker-controlled repository on the Hugging Face Hub. When the victim calls the standard from_pretrained() API, the hub_kernels.py component of the optional kernels package interprets that value as a reference to a "kernel" and downloads and runs its Python code. All of this with no sandboxing, no code signing, no integrity verification, and no prompt to the user. The expected protection, trust_remote_code=false, is bypassed because the field is processed during config deserialization, before the flag is ever evaluated.

The exposure is broad: according to SiliconANGLE, the affected versions run from 4.56.0 through 5.2.x when kernels is installed, with the vulnerable code present since August 2025. Transformers has more than 2.2 billion total installs, and the vulnerable releases were still being downloaded 7-8 million times a week, roughly a quarter of the total. The caveat that limits exploitation — the need for the kernels package — carries little weight in enterprise environments and GPU clusters, where optional dependencies are commonly all installed. The fix landed with version 5.3.0, in early March 2026.

Why it matters

  • LLM builders / devs: Loading a model stops being a harmless operation: even with trust_remote_code=false, regarded as the safe defense, a simple config.json can execute code. Transformers should be upgraded to at least 5.3.0, treating configs and weights as untrusted input and inspecting configs for the _attn_implementation_internal field.
  • ICT engineers / IT managers: This is a supply chain risk on an ubiquitous component: the most exposed are GPU clusters and enterprise ML platforms, which install all optional dependencies (including kernels). You need to inventory the Transformers versions in use across the fleet and force the upgrade to 5.3.0, because the real security boundary is the model's configuration, not just explicit remote code.

A Single Issue Could Hijack Repositories Using the Claude Code GitHub Action

A flawed authorization check in claude-code-action trusted any 'bot' actor: a single malicious issue was enough to trigger a prompt injection, exfiltrate the OIDC credentials and gain write access. Anthropic fixed the main bypass within a few days; the patch is in v1.0.94.

Researcher RyotaK of GMO Flatt Security demonstrated how the official Claude Code GitHub Action (claude-code-action) could be hijacked with a single issue. The root cause is a flawed authorization check: the checkWritePermissions function trusted any GitHub App actor — whose name ends in [bot] — regardless of its actual permissions. GitHub Apps have implicit read access to public repositories and can open issues using only the installation token. An attacker could therefore create a malicious app, install it on a repo of their own and use its token to open an issue in the target repository, bypassing the check (original disclosure, GMO Flatt Security).

That is where the prompt injection comes into play: the issue contained a fake error message that coaxed Claude into running commands. Claude Code authorizes certain bash commands without confirmation (such as cat and head), enough to read /proc/self/environ and expose the workflow's environment variables. Among these are ACTIONS_ID_TOKEN_REQUEST_TOKEN and ACTIONS_ID_TOKEN_REQUEST_URL, the credentials for requesting an OIDC token. Exchanging that token with Anthropic's backend yielded a privileged installation token, and therefore write access to the repository (eSecurity Planet).

The most serious scenario: Anthropic's own Action used that workflow, so an exploit could have injected code into the Action itself, propagating downstream (a supply chain risk). Anthropic rated the flaw 7.8 (CVSS v4.0), awarded a bug bounty and fixed the bypass in four days: reported on January 12, 2026, it was resolved on the 16th. The patch is in v1.0.94, which disables GitHub App triggers by default. On February 17 an analogous configuration in Cline was exploited in the wild, a sign that the attack class is real. Independent researcher Aonan Guan frames the problem as architectural rather than a bug: agents process untrusted input in the same runtime where they hold secrets and execution tools. The same weakness also affects Gemini CLI and GitHub Copilot (SecurityWeek).

Why it matters

  • LLM builders / devs: Anyone integrating Claude Code or similar agents into their CI must treat every issue, PR or comment as hostile input: the flaw arises from executing untrusted input in the same runtime that holds secrets and tools. Updating to at least v1.0.94, disabling GitHub App triggers and restricting auto-approved commands are immediate mitigations.
  • ICT engineers / IT managers: This is a supply chain risk that transcends any single vendor: the same class also hits Gemini CLI and GitHub Copilot. You need an inventory of which AI agents have access to repositories and OIDC tokens, and a policy governing their permissions before an upstream exploit propagates to downstream projects.

A hostile repository abuses instruction files and a symlink to make the user approve a seemingly harmless command that in fact overwrites the agent's configuration and installs a malicious MCP server. Adversa AI demonstrated it on at least five CLIs; nearly every vendor downplayed it.

On May 26, 2026, Rony Utevsky of Adversa AI disclosed SymJack, an attack that bypasses the human-in-the-loop control of several coding agents. A hostile repository includes an instruction file that agents read automatically on startup (CLAUDE.md, AGENTS.md, GEMINI.md), treating it as trusted guidance. As a technical analysis explains, that file asks the agent for a raw shell copy — for example cp media/vid0.mp4 docs/vid-settings.mp4 — instead of a native write tool. The destination, however, is a symlink committed in the repo that points to the agent's configuration (.mcp.json, .claude/settings.json, .codex/config.toml): the kernel follows the link and overwrites the config with the definition of a malicious MCP server disguised as a video file. On restart the agent launches that server and the attacker's code runs as the user, with no sandbox. The approval prompt shows the literal command, not the resolved path: the user approves something that looks harmless.

According to SecurityWeek, at least five agents are involved (Claude Code, Cursor, Antigravity, Copilot CLI, Grok Build CLI); the disclosure also enumerates Gemini CLI and OpenAI Codex CLI, so the number varies depending on how the binaries are counted. The impact ranges from the theft of SSH keys and cloud tokens to the compromise of CI pipelines, where a single pull request can exfiltrate all of a runner's secrets before a human review.

Most vendors dismissed the report as expected behavior. OpenAI/Bugcrowd closed the report as theoretical, arguing that the user approves the cp command; Adversa counters that this very approval is not informed, because the prompt does not show the resolved path. Google classified it as a "single-user self attack," Cursor as a duplicate of an already-known symlink report. Anthropic rejected it as out of scope but, without formally acknowledging the gap, later tightened Claude Code's approval flow to show the real/resolved path. xAI and GitHub had not yet responded.

Why it matters

  • ICT engineers / IT managers: The approval prompt is not a reliable defense: a single cloned repo or a single PR can exfiltrate CI secrets before a human ever sees them. Instruction files from external repos must be treated as untrusted, agent permissions must be restricted, and MCP configurations must be audited.
  • LLM builders / devs: Those building agents must resolve symlinks before the approval prompt and show the real effect, not the literal command: Anthropic's silent patch is the model to follow. Instructions auto-ingested from repos (CLAUDE.md, AGENTS.md) are untrusted input, not trusted guidance.

Ramp Raises $750 Million at a $44 Billion Valuation, Driven by Its AI Bet

The spend management company closes a $750 million Series F led by ICONIQ, GIC, and Ontario Teachers'. The valuation nearly doubles in under a year, riding the appetite for fintechs with an AI story.

Ramp, a New York-based spend management company founded in 2019, announced on June 4, 2026 a $750 million Series F (primary financing) at a $44 billion valuation. The round is led by ICONIQ, GIC, and Ontario Teachers' Pension Plan, with the entry of heavyweight institutional investors such as Goldman Sachs Alternatives, D.E. Shaw, Morgan Stanley Investment Management, Generation Investment Management, and Insight Partners. Returning backers include Founders Fund, Thrive Capital, Coatue, and General Catalyst, among others. With this deal, total funding raised surpasses $3 billion.

Underpinning the valuation are robust growth numbers: more than 70,000 customers (up from 50,000 in November 2025), an annualized purchase volume of $200 billion, annualized revenue above $1 billion with positive free cash flow, and more than 3,200 enterprise customers with at least $100,000 in ARR. As of March 2026, TPV grew roughly 170% year over year.

The story's driving force is AI. Ramp is betting on "token spend management" (monitoring and controlling the usage costs of AI models), a corporate card designed for AI agents, agents for procurement and accounting, and Ramp Stack, the AI operating system for accounting firms. It is precisely this narrative that is drawing in the capital. As TechCrunch observes, investors "hunger for fintechs with an AI story," to the point of wryly noting that the company's announcement "looks an awful lot like it was AI-generated." The valuation has nearly doubled from the $22.5 billion of summer 2025 and is up 37.5% from the $32 billion of November 2025. It's a pace that fuels doubts about the sector's exuberance. It is, in any case, the largest round of the week.

Why it matters

  • Entrepreneurs: The jump to $44 billion sets a new benchmark in enterprise fintech and confirms that capital today rewards those who pair a credible AI story with solid numbers (over $1 billion in revenue, positive free cash flow): for business owners, it's the signal that spend management — including the new frontier of AI token costs — is a market in full expansion. At the same time, a valuation that has nearly doubled in under a year is an invitation to read the sector's exuberance with caution.

Lovable Quintuples Its Google Cloud Footprint and Locks In Access to AI Models

The Swedish vibe coding startup is extending its collaboration with Google Cloud to multiple years in order to grow fivefold in infrastructure and AI usage, with access to Gemini and Claude. The financial value was not disclosed.

Lovable is the Stockholm startup that generates full-stack applications from natural-language prompts (so-called "vibe coding"). On June 3, at the Google Cloud Summit Nordics, it announced a multi-year extension of its collaboration with Google Cloud, which becomes one of its primary infrastructure partners. According to TechCrunch, the deal calls for a fivefold increase in Lovable's footprint on the platform, including AI usage, with access to Google's Gemini models and Anthropic's Claude via Vertex AI. The financial value was not disclosed by either company.

Google Cloud's official press release rests on three pillars: the Lovable Agent joins the Gemini Enterprise Agent Gallery, a catalog of verified third-party agents; a new integration with Wiz detects and fixes vulnerabilities in AI-generated code in real time; and Lovable becomes available for purchase on Google Cloud Marketplace and Gemini Enterprise to streamline procurement and billing. "Building is only the beginning," said CEO Anton Osika.

The goal is to win over enterprise customers: Lovable processes more than a million new projects a week and has reached roughly 400 million dollars in annualized revenue with 146 employees. But — as The Next Web notes — it competes in a crowded field (Cursor, Replit, Bolt) and must prove to businesses that what it produces is secure and governed. Cloud providers, TechBuzz observes, are locking in fast-growing AI companies ahead of time with multi-year deals, before they become major consumers of compute.

Why it matters

  • Entrepreneurs: For anyone building products on top of AI, the deal shows that securing compute capacity and model access through multi-year contracts is becoming as much of a strategic lever as product quality: locking in supply before costs rise protects margins, but exposes you to lock-in with a single hyperscaler. On the go-to-market side, getting into a cloud provider's marketplace and agent gallery removes the procurement friction that often stalls enterprise sales — a channel that can matter more than simply having the best model.

Goedel-Architect Proves Theorems in Lean 4 with a Global Blueprint, Hitting 99.2% on MiniF2F

An agentic framework from a Princeton group generates a dependency graph and closes lemmas in parallel, reaching 75.6% on PutnamBench; the highest scores, however, require natural-language guidance.

A Princeton group led by Sanjeev Arora has released Goedel-Architect, an agentic framework for formally proving theorems in Lean 4. What sets it apart from the dominant approaches is its method. Instead of recursively decomposing the theorem into lemmas — a strategy that can spiral into dead ends — the system first generates a global blueprint: a dependency graph among definitions and lemmas that leads to the main theorem. Then a tool-equipped Lean prover closes every still-open lemma node in parallel, and the lemmas that fail guide the refinement of the overall blueprint (arXiv, preprint dated June 4, 2026).

The scores are top-tier: 99.2% pass@1 on MiniF2F-test and 75.6% pass@1 on PutnamBench in autonomous mode. With the blueprint guided by a natural-language proof, they rise to 100% and 88.8% (597/672). On recent competitions it solves 4 of 6 problems at IMO 2025, 11 of 12 at Putnam 2025, and 3 of 6 at USAMO 2026. The backbone is DeepSeek-V4-Flash (284B-A13B), an open-weight model.

The work extends the same group's Goedel-Prover line: Goedel-Prover-V2 topped out at 88.1% pass@32 on MiniF2F and 86 problems on PutnamBench. The substantial leap is therefore on PutnamBench, while MiniF2F is by now close to saturation. Three caveats remain, however. The highest scores are not autonomous but require natural-language guidance. The cost reduction "up to 500 times versus comparable open-source pipelines" is a claim by the authors, not an independently verified comparison. Finally, it is a preprint not yet peer-reviewed (academic summary).

Why it matters

  • Frontier research: The signal is not just the score but the shift in architecture: the global blueprint with parallel lemma closure positions itself as an alternative to recursive decomposition, and the leap happens on PutnamBench — the hardest, not-yet-saturated benchmark — by orchestrating an open-weight backbone (DeepSeek-V4-Flash) instead of training a dedicated prover. It suggests that, in formal reasoning, agentic orchestration may count as much as model scale.