π
Headlines & Launches
Introducing Claude Design by Anthropic Labs (4 minute read)
Claude Design is a tool for creating visual work using the Claude Opus 4.7 vision model. It allows users to design prototypes, pitch decks, and marketing materials, automating brand consistency and enabling collaboration. The tool integrates with Claude Code for seamless transition from prototype to production.
Cursor in talks to raise $2B+ at $50B valuation as enterprise growth surges (2 minute read)
Cursor, an AI coding startup, is close to raising $2 billion in funding, potentially doubling its valuation to $50 billion. Thrive and Andreessen Horowitz are expected to lead the round, with Battery Ventures and Nvidia possibly participating. Cursor aims to triple its annualized revenue to over $6 billion by the end of 2026, driven by its proprietary Composer model and strategic shifts to achieve profitability.
π§
Deep Dives & Analysis
Are the Costs of AI Agents Also Rising Exponentially? (11 minute read)
The length of tasks AI agents can perform has been growing exponentially over the last seven years. The latest models can (sometimes) do tasks that would take a human a few hours. However, the cost to achieve these time horizons is growing exponentially, and the hourly costs for some models are now close to human costs. There will eventually be a divergence between what time horizon is possible and what is economically feasible.
Changes in the system prompt between Claude Opus 4.6 and 4.7 (5 minute read)
Anthropic is the only major AI lab to publish the system prompts for its user-facing chat systems. The company just released Opus 4.7, which features an updated system prompt. This author used Claude Code to take the Markdown version of its system prompts, break them into separate documents, and then construct a Git history of those files with fake commit dates. The result is in the post.
Building a Fast Multilingual OCR Model with Synthetic Data (11 minute read)
NEMOTRON OCR V2, developed using synthetic data, is a fast, accurate multilingual OCR model achieving significant accuracy improvements, lowering NED scores to near-zero for non-English languages. Utilizing a synthetic data pipeline with mOSCAR text and diverse fonts, the model trains with pixel-perfect annotations across languages, enabling generalization to real-world documents. The unified architecture reuses feature maps, achieving speeds of 34.7 pages/second on a single A100 GPU, outperforming specialized models in diverse language OCR tasks.
The Two Sides of OpenClaw (7 minute read)
Peter Steinberger's talks highlighted OpenClaw's contrasting stories: an inspiring public narrative and a more serious engineering perspective on security and scaling challenges. Anthropic launched Claude Design to compete with tools like Figma, expanding beyond chat/coding into design and prototyping with its OPUS 4.7 engine, which received mixed initial user reviews despite strong benchmark performances. Open-source agent stacks like Hermes and advancements in computer-use UX are gaining traction, while applied AI advances in science, medicine, and infrastructure underscore the sector's dynamic growth.
π¨βπ»
Engineering & Research
Experimental hybrid inference and new Gemini models for Android (3 minute read)
Hybrid inference for Android is a new API for Firebase AI Logic that leverages both on-device and Cloud inference. It supports Google's new Gemini models, including the latest image generation Nano Banana models. The new API allows apps to dynamically switch between Gemini Nano running locally on-device and cloud-hosted Gemini models. It is still experimental.
xAI launches Grok STT and TTS APIs (4 minute read)
xAI has launched standalone Grok Speech to Text (STT) and Text to Speech (TTS) APIs, enhancing developer options for integrating advanced speech capabilities. These APIs support high accuracy, low latency, word-level timestamps, speaker diarization, and intelligent inverse text normalization across 25+ languages. Grok STT excels in transcription accuracy within phone calls and video/podcasts, providing robust solutions for medical, legal, and financial sectors.
Prefill-as-a-Service: KVCache of Next-Generation Models Could Go Cross-Datacenter (55 minute read)
Prefill-as-a-Service (PrfaaS) is a cross-datacenter serving architecture that selectively offloads long-context prefill to standalone, compute-dense prefill clusters and transfers the resulting KVCache over commodity Ethernet to local PD clusters for decode. It combines model-side KV efficiency with system-side selective offloading, bandwidth-aware scheduling, and cache-aware request placement. The design removes the requirement that heterogeneous accelerators share the same low-latency RDMA fabric and enables independent scaling of prefill and decode capacity across loosely coupled clusters. A PrfaaS-augmented heterogeneous deployment achieves higher serving throughput while consuming only modest cross-datacenter bandwidth.
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems (1 minute read)
Claude Code is an agentic coding tool that can run shell commands, edit files, and call external services on behalf of the user. This study describes its architecture by analyzing the publicly available TypeScript source code and further comparing it with OpenClaw. The core of the system is a simple while-loop that calls the model, runs tools, and repeats. OpenClaw uses the same recurring design questions but produces different architectural answers when the deployment context changes. The study also identifies six open design directions for future agent systems.
π
Miscellaneous
Composing a Search Engine (2 minute read)
Canon defines its search pipeline as a DAG to ensure automatic concurrency across classification, localization, retrieval, and ranking processes. This setup allows independent node control, enabling easy subsystem changes without affecting the overall pipeline. The DAG framework optimizes execution through parallelism, supports durable execution with node-specific retries, and provides clear introspection and execution decoupling.
Better AI models enable more ambitious work (3 minute read)
Better AI models, like Opus 4.5 and GPT-5.2, increased developer AI usage by 44%, enabling more complex tasks after an initial adjustment period. Industries including media and advertising saw notable usage growth, driven by competitive pressures and new opportunities. Developers shifted toward managing AI output, with significant increases in documentation, architecture, and learning tasks.