Wednesday, June 17, 2026
  • About Web3Wire
  • Web3Wire NFTs
  • .w3w TLD
  • $W3W Token
  • Web3Wire DAO
  • Media Network
  • RSS Feed
  • Contact Us
Web3Wire
No Result
View All Result
  • Home
  • Web3
    • Latest
    • AI
    • Business
    • Blockchain
    • Cryptocurrencies
    • Decentralized Finance
    • Metaverse
    • Non-Fungible Token
    • Press Release
  • Technology
    • Consumer Tech
    • Digital Fashion
    • Editor’s Choice
    • Guides
    • Stories
  • Coins
    • Top 10 Coins
    • Top 50 Coins
    • Top 100 Coins
    • All Coins
  • Exchanges
    • Top 10 Crypto Exchanges
    • Top 50 Crypto Exchanges
    • Top 100 Crypto Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks
  • Events
  • News
    • Latest Crypto News
    • Latest DeFi News
    • Latest Web3 News
  • Home
  • Web3
    • Latest
    • AI
    • Business
    • Blockchain
    • Cryptocurrencies
    • Decentralized Finance
    • Metaverse
    • Non-Fungible Token
    • Press Release
  • Technology
    • Consumer Tech
    • Digital Fashion
    • Editor’s Choice
    • Guides
    • Stories
  • Coins
    • Top 10 Coins
    • Top 50 Coins
    • Top 100 Coins
    • All Coins
  • Exchanges
    • Top 10 Crypto Exchanges
    • Top 50 Crypto Exchanges
    • Top 100 Crypto Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks
  • Events
  • News
    • Latest Crypto News
    • Latest DeFi News
    • Latest Web3 News
No Result
View All Result
Web3Wire
No Result
View All Result
Home Artificial Intelligence

WEKA and Oracle Cloud Infrastructure Validate 10x Throughput Gains for Long-Context AI Inference

June 10, 2026
in Artificial Intelligence, PRNewswire
Reading Time: 8 mins read
5
SHARES
248
VIEWS
Share on TwitterShare on LinkedInShare on Facebook

Joint benchmarks on OCI H100 infrastructure showed 10x more concurrent users, 10x higher token throughput, and 7x more tokens served without adding GPUs

CAMPBELL, Calif., June 9, 2026 /PRNewswire/ — WEKA, the AI data and memory infrastructure company, today announced production-scale benchmarks that show how organizations can improve the economics of long-context AI inference by serving more users and tokens on the same GPU footprint. The benchmarks show that WEKA’s NeuralMesh™ platform with Augmented Memory Grid™ on Oracle Cloud Infrastructure (OCI) serves 10x more concurrent users, delivers 10x higher token throughput, and produces 7x more tokens per GPU than DRAM-only configurations without adding infrastructure. The results were validated on a nine-node OCI bare-metal H100 cluster with 100,000-token context windows.

“Enterprise AI workloads are pushing context windows and GPU utilization to new limits,” said Pablo Selem, senior director, software development, Oracle Cloud Infrastructure. “These benchmarks show how WEKA’s NeuralMesh platform with Augmented Memory Grid on OCI helps remove memory bottlenecks so customers can support larger, more demanding inference workloads without simply adding more GPUs.”

Three Outcomes That Change the Math on Inference
Validated at production scale on a bare-metal H100 cluster (nine nodes, 72 GPUs, 100,000-token context windows, thousands of concurrent users), NeuralMesh with Augmented Memory Grid on OCI delivered:

  • 10x more concurrent users served, without adding infrastructure. NeuralMesh with Augmented Memory Grid scaled past 5,000 concurrent users vs. about 600 for DRAM-only configurations. This eliminates the failure cliff that hits when cache saturates by expanding the active cache working set from 8.64 TiB of DRAM to 287 TiB of usable NVMe. In addition, more users per GPU means the same investment stretches further.
  • 10x higher token throughput. More output from every GPU in the cluster. On OCI, NeuralMesh with Augmented Memory Grid reached approx. two million tokens per second, compared to under 200,000 for the DRAM-only baseline. For product teams running real-time AI features, including search, summarization, code assist, and multi-turn agents, the throughput determines the ceiling for how many users can be served, how fast features respond, and how much revenue the infrastructure can support.
  • 7x more tokens served. Lower cost per token at scale. NeuralMesh with Augmented Memory Grid served five billion tokens, compared to 700 million for the DRAM-only baseline, in a single one-hour, 2,400-user test. For organizations running agentic workflows, DRAM saturation quietly drains GPU capacity through constant recomputation, creating a direct hit on cost per token and ROI.

“Inference is bottlenecked by how much effective memory is available to GPUs,” said Liran Zvibel, CEO of WEKA. “These results prove that AI token economics aren’t solved by hardware alone; they’re solved by eliminating the memory wall that has been the real ceiling on what existing hardware can do. NeuralMesh with Augmented Memory Grid running on OCI brings orders of magnitude more tokens to customers in an extremely cost-efficient way.”

Transforming AI Economics with Context Memory Infrastructure
As inference demand grows, AI infrastructure inefficiencies compound. Every key-value (KV) cache eviction is a tax: on GPU cycles, latency, user experience, and the cost of every token served. For long-context and agentic workloads, where inputs routinely run to 100,000 tokens or more, that tax is not a rounding error. It is a direct hit on the unit economics of every organization running production AI.

Augmented Memory Grid, a capability of NeuralMesh, solves the problem at the architectural level by decoupling KV cache from local GPU memory and storing it in a high-performance token warehouse accessible across the cluster. Any host can serve any session with cache hits intact, eliminating rigid session stickiness while delivering superior performance to DRAM, improving load balancing, and enabling clean horizontal scaling as concurrency grows. The result is persistent context memory for AI agents and the cost lever that makes long-context inference economical to run at scale.

Production-Grade Proof
OCI published the full benchmark methodology, system configuration, and results on its AI & Data Science blog on May 13, 2026. The benchmarks, executed on a nine-node OCI bare-metal H100 cluster, move beyond the prior phase of validation, which demonstrated 1000x more KV cache capacity and up to 20x faster time to first token at 128,000 tokens. This latest phase tests the full economics of inference in production: concurrency density, sustained throughput, cache persistence, and service level objective (SLO) stability when demand spikes under high load.

Available on Oracle Marketplace
NeuralMesh with Augmented Memory Grid is generally available to WEKA customers and on the Oracle Marketplace, with OCI as WEKA’s exclusive cloud launch partner. Organizations running long-context inference on OCI can deploy a validated, production-ready architecture today. For more on the OCI and WEKA Augmented Memory Grid benchmark, read the OCI blog: https://blogs.oracle.com/ai-and-datascience/scaling-long-context-inference-on-oci-with-wekas-augmented-memory-grid.

About WEKA
WEKA is the AI data and memory infrastructure company transforming the economics of agentic AI. Its NeuralMesh™ platform unifies high-performance data storage with extended GPU memory, giving enterprises, AI cloud providers, and AI builders a single foundation for training, inference, and agentic workloads. With Augmented Memory Grid, NeuralMesh extends GPU memory capacity by 1000x, accelerates time to first token by up to 20x, and delivers 10x more concurrent users from the same GPU footprint, proven in production benchmarks. Trusted by 30% of the Fortune 50, WEKA enables organizations to scale AI faster, optimize GPU utilization, and reduce the cost of every token served. Learn more at http://www.weka.io or connect with us on LinkedIn and X.

WEKA and the W logo are registered trademarks of WekaIO, Inc. Other trade names herein may be trademarks of their respective owners.

View original content to download multimedia:https://www.prnewswire.co.uk/news-releases/weka-and-oracle-cloud-infrastructure-validate-10x-throughput-gains-for-long-context-ai-inference-302793740.html

About Web3Wire
Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming.
Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.

ShareTweet1ShareSendShare2
Previous Post

Gumitide Gummies (Official Website Scam WARNING) – Claims Investigated

Next Post

Vadzo Imaging Explains Embedded Camera Lens Selection and FOV Calculation for Vision System Integration

Related Posts

Atos joins CrowdStrike’s Project QuiltWorks to advance sovereign AI adoption and secure frontier AI risk

Press Release Atos joins CrowdStrike’s Project QuiltWorks to advance sovereign AI adoption and secure frontier AI risk Atos’ leadership at the intersection of cybersecurity, AI and digital sovereignty strengthens QuiltWorks’ ecosystem Paris, France – June 17, 2026 – Atos, a global leader of AI-powered digital transformation, today announces that it...

Read moreDetails

SPARC AI Strengthens Ukraine Market Entry, Engages CFC Defence to Drive Frontline Adoption of Overwatch and Expand Its Roster of Partnered Drone Manufacturers

VANCOUVER, British Columbia, June 17, 2026 (GLOBE NEWSWIRE) -- SPARC AI Inc. (CSE: SPAI; OTCQB: SPAIF; Frankfurt: 5OV0) ("SPARC AI" or the "Company"), the defence-technology company behind Overwatch, a software-only, hardware-agnostic GPS-denied navigation and targeting platform for unmanned and autonomous systems, today announced that it has engaged CFC Defence, an...

Read moreDetails

Aembit Extends IAM for Agentic AI to Microsoft Copilot Studio

LAS VEGAS, June 17, 2026 (GLOBE NEWSWIRE) -- Aembit on Tuesday announced support for Copilot Studio, extending its identity and access management capabilities to Microsoft's enterprise AI agent platform. The integration, unveiled at Identiverse 2026, gives security teams the tools to manage what Copilot Studio agents can access, under what...

Read moreDetails

Peec AI launches AI Shopping Analytics as product recommendations move inside ChatGPT

Berlin, BERLIN, June 17, 2026 (GLOBE NEWSWIRE) -- Peec AI, the AI search analytics platform, today launched AI Shopping Analytics, giving e-commerce brands product-level visibility into how AI assistants recommend their catalog.The launch extends Peec AI from brand visibility to product-level visibility — into AI shopping, where buying decisions now happen....

Read moreDetails

Why Financial Firms Keep Overspending on Full Stack Observability and How it Finally Fixes the Leak

MUMBAI, IN / ACCESS Newswire / June 17, 2026 / Financial institutions generate huge volumes of operational data every second. Trading platforms, payment gateways, mobile banking applications, fraud detection engines, cloud workloads, and customer-facing services all produce streams of logs, metrics, traces and events. The list can go on and...

Read moreDetails

GenRocket DataConnect™ Brings Deterministic Synthetic Data Generation to Agentic Testing Systems

Ojai, CA, June 17, 2026 --(PR.com)-- GenRocket, the leader in Design-Driven Synthetic Data Generation, today announced the launch of GenRocket DataConnect™, a new enterprise Synthetic Data-as-a-Service (DaaS) platform that delivers on demand, deterministic synthetic data to Agentic testing systems through REST APIs and Model Context Protocol (MCP) integration.As organizations increasingly experiment...

Read moreDetails

Inntot Technologies Reinforces Software-Defined Radio Leadership at WorldDAB Automotive 2026

KOCHI, India, June 17, 2026 /PRNewswire/ -- Inntot Technologies, a leading provider of Software-Defined Radio (SDR) and in-cabin audio technologies, successfully showcased its production-proven digital radio solutions at WorldDAB Automotive 2026 in Frankfurt, Germany, reinforcing the company's leadership in enabling scalable and cost-effective radio architectures for next-generation vehicles. At the event,...

Read moreDetails

Schneider Electric advances energy intelligence at VivaTech 2026

PARIS, June 17, 2026 (GLOBE NEWSWIRE) -- Schneider Electric, a global energy technology leader, today announced its agenda for VivaTech 2026, Europe’s largest startup and tech event, where its top executives will take the stage to discuss the new rules of innovation and the energy technology needed to keep pace...

Read moreDetails

Schneider Electric advances energy intelligence at VivaTech 2026

PARIS, June 17, 2026 (GLOBE NEWSWIRE) -- Schneider Electric, a global energy technology leader, today announced its agenda for VivaTech 2026, Europe’s largest startup and tech event, where its top executives will take the stage to discuss the new rules of innovation and the energy technology needed to keep pace...

Read moreDetails

Bull and Foxconn advance European AI infrastructure with NVIDIA Vera Rubin NVL72 platform built in Europe

Paris, France – 17 June, 2026 – As part of their recently unveiled collaboration, Bull, a leader in advanced computing and AI, and Hon Hai Technology Group (Foxconn), the world’s largest electronics manufacturer and leading technology solutions provider, today announce a first strategic milestone with the production in Europe of key...

Read moreDetails
Web3Wire NFTs - The Web3 Collective

Web3Wire, $W3W Token and .w3w tld Whitepaper

Web3Wire, $W3W Token and .w3w tld Whitepaper

Claim your space in Web3 with .w3w Domain!

Web3Wire

Trending on Web3Wire

  • GENISOM AI Debuts at ICRA 2026 with Full-Stack Embodied Intelligence System

    31 shares
    Share 12 Tweet 8
  • Top Cross-Chain DeFi Solutions to Watch by 2025

    131 shares
    Share 52 Tweet 33
  • Top Layer 1 Crypto Projects to Watch in 2025

    17 shares
    Share 7 Tweet 4
  • Understanding Soulbound Tokens SBT Their Definition and Significance

    65 shares
    Share 26 Tweet 16
  • Unifying Blockchain Ecosystems: 2024 Guide to Cross-Chain Interoperability

    171 shares
    Share 68 Tweet 43
Join our Web3Wire Community!

Our newsletters are only twice a month, reaching around 10000+ Blockchain Companies, 800 Web3 VCs, 600 Blockchain Journalists and Media Houses.


* We wont pass your details on to anyone else and we hate spam as much as you do. By clicking the signup button you agree to our Terms of Use and Privacy Policy.

Web3Wire Podcasts

Upcoming Events

There are currently no events.

Latest on Web3Wire

  • ZiiGaat x Vivir Digital RUMBA: Graphene Dynamic Driver IEM With Balanced and Engaging Sound
  • Atos joins CrowdStrike’s Project QuiltWorks to advance sovereign AI adoption and secure frontier AI risk
  • SPARC AI Strengthens Ukraine Market Entry, Engages CFC Defence to Drive Frontline Adoption of Overwatch and Expand Its Roster of Partnered Drone Manufacturers
  • Aembit Extends IAM for Agentic AI to Microsoft Copilot Studio
  • Peec AI launches AI Shopping Analytics as product recommendations move inside ChatGPT

RSS Latest on Block3Wire

  • The Algorithmic Monographs: A Five-Volume Civil Code for the Age of Autonomous Intelligence
  • Ali Sadhik Shaik: Practitioner, Scholar, and Author – Focused on the Governance of Intelligent Systems
  • The Klyrox Protocol: A Decentralized Framework to Close the AI Accountability Gap
  • Covo Finance: Revolutionary Crypto Leverage Trading Platform
  • WorldStrides and HEX Announce Partnership to Offer High School and University Students Innovative Courses Designed to Improve Their Outlook in the Digital Age

RSS Latest on Meta3Wire

  • The Algorithmic Monographs: A Five-Volume Civil Code for the Age of Autonomous Intelligence
  • Ali Sadhik Shaik: Practitioner, Scholar, and Author – Focused on the Governance of Intelligent Systems
  • The Klyrox Protocol: A Decentralized Framework to Close the AI Accountability Gap
  • Thumbtack Honored as a 2023 Transform Awards Winner
  • Accenture Invests in Looking Glass to Accelerate Shift from 2D to 3D
Web3Wire

Web3Wire is your go-to source for the latest insights and updates in Web3, Metaverse, Blockchain, AI, Cryptocurrencies, DeFi, NFTs, and Gaming. We provide comprehensive coverage through news, press releases, event updates, and research articles, keeping you informed about the rapidly evolving digital world.

  • About Web3Wire
  • Founder’s Note
  • Web3Wire NFTs – The Web3 Collective
  • .w3w TLD
  • $W3W Token
  • Web3Wire DAO
  • Event Partners
  • Community Partners
  • Our Media Network
  • Media Kit
  • RSS Feeds
  • Contact Us

Crypto Coins

  • Top 10 Coins
  • Top 50 Coins
  • Top 100 Coins
  • All Coins – Marketcap
  • Crypto Coins Heatmap

Crypto Exchanges

  • Top 10 Exchanges
  • Top 50 Exchanges
  • Top 100 Exchanges
  • All Crypto Exchanges

Crypto Stocks

  • Blockchain Stocks
  • NFT Stocks
  • Metaverse Stocks
  • Artificial Intelligence Stocks

Web3Wire Whitepaper | Tokenomics

Web3 Resources

  • Top Web3 and Crypto Youtube Channels
  • Latest Crypto News
  • Latest DeFi News
  • Latest Web3 News

Blockchain Resources

  • Blockchain and Web3 Resources
  • Decentralized Finance (DeFi) – Research Reports
  • All Crypto Whitepapers

Metaverse Resources

  • AR VR and Metaverse Resources
  • Metaverse Courses
Claim your space in Web3 with .w3w!

The Klyrox Protocol | The Algorithmic Monographs

Top 50 Web3 Blogs and Websites
Web3Wire Podcast on Spotify Web3Wire Podcast on Amazon Music 
Web3Wire - Web3 and Blockchain - News, Events and Press Releases | Product Hunt
Web3Wire on Google News

Media Portfolio: Block3Wire | Meta3Wire

  • Privacy Policy
  • Terms of Use
  • Disclaimer
  • Sitemap
  • For Search Engines
  • Crypto Sitemap
  • Exchanges Sitemap

© 2024 Web3Wire. We strongly recommend our readers to DYOR, before investing in any cryptocurrencies, blockchain projects, or ICOs, particularly those that guarantee profits.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Coins
    • Top 10 Cryptocurrencies
    • Top 50 Cryptocurrencies
    • Top 100 Cryptocurrencies
    • All Coins
  • Exchanges
    • Top 10 Cryptocurrency Exchanges
    • Top 50 Cryptocurrency Exchanges
    • Top 100 Cryptocurrency Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks

© 2024 Web3Wire. We strongly recommend our readers to DYOR, before investing in any cryptocurrencies, blockchain projects, or ICOs, particularly those that guarantee profits.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.