Friday, March 6, 2026
  • About Web3Wire
  • Web3Wire NFTs
  • .w3w TLD
  • $W3W Token
  • Web3Wire DAO
  • Media Network
  • RSS Feed
  • Contact Us
Web3Wire
No Result
View All Result
  • Home
  • Web3
    • Latest
    • AI
    • Business
    • Blockchain
    • Cryptocurrencies
    • Decentralized Finance
    • Metaverse
    • Non-Fungible Token
    • Press Release
  • Technology
    • Consumer Tech
    • Digital Fashion
    • Editor’s Choice
    • Guides
    • Stories
  • Coins
    • Top 10 Coins
    • Top 50 Coins
    • Top 100 Coins
    • All Coins
  • Exchanges
    • Top 10 Crypto Exchanges
    • Top 50 Crypto Exchanges
    • Top 100 Crypto Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks
  • Events
  • News
    • Latest Crypto News
    • Latest DeFi News
    • Latest Web3 News
  • Home
  • Web3
    • Latest
    • AI
    • Business
    • Blockchain
    • Cryptocurrencies
    • Decentralized Finance
    • Metaverse
    • Non-Fungible Token
    • Press Release
  • Technology
    • Consumer Tech
    • Digital Fashion
    • Editor’s Choice
    • Guides
    • Stories
  • Coins
    • Top 10 Coins
    • Top 50 Coins
    • Top 100 Coins
    • All Coins
  • Exchanges
    • Top 10 Crypto Exchanges
    • Top 50 Crypto Exchanges
    • Top 100 Crypto Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks
  • Events
  • News
    • Latest Crypto News
    • Latest DeFi News
    • Latest Web3 News
No Result
View All Result
Web3Wire
No Result
View All Result
Home Press Release OpenPR

Z.ai Launches GLM-4.5V: Open-source Vision-Language Model Sets New Bar for Multimodal Reasoning

August 14, 2025
in OpenPR, Web3
Reading Time: 8 mins read
5
SHARES
259
VIEWS
Share on TwitterShare on LinkedInShare on Facebook
Z.ai Launches GLM-4.5V: Open-source Vision-Language Model

Z.ai (formerly Zhipu) today announced GLM-4.5V, an open-source vision-language model engineered for robust multimodal reasoning across images, video, long documents, charts, and GUI screens.

Multimodal reasoning is widely viewed as a key pathway toward AGI. GLM-4.5V advances that agenda with a 100B-class architecture (106B total parameters, 12B active) that pairs high accuracy with practical latency and deployment cost. The release follows July’s GLM-4.1V-9B-Thinking, which hit #1 on Hugging Face Trending and has surpassed 130,000 downloads, and scales that recipe to enterprise workloads while keeping developer ergonomics front and center. The model is accessible through multiple channels, including Hugging Face [http://huggingface.co/zai-org/GLM-4.5V], GitHub [http://github.com/zai-org/GLM-V], Z.ai API Platform [http://docs.z.ai/guides/vlm/glm-4.5v], and Z.ai Chat [http://chat.z.ai], ensuring broad developer access.

Open-Source SOTA

Built on the new GLM-4.5-Air text base and extending the GLM-4.1V-Thinking lineage, GLM-4.5V delivers SOTA performance among similarly sized open-source VLMs across 41 public multimodal evaluations. Beyond leaderboards, the model is engineered for real-world usability and reliability on noisy, high-resolution, and extreme-aspect-ratio inputs.

The result is all-scenario visual reasoning in practical pipelines: image reasoning (scene understanding, multi-image analysis, localization), video understanding (shot segmentation and event recognition), GUI tasks (screen reading, icon detection, desktop assistance), complex chart and long-document analysis (report understanding and information extraction), and precise grounding (accurate spatial localization of visual elements).

Image: https://www.globalnewslines.com/uploads/2025/08/1ca45a47819aaf6a111e702a896ee2bc.jpg

Key Capabilities

Visual grounding and localization

GLM-4.5V precisely identifies and locates target objects based on natural-language prompts and returns bounding coordinates. This enables high-value applications such as safety and quality inspection or aerial/remote-sensing analysis. Compared with conventional detectors, the model leverages broader world knowledge and stronger semantic reasoning to follow more complex localization instructions.

Users can switch to the Visual Positioning mode, upload an image and a short prompt, and get back the box and rationale. For example, ask “Point out any non-real objects in this picture.” GLM-4.5V reasons about plausibility and materials, then flags the insect-like sprinkler robot (the item highlighted in red in the demo) as non-real, returning a tight bounding box a confidence score, and a brief explanation.

Image: https://www.globalnewslines.com/uploads/2025/08/8dcbdd7939f12f7a2239bfbb0528b3f7.jpg

Design-to-code from screenshots and interaction videos

The model analyzes page screenshots-and even interaction videos-to infer hierarchy, layout rules, styles, and intent, then emits faithful, runnable HTML/CSS/JavaScript. Beyond element detection, it reconstructs the underlying logic and supports region-level edit requests, enabling an iterative loop between visual input and production-ready code.

Open-world image reasoning

GLM-4.5V can infer background context from subtle visual cues without external search. Given a landscape or street photo, it can reason from vegetation, climate traces, signage, and architectural styles to estimate the shooting location and approximate coordinates.

For example, using a classic scene from Before Sunrise -“Based on the architecture and streets in the background, can you identify the specific location in Vienna where this scene was filmed?”-the model parses facade details, street furniture, and layout cues to localize the exact spot in Vienna and return coordinates and a landmark name. (See demo: https://chat.z.ai/s/39233f25-8ce5-4488-9642-e07e7c638ef6).

Image: https://www.globalnewslines.com/uploads/2025/08/f51fdc9fae815cfaf720bb07467a54db.jpg

Beyond single images, GLM-4.5V’s open-world reasoning scales in competitive settings: in a global “Geo Game,” it beat 99% of human players within 16 hours and climbed to rank 66 within seven days-clear evidence of robust real-world performance.

Complex document and chart understanding

The model reads documents visually-pages, figures, tables, and charts-rather than relying on brittle OCR pipelines. That end-to-end approach preserves structure and layout, improving accuracy for summarization, translation, information extraction, and commentary across long, mixed-media reports.

GUI agent foundation

Built-in screen understanding lets GLM-4.5V read interfaces, locate icons and controls, and combine the current visual state with user instructions to plan actions. Paired with agent runtimes, it supports end-to-end desktop automation and complex GUI agent tasks, providing a dependable visual backbone for agentic systems.

Built for Reasoning, Designed for Use

GLM-4.5V is built on the new GLM-4.5-Air text base and uses a modern VLM pipeline-vision encoder, MLP adapter, and LLM decoder-with 64K multimodal context, native image and video inputs, and enhanced spatial-temporal modeling so the system handles high-resolution and extreme-aspect-ratio content with stability.

The training stack follows a three-stage strategy: large-scale multimodal pretraining on interleaved text-vision data and long contexts; supervised fine-tuning with explicit chain-of-thought formats to strengthen causal and cross-modal reasoning; and reinforcement learning that combines verifiable rewards with human feedback to lift STEM, grounding, and agentic behaviors. A simple thinking / non-thinking switch allows builders trade depth for speed on demand, aligning the model with varied product latency targets.

Image: https://www.globalnewslines.com/uploads/2025/08/8c8146f0727d80970ed4f09b16f3b316.jpg
Media Contact
Company Name: Z.ai
Contact Person: Zixuan Li
Email: Send Email [http://www.universalpressrelease.com/?pr=zai-launches-glm45v-opensource-visionlanguage-model-sets-new-bar-for-multimodal-reasoning]
Country: Singapore
Website: https://chat.z.ai/

Legal Disclaimer: Information contained on this page is provided by an independent third-party content provider. GetNews makes no warranties or responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you are affiliated with this article or have any complaints or copyright issues related to this article and would like it to be removed, please contact retract@swscontact.com

This release was published on openPR.

About Web3Wire
Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming.
Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.
ShareTweet1ShareSendShare2
Previous Post

TOHT Turns Employee Feedback into a Leading Economic Indicator – Predicting Organizational Risks Before They Impact the Bottom Line

Next Post

Why Does Getting Your Car Repaired Take So Long?

Related Posts

Global Speech and Voice Recognition Market: Generative AI and Voice Biometrics Propel Industry Toward USD 85.35 Billion by 2036

AI-powered voice interfaces and biometric authentication are transforming the global speech recognition market landscape. The global speech and voice recognition market is going through a period of genuine and rapid transformation as artificial intelligence continues to redefine how people interact with the technology around them. The market was valued at...

Read moreDetails

Global Speech and Voice Recognition Market: Generative AI and Voice Biometrics Propel Industry Toward USD 85.35 Billion by 2036

AI-powered voice interfaces and biometric authentication are transforming the global speech recognition market landscape. The global speech and voice recognition market is going through a period of genuine and rapid transformation as artificial intelligence continues to redefine how people interact with the technology around them. The market was valued at...

Read moreDetails

Generative AI Market Expected to Reach US$ 1,022.41 Billion by 2032 as Intelligent Automation Transforms Business Operations Across Industries

Generative AI Market The Demand for advanced artificial intelligence technologies is growing rapidly as organizations seek smarter ways to automate tasks, generate insights, and enhance customer experiences. Generative AI, a rapidly evolving segment of artificial intelligence capable of producing text, images, code, audio, and video content, is becoming a critical...

Read moreDetails

Generative AI Market Expected to Reach US$ 1,022.41 Billion by 2032 as Intelligent Automation Transforms Business Operations Across Industries

Generative AI Market The Demand for advanced artificial intelligence technologies is growing rapidly as organizations seek smarter ways to automate tasks, generate insights, and enhance customer experiences. Generative AI, a rapidly evolving segment of artificial intelligence capable of producing text, images, code, audio, and video content, is becoming a critical...

Read moreDetails

Recycle Yarn Market to Reach USD 10.28 Billion at 8.09% CAGR Through 2033 | Key Players: EcoSpun Fiber Technologies, GreenThread Yarn Industries, CircularWeave Textile Corp., ReThread Sustainable Yarns, FiberLoop Recycled Textiles Group

Recycle Yarn Market According to a new study by DataHorizzon Research, the Recycle Yarn Market is projected to grow at a CAGR of 8.09% from 2025 to 2033. This exceptional expansion is driven by accelerating adoption of recycled fiber yarn across apparel, activewear, home textiles, industrial fabric, and technical textile...

Read moreDetails

Recycle Yarn Market to Reach USD 10.28 Billion at 8.09% CAGR Through 2033 | Key Players: EcoSpun Fiber Technologies, GreenThread Yarn Industries, CircularWeave Textile Corp., ReThread Sustainable Yarns, FiberLoop Recycled Textiles Group

Recycle Yarn Market According to a new study by DataHorizzon Research, the Recycle Yarn Market is projected to grow at a CAGR of 8.09% from 2025 to 2033. This exceptional expansion is driven by accelerating adoption of recycled fiber yarn across apparel, activewear, home textiles, industrial fabric, and technical textile...

Read moreDetails

Die Casting Machines Market to Reach USD 7.19 Billion at 6.2% CAGR Through 2033 | Key Players: CastForce Industrial Systems, MoldDrive Manufacturing Corp., PressForm Die Technologies, MetalCast Equipment Group, AlloyPress Industrial Solutions

Die Casting Machines Market According to a new study by DataHorizzon Research, the Die Casting Machines Market is projected to grow at a CAGR of 6.2% from 2025 to 2033. This robust expansion is driven by accelerating global demand for high-pressure die casting equipment across automotive aluminum structural component manufacturing,...

Read moreDetails

Die Casting Machines Market to Reach USD 7.19 Billion at 6.2% CAGR Through 2033 | Key Players: CastForce Industrial Systems, MoldDrive Manufacturing Corp., PressForm Die Technologies, MetalCast Equipment Group, AlloyPress Industrial Solutions

Die Casting Machines Market According to a new study by DataHorizzon Research, the Die Casting Machines Market is projected to grow at a CAGR of 6.2% from 2025 to 2033. This robust expansion is driven by accelerating global demand for high-pressure die casting equipment across automotive aluminum structural component manufacturing,...

Read moreDetails

Diamond Color Sorting Machine Market to Reach USD 4.1 Billion at 7.5% CAGR Through 2033 | Key Players: GemSort Optical Technologies, CrystalGrade Sorting Systems, DiamondVision Grading Equipment, SpectralGem Sorting Corp., ColorMaster Diamond Technologies

Diamond Color Sorting Machine Market According to a new study by DataHorizzon Research, the Diamond Color Sorting Machine Market is projected to grow at a CAGR of 7.5% from 2025 to 2033. This strong expansion is driven by rising global demand for automated, objective, and high-throughput diamond color grading and...

Read moreDetails

Diamond Color Sorting Machine Market to Reach USD 4.1 Billion at 7.5% CAGR Through 2033 | Key Players: GemSort Optical Technologies, CrystalGrade Sorting Systems, DiamondVision Grading Equipment, SpectralGem Sorting Corp., ColorMaster Diamond Technologies

Diamond Color Sorting Machine Market According to a new study by DataHorizzon Research, the Diamond Color Sorting Machine Market is projected to grow at a CAGR of 7.5% from 2025 to 2033. This strong expansion is driven by rising global demand for automated, objective, and high-throughput diamond color grading and...

Read moreDetails
Web3Wire NFTs - The Web3 Collective

Web3Wire, $W3W Token and .w3w tld Whitepaper

Web3Wire, $W3W Token and .w3w tld Whitepaper

Claim your space in Web3 with .w3w Domain!

Web3Wire

Trending on Web3Wire

  • Unifying Blockchain Ecosystems: 2024 Guide to Cross-Chain Interoperability

    154 shares
    Share 62 Tweet 39
  • Top 5 Wallets for Seamless Multi-Chain Trading in 2025

    79 shares
    Share 32 Tweet 20
  • Understanding Soulbound Tokens SBT Their Definition and Significance

    48 shares
    Share 19 Tweet 12
  • Molt.id: The First AI Agent Domain System on Solana — Where One NFT Gives You Everything

    6 shares
    Share 2 Tweet 2
  • Top Cross-Chain DeFi Solutions to Watch by 2025

    82 shares
    Share 33 Tweet 21
Join our Web3Wire Community!

Our newsletters are only twice a month, reaching around 10000+ Blockchain Companies, 800 Web3 VCs, 600 Blockchain Journalists and Media Houses.


* We wont pass your details on to anyone else and we hate spam as much as you do. By clicking the signup button you agree to our Terms of Use and Privacy Policy.

Web3Wire Podcasts

Upcoming Events

There are currently no events.

Latest on Web3Wire

  • Teen CTO’s Mental Health App Wins America’s Top Young Innovator 2025
  • Teen CTO’s Mental Health App Wins America’s Top Young Innovator 2025
  • Global Speech and Voice Recognition Market: Generative AI and Voice Biometrics Propel Industry Toward USD 85.35 Billion by 2036
  • Global Speech and Voice Recognition Market: Generative AI and Voice Biometrics Propel Industry Toward USD 85.35 Billion by 2036
  • Generative AI Market Expected to Reach US$ 1,022.41 Billion by 2032 as Intelligent Automation Transforms Business Operations Across Industries

RSS Latest on Block3Wire

  • Covo Finance: Revolutionary Crypto Leverage Trading Platform
  • WorldStrides and HEX Announce Partnership to Offer High School and University Students Innovative Courses Designed to Improve Their Outlook in the Digital Age
  • Cathedra Bitcoin Announces Leasing of 2.5-MW Bitcoin Mining Facility
  • Global Web3 Payments Leader, Banxa, Announces Integration With Metis to Usher In Next Wave of Cryptocurrency Users
  • Dexalot Launches First Hybrid DeFi Subnet on Avalanche

RSS Latest on Meta3Wire

  • Thumbtack Honored as a 2023 Transform Awards Winner
  • Accenture Invests in Looking Glass to Accelerate Shift from 2D to 3D
  • MetatronAI.com Unveils Revolutionary AI-Chat Features and Interface Upgrades
  • Purely.website – Disruptive new platform combats rising web hosting costs
  • WEMADE and Metagravity Sign Strategic Alliance MOU to Collaborate on Blockchain Games for the Metaverse
Web3Wire

Web3Wire is your go-to source for the latest insights and updates in Web3, Metaverse, Blockchain, AI, Cryptocurrencies, DeFi, NFTs, and Gaming. We provide comprehensive coverage through news, press releases, event updates, and research articles, keeping you informed about the rapidly evolving digital world.

  • About Web3Wire
  • Founder’s Note
  • Web3Wire NFTs – The Web3 Collective
  • .w3w TLD
  • $W3W Token
  • Web3Wire DAO
  • Event Partners
  • Community Partners
  • Our Media Network
  • Media Kit
  • RSS Feeds
  • Contact Us

Crypto Coins

  • Top 10 Coins
  • Top 50 Coins
  • Top 100 Coins
  • All Coins – Marketcap
  • Crypto Coins Heatmap

Crypto Exchanges

  • Top 10 Exchanges
  • Top 50 Exchanges
  • Top 100 Exchanges
  • All Crypto Exchanges

Crypto Stocks

  • Blockchain Stocks
  • NFT Stocks
  • Metaverse Stocks
  • Artificial Intelligence Stocks

Web3Wire Whitepaper | Tokenomics

Web3 Resources

  • Top Web3 and Crypto Youtube Channels
  • Latest Crypto News
  • Latest DeFi News
  • Latest Web3 News

Blockchain Resources

  • Blockchain and Web3 Resources
  • Decentralized Finance (DeFi) – Research Reports
  • All Crypto Whitepapers

Metaverse Resources

  • AR VR and Metaverse Resources
  • Metaverse Courses
Claim your space in Web3 with .w3w!

The Klyrox Protocol | The Algorithmic Monographs

Top 50 Web3 Blogs and Websites
Web3Wire Podcast on Spotify Web3Wire Podcast on Amazon Music 
Web3Wire - Web3 and Blockchain - News, Events and Press Releases | Product Hunt
Web3Wire on Google News

Media Portfolio: Block3Wire | Meta3Wire

  • Privacy Policy
  • Terms of Use
  • Disclaimer
  • Sitemap
  • For Search Engines
  • Crypto Sitemap
  • Exchanges Sitemap

© 2024 Web3Wire. We strongly recommend our readers to DYOR, before investing in any cryptocurrencies, blockchain projects, or ICOs, particularly those that guarantee profits.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Coins
    • Top 10 Cryptocurrencies
    • Top 50 Cryptocurrencies
    • Top 100 Cryptocurrencies
    • All Coins
  • Exchanges
    • Top 10 Cryptocurrency Exchanges
    • Top 50 Cryptocurrency Exchanges
    • Top 100 Cryptocurrency Exchanges
    • All Crypto Exchanges
  • Stocks
    • Blockchain Stocks
    • NFT Stocks
    • Metaverse Stocks
    • Artificial Intelligence Stocks

© 2024 Web3Wire. We strongly recommend our readers to DYOR, before investing in any cryptocurrencies, blockchain projects, or ICOs, particularly those that guarantee profits.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.