A popular technique to make AI more efficient has drawbacks Gaylord Contreras

One of the most widely used techniques to make AI models more efficient, quantization, has limits — and the industry could be fast approaching them.

In the context of AI, quantization refers to lowering the number of bits — the smallest units a computer can process — needed to represent information. Consider this analogy: When someone asks the time, you’d probably say “noon” — not “oh twelve hundred, one second, and four milliseconds.” That’s quantizing; both answers are correct, but one is slightly more precise. How much precision you actually need depends on the context.

AI models consist of several components that can be quantized — in particular parameters, the internal variables models use to make predictions or decisions. This is convenient, considering models perform millions of calculations when run. Quantized models with fewer bits representing their parameters are less demanding mathematically, and therefore computationally. (To be clear, this is a different process from “distilling,” which is a more involved and selective pruning of parameters.)

But quantization may have more trade-offs than previously assumed.

The ever-shrinking model

According to a study from researchers at Harvard, Stanford, MIT, Databricks, and Carnegie Mellon, quantized models perform worse if the original, unquantized version of the model was trained over a long period on lots of data. In other words, at a certain point, it may actually be better to just train a smaller model rather than cook down a big one.

That could spell bad news for AI companies training extremely large models (known to improve answer quality) and then quantizing them in an effort to make them less expensive to serve.

The effects are already manifesting. A few months ago, developers and academics reported that quantizing Meta’s Llama 3 model tended to be “more harmful” compared to other models, potentially due to the way it was trained.

“In my opinion, the number one cost for everyone in AI is and will continue to be inference, and our work shows one important way to reduce it will not work forever,” Tanishq Kumar, a Harvard mathematics student and the first author on the paper, told TechCrunch.

Contrary to popular belief, AI model inferencing — running a model, like when ChatGPT answers a question — is often more expensive in aggregate than model training. Consider, for example, that Google spent an estimated $191 million to train one of its flagship Gemini models — certainly a princely sum. But if the company were to use a model to generate just 50-word answers to half of all Google Search queries, it’d spend roughly $6 billion a year.

Major AI labs have embraced training models on massive datasets under the assumption that “scaling up” — increasing the amount of data and compute used in training — will lead to increasingly more capable AI.

For example, Meta trained Llama 3 on a set of 15 trillion tokens. (Tokens represent bits of raw data; 1 million tokens is equal to about 750,000 words.) The previous generation, Llama 2, was trained on “only” 2 trillion tokens.

Evidence suggests that scaling up eventually provides diminishing returns; Anthropic and Google reportedly recently trained enormous models that fell short of internal benchmark expectations. But there’s little sign that the industry is ready to meaningfully move away from these entrenched scaling approaches.

How precise, exactly?

So, if labs are reluctant to train models on smaller datasets, is there a way models could be made less susceptible to degradation? Possibly. Kumar says that he and co-authors found that training models in “low precision” can make them more robust. Bear with us for a moment as we dive in a bit.

“Precision” here refers to the number of digits a numerical data type can represent accurately. Data types are collections of data values, usually specified by a set of possible values and allowed operations; the data type FP8, for example, uses only 8 bits to represent a floating-point number.

Most models today are trained at 16-bit or “half precision” and “post-train quantized” to 8-bit precision. Certain model components (e.g., its parameters) are converted to a lower-precision format at the cost of some accuracy. Think of it like doing the math to a few decimal places but then rounding off to the nearest 10th, often giving you the best of both worlds.

Hardware vendors like Nvidia are pushing for lower precision for quantized model inference. The company’s new Blackwell chip supports 4-bit precision, specifically a data type called FP4; Nvidia has pitched this as a boon for memory- and power-constrained data centers.

But extremely low quantization precision might not be desirable. According to Kumar, unless the original model is incredibly large in terms of its parameter count, precisions lower than 7- or 8-bit may see a noticeable step down in quality.

If this all seems a little technical, don’t worry — it is. But the takeaway is simply that AI models are not fully understood, and known shortcuts that work in many kinds of computation don’t work here. You wouldn’t say “noon” if someone asked when they started a 100-meter dash, right? It’s not quite so obvious as that, of course, but the idea is the same:

“The key point of our work is that there are limitations you cannot naïvely get around,” Kumar concluded. “We hope our work adds nuance to the discussion that often seeks increasingly low precision defaults for training and inference.”

Kumar acknowledges that his and his colleagues’ study was at relatively small scale — they plan to test it with more models in the future. But he believes that at least one insight will hold: There’s no free lunch when it comes to reducing inference costs.

“Bit precision matters, and it’s not free,” he said. “You cannot reduce it forever without models suffering. Models have finite capacity, so rather than trying to fit a quadrillion tokens into a small model, in my opinion much more effort will be put into meticulous data curation and filtering, so that only the highest quality data is put into smaller models. I am optimistic that new architectures that deliberately aim to make low precision training stable will be important in the future.”

This articles is written by : Nermeen Nabil Khear Abdelmalak

You can Enjoy surfing our website categories and read more content in many fields you may like .

Why USAGoldMines ?

USAGoldMines is a comprehensive website offering the latest in financial, crypto, and technical news. With specialized sections for each category, it provides readers with up-to-date market insights, investment trends, and technological advancements, making it a valuable resource for investors and enthusiasts in the fast-paced financial world.

Recent:

National Cloud Computing Policy to be finalised by year end Ali Guerra | usagoldmines.com

AI Is Helping Brands Reach More Audiences Across Social Media Gaylord Contreras | usagoldmines.com

Will AI replace humans? Yoshua Bengio warns of artificial intelligence risks Gaylord Contreras | usa...

DOJ wants Google to sell Chrome and possibly Android, more Hallie Frederick | usagoldmines.com

NVIDIA Accelerates Majority of World’s Supercomputers Ali Guerra | usagoldmines.com

Artificial Intelligence Can Be a Superpower for Financial Advisors Gaylord Contreras | usagoldmines....

OneCell Diagnostics bags $16M to help limit cancer reoccurrence using AI Gaylord Contreras | usagold...

Enterprise Productivity Is the Easiest AI Sell Macky Briones | usagoldmines.com

Swiveling Massage Seats, AI Driving Modes, and Pixels Everywhere Gaylord Contreras | usagoldmines.co...

Claroty veterans launch Twine with $12M in Seed funding from Dell and Wiz founders to Gaylord Contre...

Can a fluffy robot really replace a cat or dog? My weird, emotional week with an AI pet | Artificial...

Open Text Corporation (OTEX) Unveils Cloud Editions (CE) 24.4 with AI-Driven Innovations to Enhance ...

Nvidia’s AI chip demand still booming but slowing sales growth worries investors Gaylord Contreras |...

Google’s Gemini AI now has a memory Gaylord Contreras | usagoldmines.com

Better Artificial Intelligence Stock: Nvidia vs. Palantir Gaylord Contreras | usagoldmines.com

Self-learning AI makes college football against the spread, money line, over/under picks for Week 13...

Google’s Gemini AI now has a memory Gaylord Contreras | usagoldmines.com

Mizzle Partners with InFlux Technologies to Power DePIN Platform with Decentralized Cloud Infrastruc...

AI infrastructure transforming computing and sustainability Ali Guerra | usagoldmines.com

Nvidia rivals focus on building a different kind of chip to power AI products Ali Guerra | usagoldmi...

Meet your own personal AI Jesus in this Swiss church’s confessional Gaylord Contreras | usagoldmines...

China Turns to Silicon Valley to Bolster Homegrown AI Firms Gaylord Contreras | usagoldmines.com

Meta pushes AI bid for UK public sector forward with technology aimed at NHS | Meta Gaylord Contrera...

Microsoft pitches AI ‘agents’ that can perform tasks on their own at Ignite 2024 Gaylord Contreras |...

Physical AI startup BrightAI bootstraps to $80M in revenue Gaylord Contreras | usagoldmines.com

Report: DOJ wants to force Google Chrome sale, Android de-bundling Hallie Frederick | usagoldmines.c...

Sam Altman seeks backers for AI chipmaker to challenge Nvidia: source Gaylord Contreras | usagoldmin...

Meta hires Salesforce’s CEO of AI, Clara Shih, to lead new business AI group Gaylord Contreras | usa...

Expert Warns of AI Chatbot Risks After Teen User’s Suicide Gaylord Contreras | usagoldmines.com

The US Patent and Trademark Office Banned Staff From Using Generative AI Gaylord Contreras | usagold...

Expert believes AI is likely a factor in Marriott slashing jobs Gaylord Contreras | usagoldmines.com

As public perception of AI sours, crowdfunding platforms scramble Gaylord Contreras | usagoldmines.c...

High- Performance Computing as a Service Market Size Will Ali Guerra | usagoldmines.com

TG to become a CoE in Quantum Computing: Min Sridhar Babu Ali Guerra | usagoldmines.com

AI cloning of celebrity voices outpacing the law, experts warn | Artificial intelligence (AI) Gaylor...

Stocks rebound — plus, we’re raising our price target on a transforming AI play Gaylord Contreras | ...

Cowboys vs. Texans betting guide, Monday Night Football odds, props: AI, expert, model, DFS fantasy ...

Marc Benioff ‘blown away’ by Google Gemini AI voice assistant Gaylord Contreras | usagoldmines.com

Meet The New Boss: Artificial Intelligence Gaylord Contreras | usagoldmines.com

These Artificial Intelligence (AI) Stocks Have Soared Since Trump Won the Election, but Should You B...

San Antonio International Airport debuts new parking technology Gaylord Contreras | usagoldmines.com

Ben Affleck tells actors and writers not to worry about AI Gaylord Contreras | usagoldmines.com

The 7 Revolutionary Cloud Computing Trends That Will Define Business Success In 2025 Ali Guerra | us...

Microsoft starts boiling the Copilot frog • The Register Gaylord Contreras | usagoldmines.com

Google Docs now lets you generate AI images directly within documents Gaylord Contreras | usagoldmin...

Self-Evolving Reward Learning aligns LLMs with less human feedback Gaylord Contreras | usagoldmines....

Mobile AI opens new horizons for sustainable business growth in the digital age Gaylord Contreras | ...

Nasoya Introduces Tofie, World’s First AI-Powered Tofu Chatbot Gaylord Contreras | usagoldmines.com

Huawei’s Mate70 to flex high-end chip self-sufficiency Chris Mendez | usagoldmines.com

Using artificial intelligence in education: decision tree learning results in secondary school stude...

Building a Sustainable Future: Cloud Computing in Environmental Science | nasscom Ali Guerra | usago...

Nvidia Faces Risk from Potential Tariffs Amidst AI Boom, Bloomberg Analyst Says Gaylord Contreras | ...

Can AI Speak Culture? | Psychology Today Gaylord Contreras | usagoldmines.com

Are Quantum Computers the Secret Threat to Bitcoin’s Future? Ali Guerra | usagoldmines.com

Human-AI Coevolution Is Said To Be Coming Whether Humanity Likes It Or Not Gaylord Contreras | usago...

Meta and others now allow military to access their AI Gaylord Contreras | usagoldmines.com

My Career Advice As a Google Researcher Working in AI for 20 Years Gaylord Contreras | usagoldmines....

Spark Study Buddy (Challenger): AI algorithm matches pig sounds to their emotions – Young Post Gaylo...

AI Makes Echocardiography Faster, More Accessible Gaylord Contreras | usagoldmines.com

Chargers vs. Bengals NFL props, Sunday Night Football picks, AI prediction: Justin Herbert over 230....

Amazon offers free computing power to AI researchers, aiming to challenge Nvidia Ali Guerra | usagol...

AI Firm Genius Group Adopts Bitcoin as Primary Treasury Reserve Asset Gaylord Contreras | usagoldmin...

3 New AI Smart Home Features Arrive With Gemini and Google Nest Gaylord Contreras | usagoldmines.com

The mental health implications of artificial intelligence adoption: the crucial role of self-efficac...

How Artificial Intelligence Is Supercharging Digital Manipulation Gaylord Contreras | usagoldmines.c...

Transform your content creation with AI MagicX Gaylord Contreras | usagoldmines.com

‘Have your bot speak to my bot’: can AI productivity apps turbocharge my life? | Artificial intellig...

Qualcomm Q4 Earnings: Focus On The Long-Term Edge AI Picture (NASDAQ:QCOM) Gaylord Contreras | usago...

I’m a multitasking machine on my laptop — this Intel Lunar Lake change is a dealbreaker Gaylord Cont...

8 ChatGPT productivity tips and tricks Gaylord Contreras | usagoldmines.com

How a Hong Kong start-up’s AI-powered smart bin plans to tackle recycling Gaylord Contreras | usagol...

Does Africa need to embrace AI to keep its music centre stage? Gaylord Contreras | usagoldmines.com

Eyeing $500B AI Server Market by 2028 Amid Workforce Realignment Gaylord Contreras | usagoldmines.co...

OpenAI Has a Warning for Nvidia. Is the AI Bubble Bursting? Gaylord Contreras | usagoldmines.com

Multi-Agent AI Orchestration Shaping Up But Here’s Why It Might Not Be Fully Shipshape Gaylord Contr...

Fake AI video generators infect Windows, macOS with infostealers Gaylord Contreras | usagoldmines.co...

Phone Provider Deploys “State-of-the-Art AI Granny” to Waste Scammers’ Time Gaylord Contreras | usag...

Biden and Xi agree humans, not AI, should decide on nuclear weapon use | Joe Biden Gaylord Contreras...

Biden and Xi take a first step to limit AI and nuclear decisions : NPR Gaylord Contreras | usagoldmi...

Quantum computing: Boon or bane? Ali Guerra | usagoldmines.com

Google’s AI Search Experiment: “Learn About” Gaylord Contreras | usagoldmines.com

Self-learning AI gives NFL against the spread, over-under, money-line picks for every Week 11, 2024 ...

Alison.ai Closes $13.3M Seed Funding, Aims to Transform Global Ad Campaigns Gaylord Contreras | usag...

Our brains are vector databases — here’s why that’s helpful when using AI Gaylord Contreras | usagol...

Conference to explore opportunities, challenges of artificial intelligence Gaylord Contreras | usago...

The internet hates Coca-Cola’s AI-generated holiday commercial Gaylord Contreras | usagoldmines.com

Gemini AI tells the user to die — the answer appears out of nowhere as the user was asking Gemini’s ...

Parallels Desktop brings Apple Intelligence to Windows 11 — here’s how it works Renato Bond | usagol...

4 Ways To Balance AI, Social Media, And Well-Being Gaylord Contreras | usagoldmines.com

This Magnificent Artificial Intelligence (AI) Stock Has Crushed Nvidia in the Past Year. Can It Cont...

How the US Military Says Its Billion Dollar AI Gamble Will Pay Off Gaylord Contreras | usagoldmines....

Week 11 NFL betting guide, odds, props: AI, model, expert, parlay, DFS, season-long fantasy picks re...

Self-learning AI releases NFL against the spread, over-under, money-line picks for every Week 11, 20...

Edge Computing Market to Grow by USD 19.6 Billion from 2024-2028, Demand for Decentralized Computing...

China’s Baidu joins Meta in race to make AI-integrated smart glasses Gaylord Contreras | usagoldmine...

AI takes advertising targeting to a new level. Here’s how Gaylord Contreras | usagoldmines.com

The Washington Post has an AI newsboy to answer all your questions Gaylord Contreras | usagoldmines....

New AI Tool Can Track Your Location Using Microorganisms On Your Body Gaylord Contreras | usagoldmin...

Breaking

A popular technique to make AI more efficient has drawbacks Gaylord Contreras | usagoldmines.com

The ever-shrinking model

How precise, exactly?

Recent:

By Nermeen Nabil Khear

Leave a Reply Cancel reply

You Missed

Apple Urges Mac Users to Update After Hackers Exploit Zero-Day Vulnerabilities Renato Bond | usagoldmines.com

Apple Mac mini Review Renato Bond | usagoldmines.com

10 things that drove me mad using macOS for the first time Renato Bond | usagoldmines.com

Apple’s iOS 18.1 brings AI advancements: Privacy tips you need Renato Bond | usagoldmines.com

A popular technique to make AI more efficient has drawbacks Gaylord Contreras | usagoldmines.com

The ever-shrinking model

How precise, exactly?

Recent:

By Nermeen Nabil Khear

Related Posts

National Cloud Computing Policy to be finalised by year end Ali Guerra | usagoldmines.com

AI Is Helping Brands Reach More Audiences Across Social Media Gaylord Contreras | usagoldmines.com

Will AI replace humans? Yoshua Bengio warns of artificial intelligence risks Gaylord Contreras | usagoldmines.com

Leave a Reply Cancel reply

You Missed

Apple Urges Mac Users to Update After Hackers Exploit Zero-Day Vulnerabilities Renato Bond | usagoldmines.com

Apple Mac mini Review Renato Bond | usagoldmines.com

10 things that drove me mad using macOS for the first time Renato Bond | usagoldmines.com

Apple’s iOS 18.1 brings AI advancements: Privacy tips you need Renato Bond | usagoldmines.com