My Account Log in

1 option

Hands-on LLM serving and optimization / by Chi Wang and Peiheng Hu.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Wang, Chi, author.
Hu, Peiheng, author.
Language:
English
Subjects (All):
Natural language generation (Computer science).
Artificial intelligence.
Physical Description:
1 online resource (314 pages)
Edition:
[First edition].
Place of Publication:
[Sebastopol, California] : O'Reilly Media, Inc., [2025]
Summary:
Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale. In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success. Learn the key principles for designing a model-serving system tailored to popular business scenarios Understand the common challenges of hosting LLMs at scale while minimizing costs Pick up practical techniques for optimizing LLM serving performance Build a model-serving system that meets specific business requirements Improve LLM serving throughput and reduce latency Host LLMs in a cost-effective manner, balancing performance and resource efficiency.
Notes:
OCLC-licensed vendor bibliographic record.
OCLC:
1572438404

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account