Book DetailAuthor/Editor(s): Nathan Marz, James Warren
Publication Date: May 10, 2015
Publisher: Manning Publications
Size: 19.8 MB
Format: pdf, epub, mobi
Book DescriptionWeb-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive.
Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases.
This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.
I have rarely seen a thorough discussion of the importance of data modeling, data layers, data processing requirements analysis, and data architecture and storage implementation issues (along with other "traditional" database concepts) in the context of big data. This book delivers a refreshing comprehensive solution to that deficiency. Other books in this area tend to focus a lot more on the "gee whiz" coolness of data science and machine learning applications (which are aspects of big data that I happen to love, but they are not the whole story). You cannot hope to achieve good, effective, and efficient results from your analytics processes without good data flow, from discovery to access to integration, which is why architecture design, data modeling, and attention to data pipelining are essential. I highly recommend this book for anyone who isn't ashamed to admit that data engineering is at least as important as data science in the big data era (says this data scientist!).
--Kirk D. Borne, Amazon Customer Reviews
Deep and detailed description of a complete solution for a massive data tratement system. This kind of architecture is really effective and in fact was applied by me about 20 years ago (when there wee no "big data" systems) to a real time historical stock exchange system, with the limited resources at that time. A nice update to my knowledge I hope to apply soon.
--Carlos Roldan Gerzenstein, Amazon Customer Reviews