2 options

Mastering Apache Cassandra 3.x : an expert guide to improving database scalability and availability without compromising performance / Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj.

Ebook Central College Complete Available online

O'Reilly Online Learning: Academic/Public Library Edition Available online

Format:: Book
Author/Creator:: Ploetz, Aaron, author.; Malepati, Tejaswi, author.; Neeraj, Nishant, author.
Language:: English
Subjects (All):: Apache Cassandra.; Non-relational databases.; Application software--Development.; Application software.
Physical Description:: 1 online resource (348 pages)
Edition:: Third edition.
Other Title:: Mastering Apache Cassandra three point x
Place of Publication:: Birmingham : Packt, 2018.
System Details:: text file
Summary:: Build, manage, and configure high-performing, reliable NoSQL database for your applications with Cassandra Key Features Write programs more efficiently using Cassandra's features with the help of examples Configure Cassandra and fine-tune its parameters depending on your needs Integrate Cassandra database with Apache Spark and build strong data analytics pipeline Book Description With ever-increasing rates of data creation, the demand for storing data fast and reliably becomes a need. Apache Cassandra is the perfect choice for building fault-tolerant and scalable databases. Mastering Apache Cassandra 3.x teaches you how to build and architect your clusters, configure and work with your nodes, and program in a high-throughput environment, helping you understand the power of Cassandra as per the new features. Once you've covered a brief recap of the basics, you'll move on to deploying and monitoring a production setup and optimizing and integrating it with other software. You'll work with the advanced features of CQL and the new storage engine in order to understand how they function on the server-side. You'll explore the integration and interaction of Cassandra components, followed by discovering features such as token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail. Last but not least you will get to grips with Apache Spark. By the end of this book, you'll be able to analyse big data, and build and manage high-performance databases for your application. What you will learn Write programs more efficiently using Cassandra's features more efficiently Exploit the given infrastructure, improve performance, and tweak the Java Virtual Machine (JVM) Use CQL3 in your application in order to simplify working with Cassandra Configure Cassandra and fine-tune its parameters depending on your needs Set up a cluster and learn how to scale it Monitor a Cassandra cluster in different ways Use Apache Spark and other big data processing tools Who this book is for Mastering Apache Cassandra 3.x is for you if you are a big data administrator, database administrator, architect, or developer who wants to build a high-performing, scalable, and fault-tolerant database. Prior knowledge of core concepts of databases is required.
Contents:: Cover; Title Page; Copyright and Credits; Packt Upsell; Foreward; Contributors; Table of Contents; Preface; Chapter 1: Quick Start; Introduction to Cassandra; High availability; Distributed; Partitioned row store; Installation; Configuration; cassandra.yaml; cassandra-rackdc.properties; Starting Cassandra; Cassandra Cluster Manager; A quick introduction to the data model; Using Cassandra with cqlsh; Shutting down Cassandra; Summary; Chapter 2: Cassandra Architecture; Why was Cassandra created?; RDBMS and problems at scale; Cassandra and the CAP theorem; Cassandra's ring architecture; Partitioners; ByteOrderedPartitioner; RandomPartitioner; Murmur3Partitioner; Single token range per node; Vnodes; Cassandra's write path; Cassandra's read path; On-disk storage; SSTables; How data was structured in prior versions; How data is structured in newer versions; Additional components of Cassandra; Gossiper; Snitch; Phi failure-detector; Tombstones; Hinted handoff; Compaction; Repair; Merkle tree calculation; Streaming data; Read repair; Security; Authentication; Authorization; Managing roles; Client-to-node SSL; Node-to-node SSL; Chapter 3: Effective CQL; An overview of Cassandra data modeling; [Cassandra storage model for versions 3.0 and beyond]; Cassandra storage model for versions 3.0 and beyond; Data cells; cqlsh; Logging into cqlsh; Problems connecting to cqlsh; Local cluster without security enabled; Remote cluster with user security enabled; Remote cluster with auth and SSL enabled; Connecting with cqlsh over SSL; Converting the Java keyStore into a PKCS12 keyStore; Exporting the certificate from the PKCS12 keyStore; Modifying your cqlshrc file; Testing your connection via cqlsh.; Getting started with CQL; Creating a keyspace; Single data center example; Multi-data center example; Creating a table; Simple table example; Clustering key example; Composite partition key example; Table options; Data types; Type conversion; The primary key; Designing a primary key; Selecting a good partition key; Selecting a good clustering key; Querying data; The IN operator; Writing data; Inserting data; Updating data; Deleting data; Lightweight transactions; Executing a BATCH statement; The expiring cell; Altering a keyspace; Dropping a keyspace; Altering a table; Truncating a table; Dropping a table; Truncate versus drop; Creating an index; Caution with implementing secondary indexes; Dropping an index; Creating a custom data type; Altering a custom type; Dropping a custom type; User management; Creating a user and role; Altering a user and role; Dropping a user and role; Granting permissions; Revoking permissions; Other CQL commands; COUNT; DISTINCT; LIMIT; STATIC; User-defined functions; cqlsh commands; CONSISTENCY; COPY; DESCRIBE; TRACING; Chapter 4: Configuring a Cluster; Evaluating instance requirements; RAM; CPU; Disk; Solid state drives; Cloud storage offerings; SAN and NAS; Network; Public cloud networks; Firewall considerations; Strategy for many small instances versus few large instances; Operating system optimizations; Disable swap; XFS; Limits; limits.conf; sysctl.conf; Time synchronization; Configuring the JVM; Garbage collection; CMS; G1GC; Garbage collection with Cassandra; Installation of JVM; JCE; Configuring Cassandra; cassandra-env.sh; dc; rack; dc_suffix; prefer_local; cassandra-topology.properties.; jvm.options; logback.xml; Managing a deployment pipeline; Orchestration tools; Configuration management tools; Recommended approach; Local repository for downloadable files; Chapter 5: Performance Tuning; Cassandra-Stress; The Cassandra-Stress YAML file; name; size; population; cluster; Cassandra-Stress results; Write performance; Commitlog mount point; Scaling out; Scaling out a data center; Read performance; Compaction strategy selection; Optimizing read throughput for time-series models; Optimizing tables for read-heavy models; Cache settings; Appropriate uses for row-caching; Compression; Chunk size; The bloom filter configuration; Read performance issues; Other performance considerations; JVM configuration; Cassandra anti-patterns; Building a queue; Query flexibility; Querying an entire table; Incorrect use of BATCH; Chapter 6: Managing a Cluster; Revisiting nodetool; A warning about using nodetool; Scaling up; Adding nodes to a cluster; Cleaning up the original nodes; Adding a new data center; Adjusting the cassandra-rackdc.properties file; A warning about SimpleStrategy; Scaling down; Removing nodes from a cluster; Removing a live node; Removing a dead node; Other removenode options; When removenode doesn't work (nodetool assassinate); Assassinating a node on an older version; Removing a data center; Backing up and restoring data; Taking snapshots; Enabling incremental backups; Recovering from snapshots; Maintenance; Replacing a node; A warning about incremental repairs; Cassandra Reaper; Forcing read repairs at consistency - ALL; Clearing snapshots and incremental backups; Snapshots; Incremental backups; Compaction.; Why you should never invoke compaction manually; Adjusting compaction throughput due to available resources; Chapter 7: Monitoring; JMX interface; MBean packages exposed by Cassandra; JConsole (GUI); Connection and overview; Viewing metrics; Performing an operation; JMXTerm (CLI); Connection and domains; Getting a metric; The nodetool utility; Monitoring using nodetool; describecluster; gcstats; getcompactionthreshold; getcompactionthroughput; getconcurrentcompactors; getendpoints; getlogginglevels; getstreamthroughput; gettimeout; gossipinfo; info; netstats; proxyhistograms; status; tablestats; tpstats; verify; Administering using nodetool; cleanup; drain; flush; resetlocalschema; stopdaemon; truncatehints; upgradeSSTable; Metric stack; Telegraf; JMXTrans; InfluxDB; InfluxDB CLI; Grafana; Visualization; Alerting; Custom setup; Log stack; The system/debug/gc logs; Filebeat; Elasticsearch; Kibana; Troubleshooting; High CPU usage; Different garbage-collection patterns; Hotspots; Disk performance; Node flakiness; All-in-one Docker; Creating a database and other monitoring components locally; Web links; Chapter 8: Application Development; Getting started; The path to failure; Is Cassandra the right database?; Good use cases for Apache Cassandra; Use and expectations around application data consistency; Choosing the right driver; Building a Java application; Driver dependency configuration with Apache Maven; Connection class.; Other connection options; Retry policy; Default keyspace; Port; SSL; Connection pooling options; Starting simple - Hello World!; Using the object mapper; Building a data loader; Asynchronous operations; Data loader example; Chapter 9: Integration with Apache Spark; Spark; Architecture; Running custom Spark Docker locally; The web UI; Master; Worker; Application; PySpark; Connection config; Accessing Cassandra data; SparkR; RStudio; Jupyter; Web UI; PYSpark through Juypter; Appendix: References; Chapter 1 - Quick Start; Chapter 2 - Cassandra Architecture; Chapter 3 - Effective CQL; Chapter 4 - Configuring a Cluster; Chapter 5 - Performance Tuning; Chapter 6 - Managing a Cluster; Chapter 7 - Monitoring; Chapter 8 - Application Development; Chapter 9 - Integration with Apache Spark; Other Books You May Enjoy; Index.
Notes:: Includes index.; Includes bibliographical references.; Description based on online resource; title from PDF title page (ebrary, viewed February 20, 2019).
OCLC:: 1086104365

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

2 options

Mastering Apache Cassandra 3.x : an expert guide to improving database scalability and availability without compromising performance / Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj.

Find

My Account

Guides