My Account Log in

2 options

Mastering Apache Cassandra 3.x : an expert guide to improving database scalability and availability without compromising performance / Aaron Ploetz, Tejaswi Malepati, Nishant Neeraj.

Ebook Central College Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Ploetz, Aaron, author.
Malepati, Tejaswi, author.
Neeraj, Nishant, author.
Language:
English
Subjects (All):
Apache Cassandra.
Non-relational databases.
Application software--Development.
Application software.
Physical Description:
1 online resource (348 pages)
Edition:
Third edition.
Other Title:
Mastering Apache Cassandra three point x
Place of Publication:
Birmingham : Packt, 2018.
System Details:
text file
Summary:
Build, manage, and configure high-performing, reliable NoSQL database for your applications with Cassandra Key Features Write programs more efficiently using Cassandra's features with the help of examples Configure Cassandra and fine-tune its parameters depending on your needs Integrate Cassandra database with Apache Spark and build strong data analytics pipeline Book Description With ever-increasing rates of data creation, the demand for storing data fast and reliably becomes a need. Apache Cassandra is the perfect choice for building fault-tolerant and scalable databases. Mastering Apache Cassandra 3.x teaches you how to build and architect your clusters, configure and work with your nodes, and program in a high-throughput environment, helping you understand the power of Cassandra as per the new features. Once you've covered a brief recap of the basics, you'll move on to deploying and monitoring a production setup and optimizing and integrating it with other software. You'll work with the advanced features of CQL and the new storage engine in order to understand how they function on the server-side. You'll explore the integration and interaction of Cassandra components, followed by discovering features such as token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail. Last but not least you will get to grips with Apache Spark. By the end of this book, you'll be able to analyse big data, and build and manage high-performance databases for your application. What you will learn Write programs more efficiently using Cassandra's features more efficiently Exploit the given infrastructure, improve performance, and tweak the Java Virtual Machine (JVM) Use CQL3 in your application in order to simplify working with Cassandra Configure Cassandra and fine-tune its parameters depending on your needs Set up a cluster and learn how to scale it Monitor a Cassandra cluster in different ways Use Apache Spark and other big data processing tools Who this book is for Mastering Apache Cassandra 3.x is for you if you are a big data administrator, database administrator, architect, or developer who wants to build a high-performing, scalable, and fault-tolerant database. Prior knowledge of core concepts of databases is required.
Contents:
Cover
Title Page
Copyright and Credits
Packt Upsell
Foreward
Contributors
Table of Contents
Preface
Chapter 1: Quick Start
Introduction to Cassandra
High availability
Distributed
Partitioned row store
Installation
Configuration
cassandra.yaml
cassandra-rackdc.properties
Starting Cassandra
Cassandra Cluster Manager
A quick introduction to the data model
Using Cassandra with cqlsh
Shutting down Cassandra
Summary
Chapter 2: Cassandra Architecture
Why was Cassandra created?
RDBMS and problems at scale
Cassandra and the CAP theorem
Cassandra's ring architecture
Partitioners
ByteOrderedPartitioner
RandomPartitioner
Murmur3Partitioner
Single token range per node
Vnodes
Cassandra's write path
Cassandra's read path
On-disk storage
SSTables
How data was structured in prior versions
How data is structured in newer versions
Additional components of Cassandra
Gossiper
Snitch
Phi failure-detector
Tombstones
Hinted handoff
Compaction
Repair
Merkle tree calculation
Streaming data
Read repair
Security
Authentication
Authorization
Managing roles
Client-to-node SSL
Node-to-node SSL
Chapter 3: Effective CQL
An overview of Cassandra data modeling
[Cassandra storage model for versions 3.0 and beyond]
Cassandra storage model for versions 3.0 and beyond
Data cells
cqlsh
Logging into cqlsh
Problems connecting to cqlsh
Local cluster without security enabled
Remote cluster with user security enabled
Remote cluster with auth and SSL enabled
Connecting with cqlsh over SSL
Converting the Java keyStore into a PKCS12 keyStore
Exporting the certificate from the PKCS12 keyStore
Modifying your cqlshrc file
Testing your connection via cqlsh.
Getting started with CQL
Creating a keyspace
Single data center example
Multi-data center example
Creating a table
Simple table example
Clustering key example
Composite partition key example
Table options
Data types
Type conversion
The primary key
Designing a primary key
Selecting a good partition key
Selecting a good clustering key
Querying data
The IN operator
Writing data
Inserting data
Updating data
Deleting data
Lightweight transactions
Executing a BATCH statement
The expiring cell
Altering a keyspace
Dropping a keyspace
Altering a table
Truncating a table
Dropping a table
Truncate versus drop
Creating an index
Caution with implementing secondary indexes
Dropping an index
Creating a custom data type
Altering a custom type
Dropping a custom type
User management
Creating a user and role
Altering a user and role
Dropping a user and role
Granting permissions
Revoking permissions
Other CQL commands
COUNT
DISTINCT
LIMIT
STATIC
User-defined functions
cqlsh commands
CONSISTENCY
COPY
DESCRIBE
TRACING
Chapter 4: Configuring a Cluster
Evaluating instance requirements
RAM
CPU
Disk
Solid state drives
Cloud storage offerings
SAN and NAS
Network
Public cloud networks
Firewall considerations
Strategy for many small instances versus few large instances
Operating system optimizations
Disable swap
XFS
Limits
limits.conf
sysctl.conf
Time synchronization
Configuring the JVM
Garbage collection
CMS
G1GC
Garbage collection with Cassandra
Installation of JVM
JCE
Configuring Cassandra
cassandra-env.sh
dc
rack
dc_suffix
prefer_local
cassandra-topology.properties.
jvm.options
logback.xml
Managing a deployment pipeline
Orchestration tools
Configuration management tools
Recommended approach
Local repository for downloadable files
Chapter 5: Performance Tuning
Cassandra-Stress
The Cassandra-Stress YAML file
name
size
population
cluster
Cassandra-Stress results
Write performance
Commitlog mount point
Scaling out
Scaling out a data center
Read performance
Compaction strategy selection
Optimizing read throughput for time-series models
Optimizing tables for read-heavy models
Cache settings
Appropriate uses for row-caching
Compression
Chunk size
The bloom filter configuration
Read performance issues
Other performance considerations
JVM configuration
Cassandra anti-patterns
Building a queue
Query flexibility
Querying an entire table
Incorrect use of BATCH
Chapter 6: Managing a Cluster
Revisiting nodetool
A warning about using nodetool
Scaling up
Adding nodes to a cluster
Cleaning up the original nodes
Adding a new data center
Adjusting the cassandra-rackdc.properties file
A warning about SimpleStrategy
Scaling down
Removing nodes from a cluster
Removing a live node
Removing a dead node
Other removenode options
When removenode doesn't work (nodetool assassinate)
Assassinating a node on an older version
Removing a data center
Backing up and restoring data
Taking snapshots
Enabling incremental backups
Recovering from snapshots
Maintenance
Replacing a node
A warning about incremental repairs
Cassandra Reaper
Forcing read repairs at consistency - ALL
Clearing snapshots and incremental backups
Snapshots
Incremental backups
Compaction.
Why you should never invoke compaction manually
Adjusting compaction throughput due to available resources
Chapter 7: Monitoring
JMX interface
MBean packages exposed by Cassandra
JConsole (GUI)
Connection and overview
Viewing metrics
Performing an operation
JMXTerm (CLI)
Connection and domains
Getting a metric
The nodetool utility
Monitoring using nodetool
describecluster
gcstats
getcompactionthreshold
getcompactionthroughput
getconcurrentcompactors
getendpoints
getlogginglevels
getstreamthroughput
gettimeout
gossipinfo
info
netstats
proxyhistograms
status
tablestats
tpstats
verify
Administering using nodetool
cleanup
drain
flush
resetlocalschema
stopdaemon
truncatehints
upgradeSSTable
Metric stack
Telegraf
JMXTrans
InfluxDB
InfluxDB CLI
Grafana
Visualization
Alerting
Custom setup
Log stack
The system/debug/gc logs
Filebeat
Elasticsearch
Kibana
Troubleshooting
High CPU usage
Different garbage-collection patterns
Hotspots
Disk performance
Node flakiness
All-in-one Docker
Creating a database and other monitoring components locally
Web links
Chapter 8: Application Development
Getting started
The path to failure
Is Cassandra the right database?
Good use cases for Apache Cassandra
Use and expectations around application data consistency
Choosing the right driver
Building a Java application
Driver dependency configuration with Apache Maven
Connection class.
Other connection options
Retry policy
Default keyspace
Port
SSL
Connection pooling options
Starting simple - Hello World!
Using the object mapper
Building a data loader
Asynchronous operations
Data loader example
Chapter 9: Integration with Apache Spark
Spark
Architecture
Running custom Spark Docker locally
The web UI
Master
Worker
Application
PySpark
Connection config
Accessing Cassandra data
SparkR
RStudio
Jupyter
Web UI
PYSpark through Juypter
Appendix: References
Chapter 1 - Quick Start
Chapter 2 - Cassandra Architecture
Chapter 3 - Effective CQL
Chapter 4 - Configuring a Cluster
Chapter 5 - Performance Tuning
Chapter 6 - Managing a Cluster
Chapter 7 - Monitoring
Chapter 8 - Application Development
Chapter 9 - Integration with Apache Spark
Other Books You May Enjoy
Index.
Notes:
Includes index.
Includes bibliographical references.
Description based on online resource; title from PDF title page (ebrary, viewed February 20, 2019).
OCLC:
1086104365

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account