My Account Log in

2 options

Fundamentals of Analytics Engineering : An Introduction to Building End-To-end Analytics Solutions / Dumky De Wilde [and six others].

EBSCOhost Academic eBook Collection (North America) Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Wilde, Dumky De, author.
Language:
English
Subjects (All):
Data mining.
Systems engineering.
Physical Description:
1 online resource (332 pages)
Edition:
First edition.
Place of Publication:
Birmingham, England : Packt Publishing, [2024]
Biography/History:
Wilde Dumky De: Dumky is an award-winning analytics engineer with close to 10 years of experience in setting up data pipelines, data models and cloud infrastructure. Dumky has worked with a multitude of clients from government to fintech and retail. His background is in marketing analytics and web tracking implementations, but he has since branched out to include other areas and deliver value from data and analytics across the entire organization. Kassapian Fanny: Fanny has a multidisciplinary background across various industries, giving her a unique perspective on analytics workflows, from engineering pipelines to driving value for the business. As a consultant, Fanny helps companies translate opportunities and business needs into technical solutions, implement analytics engineering best practices to streamline their pipelines, and treat data as a product. She is an avid promoter of data democratization, through technology and literacyGligorevic Jovan: Jovan, an Analytics Engineer, specializes in data modeling and building analytical dashboards. Passionate about delivering end-to-end analytics solutions and enabling self-service analytics, he has a background in business and data science. With skills ranging from machine learning to dashboarding, Jovan has democratized data across diverse industries. Proficient in various tools and programming languages, he has extensive experience with the modern data stack. Jovan enjoys providing trainings in dbt and Power BI, sharing his knowledge generouslyPerafan Juan Manuel: Juan Manuel Perafan 8 years of experience in the realm of analytics (5 years as a consultant). Juan was the first analytics engineer hired by Xebia back in 2020. Making him one of the earliest adopters of this way of working. Besides helping his clients realizing the value of their data, Juan is also very active in the data community. He has spoken at dozens of conferences and meetups around the world (including Coalesce 2023). Additionally, he is the founder of the Analytics Engineering meetup in the Netherlands as well as the Dutch dbt meetupBenninga Lasse: Lasse has been working in the dataspace since 2018, starting out as a Data Engineer at a large airline, then switching towards Cloud Engineering for a consultancy and working for different clients in the retailing and healthcare space. Since 2021, he's an Analytics Engineer at Xebia Data, merging software/platform engineering with analytics passion. As a consultant Lasse has seen many different clients, ranging from retail, healthcare, ridesharing industry, and trading companies. He has implemented multiple data platforms and worked in all three major clouds, leveraging his knowledge of data and analytics to provide valueLopez Ricardo Angel Granados: Ricardo, an Analytics Engineer with a strong background in data engineering and analysis, is a quick learner and tech enthusiast. With a Master's in IT Management specializing in Data Science, he excels in using various programming languages and tools to deliver valuable insights. Ricardo, experienced in diverse industries like energy, transport, and fintech, is adept at finding alternative solutions for optimal results. As an Analytics Engineer, he focuses on driving value from data through efficient data modeling, using best practices, automating tasks and improving data qualityPereira Tais Laurindo: Tais is a versatile data professional with experience in a diverse range of organizations - from big corporations to scale-ups. Before her move to Xebia, she had the chance to develop distinct data products, such as dashboards and machine learning implementations. Currently, she has been focusing on end-to-end analytics as an Analytics Engineer. With a mixed background in engineering and business, her mission is to contribute to data democratization in organizations, by he. ..
Summary:
Gain a holistic understanding of the analytics engineering lifecycle by integrating principles from both data analysis and engineering Key Features Discover how analytics engineering aligns with your organization's data strategy Access insights shared by a team of seven industry experts Tackle common analytics engineering problems faced by modern businesses Purchase of the print or Kindle book includes a free PDF eBook Book Description Navigate the world of data analytics with Fundamentals of Analytics Engineering--guiding you from foundational concepts to advanced techniques of data ingestion and warehousing, data lakehouse, and data modeling. Written by a team of 7 industry experts, this book helps you to transform raw data into structured insights. In this book, you'll discover how to clean, filter, aggregate, and reformat data, and seamlessly serve it across diverse platforms. With practical guidance, you'll also learn how to build a simple data platform using Airbyte for ingestion, DuckDB for warehousing, dbt for transformations, and Tableau for visualization. From data quality and observability to fostering collaboration on codebases, you'll discover effective strategies for ensuring data integrity and driving collaborative success. As you advance, you'll become well-versed with the CI/CD principles for automated code building, testing, and deployment--laying the foundation for consistent and reliable pipelines. And with invaluable insights into gathering business requirements, documenting complex business logic, and the importance of data governance, you'll develop a holistic understanding of the analytics lifecycle. By the end of this book, you'll be armed with the essential techniques and best practices for developing scalable analytics solutions from end to end. What you will learn Design and implement data pipelines from ingestion to serving data Explore best practices for data modeling and schema design Gain insights into the use of cloud-based analytics platforms and tools for scalable data processing Understand the principles of data governance and collaborative coding Comprehend data quality management in analytics engineering Gain practical skills in using analytics engineering tools to conquer real-world data challenges Who this book is for This book is for data engineers and data analysts considering pivoting their careers into analytics engineering. Analytics engineers who want to upskill and search for gaps in their knowledge will also find this book helpful, as will other data professionals who want to understand the value of analytics engineering in their organization's journey toward data maturity. To get the most out of this book, you should have a basic understanding of data analysis and engineering concepts such as data cleaning, visualization, ETL and data warehousing.
Contents:
Cover
Title Page
Copyright and Credits
Dedications
Foreword
Contributors
Table of Contents
Preface
Prologue
Part 1: Introduction to Analytics Engineering
Chapter 1: What Is Analytics Engineering?
Introducing analytics engineering
Defining analytics engineering
Why do we need analytics engineering?
A supermarket analogy
The shift from ETL to ELT
The difference between analytics engineers, data analysts, and data engineers
Summary
Chapter 2: The Modern Data Stack
Understanding a Modern Data Stack
Explaining three key differentiators versus legacy stacks
Lowering technical barriers with a SQL-first approach
Improving infrastructure efficiency with cloud-native systems
Simplifying implementation and maintenance with managed and modular solutions
Discussing the advantages and disadvantages of the MDS
Part 2: Building Data Pipelines
Chapter 3: Data Ingestion
Digging into the problem of moving data between two systems
The source of all problems
Understanding the eight essential steps of a data ingestion pipeline
Trigger
Connection
State management
Data extraction
Transformations
Validation and data quality
Loading
Archiving and retention
Managing the quality and scalability of data ingestion pipelines - the three key topics
Scalability and resilience
Monitoring, logging, and alerting
Governance
Working with data ingestion - an example pipeline
Chapter 4: Data Warehousing
Uncovering the evolution of data warehousing
The problem with transactional databases
The history of data warehouses
Moving to the cloud
Benefits of cloud versus on-premises data warehouses
Cloud data warehouse users - no one-size fits all
Building blocks of a cloud data warehouse
Compute.
Knowing the market leaders in cloud data warehousing
Amazon Redshift
Google BigQuery
Snowflake
Databricks
Use case - choosing the right cloud data warehouse
Managed versus self-hosted data warehouses
Chapter 5: Data Modeling
The importance of data models
Completeness
Enforcement of business rules
Minimizing redundancy
Data reusability
Stability and flexibility
Elegance
Communication
Integration
Potential trade-offs
The elephant in the room - performance
Designing your data model
Data modeling techniques
Bill Inmon and relational modeling
Ralph Kimball and dimensional modeling
Daniel Linstedt and Data Vault
Comparison of the different data models
Choosing a data model
Chapter 6: Transforming Data
Transforming data - the foundation of analytics work
A key step in the data value chain
Challenges in transforming data
Design choices
Where to apply transformations
Specify your data model
Layering transformations
Data transformation best practices
Readability and reusability first, optimization second
Modularity
Other best practices
An example of writing modular code
Tools that facilitate data transformations
Types of transformation tools
Considerations
Chapter 7: Serving Data
Exposing data using dashboarding and BI tools
Dashboards
Spreadsheets
Programming environments
Low-code tools
Reverse ETL
Valuable
Usable
Sensible
Serving data - four key topics
Self-serving analytics and report factories
Interactive and static reports
Actionable and vanity metrics
Reusability and bespoke processes
Part 3: Hands-On Guide to Building a Data Platform
Chapter 8: Hands-On Analytics Engineering
Technical requirements.
Understanding the Stroopwafelshop use case
Business objectives, metrics, and KPIs
Looking at the data
The thing about spreadsheets
What about BI tools?
The tooling
Preparing Google Cloud
ELT using Airbyte Cloud
Loading the Stroopwafelshop data using Airbyte Cloud
Modeling data using dbt Cloud
The shortcomings of conventional analytics
The role of dbt in analytics engineering
Setting up dbt Cloud
Data marts
Additional dbt features
Visualizing data with Tableau
Why Tableau?
Selecting the KPIs
First visualization
Creating measures
Creating the store growth dashboard
What's next?
Part 4: DataOps
Chapter 9: Data Quality and Observability
Understanding the problem of data quality at the source, in transformations, and in data governance
Data quality issues in source systems
Data quality issues in data infrastructure and data pipelines
How data governance impacts data quality
Finding solutions to data quality issues - observability, data catalogs, and semantic layers
Using observability to improve your data quality
The benefits of data catalogs for data quality
Improving data quality with a semantic layer
Chapter 10: Writing Code in a Team
Identifying the responsibilities of team members
Tracking tasks and issues
Tools for issue and task tracking
Clear task definition
Categorization and tagging
Managing versions with version control
Working with Git
Git branching
Development workflow for analytics engineers
Working with coding standards
PEP8
ANSI
Linters
Pre-commit hooks
Reviewing code
Pull requests - The four eyes principle
Continuous integration/continuous deployment
Documenting code
Documenting code in dbt
Code comments
READMEs
Documentation on getting started.
Conceptual documentation
Working with containers
Refactoring and technical debt
Chapter 11: Automating Workflows
Introducing DataOps
Orchestrating data pipelines
Designing an automated workflow - considerations
dbt Cloud
Airflow
Continuous integration
Continuous
Handling integration issues
Automating testing with a CI pipeline
Continuous deployment
The CD pipeline
Slim CI/CD
Configuring CI/CD in dbt Cloud
Continuous delivery
Continuous delivery versus continuous deployment
Part 5: Data Strategy
Chapter 12: Driving Business Adoption
Defining analytics translation
The analytics value chain
Scoping analytics use cases
Identifying stakeholders
Ideating analytics use cases
Prioritizing use cases
Ensuring business adoption
Working incrementally
Gathering feedback
Knowing when to stop developing
Communicating your results
Documenting business logic
Chapter 13: Data Governance
Understanding data governance
The objective of data governance
Applying data governance in analytics engineering
Defining data ownership
Data quality and integrity
Managing data assets
Training, enablement, and best practices
Data definitions
Addressing critical areas for seamless data governance
Resistance to change and adoption
Engaging stakeholders and fostering collaboration
Establishing a data governance roadmap
Chapter 14: Epilogue
Reviewing the fundamental insights - what you've learned so far
Making your career future-proof - how to take it further
Tip #1 - keep learning and developing your skills
Tip #2 - network and engage with the community
Tip #3 - showcase your work and build a portfolio
Closing remarks
Index
Other Books You May Enjoy.
Notes:
Description based on publisher supplied metadata and other sources.
Description based on print version record.
ISBN:
9781837632114
1837632111
OCLC:
1428526380

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account