My Account Log in

3 options

Perspectives on data science for software engineering / edited by Tim Menzies, Laurie Williams, Thomas Zimmermann ; cover designer, Mark Rogers.

EBSCOhost Academic eBook Collection (North America) Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online

eBook EngineeringCore Collection Available online

View online
Format:
Book
Author/Creator:
Menzies, Tim, author.
Contributor:
Menzies, Tim, editor.
Williams, Laurie, editor.
Zimmermann, Thomas, editor.
Rogers, Mark, book designer.
Language:
English
Subjects (All):
Software engineering.
Physical Description:
1 online resource (410 pages) : illustrations (some color), photographs, graphs, tables
Edition:
1st edition
Place of Publication:
Amsterdam, [Netherlands] : Morgan Kaufmann, 2016.
System Details:
text file
Summary:
Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. The idea for this book was created during the 2014 conference at Dagstuhl, an invitation-only gathering of leading computer scientists who meet to identify and discuss cutting-edge informatics topics. At the 2014 conference, the concept of how to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field highlighted many discussions. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the community’s leaders gathered to share hard-won lessons from the trenches. Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and how to utilize these techniques in successful software projects. Newcomers to software engineering data science will learn the tips and tricks of the trade, while more experienced data scientists will benefit from war stories that show what traps to avoid. Presents the wisdom of community experts, derived from a summit on software analytics Provides contributed chapters that share discrete ideas and technique from the trenches Covers top areas of concern, including mining security and social data, data visualization, and cloud-based data Presented in clear chapters designed to be applicable across many domains
Contents:
Front Cover
Perspectives on Data Science for Software Engineering
Copyright
Contents
Contributors
Acknowledgments
Introduction
Perspectives on data science for software engineering
Why This Book?
About This Book
The Future
References
Software analytics and its application in practice
Six Perspectives of Software Analytics
Experiences in Putting Software Analytics into Practice
Seven principles of inductive software engineering: What we do is different
Different and Important
Principle #1: Humans Before Algorithms
Principle #2: Plan for Scale
Principle #3: Get Early Feedback
Principle #4: Be Open Minded
Principle #5: Be smart with your learning
Principle #6: Live With the Data You Have
Principle #7: Develop a Broad Skill Set That Uses a Big Toolkit
The need for data analysis patterns (in software engineering)
The Remedy Metaphor
Software Engineering Data
Needs of Data Analysis Patterns
Building Remedies for Data Analysis in Software Engineering Research
From software data to software theory: The path less traveled
Pathways of Software Repository Research
From Observation, to Theory, to Practice
Why theory matters
How to Use Theory
How to build theory
Constructs
Propositions
Explanation
Scope
In Summary: Find a Theory or Build One Yourself
Further Reading
Success stories/applications
Mining apps for anomalies
The Million-Dollar Question
App Mining
Detecting Abnormal Behavior
A Treasure Trove of Data
... But Also Obstacles
Executive Summary
Embrace dynamic artifacts
Can We Minimize the USB Driver Test Suite?
Yes, Lets Observe Interactions
Why Did Our Solution Work?
Still Not Convinced? Heres More.
Dynamic Artifacts Are Here to Stay
Mobile app store analytics
Understanding End Users
Conclusion
The naturalness of software*
Transforming Software Practice
Porting and Translation
The ``Natural Linguistics´´ of Code
Analysis and Tools
Assistive Technologies
Advances in release readiness
Predictive Test Metrics
Universal Release Criteria Model
Best Estimation Technique
Resource/Schedule/Content Model
Using Models in Release Management
Research to Implementation: A Difficult (but Rewarding) Journey
How to tame your online services
Background
Service Analysis Studio
Success Story
Measuring individual productivity
No Single and Simple Best Metric for Success/Productivity
Measure the Process, Not Just the Outcome
Allow for Measures to Evolve
Goodharts Law and the Effect of Measuring
How to Measure Individual Productivity?
Stack traces reveal attack surfaces
Another Use of Stack Traces?
Attack Surface Approximation
Visual analytics for software engineering data
Gameplay data plays nicer when divided into cohorts
Cohort Analysis as a Tool for Gameplay Data
Play to Lose
Forming Cohorts
Case Studies of Gameplay Data
Challenges of using cohorts
Summary
A success story in applying data science in practice
Overview
Analytics Process
Data Collection
Exploratory Data Analysis
Model Selection
Performance Measures and Benefit Analysis
Communication Process-Best Practices
Problem Selection
Managerial Support
Project Management
Trusted Relationship
There's never enough time to do all the testing you want.
The Impact of Short Release Cycles (There's Not Enough Time)
Testing Is More Than Functional Correctness (All the Testing You Want)
Learn From Your Test Execution History
Test Effectiveness
Test Reliability/Not Every Test Failure Points to a Defect
The Art of Testing Less
Without Sacrificing Code Quality
Tests Evolve Over Time
In Summary
The perils of energy mining: measure a bunch, compare just once
A Tale of TWO HTTPs
Let's energise your software energy experiments
Environment
N-Versions
Energy or Power
Repeat!
Granularity
Idle Measurement
Statistical Analysis
Exceptions
Identifying fault-prone files in large industrial software systems
Acknowledgment
A tailored suit: The big opportunity in personalizing issue tracking
Many Choices, Nothing Great
The Need for Personalization
Developer Dashboards or ``A Tailored Suit´´
Room for Improvement
What counts is decisions, not numbers-Toward an analytics design sheet
Decisions Everywhere
The Decision-Making Process
The Analytics Design Sheet
Example: App Store Release Analysis
A large ecosystem study to understand the effect of programming languages on code quality
Comparing Languages
Study Design and Analysis
Results
Code reviews are not for finding defects-Even established tools need occasional evaluation
Effects
Conclusions
Techniques
Interviews
Why Interview?
The Interview Guide
Selecting Interviewees
Recruitment
Collecting Background Data
Conducting the Interview
Post-Interview Discussion and Notes
Transcription
Analysis
Reporting
Now Go Interview!
Look for state transitions in temporal data.
Bikeshedding in Software Engineering
Summarizing Temporal Data
Recommendations
Reference
Card-sorting: From text to themes
Preparation Phase
Execution Phase
Analysis Phase
Tools! Tools! We need tools!
Tools in Science
The Tools We Need
Recommendations for Tool Building
Evidence-based software engineering
The Aim and Methodology of EBSE
Contextualizing Evidence
Strength of Evidence
Evidence and Theory
Which machine learning method do you need?
Learning Styles
Do additional Data Arrive Over Time?
Are Changes Likely to Happen Over Time?
If You Have a Prediction Problem, What Do You Really Need to Predict?
Do You Have a Prediction Problem Where Unlabeled Data are Abundant and Labeled Data are Expensive?
Are Your Data Imbalanced?
Do You Need to Use Data From Different Sources?
Do You Have Big Data?
Do You Have Little Data?
In Summary ...
Structure your unstructured data first!
Unstructured Data in Software Engineering
Summarizing Unstructured Software Data
As Simple as Possible... But not Simpler!
You Need Structure!
Parse that data! Practical tips for preparing your raw data for analysis
Use Assertions Everywhere
Print Information About Broken Records
Use Sets or Counters to Store Occurrences of Categorical Variables
Restart Parsing in the Middle of the Data Set
Test on a Small Subset of Your Data
Redirect Stdout and Stderr to Log Files
Store Raw Data Alongside Cleaned Data
Finally, Write a Verifier Program to Check the Integrity of Your Cleaned Data
Natural language processing is no free lunch
Natural Language Data in Software Projects
Natural Language Processing
How to Apply NLP to Software Projects
Do Stemming First.
Check the Level of Abstraction
Dont Expect Magic
Dont Discard Manual Analysis of Textual Data
Aggregating empirical evidence for more trustworthy decisions
What's Evidence?
What Does Data From Empirical Studies Look Like?
The Evidence-Based Paradigm and Systematic Reviews
How Far Can We Use the Outcomes From Systematic Review to Make Decisions?
If it is software engineering, it is (probably) a Bayesian factor
Causing the Future With Bayesian Networks
The Need for a Hybrid Approach in Software Analytics
Use the Methodology, Not the Model
Becoming Goldilocks: Privacy and data sharing in ``just right´´ conditions
The ``Data Drought´´
Change is Good
Dont Share Everything
Share Your Leaders
The wisdom of the crowds in predictive modeling for software engineering
The Wisdom of the Crowds
So... How is That Related to Predictive Modeling for Software Engineering?
Examples of Ensembles and Factors Affecting Their Accuracy
Crowds for transferring knowledge and dealing with changes
Crowds for Multiple Goals
A Crowd of Insights
Ensembles as Versatile Tools
Combining quantitative and qualitative methods (when mining software data)
Prologue: We Have Solid Empirical Evidence!
Correlation is Not Causation and, Even If We Can Claim Causation...
Collect your data: People and artifacts
Source 1: Dig Into Software Artifacts and Data
...but be careful about noise and incompleteness!
Source 2: Getting Feedback From Developers
...and dont be afraid if you collect very little data!
How Much to Analyze, and How?
Build a theory upon your data
Conclusion: The Truth is Out There!
Suggested Readings
References.
A process for surviving survey design and sailing through survey deployment.
Notes:
Includes bibliographical references at the end of each chapters.
Description based on online resource; title from PDF title page (ebrary, viewed August 1, 2016).
ISBN:
9780128042618
0128042613
9780128042069
0128042060
OCLC:
957279026

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account