My Account Log in

1 option

Big Data Analytics and Knowledge Discovery : 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings / edited by Robert Wrembel, Silvia Chiusano, Gabriele Kotsis, A Min Tjoa, Ismail Khalil.

SpringerLink Books Lecture Notes In Computer Science (LNCS) (1997-2024) Available online

View online
Format:
Book
Contributor:
Wrembel, Robert, editor.
Series:
Lecture Notes in Computer Science, 1611-3349 ; 14912
Language:
English
Subjects (All):
Statistics.
Data mining.
Information technology--Management.
Information technology.
Artificial intelligence.
Data Mining and Knowledge Discovery.
Computer Application in Administrative Data Processing.
Artificial Intelligence.
Local Subjects:
Statistics.
Data Mining and Knowledge Discovery.
Computer Application in Administrative Data Processing.
Artificial Intelligence.
Physical Description:
1 online resource (409 pages)
Edition:
1st ed. 2024.
Place of Publication:
Cham : Springer Nature Switzerland : Imprint: Springer, 2024.
Summary:
This book constitutes the proceedings of the 26th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2024, which too place in Naples, Italy, during August 26-28, 2024. The 16 full and 20 short papers included in this book were carefully reviewed and selected from 83 submissions. They were organized in topical sections as follows: Modeling and design; entity matching and similarity; classification; machine learning methods and applications; time series; data repositories;optimization; and data quality and applications. .
Contents:
Intro
Preface
Organization
Abstracts of Keynote Talks
Multimodal Deep Learning in Medical Imaging
Digital Humanism as an Enabler for a Holistic Socio-Technical Approach to the Latest Developments in Computer Science and Artificial Intelligence
Deep Entity Processing in the Era of Large Language Models: Challenges and Opportunities
Contents
Modeling and Design
LiteSelect: A Lightweight Adaptive Learning Algorithm for Online Index Selection
1 Introduction
2 The Online Index Selection Problem
3 LiteSelect: An Lightweight Online Index Tuner
3.1 Algorithm LiteSelect
3.2 Fine Tuning LiteSelect
4 Experimental Evaluation
4.1 Experimental Setup
4.2 Parameter Impact Analysis
4.3 Index Tuning Performance Comparison
5 Related Work
6 Conclusion
References
IDAGEmb: An Incremental Data Alignment Based on Graph Embedding
2 Background
2.1 Existing Data Alignment Approaches
2.2 Graph Embedding in Representation Learning
2.3 Discussion
3 Methodology
3.1 Research Design
3.2 Preliminaries
3.3 Adopted Algorithm for IDAGEmb
4 Experiments and Results
4.1 Experiment Configuration
4.2 Experiment #1: Embedding Method Selection
4.3 Experiment #2: Comparison with Static Methods (effectiveness and Efficiency)
4.4 Experiment #3: Model Sensitivity to Data Order Variation
5 Conclusion and Outlook
Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry
1 Introduction and Motivation
1.1 Research Questions (RQs)
1.2 Structure of Review
2 Literature Search Strategy
2.1 Quality Assessment Checks
2.2 Selection of Primary Studies
2.3 Data Synthesis and Analysis Approach
3 Reporting the Review
3.1 Overview of All Studies
3.2 Overview of All Primary Studies.
4 Evaluating the Research Questions
5 Discussion and Conclusion
Entity Matching and Similarity
MultiMatch: Low-Resource Generalized Entity Matching Using Task-Conditioned Hyperadapters in Multitask Learning
2.1 Problem Formulation
2.2 Entity Matching with Single-task Objective Models
2.3 Fully Fine-tuning Methods
2.4 Parameter-Efficient Fine-tuning Methods
2.5 Entity Matching with Parameter-Efficient Multi-task Models
3 MultiMatch Training
4 Experiments
5 Analysis
5.1 Single Versus Multiple Objective Models
5.2 Task Ablation Experiments
6 Conclusions and Future Work
Embedding-Based Data Matching for Disparate Data Sources
1 Context and Main Issues
2 Proposed Framework
2.1 Problem Statement
2.2 Overview
3 Experiments
3.1 RQ1. Effectiveness and Stability
3.2 RQ2. Ablation
4 Conclusion
Subtree Similarity Search Based on Structure and Text
2 Problem Definition
3 Related Works
3.1 Tree Edit Distance
3.2 Lower Bounds of Tree Edit Distance
3.3 Upper Bounds of Tree Edit Distance
3.4 Subtree Similarity Search
3.5 Other Related Problems
4 Preliminaries
5 Proposed Method
6 Experiments
6.1 Dataset
6.2 Methods
6.3 Effect of the Recall
6.4 Effect of the Document Size
6.5 Effect of the Query Size
6.6 Accuracy
7 Conclusion
Classification
Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF
2 Related Work
4 Experimental Evaluations
4.1 Data Collection
4.2 Experimental Settings
4.3 Bootstrapping
4.4 Remarks
5 Conclusions
Evaluation of High Sparsity Strategies for Efficient Binary Classification
2 Related Work.
3 Materials and Methods
4 Results and Discussion
5 Conclusions and Future Work
Incremental SMOTE with Control Coefficient for Classifiers in Data Starved Medical Applications
3 Method
3.1 An Incremental Synthetic Data Generation System
4.1 Datasets and Experiments Setup
4.2 Statistical Analysis
4.3 Performance Evaluation on Classifiers
Exploring Evaluation Metrics for Binary Classification in Data Analysis: the Worthiness Benchmark Concept
1 Introduction and Related Research
2 Methodology
3 Discussion and Conclusion
Machine Learning Methods and Applications
Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs
3.1 In-Chain Domain Knowledge
3.2 CK-CEVAE
3.3 Chained Prediction Unit
4.1 Chains Acquisition
4.2 Domain Detection Model
4.3 Models Configurations
4.4 Overall Analysis
4.5 Ablation Study
5 Case Study: Understanding Semantic Continuity in Knowledge Graphs
6 Discussion
Towards Regional Explanations with Validity Domains for Local Explanations
2.1 Explanation Methods
2.2 Explanation Evaluation Metrics
2.3 Validity Domain of Models
3 Toy Example
4 Our Proposal
4.1 Validity Domain
4.2 Model Summary
4.3 Evaluation Metrics
5 Experiments
5.1 Protocol
5.2 Evaluation of Methods
5.3 Model Summary
5.4 Sensitivity Analysis
6 Discussion and Limits
7 Conclusion and Perspectives
Analyzing a Decade of Evolution: Trends in Natural Language Processing
2.1 PDF Parsing
3 Results
4 Conclusion.
5 Limitations
Improving Serendipity for Collaborative Metric Learning Based on Mutual Proximity
2.1 Serendipity
2.2 Collaborative Metric Learning (CML)
2.3 Mutual Proximity (MP)
2.4 Advantages and Originality of the Proposed Method
3.1 Learning Embeddings
3.2 Searching Embedding Space and Recommending Items
4.1 Datasets
4.2 Metrics
4.3 Results
5 Conclusions and Discussion
Ada2vec: Adaptive Representation Learning for Large-Scale Dynamic Heterogeneous Networks
3 Problem Definition
4 The Ada2vec Framework
4.1 Part 1 Dynamic
4.2 Part 2 Heterogeneity
4.3 Part 3 Change
5 Experimental Evaluations
5.1 Data
5.2 Benchmarks
5.3 Classification
5.4 Clustering
5.5 Performance Analysis
6 Conclusion and Future Work
Differentially-Private Neural Network Training with Private Features and Public Labels
2.1 Differential Privacy
2.2 DP-SGD
3 Related Work
4 Proposed Approach
4.1 Sanitization Layer
4.2 Bounding Sensitivity and Adding Noise
4.3 Design Choices and Tradeoffs
5 Experimental Evaluation
5.1 Experimental Settings
5.2 Results
Time Series
Series2Graph++: Distributed Detection of Correlation Anomalies in Multivariate Time Series
3 Series2Graph++
5 Conclusion
Anomaly Detection from Time Series Under Uncertainty
3 Proposed Approach
4.1 Uncertainty Quantification Evaluation
4.2 Model Performance
Comparison of Measures for Characterizing the Difficulty of Time Series Classification.
1 Introduction
2.1 Data and Models
2.2 Complexity Measures
3 Analysis
3.1 Correlation Analysis
3.2 Relationships Between the Complexity Measures
Dynamic Time Warping for Phase Recognition in Tribological Sensor Data
3.1 Dynamic Time Warping (DTW)
3.2 Tribological Use Case
3.3 Experiments
4 Results
4.1 Classification of the Whole Wear Phases
4.2 Partial Classification of the Wear Phases
Data Repositories
Putting Co-Design-Supporting Data Lakes to the Test: An Evaluation on AEC Case Studies
1 Motivation: Data Management in AEC
2 ArchIBALD Architecture Development and Definition
2.1 Requirement Analysis
2.2 Design of the ArchIBALD Architecture
3 Scenario-Based Case Studies: Context and Overview
3.1 The livMatS Biomimetic Shell
3.2 Co-Design of Robotic Prefabrication
3.3 Co-Design of End-Effectors for On-Site Assembly
3.4 Co-Design of On-Site Planning and Execution
4 Evaluation
4.1 Case Study 1: Co-Design of Robotic Prefabrication
4.2 Case Study 2: Co-Design of End-Effectors
4.3 Case Study 3: Co-Design of On-Site Planning and Execution
Creating and Querying Data Cubes in Python Using PyCube
3 Preliminaries
4 Use Case
4.1 Initializing PyCube
4.2 Analyzing the Data in the View
5 Populating the View
5.1 Generating the SQL Query
5.2 Converting Result Sets to Dataframes
6.1 Experimental Setup
6.2 Data Retrieval Speeds
6.3 Memory Usage
6.4 Code Comparison
7 Conclusion and Future Work
An E-Commerce Benchmark for Evaluating Performance Trade-Offs in Document Stores
2 Benchmark Design.
2.1 E-Commerce Application.
Notes:
Includes bibliographical references and index.
Other Format:
Print version: Wrembel, Robert Big Data Analytics and Knowledge Discovery
ISBN:
9783031683237

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account