1 option

Big Data Analytics and Knowledge Discovery : 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings / edited by Robert Wrembel, Silvia Chiusano, Gabriele Kotsis, A Min Tjoa, Ismail Khalil.

SpringerLink Books Lecture Notes In Computer Science (LNCS) (1997-2024) Available online

Format:: Book
Contributor:: Wrembel, Robert, editor.
Series:: Lecture Notes in Computer Science, 1611-3349 ; 14912
Language:: English
Subjects (All):: Statistics.; Data mining.; Information technology--Management.; Information technology.; Artificial intelligence.; Data Mining and Knowledge Discovery.; Computer Application in Administrative Data Processing.; Artificial Intelligence.
Local Subjects:: Statistics.; Data Mining and Knowledge Discovery.; Computer Application in Administrative Data Processing.; Artificial Intelligence.
Physical Description:: 1 online resource (409 pages)
Edition:: 1st ed. 2024.
Place of Publication:: Cham : Springer Nature Switzerland : Imprint: Springer, 2024.
Summary:: This book constitutes the proceedings of the 26th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2024, which too place in Naples, Italy, during August 26-28, 2024. The 16 full and 20 short papers included in this book were carefully reviewed and selected from 83 submissions. They were organized in topical sections as follows: Modeling and design; entity matching and similarity; classification; machine learning methods and applications; time series; data repositories;optimization; and data quality and applications. .
Contents:: Intro; Preface; Organization; Abstracts of Keynote Talks; Multimodal Deep Learning in Medical Imaging; Digital Humanism as an Enabler for a Holistic Socio-Technical Approach to the Latest Developments in Computer Science and Artificial Intelligence; Deep Entity Processing in the Era of Large Language Models: Challenges and Opportunities; Contents; Modeling and Design; LiteSelect: A Lightweight Adaptive Learning Algorithm for Online Index Selection; 1 Introduction; 2 The Online Index Selection Problem; 3 LiteSelect: An Lightweight Online Index Tuner; 3.1 Algorithm LiteSelect; 3.2 Fine Tuning LiteSelect; 4 Experimental Evaluation; 4.1 Experimental Setup; 4.2 Parameter Impact Analysis; 4.3 Index Tuning Performance Comparison; 5 Related Work; 6 Conclusion; References; IDAGEmb: An Incremental Data Alignment Based on Graph Embedding; 2 Background; 2.1 Existing Data Alignment Approaches; 2.2 Graph Embedding in Representation Learning; 2.3 Discussion; 3 Methodology; 3.1 Research Design; 3.2 Preliminaries; 3.3 Adopted Algorithm for IDAGEmb; 4 Experiments and Results; 4.1 Experiment Configuration; 4.2 Experiment #1: Embedding Method Selection; 4.3 Experiment #2: Comparison with Static Methods (effectiveness and Efficiency); 4.4 Experiment #3: Model Sensitivity to Data Order Variation; 5 Conclusion and Outlook; Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry; 1 Introduction and Motivation; 1.1 Research Questions (RQs); 1.2 Structure of Review; 2 Literature Search Strategy; 2.1 Quality Assessment Checks; 2.2 Selection of Primary Studies; 2.3 Data Synthesis and Analysis Approach; 3 Reporting the Review; 3.1 Overview of All Studies; 3.2 Overview of All Primary Studies.; 4 Evaluating the Research Questions; 5 Discussion and Conclusion; Entity Matching and Similarity; MultiMatch: Low-Resource Generalized Entity Matching Using Task-Conditioned Hyperadapters in Multitask Learning; 2.1 Problem Formulation; 2.2 Entity Matching with Single-task Objective Models; 2.3 Fully Fine-tuning Methods; 2.4 Parameter-Efficient Fine-tuning Methods; 2.5 Entity Matching with Parameter-Efficient Multi-task Models; 3 MultiMatch Training; 4 Experiments; 5 Analysis; 5.1 Single Versus Multiple Objective Models; 5.2 Task Ablation Experiments; 6 Conclusions and Future Work; Embedding-Based Data Matching for Disparate Data Sources; 1 Context and Main Issues; 2 Proposed Framework; 2.1 Problem Statement; 2.2 Overview; 3 Experiments; 3.1 RQ1. Effectiveness and Stability; 3.2 RQ2. Ablation; 4 Conclusion; Subtree Similarity Search Based on Structure and Text; 2 Problem Definition; 3 Related Works; 3.1 Tree Edit Distance; 3.2 Lower Bounds of Tree Edit Distance; 3.3 Upper Bounds of Tree Edit Distance; 3.4 Subtree Similarity Search; 3.5 Other Related Problems; 4 Preliminaries; 5 Proposed Method; 6 Experiments; 6.1 Dataset; 6.2 Methods; 6.3 Effect of the Recall; 6.4 Effect of the Document Size; 6.5 Effect of the Query Size; 6.6 Accuracy; 7 Conclusion; Classification; Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF; 2 Related Work; 4 Experimental Evaluations; 4.1 Data Collection; 4.2 Experimental Settings; 4.3 Bootstrapping; 4.4 Remarks; 5 Conclusions; Evaluation of High Sparsity Strategies for Efficient Binary Classification; 2 Related Work.; 3 Materials and Methods; 4 Results and Discussion; 5 Conclusions and Future Work; Incremental SMOTE with Control Coefficient for Classifiers in Data Starved Medical Applications; 3 Method; 3.1 An Incremental Synthetic Data Generation System; 4.1 Datasets and Experiments Setup; 4.2 Statistical Analysis; 4.3 Performance Evaluation on Classifiers; Exploring Evaluation Metrics for Binary Classification in Data Analysis: the Worthiness Benchmark Concept; 1 Introduction and Related Research; 2 Methodology; 3 Discussion and Conclusion; Machine Learning Methods and Applications; Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs; 3.1 In-Chain Domain Knowledge; 3.2 CK-CEVAE; 3.3 Chained Prediction Unit; 4.1 Chains Acquisition; 4.2 Domain Detection Model; 4.3 Models Configurations; 4.4 Overall Analysis; 4.5 Ablation Study; 5 Case Study: Understanding Semantic Continuity in Knowledge Graphs; 6 Discussion; Towards Regional Explanations with Validity Domains for Local Explanations; 2.1 Explanation Methods; 2.2 Explanation Evaluation Metrics; 2.3 Validity Domain of Models; 3 Toy Example; 4 Our Proposal; 4.1 Validity Domain; 4.2 Model Summary; 4.3 Evaluation Metrics; 5 Experiments; 5.1 Protocol; 5.2 Evaluation of Methods; 5.3 Model Summary; 5.4 Sensitivity Analysis; 6 Discussion and Limits; 7 Conclusion and Perspectives; Analyzing a Decade of Evolution: Trends in Natural Language Processing; 2.1 PDF Parsing; 3 Results; 4 Conclusion.; 5 Limitations; Improving Serendipity for Collaborative Metric Learning Based on Mutual Proximity; 2.1 Serendipity; 2.2 Collaborative Metric Learning (CML); 2.3 Mutual Proximity (MP); 2.4 Advantages and Originality of the Proposed Method; 3.1 Learning Embeddings; 3.2 Searching Embedding Space and Recommending Items; 4.1 Datasets; 4.2 Metrics; 4.3 Results; 5 Conclusions and Discussion; Ada2vec: Adaptive Representation Learning for Large-Scale Dynamic Heterogeneous Networks; 3 Problem Definition; 4 The Ada2vec Framework; 4.1 Part 1 Dynamic; 4.2 Part 2 Heterogeneity; 4.3 Part 3 Change; 5 Experimental Evaluations; 5.1 Data; 5.2 Benchmarks; 5.3 Classification; 5.4 Clustering; 5.5 Performance Analysis; 6 Conclusion and Future Work; Differentially-Private Neural Network Training with Private Features and Public Labels; 2.1 Differential Privacy; 2.2 DP-SGD; 3 Related Work; 4 Proposed Approach; 4.1 Sanitization Layer; 4.2 Bounding Sensitivity and Adding Noise; 4.3 Design Choices and Tradeoffs; 5 Experimental Evaluation; 5.1 Experimental Settings; 5.2 Results; Time Series; Series2Graph++: Distributed Detection of Correlation Anomalies in Multivariate Time Series; 3 Series2Graph++; 5 Conclusion; Anomaly Detection from Time Series Under Uncertainty; 3 Proposed Approach; 4.1 Uncertainty Quantification Evaluation; 4.2 Model Performance; Comparison of Measures for Characterizing the Difficulty of Time Series Classification.; 1 Introduction; 2.1 Data and Models; 2.2 Complexity Measures; 3 Analysis; 3.1 Correlation Analysis; 3.2 Relationships Between the Complexity Measures; Dynamic Time Warping for Phase Recognition in Tribological Sensor Data; 3.1 Dynamic Time Warping (DTW); 3.2 Tribological Use Case; 3.3 Experiments; 4 Results; 4.1 Classification of the Whole Wear Phases; 4.2 Partial Classification of the Wear Phases; Data Repositories; Putting Co-Design-Supporting Data Lakes to the Test: An Evaluation on AEC Case Studies; 1 Motivation: Data Management in AEC; 2 ArchIBALD Architecture Development and Definition; 2.1 Requirement Analysis; 2.2 Design of the ArchIBALD Architecture; 3 Scenario-Based Case Studies: Context and Overview; 3.1 The livMatS Biomimetic Shell; 3.2 Co-Design of Robotic Prefabrication; 3.3 Co-Design of End-Effectors for On-Site Assembly; 3.4 Co-Design of On-Site Planning and Execution; 4 Evaluation; 4.1 Case Study 1: Co-Design of Robotic Prefabrication; 4.2 Case Study 2: Co-Design of End-Effectors; 4.3 Case Study 3: Co-Design of On-Site Planning and Execution; Creating and Querying Data Cubes in Python Using PyCube; 3 Preliminaries; 4 Use Case; 4.1 Initializing PyCube; 4.2 Analyzing the Data in the View; 5 Populating the View; 5.1 Generating the SQL Query; 5.2 Converting Result Sets to Dataframes; 6.1 Experimental Setup; 6.2 Data Retrieval Speeds; 6.3 Memory Usage; 6.4 Code Comparison; 7 Conclusion and Future Work; An E-Commerce Benchmark for Evaluating Performance Trade-Offs in Document Stores; 2 Benchmark Design.; 2.1 E-Commerce Application.
Notes:: Includes bibliographical references and index.
Other Format:: Print version: Wrembel, Robert Big Data Analytics and Knowledge Discovery
ISBN:: 9783031683237

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Big Data Analytics and Knowledge Discovery : 26th International Conference, DaWaK 2024, Naples, Italy, August 26–28, 2024, Proceedings / edited by Robert Wrembel, Silvia Chiusano, Gabriele Kotsis, A Min Tjoa, Ismail Khalil.

Find

My Account

Guides