Numerical Algorithms and Parallel Scientific Computing.- Advances in Incremental PCA Algorithms.- Algorithms for forward and backward solution of the Fokker-Planck equation in the heliospheric transport of cosmic rays.- Efficient Evaluation of Matrix Polynomials.- A Comparison of Soft-Fault Error Models in the Parallel Preconditioned Flexible GMRES.- Multilayer Approach for Joint Direct and Transposed Sparse Matrix Vector Multiplication for Multithreaded CPUs.- Comparison of parallel time-periodic Navier–Stokes solvers.- Blocked Algorithms for Robust Solution of Triangular Linear Systems.- A Comparison of Accuracy and Efficiency of Parallel Solvers for Fractional Power Diffusion Problems.- Efficient Cross Section Reconstruction on Modern Multi and Many Core Architectures.- Parallel assembly of ACA BEM matrices on Xeon Phi clusters.- Stochastic bounds for Markov chains on Intel Xeon Phi coprocessor.-
Particle Methods in Simulations.- Fast DEM collision checks on multicorenodes.- A Space and Bandwidth Efficient Multicore Algorithm for the Particle-in-Cell Method.- Load Balancing for Particle-in-Cell Plasma Simulation on Multicore Systems.-
Task-Based Paradigm of Parallel Computing.- TaskUniVerse: A Task-Based Unified Interface for Versatile Parallel Execution.- Comparison of Time and Energy Oriented Scheduling for Task-based Programs.- Experiments with sparse Cholesky using a parametrized task graph Implementation.- A Task-Based Algorithm for Reordering the Eigenvalues of a Matrix in Real Schur Form.-
GPU Computing.- Radix Tree for Binary Sequences on GPU.- A comparison of performance tuning process for different generations of NVIDIA GPUs and an example scientific computing algorithm.- NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagonal Systems Implementation of cuThomasBatch.- Two-Echelon System Stochastic Optimization with R and CUDA.- Parallel Hierarchical Agglomerative Clustering for fMRI Data.-
Parallel Non-numerical Algorithms.- Two Parallelization Schemes for the Induction of Nondeterministic Finite Automata on PCs.- Approximating Personalized Katz Centrality in Dynamic Graphs.- Graph-Based Speculative Query Execution for RDBMS.- A GPU Implementation of Bulk Execution of the Dynamic Programming for the Optimal Polygon Triangulation.-
Performance Evaluation of Parallel Algorithms and Applications.- Early performance evaluation of the hybrid cluster with torus interconnect aimed at molecular-dynamics simulations.- Load balancing for CPU-GPU coupling in Computational Fluid Dynamics.- Implementation and Performance Analysis of 2.5D-PDGEMM on the K Computer.- An Approach for Detecting Abnormal Parallel Applications Based on Time Series Analysis Methods.- Prediction of the Inter-Node Communication Costs of a New Gyrokinetic Code with Toroidal Domain.- D-Spline Performance Tuning Method Flexibly Responsive to Execution Time Perturbation.-
Environments and Frameworks for Parallel/Distributed/Cloud Computing.- Dfuntest: A Testing Framework for Distributed Applications.- Security monitoring and analytics in the context of HPC processing model.- Multidimensional Performance and Scalability Analysis for Diverse Applications Based on System Monitoring Data.- Bridging the Gap between HPC and Cloud using HyperFlow and PaaSage.- A Memory Efficient Parallel All-Pairs Computation Framework: Computation – Communication Overlap.- Automatic Parallelization of ANSI C to CUDA C Programs.-Consistency Models for Global Scalable Data Access Services.-
Applications of Parallel Computing.- Global state monitoring in optimization of parallel event–driven simulation.- High Performance Optimization of Independent Component Analysis Algorithm for EEG Data.- Continuous and discrete models of melanoma progression simulated in multi-GPU environment.- Early experience on using Knights Landing processors for Lattice Boltzmann applications.-
Soft Computing with Applications.-Towards a Model of Semi-Supervised Learning for the Syntactic Pattern Recognition-based Electrical Load Prediction System.- Parallel Processing of Color Digital Images for Linguistic Description of Their Content.- Co-evolution of fitness predictors and Deep Neural Networks.- Performance evaluation of DBN learning on Intel multi- and manycore architectures.-
Special Session on Parallel Matrix Factorizations.- On the Tunability of a New Hessenberg Reduction Algorithm Using Parallel Cache Assignment.- New Preconditioning for the One–Sided Block–Jacobi SVD Algorithm.- Structure-preserving technique in the block SS–Hankel method for solving Hermitian generalized eigenvalue problems.- On using the Cholesky QR method in the full-blocked one-sided Jacobi algorithm.- Parallel Divide-and-Conquer Algorithm for Solving Tridiagonal Eigenvalue Problems on Manycore Systems.- Partial Inverses of Complex Block Tridiagonal Matrices.- Parallel Nonnegative Matrix Factorization Based on Newton Iteration with Improved Convergence Behavior.