Invited Workshop on Compiler Techniques for Sparse Tensor Algebra
Cambridge MA, January 26th
Organizers: Fredrik Kjolstad and Saman Amarasinghe
There has been a lot of recent interest and innovation in compiler technology for high performance sparse tensor computation. This workshop will bring together relevant parties from academia and industry to discuss individual approaches, how they relate, and whether the ideas can be combined.
Program
There will be seven session of three speakers. The sessions will consist of three ten minute talks followed by a twenty minute speaker and room discussion.Time | Session |
---|---|
8:00 - 8:20 | Registration |
8:20 - 8:30 | Welcome |
8:30 - 9:20 |
Sparse Algorithms and Data Structures I Chaired by Saman Amarasinghe (MIT) Jiajia Li (Pacific Northwest National Laboratory) Shaden Smith (Intel Labs) Edgar Solomonik (University of Illinois at Urbana-Champaign) |
9:20 - 9:30 | Break |
9:30 - 10:20 |
Sparse Algorithms and Data Structures II Chaired by Maryam Dehnavi (University of Toronto) Muthu Baskaran (Reservoir Labs) Jee Choi (University of Oregon) Duane Merrill and Michael Garland (NVIDIA Research) |
10:20 - 10:30 | Break |
10:30 - 11:20 |
Sparse Polyhedral Optimization Chaired by Shoaib Kamil (Adobe Research) Mary Hall (University of Utah) Cathie Olschanowsky (Boise State University) Michelle Strout (University of Arizona) |
11:20 - 11:30 | Break |
11:30 - 12:30 |
Tensor Compilers I Chaired by Stephen Chou (MIT) Dougal Maclaurin (Google) P. (Saday) Sadayappan (Ohio State University) Hongbo Rong (Intel Labs) Keno Fischer (Julia Computing) |
12:30 - 1:30 | Lunch |
1:30 - 2:20 |
Tensor Compilers II Chaired by Chris Fletcher (University of Illinois at Urbana-Champaign) Maryam Dehnavi (University of Toronto) Fredrik Kjolstad (MIT) Stephen Chou (MIT) |
2:20 - 2:35 | Break |
2:35 - 3:25 |
Tensors and Graphs Chaired by Fredrik Kjolstad (MIT) Justin Solomon (MIT) Jeremy Kepner (Lincoln Laboratory) Julian Shun (MIT) |
3:25 - 3:40 | Break |
3:40 - 4:30 |
Hardware for Sparse Tensor Algebra Chaired by Joel Emer (NVIDIA Research and MIT) Michael Pellauer (NVIDIA Research) Hanrui Wang (MIT) Chris Fletcher (University of Illinois at Urbana-Champaign) |
4:30 - 4:45 | Break |
4:45 - 5:35 |
Applications in Science and Engineering Chaired by Peter Ahrens (MIT) Srinivas Eswar (Georgia Tech) Richard Mills (Argonne National Laboratory) Miles Stoudenmire (Flatiron Institute) |
5:35 - 6:00 | Discussion |
6:00 - 8:00 | Dinner and Poster Session |
Papers
The following is a list of papers that were suggested by workshop participants.
Run-time parallelization in a polyhedral framework
Automating wavefront parallelization for sparse matrix computations Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, Hongbo Rong, Rajkishore Barik, Michelle Mills Strout, and Mary Hall SC 2016 https://ieeexplore.ieee.org/document/7877119/ |
Uninterpreted functions in a polyhedral framework
Non-affine Extensions to Polyhedral Code Generation Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Mills Strout CGO 2014 https://doi.org/10.1145/2581122.2544141 |
Data Transformations in a Polyhedral Framework
Loop and data transformations for sparse matrix code Anand Venkat, Mary Hall, and Michelle Strout PLDI 2015 https://doi.org/10.1145/2737924.2738003 |
NNCP Algo
Fast alternating LS algorithms for high order CANDECOMP/PARAFAC tensor factorizations. Anh-Huy Phan, Petr Tichavsky, and Andrzej Cichocki. Signal Processing 2013 |
NNCP Algo
Nesterov-based parallel algorithm for large-scale nonnegative tensor factorization. Athanasios P Liavas, Georgios Kostoulas, Georgios Lourakis, Kejun Huang, and Nicholas D Sidiropoulos. ICASSP 2017 |
Storage formats for tensor operations: GPUs
A Unified Optimization Approach for Sparse Tensor Operations on GPUs B. Liu, C. Wen, A. Sarwate, and M. Mehri Dehnavi Cluster |
Sparse tensor optimizations for tensor decompisitons used for sparse count data
Memory-efficient parallel tensor decompositions Baskaran, M., Henretty, T., Pradelle, B., Langston, M. H., Bruns-Smith, D., Ezick, J., Lethin, R HPEC 2017 https://www.reservoir.com/publication/memory-efficient-parallel-tensor-decompositions/) |
Mode-specific and mode-generic tensor formats in ENSIGN
Efficient and scalable computations with sparse tensors Baskaran, M., Meister, B., Vasilache, N., & Lethin, R HPEC 2012 https://www.reservoir.com/publication/efficient-scalable-computations-sparse-tensors/) |
FROSTT
The Formidable Repository of Open Sparse Tensors and Tools Bunch of people http://frostt.io/ |
Merge-based parallelism for load balancing
Merge-based Parallel Sparse Matrix-Vector Multiplication Duane Merill and Michael Garland SC 2016 https://ieeexplore.ieee.org/document/7877136 |
Cyclops tensor framework (distributed memroy tensor algebra)
A massively parallel tensor contraction framework for coupled-cluster computations Edgar Solomonik, Devin Matthews, Jeff R. Hammond, John F. Stanton, and James Demmel JPDC 2014 http://www.sciencedirect.com/science/article/pii/S074373151400104X |
Sparse tensor contractions (matrix multiplication) and graph algorithms with Cyclops
Scaling betweenness centrality using communication-efficient sparse matrix multiplication Edgar Solomonik, Maciej Besta, Flavio Vella, and Torsten Hoefler SC 2017 https://arxiv.org/abs/1609.07008 |
TACO system
The Tensor Algebra Compiler Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe OOPSLA 2017 http://fredrikbk.com/oopsla17.html |
Communication Lower bounds for MTTKRP
Communication lower bounds for matricized tensor times khatri-rao product. Grey Ballard, Nicholas Knight, and Kathryn Rouse. IPDPS 2018 https://arxiv.org/pdf/1708.07401.pdf |
Specify sparse computations for FPGAs and CGRAs
Expressing Sparse Matrix Computations for Productive Performance on Spatial Architectures Hongbo Rong https://arxiv.org/abs/1810.07517 |
HBL upper bounds
Parallelepipeds obtaining HBL lower bounds. James Demmel and Alex Rusciano CoRR 2016 https://arxiv.org/pdf/1611.05944.pdf |
Cache blocking for Sparse Tensors
Blocking Optimization Strategies for Sparse Tensor Computation. Jee Choi, Xing Liu, Shaden Smith, and Tyler Simon IPDPS 2018 https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxqZWV3aGFuY2hvaXBlcnNvbmFsfGd4OjZkNDBlMmU0NGYxZWQzYzI |
HiCOO format
HiCOO: Hierarchical Storage of Sparse Tensors Jiajia Li, Jimeng Sun, Richard Vuduc SC 2018 http://fruitfly1026.github.io/static/files/sc18-li.pdf |
PASTA benchmark suite
PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite Jiajia Li, Yuchen Ma, Xiaolong Wu, Ang Li, Kevin Barker Technical Report 2019 http://fruitfly1026.github.io/static/files/pasta.pdf |
graph processing frameworks
Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun and Guy Blelloch PPoPP 2013 https://people.csail.mit.edu/jshun/ligra.pdf |
Code generation for sparse matrix methods
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism K. Chesmi, S. Kamil, M. Strout, and M. Mehri Dehnavi SC18 |
Code generation for sparse matrix methods
Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis K. Chesmi, S. Kamil, M. Strout, and M. Mehri Dehnavi SC17 |
Data Tranformations in Real-world code
Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action Khalid Ahmad, Anand Venkat, Mary W. Hall LCPC 2016 https://doi.org/10.1007/978-3-319-52709-3_17 |
Dense MTTKRP
Shared-memory parallelization of MTTKRP for dense tensors. Koby Hayashi, Grey Ballard, Yujie Jiang, and Michael J. Tobia. PPoPP 2018 |
graph processing frameworks
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable Laxman Dhulipala, Guy Blelloch, and Julian Shun SPAA 2018 https://arxiv.org/abs/1805.05208 |
graph processing frameworks
Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing Laxman Dhulipala, Guy Blelloch, and Julian Shun SPAA 2017 https://people.csail.mit.edu/jshun/bucketing.pdf Generalized CP streaming framework in ENSIGN Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data Letourneau, P.D., Baskaran, M., Henretty, T., Ezick, J. and Lethin, R HPEC 2018 https://www.reservoir.com/publication/cp-tensor-decomposition-update-framework/ |
alternating least squares for tensor decomposition
Accelerating alternating least squares for tensor decomposition by pairwise perturbation Linjian Ma and Edgar Solomonik arxiv 2018 https://arxiv.org/abs/1811.10573 |
Survey/overview
The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code M. M. Strout, M. Hall and C. Olschanowsky Proc. of the IEEE 2018 10.1109/JPROC.2018.2857721 |
Loop upper bounds
Communication-Optimal Loops Nick Knight Phd Thesis 2015 |
Finding Good Block Sizes In Constant Time
A Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats Peter Ahrens, Helen Xu, Nicholas Schiefer IPDPS 2018 https://doi.org/10.1109/IPDPS.2018.00064 |
CSF formalization
Tensor-Matrix Products with a Compressed Sparse Tensor. Shaden Smith, George Karypis IA^3 2015 http://shaden.io/pub-files/smith2015csf.pdf |
TTMc optimization
Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith, George Karypis Euro-Par 2017 http://shaden.io/pub-files/smith2017tucker.pdf |
SPLATT - CSF data structure and MTTKRP
SPLATT: Efficient and parallel sparse tensor-matrix multiplication. Shaden Smith, Niranjay Ravindran, Nicholas D Sidiropoulos, and George Karypis IPDPS 2015 http://shaden.io/pub-files/smith2015splatt.pdf |
Generating code for computing with disparate tensor formats
Format Abstraction for Sparse Tensor Algebra Compilers Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe OOPSLA 2018 https://dl.acm.org/citation.cfm?doid=3288538.3276493 |
High-performance Sparse Graph Computations
GraphIt: a high-performance graph DSL "Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, Saman Amarasinghe " OOPSLA 2018 https://dl.acm.org/citation.cfm?id=3276491 |
Storage formats for tensor operations: Cloud
CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms Z. Blanco, B. Liu, and M. Mehri Dehnavi ICPP |
Venue
Room 32-155
MIT Ray and Maria Stata Center
32 Vassar St
Cambridge, MA 02139