Li, Yi ()
I am an Associate Professor in the Division of Mathematical Sciences at Nanyang Technological University.
Before joining NTU, I was a postdoc at Harvard University (hosted by Jelani Nelson), a postdoc at the Max-Planck Institute for Informatics in Saarbruecken, Germany and a research fellow at the Simons Institute for the Theory of Computing. I received a PhD from the University of Michigan in 2013 and a BEng from the ACM Class of Shanghai Jiaotong University in 2008, both in computer science and engineering.
Postal address: SPMS-MAS-05-17, Division of Mathematical Sciences, 21 Nanyang Link, Singapore 637371.
Email:
Research Interests
Algorithms for massive data sets, data streaming algorithms
Low-distortion metric embeddings
Compressive sensing and signal processing
Theoretical computer science
Manuscripts
- Cheng Chen, Yi Li, and Yiming Sun. Online Active Regression.
arXiv:2207.05945
(This version supersedes, and significantly improves, [C31])
- Yi Li and Mingmou Liu. Sparsity-Dimension Trade-Offs for Oblivious Subspace Embeddings.
arXiv:2212.02913
Journal Publications
- Zhengyang Guo, Yi Li, and Shaoyu Pei. Expected Size of Random Tukey Layers and Convex Layers.
Computational Geometry: Theory and Applications 103, Article No. 101856, Apr 2022. pdf
- Yi Li, Ruosong Wang, and David Woodruff. Tight Bounds for the Subspace Sketch Problem with Applications.
SIAM J. Comp. 50(4), pp 1287--1335, 2021. pdf
(This version supersedes [C19])
- Yi Li and Vasileios Nakos. Sublinear-Time Algorithms for Compressive Phase Retrieval.
IEEE Trans. Info. Theory, 66(11), pp 7302--7310, 2020. pdf
(This paper supersedes, and is radically different from, [C14])
- Yi Li, Huy Le Nguyen, and David Woodruff. On Approximating Matrix Norms in a Stream.
SIAM J. Comput., 48(6), pp 1643--1697, 2019. pdf
(This version supersedes [C4], [C9] and part of [C11])
- Sudipto Guha, Yi Li, and Qin Zhang. Distributed Partial Clustering.
ACM Transactions on Parallel Computing 6(3), pp 11:1--11:20, October 2019. (Special issue on SPAA 2017) pdf
(This version supersedes [C12])
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
For-all Sparse Recovery in Near-Optimal Time.
ACM Transactions on Algorithms 13(3), pp 32:1--32:26, August 2017. pdf
(This version supersedes [C6])
- Petros Boufounos, Volkan Cevher, Anna Gilbert, Yi Li, and Martin Strauss. What's the Frequency, Kenneth?: Sublinear Fourier Sampling Off the Grid.
Algorithmica 73(2), pp 261-288, 2015. pdf
(This version supersedes [C2])
Update: A small tweak in the hashing lemma shows that diluting S^1 by k/η, instead of 1/η, would be enough. The sampling duration can be brought down to 1/η from k/η.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss. Approximate Sparse Recovery: Optimizing Time and Measurements.
SIAM J. Comput. 41(2), pp 436-453, 2012. pdf
(This version supersedes [C1])
Conference Publications
- Aarshvi Gajjar, Wai Ming Tai, Xingyu Xu, Chinmay Hegde, Yi Li, Christopher Musco. Agnostic Active Learning of Single Index Models with
Linear Sample Complexity.
Proceedings of Machine Learning Research 247:1715--1754, 2024. (Proceedings of COLT 2024) arXiv:2405.09312
- Sheng-Jun Huang, Yi Li, Yiming Sun, and Ying-Peng Tang. One-shot Active Learning Based on Lewis Weight Sampling for Multiple Deep Models.
Proceedings of ICLR 2024. (No physical publication) Open Review link arXiv:2405.14121
- Yi Li, Honghao Lin, and David Woodruff. Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms.
Proceedings of ICLR 2024. (No physical publication) Open Review link
- Yi Li, Honghao Lin, and David P. Woodruff. ℓp-Regression in the Arbitrary Partition Model of Communication.
Proceedings of Machine Learning Research 195:4902-4928, 2023. (Proceedings of COLT 2023) arXiv:2307.05117
- Yi Li, Honghao Lin, Simin Liu, Ali Vakilian, and David Woodruff. Learning the Positions in CountSketch.
Proceedings of ICLR 2023. (No physical publication) Open Review link arXiv:2306.06611
- Yi Li, Honghao Lin, and David P. Woodruff. The ℓp-Subspace Sketch Problem in Small Dimensions with Applications to Support Vector Machines.
Proceedings of SODA 2023, pp 850--877. arXiv:2211.07132
- Yi Li, Honghao Lin, David Woodruff, and Yuheng Zhang. Streaming Algorithms with Large Approximation Factors.
Proceedings of APPROX/RANDOM 2022, LIPIcs Vol. 245, pp 13:1--13:23. arXiv:2207.08075
- Cheng Chen, Yi Li, and Yiming Sun. Online Active Regression.
Proceedings of Machine Learning Research 162:3320--3335, 2022. (Proceedings of ICML 2022) Long presentation at ICML 2022.
- Yi Li and Mingmou Liu. Lower Bounds for Sparse Oblivious Subspace Embeddings.
Proceedings of PODS 2022, pp 251--260. arXiv:2112.10987
- Yi Li, Yan Song, and Qin Zhang. Learning to Cluster via Same-Cluster Queries.
Proceedings of CIKM 2021, pp 978--987. arXiv:2108.07383
- Yi Li and David P. Woodruff. The Product of Gaussian Matrices is Close to Gaussian.
Proceedings of APPROX/RANDOM 2021, LIPIcs Vol. 207, pp 35:1--35:22. arXiv:2108.09887
- Yi Li, David P. Woodruff, and Taisuke Yasuda. Exponentially Improved Dimensionality Reduction for ℓ1: Subspace Embeddings and Independence Testing.
Proceedings of Machine Learning Research 134:3111--3195, 2021. (Proceedings of COLT 2021) arXiv:2104.12946
- Yifei Jiang, Yi Li, Yiming Sun, Jiaxin Wang, and David Woodruff. Single Pass Entrywise-Transformed Low Rank Approximation.
Proceedings of Machine Learning Research 139:4982--4991, 2021. (Proceedings of ICML 2021) arxiv:2107.07889
- Zhengyang Guo and Yi Li. Geometric Cover with Outliers Removal.
Proceedings of STACS 2021, LIPIcs Vol. 187, pp 39:1--39:15. pdf
- Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, and David Woodruff. Streaming Complexity of SVMs.
Proceedings of APPROX/RANDOM 2020, LIPIcs Vol. 176, pp 50:1--50:22. arxiv:2007.03633
- Yi Li, Ruosong Wang, Lin Yang, and Hanrui Zhang. Nearly Linear Row Sampling Algorithm for Quantile Regression.
Proceedings of Machine Learning Research 119:1888--1898, 2020. (Proceedings of ICML 2020). arxiv:2006.08397
- Yi Li and David Woodruff. Input-Sparsity Low Rank Approximation in Schatten Norm.
Proceedings of Machine Learning Research 119:11124--11132, 2020. (Proceedings of ICML 2020). arxiv:2004.12646
- Yi Li and Vasileios Nakos. Deterministic Sparse Fourier Transform with an ℓ∞ Guarantee.
Proceedings of ICALP 2020, LIPIcs Vol. 168, pp 77:1--77:14. arxiv:1903.00995
- Tanqiu Jiang, Yi Li, Honghao Lin, Yisong Ruan, and David Woodruff. Learning-Augmented Data Stream Algorithms.
Proceedings of ICLR 2020. (No physical publication) Open Review link
- Yi Li, Ruosong Wang, and David Woodruff. Tight Bounds for the Subspace Sketch Problem with Applications.
Proceedings of SODA 2020, pp 1655--1674. arxiv:1904.05543
- Maria-Florina Balcan, Yi Li, David Woodruff, and Hongyang Zhang. Testing Matrix Rank, Optimally.
Proceedings of SODA 2019, pp 727--746. arxiv:1810.08171
- Yi Li and Vasileios Nakos. Deterministic Heavy Hitters with Sublinear Query Time.
Proceedings of APPROX/RANDOM 2018, LIPIcs Vol. 116, pp 18:1--18:18. arxiv:1712.01971
- Yi Li, Vasileios Nakos, and David Woodruff. On Low-Risk Heavy Hitters and Sparse Recovery Schemes.
Proceedings of APPROX/RANDOM 2018, LIPIcs Vol. 116, pp 19:1--19:13. arxiv:1709.02919
- Vladimir Braverman, Stephen Chestnut, Robert Krauthgamer, Yi Li, David Woodruff, and Lin Yang. Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order.
Proceedings of Machine Learning Research 80:648-657, 2018. (Proceedings of ICML 2018) arxiv:1609.05885
- Yi Li and Vasileios Nakos. Sublinear-Time Algorithms for Compressive Phase Retrieval.
Proceedings of ISIT 2018, pp 2301--2305.
- Yi Li and David Woodruff. Embeddings of Schatten Norms with Applications to Data Streams.
Proceedings of ICALP 2017, pp 60:1--60:14. pdf
- Sudipto Guha, Yi Li, and Qin Zhang. Distributed Partial Clustering.
Proceedings of SPAA 2017, pp 143--152. Co-winner of the Best Paper award.
- Yi Li and David Woodruff. Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings.
Proceedings of APPROX/RANDOM 2016, LIPIcs Vol. 60, 39:1--39:11. arXiv:2202.09797
- Yuqing Ai, Wei Hu, Yi Li, and David Woodruff. New Characterizations in Turnstile Streams with Applications.
Proceedings of CCC 2016, pp 20:1--20:22. pdf
- Yi Li and David Woodruff. On Approximating Functions of the Singular Values in a Stream.
Proceedings of STOC 2016, pp 767--780.
- Yi Li, Xiaoming Sun, Chengu Wang, and David Woodruff.
On The Communication Complexity of Linear Algebraic Problems in the Message Passing Model.
Proceedings of DISC 2014, pp 499--513. Full version: arxiv:1407.4755
- Yi Li, Zhengyu Wang, and David Woodruff. Improved Testing of Low Rank Matrices.
Proceedings of SIGKDD 2014, pp 691--700. One of nine best papers.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
For-all Sparse Recovery in Near-Optimal Time.
Proceedings of ICALP 2014, LNCS 8572, pp 538--550
- Yi Li, Huy Le Nguyen, and David Woodruff. Turnstile
Streaming Algorithms Might as Well Be Linear Sketches.
Proceedings of STOC 2014, pp 174--183. pdf
- Yi Li, Huy Le Nguyen, and David Woodruff. On Sketching
Matrix Norms and the Top Singular Vector.
Proceedings of SODA 2014, pp 1562--1581.
- Yi Li and David Woodruff. A Tight Lower
Bound for High Frequency Moment Estimation for Small
Error.
Proceedings of APPROX/RANDOM 2013, LNCS 8906, pp
623--638. pdf of full version
- Petros Boufounos, Volkan Cevher, Anna Gilbert, Yi Li, and
Martin Strauss. What's the Frequency, Kenneth?: Sublinear
Fourier Sampling Off the Grid.
Proceedings of APPROX/RANDOM 2012, LNCS 7408, pp 61-72.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
Approximate Sparse Recovery: Optimizing Time and
Measurements.
Proceedings of STOC 2010, pp 475-484.
Grants
- MOE AcRF Tier 2. Feb 2023 -- Jan 2026. Amount awarded: 489,600 SGD.
- MOE AcRF Tier 1. Mar 2022 -- Aug 2024. Amount awarded: 189,000 SGD.
- MOE AcRF Tier 2. Jan 2019 -- Mar 2022. Amount awarded: 593,050 SGD.
Talks (excluding conference presentations)
The lp-Subspace Sketch Problem. Basic Algorithms Research Copenhagen (BARC), University of Copenhagen, Mar 2023.
Lower Bounds for Sparse Oblivious Subspace Embeddings. Information Theory and Data Science Workshop. Institute of Mathematical Sciences, NUS, Jan 2023.
The lp-Subspace Sketch Problem. National University of Singapore, Dec 2022.
Lower Bounds for Sparse Oblivious Subspace Embeddings. Workshop on Algorithms and Foundations for Data Science. Institute of Mathematical Sciences, NUS, Jun 2022.
Some Linear Algebraic Problems from TCS Perspective. Shanghai Jiaotong University, Nov 2020.
The lp-Subspace Sketch Problem with Applications to Non-embeddability. Shanghai Jiaotong University, May 2019 and Xiamen University, Jun 2019.
The lp-Subspace Sketch Problem with Applications. Nanjing University, May 2019.
Matrix-related Problems in Data Streams. Workshop on Random Matrices, Stochastic Geometry and Related Topics. Institute of Mathematical Sciences, NUS, 2019.
Estimating the lp and Schatten Norms in Data Stream Model. NUS Department of Statistics Seminar Talk, 2019.
Estimating the Schatten Norms in Streaming Model. Shanghai Jiaotong University, 2018.
Estimating the Schatten Norms in Streaming Model. NII Shonan Meeting "Processing Big Data Streams", 2017.
Improved Sparse Recovery Algorithms. Dagstuhl Seminar 17181 "Theory and Applications of Hashing", 2017.
Sublinear-time Algorithms for the Sparse Recovery Problem. Workshop on Sparse Representations & Compressive Sensing, 2017.
Estimating the Schatten Norms in Streaming Model. DIMACS Workshop on E+M=C2, 2017.
Introduction to the Data Stream Algorithms. Xiamen University, 2016
Introduction to the Sparse Recovery Problem. Fuzhou University, 2016
Introduction to the Data Stream Algorithms. Math Colloquium, Nanyang Technological University, 2016.
Data Streaming Algorithms. MAS Seminar, Nanyang Technological University, 2015.
For-all Sparse Recovery in Near-Optimal Time. Theory of Computation Seminar, Harvard University, 2015.
A Brief Introduction to the Sublinear-time Sparse Recovery Problem. MPI for Informatics, 2014.
Sublinear Fourier Sampling Off the Grid. Workshop
on Sparse Fourier Transform, MIT, 2013.
Approximate Sparse Recovery: Optimizing Time and
Measurements. Minisymposium
on Combinatorics and Data Science, Shanghai Jiaotong
University, 2011.
Approximate Sparse Recovery: Optimizing Time and
Measurements. DIMACS
Workshop on Network Data Streaming and Compressive
Sensing, 2010.
Teaching
- At Nanyang Technological University
- MH1812. Discrete Mathematics.
- Autumn 2020*, 2021*, 2022*, 2023*, 2024*. (* denotes co-teaching with Jian Guo)
- MH2500. Introduction to Probability and Statistics.
- Autumn 2017*, 2018*, 2019. (* denotes co-teaching with Chan Song Heng)
- MH7009 (old code: MAS723). Topics in Probability and Statistics I: High-dimensional Probability
- MH4520. High-dimensional Probability.
- MH2401. Algorithms and Computing III.
- Autumn 2016. (co-teaching with Fedor Duzhin)
- At Shanghai Jiaotong University
- Summer 2018, 2021, 2022. Probability and Computing.
Services
Miscellaneous