Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU VW Lee, C Kim, J Chhugani, M Deisher, D Kim, AD Nguyen, N Satish, ... ACM SIGARCH Computer Architecture News 38 (3), 451-460, 2010 | 1214 | 2010 |
Clearpath: highly parallel collision avoidance for multi-agent simulation SJ Guy, J Chhugani, C Kim, N Satish, M Lin, D Manocha, P Dubey Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer …, 2009 | 478 | 2009 |
Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs C Kim, T Kaldewey, VW Lee, E Sedlar, AD Nguyen, N Satish, J Chhugani, ... Proceedings of the VLDB Endowment 2 (2), 1378-1389, 2009 | 439 | 2009 |
FAST: fast architecture sensitive tree search on modern CPUs and GPUs C Kim, J Chhugani, N Satish, E Sedlar, AD Nguyen, T Kaldewey, VW Lee, ... Proceedings of the 2010 international conference on Management of data, 339-350, 2010 | 438 | 2010 |
3.5-D blocking optimization for stencil computations on modern CPUs and GPUs A Nguyen, N Satish, J Chhugani, C Kim, P Dubey Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 399 | 2010 |
PLEdestrians: A Least-Effort Approach to Crowd Simulation. SJ Guy, J Chhugani, S Curtis, P Dubey, MC Lin, D Manocha Symposium on computer animation, 119-128, 2010 | 374 | 2010 |
Efficient implementation of sorting on multi-core SIMD CPU architecture J Chhugani, AD Nguyen, VW Lee, W Macy, M Hagog, YK Chen, A Baransi, ... Proceedings of the VLDB Endowment 1 (2), 1313-1324, 2008 | 334 | 2008 |
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort N Satish, C Kim, J Chhugani, AD Nguyen, VW Lee, D Kim, P Dubey Proceedings of the 2010 international conference on Management of data, 351-362, 2010 | 319 | 2010 |
Dyser: Unifying functionality and parallelism specialization for energy-efficient computing V Govindaraju, CH Ho, T Nowatzki, J Chhugani, N Satish, ... IEEE Micro 32 (5), 38-51, 2012 | 311 | 2012 |
Second life and the new generation of virtual worlds S Kumar, J Chhugani, C Kim, D Kim, A Nguyen, P Dubey, C Bienia, Y Kim Computer 41 (9), 46-53, 2008 | 274 | 2008 |
Fast updates on read-optimized databases using multi-core CPUs J Krueger, C Kim, M Grund, N Satish, D Schwalb, J Chhugani, H Plattner, ... arXiv preprint arXiv:1109.6885, 2011 | 192 | 2011 |
Compression-tolerant watermarking scheme for image authentication S Agarwal, A Aggarwal, HS Bassali, J Chhugani, PK Dubey US Patent 6,246,777, 2001 | 154 | 2001 |
Can traditional programming bridge the ninja performance gap for parallel computing applications? N Satish, C Kim, J Chhugani, H Saito, R Krishnaiyer, M Smelyanskiy, ... ACM SIGARCH Computer Architecture News 40 (3), 440-451, 2012 | 149 | 2012 |
Convergence of recognition, mining, and synthesis workloads and its implications YK Chen, J Chhugani, P Dubey, CJ Hughes, D Kim, S Kumar, VW Lee, ... Proceedings of the IEEE 96 (5), 790-807, 2008 | 148 | 2008 |
PALM: Parallel architecture-friendly latch-free modifications to B+ trees on many-core processors J Sewall, J Chhugani, C Kim, N Satish, P Dubey Proceedings of the VLDB Endowment 4 (11), 795-806, 2011 | 147 | 2011 |
Fast and efficient graph traversal algorithm for cpus: Maximizing single-node efficiency J Chhugani, N Satish, C Kim, J Sewall, P Dubey 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 120 | 2012 |
Mapping high-fidelity volume rendering for medical imaging to CPU, GPU and many-core architectures M Smelyanskiy, D Holmes, J Chhugani, A Larson, DM Carmean, ... Visualization and Computer Graphics, IEEE Transactions on 15 (6), 1563-1570, 2009 | 113 | 2009 |
Compressing large boolean matrices using reordering techniques D Johnson, S Krishnan, J Chhugani, S Kumar, S Venkatasubramanian Proceedings of the Thirtieth international conference on Very large data …, 2004 | 104* | 2004 |
Vector instructions to enable efficient synchronization and parallel reduction operations M Smelyanskiy, S Kumar, D Kim, J Chhugani, C Kim, CJ Hughes, VW Lee, ... US Patent 9,513,905, 2016 | 102 | 2016 |
Matrix factorizations at scale: A comparison of scientific data analytics in Spark and C+ MPI using three case studies A Gittens, A Devarakonda, E Racah, M Ringenburg, L Gerhardt, ... 2016 IEEE International Conference on Big Data (Big Data), 204-213, 2016 | 86 | 2016 |