My research focuses on efficient and effective representations for large-scale search engines, including indexing, compression, and retrieval. I am also interested in understanding how to measure improvements in the end-to-end search pipeline, including system-oriented effectiveness measurements and user behaviour analysis. I have a broad interest in empirical experimentation, operating systems, data structures, and algorithms. Please see my personal site for more details.
2024
What do Users Really Ask Large Language Models? An Initial Log Analysis of Google Bard Interactions in the Wild
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
10 Jul 2024
·
10.1145/3626772.3657914
ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information Retrieval
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
10 Jul 2024
·
10.1145/3626772.3657994
Revisiting Document Expansion and Filtering for Effective First-Stage Retrieval
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
10 Jul 2024
·
10.1145/3626772.3657850
How much freedom does an effectiveness metric really have?
Journal of the Association for Information Science and Technology
·
15 Feb 2024
·
10.1002/asi.24874
2023
Lossy Compression Options for Dense Index Retention
Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
·
26 Nov 2023
·
10.1145/3624918.3625316
Exploring the Representation Power of SPLADE Models
Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval
·
09 Aug 2023
·
10.1145/3578337.3605129
ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information Retrieval
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
18 Jul 2023
·
10.1145/3539618.3591922
Profiling and Visualizing Dynamic Pruning Algorithms
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
18 Jul 2023
·
10.1145/3539618.3591806
Efficient immediate-access dynamic indexing
Information Processing & Management
·
01 May 2023
·
10.1016/j.ipm.2022.103248
Index-Based Batch Query Processing Revisited
Lecture Notes in Computer Science
·
01 Jan 2023
·
10.1007/978-3-031-28241-6_6
2022
Efficient Document-at-a-Time and Score-at-a-Time Query Evaluation for Learned Sparse Representations
ACM Transactions on Information Systems
·
15 Dec 2022
·
10.1145/3576922
Immediate-Access Indexing Using Space-Efficient Extensible Arrays
Proceedings of the 26th Australasian Document Computing Symposium
·
15 Dec 2022
·
10.1145/3572960.3572984
A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
06 Jul 2022
·
10.1145/3477495.3531657
Faster Learned Sparse Retrieval with Guided Traversal
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
06 Jul 2022
·
10.1145/3477495.3531774
A Flexible Framework for Offline Effectiveness Metrics
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
06 Jul 2022
·
10.1145/3477495.3531924
Report on the 25th Australasian Document Computing Symposium (ADCS 2021)
ACM SIGIR Forum
·
01 Jun 2022
·
10.1145/3582524.3582532
Efficient query processing techniques for next-page retrieval
Information Retrieval Journal
·
18 Jan 2022
·
10.1007/s10791-021-09402-7
A Sensitivity Analysis of the MSMARCO Passage Collection
arXiv
·
12 Jan 2022
·
arXiv:2112.03396
Tradeoff Options for Bipartite Graph Partitioning
IEEE Transactions on Knowledge and Data Engineering
·
01 Jan 2022
·
10.1109/tkde.2022.3208902
Accelerating Learned Sparse Indexes Via Term Impact Decomposition
Findings of the Association for Computational Linguistics: EMNLP 2022
·
01 Jan 2022
·
10.18653/v1/2022.findings-emnlp.205
2021
Cost-Effective Updating of Distributed Reordered Indexes
Australasian Document Computing Symposium
·
09 Dec 2021
·
10.1145/3503516.3503528
Anytime Ranking on Document-Ordered Indexes
ACM Transactions on Information Systems
·
08 Sep 2021
·
10.1145/3467890
Modality Effects When Simulating User Querying Tasks
Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval
·
11 Jul 2021
·
10.1145/3471158.3472244
ERR is not C/W/L: Exploring the Relationship Between Expected Reciprocal Rank and Other Metrics
Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval
·
11 Jul 2021
·
10.1145/3471158.3472239
Faster Index Reordering with Bipartite Graph Partitioning
Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
11 Jul 2021
·
10.1145/3404835.3462991
Different Keystrokes for Different Folks
Proceedings of the 2021 Conference on Human Information Interaction and Retrieval
·
14 Mar 2021
·
10.1145/3406522.3446054
2020
Examining the Additivity of Top-k Query Processing Innovations
Proceedings of the 29th ACM International Conference on Information & Knowledge Management
·
19 Oct 2020
·
10.1145/3340531.3412000
CC-News-En
Proceedings of the 29th ACM International Conference on Information & Knowledge Management
·
19 Oct 2020
·
10.1145/3340531.3412762
Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
·
25 Jul 2020
·
10.1145/3397271.3401404
Efficiency Implications of Term Weighting for Passage Retrieval
Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
·
25 Jul 2020
·
10.1145/3397271.3401263
Managing tail latency in large scale information retrieval systems
ACM SIGIR Forum
·
01 Jun 2020
·
10.1145/3451964.3451982
2019
Boosting Search Performance Using Query Variations
ACM Trans. Inf. Syst.
·
04 Oct 2019
·
https://doi.org/10.1145/3345001
Accelerated Query Processing Via Similarity Score Prediction
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
·
18 Jul 2019
·
10.1145/3331184.3331207
Exploring User Behavior in Email Re-Finding Tasks
The World Wide Web Conference
·
13 May 2019
·
10.1145/3308558.3313450
Compressing Inverted Indexes with Recursive Graph Bisection: A Reproducibility Study
Lecture Notes in Computer Science
·
01 Jan 2019
·
10.1007/978-3-030-15712-8_22
2018
Revisiting Spam Filtering in Web Search
Proceedings of the 23rd Australasian Document Computing Symposium
·
11 Dec 2018
·
10.1145/3291992.3291999
Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
·
02 Feb 2018
·
https://arxiv.org/abs/1704.03970
Query Driven Algorithm Selection in Early Stage Retrieval
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
·
02 Feb 2018
·
10.1145/3159652.3159676
On the Cost of Negation for Dynamic Pruning
Lecture Notes in Computer Science
·
01 Jan 2018
·
10.1007/978-3-319-76941-7_42
2017
Early Termination Heuristics for Score-at-a-Time Index Traversal
Proceedings of the 22nd Australasian Document Computing Symposium
·
07 Dec 2017
·
10.1145/3166072.3166073
Managing Tail Latencies in Large Scale IR Systems
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
·
07 Aug 2017
·
10.1145/3077136.3084152
A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation
Proceedings of the Tenth ACM International Conference on Web Search and Data Mining
·
02 Feb 2017
·
10.1145/3018661.3018726
2015
Efficient Location-Aware Web Search
Proceedings of the 20th Australasian Document Computing Symposium
·
08 Dec 2015
·
10.1145/2838931.2838933