Simon J. Puglisi

Simon J. Puglisi always looks tired these days
[Simon J. Puglisi always looks tired these days.]
ARC Postdoctoral Fellow, January 2008 -
School of Computer Science & Information Technology
Royal Melbourne Institute of Technology
GPO Box 2476V,
Melbourne, VIC 3001
Australia
email: sjp [at] cs [.] rmit [.] edu [.] au
phone: +61 3 9925 3167
facsimile: +61 3 9662 1617
office: Building 14, Level 9, Room 3
I'm a postdoc at RMIT in the School of Computer Science & IT and a member of the Search Engine Group.

Research Interests

Just about any algorithmic problem; particularly, those to do with text; even more particularly, those problems relating to large texts and text collections; even more particularly...
Algorithms and Data Structures for information retrieval
Combinatorics and periodicity properties of strings
The Burrows-Wheeler transformation
Suffix Array construction and use
Compressed full text indexing
Lossless data compression
Computational biology

Refereed Publications

2010

31) Travis Gagie, Gonzalo Navarro, Simon J. Puglisi, Colored Range Queries and Document Retrieval, Proceedings of the 17th Symposium on String Processing and Information Retrieval (SPIRE'10), Edgar Chavez, Stefano Lonardi (Eds.), to appear (2010).

30) Shanika Kuruppu, Simon J. Puglisi & Justin Zobel, Relative Lempel-Ziv Compression of Genomes for Large-Scale Storage and Retrieval, Proceedings of the 17th Symposium on String Processing and Information Retrieval (SPIRE'10), Edgar Chavez, Stefano Lonardi (Eds.), to appear (2010).

29) Juha Karkkainen & Simon J. Puglisi, Medium-Space Algorithms for Inverse BWT, Proceedings of the 18th European Symposium on Algorithms (ESA'10), Mark de Berg, Ulrich Meyer (Eds.), to appear (2010).

28) J. Shane Culpepper, Gonzalo Navarro, Simon J. Puglisi & Andrew Turpin, Top-k Ranked Document Search in General Text Databases, Proceedings of the 18th European Symposium on Algorithms (ESA'10), Mark de Berg, Ulrich Meyer (Eds.), to appear (2010).

27) Jasbir Dhaliwal, Simon J. Puglisi & Andrew Turpin, Practical Efficient String Mining, IEEE Transactions on Knowledge and Data Engineering (TKDE), in submission (2010).

26) Simon J. Puglisi, W. F. Smyth & Munina Yusufu, Fast optimal algorithms for computing all the repeats in a string, Mathematics in Computer Science (MCS), Special Issue on Combinatorial Algorithms, 3-4 (2010) pp. 373- [link to paper].

25) Mingfang Wu, Andrew Turpin, Simon J. Puglisi, Falk Scholer, James A. Thom, Presenting Query Aspects to Support Exploratory Search, Australasian User Interface Conference (AUIC'10), Brisbane, Australia, pp. 27-36 (2010).

2009

24) Jan Schroder, Heiko Schroder, Simon J. Puglisi, Ranjan Sinha & Bertil Schmidt, SHREC: A short-read error correction method, Bioinformatics, 25-17 (2009) pp. 2157-2163.[link to paper]

23) Bertil Schmidt, Bryan Beresford-Smith, Ranjan Sinha & Simon J. Puglisi, A fast hybrid short read fragment assembly algorithm, Bioinformatics, 25-17 (2009) pp. 2279-2280.[link to paper]

22) Ryszard Janicki, Simon J. Puglisi, M. Sohel Raman, Guest editors' preface, Fundamenta Informaticae, Special Issue on String Algorithms, 97-3 (2009) pp. i-ii.

21) Travis Gagie, Simon J. Puglisi & Andrew Turpin, Range quantile queries: another virtue of wavelet trees, Proceedings of the 16th Symposium on String Processing and Information Retrieval (SPIRE'09), J. Karlgren, J. Tarhio, H. Hyyro (Eds.), LNCS 5721, pp. 1-6. Springer, Heidelberg (2009).

20) Juha Karkkainen, Giovanni Manzini & Simon J. Puglisi, Permuted longest-common-prefix array, Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching (CPM '09), G. Kucherov and E. Ukkonen (Eds.), LNCS 5577, pp. 181-192. Springer, Heidelberg (2009).[link to paper]

19) Serdar Boztas, Simon J. Puglisi & Andrew Turpin, Testing stream ciphers by finding the longest substring of a given density, Proceedings of the 14th Australasian Conference on Information Security and Privacy (ACISP '09), C. Boyd, J. M. Gonzalez Nieto (Eds.) LNCS 5594, pp. 122-133. Springer, Heidelberg (2009).

18) Alistair Moffat, Simon J. Puglisi & Ranjan Sinha, Reducing the space requirements for disk resident suffix arrays, Proceedings of the 14th International Conference on Database Systems for Advanced Applications (DASFAA '09), X. Zhou, H. Yokota, K. Deng, Q. Liu (Eds.) LNCS 5463, pp. 730-744. Springer, Heidelberg (2009).

17) Yohannes Tsegay, Simon J. Puglisi, Andrew Turpin & Justin Zobel, Document compaction for efficient query biased snippet generation, Proceedings of the 31st European Conference on Information Retrieval (ECIR '09), M. Boughanem, C. Berrut, J. Mothe, C. Soule-Dupuy (Eds.) LNCS 5478, pp. 509-520. Springer, Heidelberg (2009).

2008

16) Jamie Simpson & Simon J. Puglisi, Words with simple Burrows-Wheeler transforms, Electronic Journal of Combinatorics Volume 15, Article #R83 (2008).[link to paper]

15) Simon J. Puglisi & Jamie Simpson, The expected number of runs in a word, Australasian Journal of Combinatorics Volume 42 (2008), pp. 45-54.[full paper][link to journal]

14) Gang Chen, Simon J. Puglisi & W. F. Smyth, LZ factorization in less time and space, Mathematics in Computer Science (MCS) Special Issue on Combinatorial Algorithms 1-4 (2008) pp. 605-623.[link to paper][code]

13) Simon J. Puglisi, Jamie Simpson & W. F. Smyth, How many runs can a string contain?, Theoretical Computer Science 401-1 (2008) pp. 165-171.[link to paper]

12) Simon J. Puglisi & Andrew Turpin, Space-time tradeoffs for longest-common-prefix array computation, 19th International Symposium on Algorithms and Computation (ISAAC'08), S.-H. Hong, H. Nagamochi and T. Fukunaga (Eds.) (2008) pp. 124-135.[code]

11) Simon J. Puglisi, W. F. Smyth & Munina Yusufu, Fast optimal algorithms for computing all the repeats in a string, Proceedings of the Prague Stringology Conference (PSC'08), Jan Holub and Jan Zdarek (Eds.) (2008) pp. 161-169.

10) Ranjan Sinha, Simon J. Puglisi, Alistair Moffat & Andrew Turpin, Improving suffix array locality for fast pattern matching on disk, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Jason Tsong-Li Wang (Ed.) (2008) pp. 661-672.[link to paper]

2007

9) Michael A. Maniscalco & Simon J. Puglisi, An efficient, versatile approach to suffix sorting, ACM Journal of Experimental Algorithmics Volume 12, Article 1.2 (2007)

8) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A taxonomy of suffix array construction algorithms, ACM Computing Surveys 39-2 (2007) pp. 1-31.

7) Gang Chen, Simon J. Puglisi & W. F. Smyth, Fast and practical algorithms for computing all the runs in a string, 18th Annual Symposium on Combinatorial Pattern Matching (CPM'07), B. Ma & K. Zhang (Eds.) (2007) pp. 307-315.

2006

6) Kangmin Fan, Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A new periodicity lemma, SIAM Journal on Discrete Mathematics 20-3 (2006) pp. 656-668.

5) Michael A. Maniscalco & Simon J. Puglisi, Faster lightweight suffix array construction, 17th Australasian Workshop on Combinatorial Algorithms (AWOCA'06), Joe Ryan & Dafik (Eds.) (2006) pp. 16-29.[pdf]

4) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, Inverted files versus suffix arrays for locating patterns in primary memory, 13th Symposium on String Processing and Information Retrieval (SPIRE'06), Fabio Crestiani, Paolo Ferragina & Mark Sanderson (Eds.) (2006) pp. 122-133.[link to paper]

2005

3) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, Some restrictions on periodicity in strings, Proceedings of the 16th Australasian Workshop on Combinatorial Algorithms (AWOCA'05), Joe Ryan, Prabhu Manyem, Kiki Sugeng & Mirka Miller (Eds.) (2005) pp. 263-268.

2) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A taxonomy of suffix array construction algorithms, Proceedings of the Prague Stringology Conference (PSC'05), Jan Holub (Ed.) (2005) pp. 1-30.

1) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, The performance of linear time suffix sorting algorithms, Proceedings of the IEEE Data Compression Conference (DCC'05), Jim Storer & Martin Cohn (Eds.) (2005) pp. 358-367.

Technical Reports

1) Simon J. Puglisi, Exposition and analysis of a suffix sorting algorithm, TR No CAS-05-02-WS, May 2005, Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada.[ps]

2) Simon J. Puglisi & Andrew Turpin, Wavelet trees for range quantile queries and related problems, TR-09-1, January 2009, School of Computer Science and Information Technology, RMIT University, Melbourne, Victoria, Australia.[link to abstract]

3) Shanika Kuruppu, Simon J. Puglisi & Justin Zobel, Relative Lempel-Ziv parsing for storing and accessing similar genomes, TR-09-3, October 2009, School of Computer Science and Information Technology, RMIT University, Melbourne, Victoria, Australia.[link to abstract]

Grants

Indexes Allowing Fast and Efficient Text Search,
ARC Discovery Grant, $AU235,944 (with Andrew Turpin and Bill Smyth).

Extracting Query Semantics from Past Query Sequences to Support Exploratory Search, Microsoft Live Labs Award, $US34,100 (with Mingfang Wu, Andrew Turpin, Falk Scholer, James Thom).

Research Activities

I am on the program committee for
CPM 2011.
I am on the program committee for IWOCA 2010.
I was on the program committee for ACM CIKM 2009.
I was on the program committee for SPIRE 2008.
I was on the organizing committee for SPIRE 2008.
William Webber occasionally says some interesting things.
I occasionally review papers for the following venues:

Life

I like to spend most of my time with
Esme and now Abigail.
(Abby Abby Abby Abby).
I try to travel overseas as much as possible.
I enjoy listening to Arvo Pärt.
Science on the Edge.
The Long Now Foundation.
The Saws by Robert Pinsky.
First Idyll by Susan Stewart.
Next Door by Jessica Greenbaum.