Simon J. Puglisi

Simon J. Puglisi swanning around Fitzroy Gardens
[Simon J. Puglisi swanning around Fitzroy Gardens, not far from his home.]
ARC Postdoctoral Fellow, January 2008 -
School of Computer Science & Information Technology
Royal Melbourne Institute of Technology
GPO Box 2476V,
Melbourne, VIC 3001
Australia
email: sjp [at] cs [.] rmit [.] edu [.] au
phone: +61 3 9925 3167
facsimile: +61 3 9662 1617
office: Building 14, Level 9, Room 3
I'm a postdoc at RMIT in the School of Computer Science & IT and a member of the Search Engine Group.

Research Interests

Just about any algorithmic problem; particularly, those to do with text; even more particularly, those problems relating to large texts and text collections; even more particularly...
Algorithms and Data Structures for information retrieval
Combinatorics and periodicity properties of strings
The Burrows-Wheeler transformation
Suffix Array construction and use
Compressed full text indexing
Lossless data compression
Computational biology

Refereed Publications

2009

23) Jan Schroder, Heiko Schroder, Simon J. Puglisi, Ranjan Sinha & Bertil Schmidt, SHREC: A short-read error correction method, Bioinformatics, To appear.

22) Bertil Schmidt, Bryan Beresford-Smith, Ranjan Sinha & Simon J. Puglisi, A fast hybrid short read fragment assembly algorithm, Bioinformatics, To appear.

21) Travis Gagie, Simon J. Puglisi & Andrew Turpin, Range quantile queries: another virtue of wavelet trees, Proceedings of the 16th Symposium on String Processing and Information Retrieval (SPIRE'09), (2009) To appear.

20) Juha Karkkainen, Giovanni Manzini & Simon J. Puglisi, Permuted longest-common-prefix array, Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching (CPM '09), (2009) To appear.

19) Serdar Boztas, Simon J. Puglisi & Andrew Turpin, Testing stream ciphers by finding the longest substring of a given density, Proceedings of the 14th Australasian Conference on Information Security and Privacy (ACISP '09), (2009) To appear.

18) Alistair Moffat, Simon J. Puglisi & Ranjan Sinha, Reducing the space requirements for disk resident suffix arrays, Proceedings of the 14th International Conference on Database Systems for Advanced Applications (DASFAA '09), X. Zhou, H. Yokota, K. Deng, Q. Liu (eds.) (2009) pp. 730-744.

17) Yohannes Tsegay, Simon J. Puglisi, Andrew Turpin & Justin Zobel, Document compaction for efficient query biased snippet generation, Proceedings of the 31st European Conference on Information Retrieval (ECIR '09), M. Boughanem, C. Berrut, J. Mothe, C. Soule-Dupuy (eds.) (2009) pp. 509-520.

2008

16) Jamie Simpson & Simon J. Puglisi, Words with simple Burrows-Wheeler transforms, Electronic Journal of Combinatorics Volume 15, Article #R83 (2008).[link to paper]

15) Simon J. Puglisi & Jamie Simpson, The expected number of runs in a word, Australasian Journal of Combinatorics Volume 42 (2008), pp. 45-54.[full paper][link to journal]

14) Gang Chen, Simon J. Puglisi & W. F. Smyth, LZ factorization in less time and space, Mathematics in Computer Science (MCS) Special Issue on Combinatorial Algorithms 1-4 (2008) pp. 605-623.[link to paper][code]

13) Simon J. Puglisi, Jamie Simpson & W. F. Smyth, How many runs can a string contain?, Theoretical Computer Science 401-1 (2008) pp. 165-171.[link to paper]

12) Simon J. Puglisi & Andrew Turpin, Space-time tradeoffs for longest-common-prefix array computation, 19th International Symposium on Algorithms and Computation (ISAAC'08), S.-H. Hong, H. Nagamochi and T. Fukunaga (eds.) (2008) pp. 124-135.[code]

11) Simon J. Puglisi, W. F. Smyth & Munina Yusufu, Fast optimal algorithms for computing all the repeats in a string, Proceedings of the Prague Stringology Conference (PSC'08), Jan Holub and Jan Zdarek (eds.) (2008) pp. 161-169.

10) Ranjan Sinha, Simon J. Puglisi, Alistair Moffat & Andrew Turpin, Improving suffix array locality for fast pattern matching on disk, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Jason Tsong-Li Wang (ed.) (2008) pp. 661-672.[link to paper]

2007

9) Michael A. Maniscalco & Simon J. Puglisi, An efficient, versatile approach to suffix sorting, ACM Journal of Experimental Algorithmics Volume 12, Article 1.2 (2007)

8) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A taxonomy of suffix array construction algorithms, ACM Computing Surveys 39-2 (2007) pp. 1-31.

7) Gang Chen, Simon J. Puglisi & W. F. Smyth, Fast and practical algorithms for computing all the runs in a string, 18th Annual Symposium on Combinatorial Pattern Matching (CPM'07), Bin Ma & Kaizhong Zhang (eds.) (2007) pp. 307-315.

2006

6) Kangmin Fan, Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A new periodicity lemma, SIAM Journal on Discrete Mathematics 20-3 (2006) pp. 656-668.

5) Michael A. Maniscalco & Simon J. Puglisi, Faster lightweight suffix array construction, 17th Australasian Workshop on Combinatorial Algorithms (AWOCA'06), Joe Ryan & Dafik (eds.) (2006) pp. 16-29.[pdf]

4) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, Inverted files versus suffix arrays for locating patterns in primary memory, 13th Symposium on String Processing and Information Retrieval (SPIRE'06), Fabio Crestiani, Paolo Ferragina & Mark Sanderson (eds.) (2006) pp. 122-133.[link to paper]

2005

3) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, Some restrictions on periodicity in strings, Proceedings of the 16th Australasian Workshop on Combinatorial Algorithms (AWOCA'05), Joe Ryan, Prabhu Manyem, Kiki Sugeng & Mirka Miller (eds.) (2005) pp. 263-268.

2) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, A taxonomy of suffix array construction algorithms, Proceedings of the Prague Stringology Conference (PSC'05), Jan Holub (ed.) (2005) pp. 1-30.

1) Simon J. Puglisi, W. F. Smyth & Andrew Turpin, The performance of linear time suffix sorting algorithms, Proceedings of the IEEE Data Compression Conference (DCC'05), Jim Storer & Martin Cohn (eds.) (2005) pp. 358-367.

Technical Reports

1) Simon J. Puglisi, Exposition and analysis of a suffix sorting algorithm, TR No CAS-05-02-WS, Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada.[ps]

2) Simon J. Puglisi & Andrew Turpin, Wavelet Trees for Range Quantile Queries and Related Problems, TR-09-1, School of Computer Science and Information Technology, RMIT University, Melbourne, Victoria, Australia.[link to abstract]

Grants

Indexes Allowing Fast and Efficient Text Search,
ARC Discovery Grant, $AU235,944 (with Andrew Turpin and Bill Smyth).

Extracting Query Semantics from Past Query Sequences to Support Exploratory Search, Microsoft Live Labs Award, $US34,100 (with Mingfang Wu, Andrew Turpin, Falk Scholer, James Thom).

Research Activities

I am on the program committee for
ACM CIKM 2008.
I was on the organizing and program committees for SPIRE 2008.
William Webber occasionally says some interesting things.
I occasionally review papers for the following venues:

Life

I like to spend most of my time with
Esme.
I try to travel overseas as much as possible.
I enjoy listening to Arvo Pärt.
Science on the Edge.
The Long Now Foundation.
The Saws by Robert Pinsky.
First Idyll by Susan Stewart.