Renata Borovica-Gajic

I am a Senior Lecturer in Data Analytics and ARC DECRA Fellow in the School of Computing and Information Systems (CIS), as well as Associate Dean (Diversity and Inclusion) for the Faculty of Engineering and IT at the University of Melbourne. My research lies at the intersection of database systems, machine learning and artificial intelligence, data-driven optimisation and analytics. To optimize performance and cost of data analytics services, I envision database systems as adaptive self-driving engines able to adjust their query execution strategy at runtime to fit the characteristics of the queries, data and underlying hardware. I am also interested in the topics of scientific data management, data exploration, query optimization, physical database design, data cleaning, and data-driven (traffic) optimization. The quality and impact of my work is recognised with several notable awards, including L'Oréal-UNESCO for Women in Science Fellowship in 2023, 2x Research Excellence Awards (for Early Career Researcher in 2022 and Mid Career Researcher in 2023), 2x Excellence in Teaching and Learning Awards (in 2018 and 2020), Google Research Inclusion Award in 2021, as well as Test of Time Award (at SIGMOD 2022).

Renata Borovica-Gajic photo

Research Projects

Learned indexes with Machine Learning

Learned indexes leverage Machine Learning (ML) models to approximate the distribution of data towards building auxiliary data structures that can replace traditional database indexes. Learned indexes deliver the promise of reduced memory footprint and faster data retrieval. We investigate the efficiency of existing learned models, and propose advancements with respect to the model complexity as well as suitability of the learned indexes for the on-disk placement. [ADC'20] [DBML@ICDE'23] [SIGMOD'23] [ICDE'24]

Database performance tuning with Multi-armed Bandits

Performance tuning is a crucial step in database preparation since it enables fast query responses to user enquiries. This project aims to fully automate performance tuning of databases by choosing physical design structures that speed up query analysis. Our approach removes the need for costly human personnel and automatically adjusts to user queries (and data) whilst they are doing analysis. DBA Bandits are not only able to successfully tune unpredictable and ad hoc workloads (unknown a priori), but also provide statistical guarantees on the fitness of proposed design structures – first such guarantees in the world of database performance tuning. [ICDE'21] [ICDM'21] [VLDB'22] [KAIS'23] [TKDE'23]

Data-driven traffic flow and assignment optimization with Reinforcement Learning

Thanks to the ubiquitous access to the Internet, on-demand ride-sharing services have emerged to provide timely and convenient rides to passengers. A fundamental problem of ride-sharing is determining how to dispatch vehicles to passengers in order to minimize both the waiting and travel time of passengers, as well as maximize the overall traffic flow (i.e., minimise travel congestion). With this line of work we: i) developed highly scalable algorithms that empower individuals in finding trips that minimize their overall travel times in ride-sharing scenarios, and explore how the customer location impacts optimal vehicle dispatching, and ii) optimise the road network configuration to maximise the traffic flow. [SIGSPATIAL'20] [SSDBM'20] [ECML/PKDD'20] [IV'22] [SIGSPATIAL'22] [SIGSPATIAL'22(demo)] [SIGSPATIAL'22(industrial paper)] [TIST'23]

Improving data quality through domain-driven imputation

Dirty data (consisting of incorrect or, more frequently, missing values) has severe implications on real-life applications (e.g. the map based on incomplete data leading drivers to turn from a road into the lake ). This project aims to improve the quality of OSM spatial data that many applications such as Google Maps rely on, by harnessing the spatial properties of such data (e.g. closeness, type of objects, etc.) to impute missing values. This line of work was the first to use the intrinsic properties of the data to improve its quality without relying on external sources of information, and as such, is a first step towards developing databases that “self-heal”, i.e., automatically improve the quality of its data, by learning from the data itself. [TGIS'20] [IJGIS'21]

Cheap data analysis with Skipper

This project looks at exploiting new hardware to decrease the cost of data analytics storage infrastructure. In particular, Skipper is an end-to-end query processing framework built by using cold storage devices as an underlying storage medium. Skipper employs a (cold) storage-driven query execution model based on multi-way joins combined with efficient cache management and I/O scheduling strategies to hide the non-uniform access latencies of cold storage. As a result, Skipper offers performance comparable to the performance of systems storing data on HDD, with a substantially lower cost. [VLDB'16] [CACM'19]

Predictable query performance with Smooth Scan

Smooth Scan aims at preventing performance degradation coming from suboptimal query plan decisions. To achieve optimal performance for all query inputs, Smooth Scan uses continuous adaptation and morphing at runtime by transforming from one physical alternative to another (i.e., from an index access to sequential scan). With Smooth Scan, an access path strategy is progressively transformed into an optimal form based on the operator selectivity and result distribution observed during query execution. As a result, a system with Smooth Scan requires no access path decisions up front nor does it need accurate statistics to provide good performance. [ICDE'15] [VLDBJ'18]

Fast data exploration with NoDB

NoDB enables fast and efficient data exploration by removing data loading as a prerequisite to data querying. Instead, NoDB enables query processing capabilities directly over data files, and uses the user queries as a driver for partial and incremental data loading and performance tuning. To mask the overhead of raw file data access and speed up future queries, NoDB builds and progressively refines auxiliary design structures (positional indexes, data caches and statistics) during query execution. As a result, NoDB matches the performance of traditional database systems that process already loaded data. [SIGMOD'12] [VLDB'12 (demo)] [CACM'15]

Awards

L'Oréal-UNESCO for Women in Science Fellowship, 2023. Link

Research Excellence Award for Mid-Career Researcher, The University of Melbourne, 2023. Link

Test of Time Award at SIGMOD 2022 (for SIGMOD 2012 NoDB paper). Link

Excellence Award in Early Career Research, Faculty of Engineering and Information Technology, The University of Melbourne, 2022. Link

Google Award for Research Inclusion, 2021. Link

Excellence Award in Teaching and Learning for Early Career Academics, Faculty of Engineering and Information Technology, The University of Melbourne, 2020. Link

Excellence Award in Teaching and Learning, School of Computing and Information Systems, The University of Melbourne, 2018.

Publications

VLDB, 2024. SeLeP: Learning Based Semantic Prefetching for Exploratory Database Workloads. F. Zirak, F. Choudhury, and R. Borovica-Gajic. PDF Code (to appear)

ICDE, 2024. A Fully On-disk Updatable Learned Index. H. Lan, Z. Bao, S. Culpepper, R. Borovica-Gajic and Y. Dong. PDF Slides Code (to appear)

SIGMOD Record, 2023. Reminiscences on Influential Papers. R. Borovica-Gajic. PDF Link BibTex

TKDE, 2023. No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees. M. Perera, B. Oetomo, B. I. P. Rubinstein, and R. Borovica-Gajic. PDF Link BibTex

TIST, 2023. Real-time Road Network Optimization with Coordinated Reinforcement Learning. U. Gunarathna, H. Xie, E. Tanin, S. Karunasekera, and R. Borovica-Gajic. PDF Link BibTex

SIGMOD, 2023. Updatable Learned Indexes Meet Disk-Resident DBMS - From Evaluations to Design Choices. H. Lan, Z. Bao, S. Culpepper, and R. Borovica-Gajic. PDF Slides Link BibTex Code

DBML@ICDE, 2023. Efficient Index Learning via Model Reuse and Fine-tuning. G. Liu, J. Qi, L. Kulik, K. Soga, R. Borovica-Gajic, and B. I. P. Rubinstein. PDF Slides Link BibTex

KAIS, 2023. Cutting to the Chase with Warm-Start Contextual Bandits. B. Oetomo, M. Perera, R. Borovica-Gajic and B. I. P. Rubinstein. PDF Link BibTex

VLDB, 2022. HMAB: Self-Driving Hierarchy of Bandits for Integrated Physical Database Design Tuning. M. Perera, B. Oetomo, B. I. P. Rubinstein, and R. Borovica-Gajic. PDF Slides Link Video BibTex Code

Dagstuhl Reports 12(3), 2022. Database Indexing and Query Processing (Dagstuhl Seminar 22111). R. Borovica-Gajic, G. Graefe, A. Lee, C. Sauer, and P. Tozun. PDF Link BibTex

CSUR, 2022. Energy Efficient Computing Systems: Architectures, Abstractions and Modeling to Techniques and Standards. R. Muralidhar, R. Borovica-Gajic, and R. Buyya. PDF Link BibTex

DEBS, 2022. A Multi-level Caching Architecture for Stateful Stream Computation. T. Islam, R. Borovica-Gajic, and S. Karunasekera. PDF Slides Link Video BibTex (Best Student Paper)

SIGSPATIAL, 2022. Dynamic Graph Combinatorial Optimization with Multi-Attention Deep Reinforcement Learning. U. Gunarathna, R. Borovica-Gajic, S. Karunasekera, and E. Tanin. PDF Slides Link Video BibTex

SIGSPATIAL, 2022. e-SMARTS: A System to Simulate Intelligent Traffic Management Solutions (Demo Paper). U. Gunarathna, R. Borovica-Gajic, S. Karunasekera, and E. Tanin. PDF Slides Link Video BibTex

SIGSPATIAL, 2022. A Simulation Study on Prioritizing Connected Freight Vehicles at Intersections for Traffic Flow Optimization (Industrial Paper). H. Xie, R. Borovica-Gajic, E. Tanin, S. Karunasekera, U. Gunarathna, G. Oppy, and M. Sarvi. PDF Slides Link Video BibTex

IV, 2022. Real-Time Intelligent Autonomous Intersection Management Using Reinforcement Learning. U. Gunarathna, S. Karunasekera, R. Borovica-Gajic, and E. Tanin. PDF Slides Link BibTex

IJGIS, 2022. Can you fixme? An intrinsic classification of contributor-identified spatial data issues using topic models. R. C. Sundaram, E. Naghizade, R. Borovica-Gajic, and M. Tomko. PDF Link BibTex

ICDM, 2021. Cutting to the Chase with Warm-Start Contextual Bandits. B. Oetomo, M. Perera, R. Borovica-Gajic and B. I. P. Rubinstein. PDF Slides Link Video BibTex

ICDE, 2021. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. M. Perera, B. Oetomo, B. I. P. Rubinstein, and R. Borovica-Gajic. PDF Slides Link Video BibTex Code

SIGSPATIAL, 2020. Highly Efficient and Scalable Multi-hop Ride-sharing. Y. Xu, L. Kulik, R. Borovica-Gajic, A. Aldwyish, and J. Qi. PDF Slides Link BibTex

ECML/PKDD, 2020. Real-time Lane Configuration with Coordinated Reinforcement Learning. U. Gunarathna, H. Xie, E. Tanin, S. Karunasekera, and R. Borovica-Gajic. PDF Slides Link BibTex

SSDBM, 2020. GeoPrune: Efficiently Matching Trips in Ride-sharing Through Geometric Properties. Y. Xu, J. Qi, R. Borovica-Gajic, and L. Kulik. PDF Slides Link BibTex

ICDE, 2020. CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs. M. Li, F. M. Choudhury, R. Borovica-Gajic, Z. Wang, J. Xin, and J. Li. PDF Slides Link BibTex

TGIS, 2020. Harnessing spatio-temporal patterns in data for nominal attribute imputation. R. C. Sundaram, E. Naghizade, R. Borovica-Gajic, and M. Tomko. PDF Link BibTex

ADC, 2020. Function Interpolation for Learned Index Structures. N. F. Setiawan, B. I. P. Rubinstein, and R. Borovica-Gajic. PDF Slides Link BibTex  (Best Student Paper Honorable Mention)

(Book chapter) Database System Concepts, 7th edition, by Silberschatz, Korth and Sudarshan, 2019. Chapter 32 on PostgreSQL. R. Borovica-Gajic and I. Alagiannis. PDF Bibtex

CACM, 2019. The five minute rule thirty years later and its impact on the storage hierarchy. R. Appuswamy, R. Borovica-Gajic, G. Graefe, and A. Ailamaki. PDF Link BibTex

arXiv:1902.07500, 2019. A Note on Bounding Regret of the C2UCB Contextual Combinatorial Bandit. B. Oetomo, M. Perera, R. Borovica-Gajic, and B. I. P. Rubinstein. PDF Link BibTex

VLDB Journal, 2018. Smooth Scan: Robust Access Path Selection without Cardinality Estimation. R. Borovica-Gajic, S. Idreos, A. Ailamaki, M. Zukowski and C. Fraser. PDF* Link BibTex

DASFAA, 2018. Finding All Nearest Neighbors with a Single Graph Traversal. Y. Xu, J. Qi, R. Borovica-Gajic, and L. Kulik. PDF Slides Link BibTex

Dagstuhl Reports 7(5), 2017. Robust Performance in Database Query Processing (Dagstuhl Seminar 17222). R. Borovica-Gajic, G. Graefe, and A. Lee. PDF BibTex

ADMS@VLDB, 2017. The five minute rule thirty years later and its impact on the storage hierarchy. R. Appuswamy, R. Borovica-Gajic, G. Graefe, and A. Ailamaki. PDF Slides BibTex

PhD Thesis, EPFL, 2016. Toward timely, predictable and cost-effective data analytics. R. Borovica-Gajic. PDF Slides Poster

VLDB, 2016. Cheap Data Analytics Using Cold Storage Devices. R. Borovica-Gajic, R. Appuswamy, and A. Ailamaki. PDF Slides Poster BibTex

ICDE, 2015. Smooth Scan: Statistics-oblivious Access Paths. R. Borovica-Gajic, S. Idreos, A. Ailamaki, M. Zukowski and C. Fraser. PDF Slides Poster BibTex

CACM, Research Highlights, 2015. NoDB: Efficient Query Execution on Raw Data Files. I. Alagiannis, R. Borovica-Gajic, M. Branco, S. Idreos, and A. Ailamaki. PDF BibTex

VLDB, 2012. NoDB in Action: Adaptive Query Processing on Raw Data (demo). I. Alagiannis, R. Borovica, M. Branco, S. Idreos and A. Ailamaki. PDF Poster BibTex

DBTest@SIGMOD, 2012. Automated Physical Designers: What You See is (Not) What You Get. R. Borovica, I. Alagiannis and A. Ailamaki. PDF Slides Poster BibTex

SIGMOD, 2012. NoDB: Efficient Query Execution on Raw Data Files. I. Alagiannis, R. Borovica, M. Branco, S. Idreos and A. Ailamaki. PDF Slides Poster BibTex (Test of Time Award at SIGMOD 2022)

DEB, 2011. Challenges and Opportunities in Self-Managing Scientific Databases. T. Heinis, M. Branco, I. Alagiannis, R. Borovica, F. Tauheed and A. Ailamaki. PDF BibTex

Patents

US Patent: 9298754. Query management system and engine allowing for efficient query execution on raw details. A. Ailamaki, S. Idreos, I. Alagiannis, R. Borovica, M. Branco. Link

Teaching

Database Systems (INFO20003). University of Melbourne, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024 (offered every semester)

Programming I (C course) with Prof. Jean-Luc Desbiolles. EPFL, Spring 2014.

Project in informatics (C++ course) with Prof. Jamila Sam. EPFL, Fall 2013.

Introduction to object oriented programming (Java course) with Prof. Jamila Sam. EPFL, Fall 2012.

Computer-aided engineering (C++ course) with Prof. Jamila Sam. EPFL, Spring 2012, 2013.

Introduction to programming (Java course) with Prof. Jamila Sam. EPFL, Fall 2011, 2014.

Programming (Java course) with Prof. Thomas Lochmatter. EPFL, Spring 2011.

Students

Current PhD Students:

  • Dinuka de Zoysa, 2024-, co-supervised with J. Bailey
  • David Adams, 2024-, co-supervised with N. Lipovetzky
  • Lankadinee Rathuwadu, 2023-, co-supervised with C. Leckie
  • Wentao Gao, 2023-, co-supervised with T. Pham
  • Dimuthu Kariyawasan, 2022-, co-supervised with S. Karunasekera
  • Xinling Shen, 2022-, co-supervised with S. Winter
  • Farzaneh Zirak, 2022-, co-supervised with F. Choudhury

Completed PhD Students:

  • Yixin Xu, 2017-2021, co-supervised with J. Qi and L. Kulik (first position: Singapore Management University, Lecturer)
  • Malinga Perera, 2018-2023, co-supervised with B. Rubinstein (first position: Amazon Web Services)
  • Bastian Oetomo, 2018-2023, co-supervised with B. Rubinstein (first position: University of Melbourne, Research Fellow)
  • Udesh Gunarathna, 2018-2023, co-supervised with S. Karunasekera and E. Tanin (first position: Amazon Web Services)
  • Renlord Yang, 2018-2023, co-supervised with U. Parampalli and T. Murray (first position: Apple)
  • Rajesh Chittor Sundaram, 2018-2023, co-supervised with M. Tomko and E. Naghizade (first position: University of Melbourne, Associate Lecturer)

Invited talks

Physical database design tuning: Reaching the holy grail of performance guarantees. International Workshop on Databases and Machine Learning (DBML)@ICDE, 2024. Slides

Tips and tricks for effective PhD: A guide to avoiding commong pitfalls. Information Resilience PhD School (CIRES), 2023. Slides

Connectivity matters! Traffic flow optimisation with (deep) reinforcement learning. Smart Mobility Workshop, Melbourne, 2022. Slides

Tips and tricks for effective PhD: A guide to avoiding commong pitfalls. PhD Workshop at Very Large Databases Conference (VLDB), 2022. Slides

Machine Learning and Databases: Friends or Foes? Australasian Database Conference (ADC), 2022. Slides

Data Science Pipeline: From raw data to insights with Goce Ristanoski. Melbourne Centre for Data Science (MCDS) PhD School, 2021. Slides

Data Analytics for a Penny. University of Iowa, Iowa, USA, 2021. Slides

Learning-based Algorithms in DBMS: Hype or Future? University of Copenhagen, Denmark, 2021 & McGill University, Canada 2022. Slides

Multi-armed bandits stealing DBA jobs. EPFL, Switzerland, 2020. Slides

A tale of learning databases. Australasian Database Conference (ADC), 2019. Slides

Tips and tricks for effective PhD. University of Queensland, Data Science Colloquium, 2018. Slides

Smooth Scan: Statistics-oblivious access paths under looking glass. Dagstuhl Seminar, 2017. Slides

Towards timely, predictable and cost-effective data analytics. University of Melbourne, 2016. Slides

Professional Service

2021-, SIGMOD Record Associate Editor (Surveys)

2021-2023, Diversity and Inclusion Chair, School of Computing and Information Systems, The University of Melbourne, (Link)

2023-2024, Assistant Dean Diversity and Inclusion, Faculty of Engineering and IT, The University of Melbourne, (Link)

2024-, Associate Dean Diversity and Inclusion, Faculty of Engineering and IT, The University of Melbourne, (Link)

Conference Program Committee:

  • 2024: SIGMOD, VLDB, ICDE, SiMod@VLDB, CDMS@VLDB, Dolap@EDBT/ICDT, DBTest@SIGMOD
  • 2023: SIGMOD, VLDB, ICDE, SiMod@VLDB, CDMS@VLDB
  • 2022: SIGMOD, VLDB, ICDE, CIDR, AIDM@SIGMOD, DEEM@SIGMOD, DBTest@SIGMOD, PhDWorkshop@VLDB, CDMS@VLDB
  • 2021: SIGMOD, EDBT
  • 2020: SIGMOD, VLDB, ICDE, DBTest@SIGMOD
  • 2019: SIGMOD, VLDB, ICDE, ADC, WomENcourage, GHC, CIKM
  • 2018: SYSTOR, DBTest@SIGMOD, WomENcourage

Journal reviewer:

  • Information Systems, VLDB Journal

Conference and Seminar Organization:

  • Y. Amsterdamer, R. Borovica-Gajic, D. Firmani. Workshop Chairs, International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM)@SIGMOD, 2024  (Link)
  • R. Borovica-Gajic, F. Naumann, X. Zhou. Tutorial Chairs, International Conference on Data Engineering (ICDE), 2024  (Link)
  • R. Borovica-Gajic, Z. Bao. PC Chairs, Australasian Database Conference (ADC), 2023  (Link)
  • R. Borovica-Gajic, V. Kalogeraki. Diversity and Inclusion Chairs, The 24th IEEE International Conference on Mobile Data Management (MDM), 2023  (Link)
  • R. Borovica-Gajic, P. Tozun. Diversity and Inclusion Chairs, International Conference on Management of Data (SIGMOD), 2022  (Link)
  • R. Borovica-Gajic, G. Graefe, A. Lee, C. Sauer, P. Tözün. Dagstuhl Seminar, 2022: "Database Indexing and Query Processing"  (Link)
  • A. Cheung, R. Borovica-Gajic, A. Appuswamy. PhD Symposium Chairs, International Conference on Data Engineering (ICDE), 2021  (Link)
  • Registration Chair, Symposium on Cloud Computing (SOCC), 2021  (Link)
  • Registration Chair, Very Large Data Bases Conference (VLDB), 2021  (Link)
  • R. Borovica-Gajic, Jianzhong Qi. PC Chairs, Australasian Database Conference (ADC), 2020  (Link)
  • R. Borovica-Gajic, G. Graefe, A. Lee. Dagstuhl Seminar 17222, 2017: "Robust Performance in Database Query Processing"  (Link)

Sponsors

Australian Research Council

arc

Google

google

L'Oréal

google

Telstra

google

Address

Level 3, Office 3318
Melbourne Connect
University of Melbourne
Parkville, VIC 3010
Australia
Phone: +61 3834 47220
  • *This is a post-peer-review, pre-copyedit version of an article published in VLDB Journal. The final authenticated version is available online at: http://dx.doi.org/10.1007/s00778-018-0507-8