An RSL-based Associative Filesystem

Student Name
Gregory Cardone
Thesis Type
Master Thesis
Thesis Status
Finished
Academic Year
2009 - 2010
Degree
Master in de Toegepaste Informatica
Promoter
Beat Signer
Supervisor(s)
Beat Signer
Download
thesisCardoneGregory2010.pdf
Description

The persistent storage capacity of personal computers in the form of disk space is growing every year, and we are already dealing with terabytes of data. People use their computer not only for work, but also for entertainment. However, since the number of files is increasing, problems of organizing and retrieving data are arising.

With current filesystems, it is possible to organize files in folders and to define complex hierarchies of folders. But over the time and with an increasing number of files, an efficient organization becomes more complex to realize. The general issue is that users can have difficulties to remember in which folders their data is located and how it is organized. This is because files can be located in deep folder hierarchies. Search functionality do not help us in most of these situations since files often have non-trivial names. Therefore, we propose a mechanism to classify files and folders within multiple folders, called {multiple classification, in order to organize our data and access it from multiple folders, and to avoid creating deep folder hierarchies.

Furthermore, in existing filesystems we do not see whether files are semantically related to each other (for example a photo and a video that have been created at the same geographical place and in the same context). Also, we encounter sometimes situations where it is difficult to create a correct organization of our files with folders. We introduce the concept of semantic links, which enables us to semantically link data in order to understand how a set of data is related, and make it possible to flexibly organize our data (i.e. not only organizing files and folders within folders).

Moreover, we often create "temporary" files that become irrelevant after some time. For example, if we want to insert a part of an image in a document, we may have to create a cropped version of the image before inserting it into our document. In many cases, we forget to later remove those "temporary" files. Thus, they pollute our collection of files and waste unnecessary space on our hard disk drive. To solve the problem, we introduce the concept of content recycling, which consists of reusing data by pointing to (parts of) the content of files and render it within other files.

In addition, current filesystems add their own predefined metadata to files or folders, but do not allow us to define our own metadata. Furthermore, search functions can only be based on this limited system-defined metadata. Therefore, we propose to let users create user-defined metadata, called properties, on any data such as files or partial content of files.

Last but not least, we introduce the concept of content travelling and explain how it is useful when we want to consult the content of different information sources in a non-linear way. We also explain the concept of content outsorters in order to adapt the view (or environment) of our data. Finally, we show how we can grant access rights to our data, and how we can combine it with content outsorters in order to adapt the presentation of data.

To achieve these goals, we have extended the Resource-Selector-Link (RSL) model, a metamodel that has the purpose to link and compose data, with an extension called the RSL-based Associative Filesystem (RBAF) model. We have implemented an initial prototype of the RSL-based Associative Filesystem as a proof of concept.