The Desktop Fascinator - FAQ
- 1.What is The Desktop Fascinator?
- 2.How do I join in the conversation?
- 3.How is this different from Google desktop search and the like?
- 4.What about access control?
- 5.Will it work across local, networked and remote data stores?
- 6.Will it interface with SRB?
- 7.Will it work with RIF CS metadata?
- 8.How will this work with taxonomies?
1.What is The Desktop Fascinator?
The Desktop Fascinator is an experimental software package based on The Fascinator, which is inspired by the ongoing problem of getting eResearch data and publications into repositories where they can be backed up, preserved and shared with a network of other researchers. See Desktop eResearch revolution which gives some of the background.
Another source of inspiration is consumer tools like Picasa, which indexes images and provides a way to browse them, iTunes which indexes music files and provides faceted search and browsing based on metadata such as artist and album, and KDE/Linux application digiKam which has an easy to configure tagging system for classifying images.
The software is a work in progress but in the short term it will:
Install on Windows, Mac and Linux desktops
Index local files including stuff on external disk drives.
Provide plugin metadata parsing mechanism so it can understand data and metadata inherent in various file types, such as titles in office documents, camera types in images and elements and bonds in chemical markup language files. It will also use the organization of the file system as metadata. These are just examples – the idea is to be able to adapt to any discipline or domain.
Allow the user to organise the stuff that has been located and indexed into projects or whatever categories they wish using a simple hierarchical tagging system. These categories may end up being linked to formal 'collection' descriptions in the Australian National Data Service (ANDS) infrastructure, but that is still under development, see Will it work with RIF CS metadata?, below.
Trigger automatic replication to departmental repositories and backup systems based on the metadata in their materials – for example, make sure that all files in the ~/thesis directory are backed up, or that everything tagged with 'project a' is replicated to a Fedora repository for Project a.
Push content to institutional repositories such as ePrints. (This is a great opportunity to populate repositories. Old word documents can be sent of to the repository input queue at the click of a button.)
In the medium to long term it should:
Enable users to add formal, schematized metadata via a plugin interface.
Provide an eResearch 'workbench' to provide layers of organisation above the level of the item but more sophisticated than simple metadata
2.How do I join in the conversation?
We talk about this application using the tag #DteRrev (for Desktop eResearch Revolution, which is a nod towards Kevin Rudd's Digital Education Revolution, which will apparently make Australian fingers and toes much smarter over the next generation). So use that tag on Twitter (search now), delicious and so on. We'll be watching.
For larger posting – why not join our Google Groups mailing list: http://groups.google.com/group/the-fascinator-dev
The code is available from Subversion so you can download it, try it and improve it if you wish. You can check out our Trac or go straight to the code.
3.How is this different from Google desktop search and the like?
Google desktop search is very much focussed on full-text retrieval. This system will extract a great deal of meatadata and be extensible to deal with research-relevant metadata. It will also be configurable to make sure that content generated and managed locally is replicated / federated to departmental, institutional, project and discipline repositories as required. And ours is Free software not just freeware.
4.What about access control?
The Fascinator server software was originally written to test out widely access control scenarios. It has been demonstrated that using The Fascinator a Solr index can be used as part of an access control mechanism via the use of limit queries with different user groups being assigned different queries to filter what they can see and access.
Access control will work in two ways:
The user will be able to configure where they would like their stuff sent – so private materials can go to a private backup, and project materials can go to project systems.
Downstream repositories can have access controls that are metadata driven. This will require repository owners and administrators to set up appropriate access regimes. If the downstream repository is running something like The Fascinator which expresses access controls in terms of limit queries against a simple index then this will be easy. If it is something like a Fedora repository with access controls in XACML then let us know how that works out for you.
5.Will it work across local, networked and remote data stores?
That is the intention, but we are starting with the most vulnerable data which is that stuff that lives only on the researcher's hard drive and is not even systematically backed up at the moment.
6.Will it interface with SRB?
(SRB stands for Storage Resource Broker:
The SRB software infrastructure can be used to enable Distributed Logical File Systems, Distributed Digital Libraries, Distributed Persistent Archives, and Virtual Object Ring Buffers. The most common usage of SRB is as a Distributed Logical File System (a synergy of database system concepts and file systems concepts) that provides a powerful solution to manage multi-organizational file system namespaces.)
It certainly could if there were data in SRB that could usefully be indexed and then replicated to other systems as we plan to do with plain old files, but currently our group has neither the expertise or the impetus to explore this.
7.Will it work with RIF CS metadata?
http://pilot.apsr.edu.au/public/rif/guidelines/rif-cs.html
The RIF-CS Schema was developed as a data interchange format for supporting the submission of collections metadata to the ORCA registry. It is based on ISO2146 but only includes elements needed for a collection service registry and so is not full binding to the standard.
It is recommended that the RIF-CS user community provide feedback in order the schema can evolve to meet a wider needs base. The schema also has an accompanying set of vocabularies.
Currently the primary registry object type is collections. A collection in the RIF-CS Schema context could be a repository, a registry, a collective work or an index/database. There are no hard and fast rules about what constitutes a collection and it is up to the data providers to consider what their collections are and what metadata is provided. The RIF-CS also supports other registry object types, namely services, activities and parties. Any or all of these along with their relations to each other are able to be expressed in RIF-CS format. The relations currently supported by the format are illustrated in Figure 1. Adopters of the RIF-CS format are encouraged to identify new relations needing to be supported.
This looks promising. We will explore ways that our researchers can describe their local collections using this format, how much use that is and give some feedback to ANDS.
8.How will this work with taxonomies?
We have yet to try these ideas, but suggestions from Rowan Brownlee at USyd lead us to propose the following support for formal taxonomies and other species of schematized metdata:
Metadata will reside in one of three places:
In the object – where this is possible and sensible, think PDF, mp3, Office documents.
In the filesystem alongside the object or objects where this aligns with researcher practice.
In the Fedora repository that's part of The Desktop Fascinator.
Where possible only the node-id (URI) of the metdata term, for example a species should be attached to the item it describes.
The indexer will expand out taxonomic and other hierarchies when it indexes to allow drill-down discovery. Reindexing will be enabled for when taxonomies change and will need to be able to update metadata records as well as the index. We are unsure if we will be able to, or want to use this approach with user-derived tagging like
Project A.




