DataVault is designed for long-term retention of research data, to meet funder requirements and ensure future access to high value datasets. It meets digital preservation requirements by storing three copies in different locations (two on tape, one in the cloud) with integrity checking built-in, so that the data owner can retrieve their data with confidence until the end of the retention period (typically ten years).
Previous development projects (I-IV), in summary, established a University service from a proof of concept developed in partnership with the University of Manchester; added encryption and a third ‘copy in the cloud’; created a distinct workflow that took advantage of architectural efficiencies; and created interface and performance enhancements, allowing deposits of up to ten TB without errors.
It is a known factor that the underlying DataVault (DV) application, the code that it is written in, is aging and requires a refresh / refactoring, having been first written in 2015. This was reinforced by the recent requirement to update the Oracle Cloud Storage from Gen1 to Gen2 where developers struggled with aging Java class libraries. Also because of the way that DV has been developed over its lifecycle, it has been approached by several different teams / developers and it is known that there is a level of redundant code / some inefficiencies that could be improved on / cleaned up.
Technical workshops have been held in order to discuss, evaluate and analyse options for refactoring the underlying application / code that DV is written on, to look at options on improvements and ultimately provide estimations (high end) on the works that would be required in order to refactor DataVault.
Current project status
Report Date | RAG | Budget | Effort Completed | Effort to complete |
---|---|---|---|---|
March 2023 | BLUE | 0.0 days | 0.0 days | 0.0 |