Closure Report

Project Summary

The aim of the project was to instil the concept of schema.org into the websites of Library Digital Development. This framework involves the markup of data items within a site's content to describe it semantically. This makes search engine returns more meaningful, and should lead to more relevant searching, and better overall Analytics results. We saw it as a great opportunity to follow the lead of LTW, who have used this concept in the new Funnelback search, making the search results for Events and Profiles better. 

Our intention was to use the collections.ed sites, which are data-driven and thus mappable- there's no complicated text-mining required to programmatically apply the markup- and if the project had sufficient time, then we would try to produce recommendations for how to apply the concepts to our releases of Open Source sites which we don't actually build, and so forth.

Analysis of Resource Usage:

Staff Usage Estimate: 31.7 days (at 7 hours per day, not including holidays)

Staff Usage Actual: 28 days

Staff Usage Variance: 11.7%

Other Resource Estimate: £0

Other Resource Actual: £0

Other Resource Variance: 0%

Outcome

Technical Outputs

In terms of tangible outputs, we were pleased with the outcome of the project. Three of our most important sites were released to the world with their data underpinned by schema.org:

https://collections.ed.ac.uk/art

https://collections.ed.ac.uk/mimed

https://collections.ed.ac.uk/calendars

You can see the microdata markup by installing the 'structured data sniffer' extension on Google Chrome. This is what the search engines will read.

Six more sites will have the markup on the next web release: Exam Papers, Fairbairn Words, Open Books, Exhibitions, Iconics and Guardbook.

Additionally, we have analysed and spec'd out the code required to apply the markup to future sites: there are probably 5 or 6 more that we can apply it to.

 

Investigation, Documentation and Reporting

The interns kept a OneNote file in which they documented everything they did. This included all of their conceptual investigation into schema.org, how they established mappings, decisions over fields to include/exclude, and basically advice to the team to be able to apply schema.org to sites as a matter of course.

Nandini presented her findings to the Digital Library Systems Team on the 19th of July, using a well-crafted Powerpoint Presentation.

We did run out of time as regards the report with recommendations as to how to apply schema.org to Archives Space, DSpace etc, or options for running against a corpus of mined text, so it will probably fall to the team to look more at that, but we can still build on the interns' excellent documentation of how they analysed data and performed mappings etc.

Both Holly and Nandini wrote excellent blogposts about the project:

http://libraryblogs.is.ed.ac.uk/librarylabs/2018/06/29/marking-up-collections-sites-with-schema-org/

http://libraryblogs.is.ed.ac.uk/librarylabs/2018/07/27/marking-up-collections-sites-with-schema-org-blog-2/

 

Training

Nandini took on the technical side of the project, and thus has learned how to use Atom for code changes, and github for deployment. She has a number of 'pull requests' to her name now.

Courses were made available to the interns, and training was undertaken in Google Analytics and SEO. 

Meetings took place with Billy Wardrop and Duncan MacGruer (Ed Web), Alasdair MacDonald (Metadata Co-ordinator), Susan Pettigrew, John Bryden and Juliette Lichman (Digital Imaging Unit) and Jill Forrest (collections), all of which helped develop their knowledge of key aspects of the project.

Explanation for variance

Both interns had periods of sickness over the course of the project- Holly more than Nandini.

Key Learning Points

Both interns acknowledge that they have learned a lot about what to expect from 'the world of work' (office life, induction processes, planning processes, and being trusted to work with minimal interference from management), and they have much greater technical knowledge in this area, and knowledge of collections within the digital realm.

The knowledge of structured data for semantics within the team has been greatly increased.

Outstanding Issues

  • To make the documentation available on the Wiki;
  • To meet Billy Wardrop to discuss how this content can be better served up on the main ed.ac.uk site through its search (Billy has already built a collection based on the content to work with)- meeting planned for 13/08.
  • To update some outstanding sites with schema.org.

 

Project Info

Project
Schema.org integration with Library and University Collections
Code
LTW006
Programme
z. ISG Learning Teaching and Web - Strategic Projects (Closed)
Management Office
ISG PMO
Project Manager
Scott Renton
Project Sponsor
Melissa Highton
Current Stage
Close
Status
Closed
Start Date
08-Jan-2018
Planning Date
n/a
Delivery Date
n/a
Close Date
15-Aug-2018
Overall Priority
Normal

Documentation

Close