Hello world!

The licence for our current Google Search Appliance is due to expire. This blog will discuss our initial project to replace it (Phase 1) and then subsequent projects designed to develop it further.

Phase 1

We’re initially going to be working with Funnelback and deploying their out of the box ‘Higher Education Template‘. The scope of this will include a number of filters for People, News, Undergraduate and Postgraduate Courses.

This product offers a Marketing Dashboard to enable us to understand what our visitors are searching for, to identify both common issues and newly spotted trends. It gives us greater control over the design and layout of different kinds of results.

View the Work In Progress Here

Information about the scope and index size of our initial search can be found here.

Feedback, Comments and Suggestions are welcome and can be posted here.

Subsequent Projects

Several user requirements workshops and followup discussions have been held with stakeholders and have identified a number of more complex requirements around integrating search with other applications and knowledge bases (in particular the Library and NU Connect). These requirements will be prioritised in December 2018 and work will begin early in 2019.

Funnelback also offers us significant scope to look at personalising results based on user demographics (and to promote events and messages based on this) and there is  potential for course comparison functionality and the ability to embed fragments of search within pages – to automate the presentation of similar courses or news articles, for example.

 

Requests and Features Roadmap

This page will outline the major requests and features roadmap as it develops.

Several user requirements workshops and followup discussions have been held with stakeholders and have identified a number of more complex requirements around integrating search with other applications and knowledge bases (in particular the Library and NU Connect). These requirements will be prioritised in December 2018 and work will begin early in 2019.

Some of the things that have been mentioned…

Some of these items come out of the box, some will require development as the roadmap is finalised for future projects.

Category Requirement
Search Tool Crawl and Index webpages
Make use of metadata from webpages
Query and return data from databaes
Integrate with and extract results from APIs of other systems (e.g. Library Search, NUConnect and other Knowledge Bases)
Be aware of synonyms
Allow searching based on a question rather than keyword
Be aware of related queries
API to generate Opensearch result format
Allow biasing of results
Allow removal of individual results from search index
Allow voice activated search
Marketing Dashboard View search trends
View popular searches
Generate visual reports for quick analysis
Generate detailed reports for integration with other data for analytics
SERP Allow for promotion/banners at given time periods
Allow filtering based on content type metadata
Signpost to relavent search areas (Library/Intranet as appropriate) – based on search terms and or the page the user came to the SERP from
Mobile responsive
By default, be styled like the main University brand
User customisable layouts (tiles/list etc)
Include results from other search systems in tabs/widgets
Allow for promotion/banners for different audiences
Provide type-ahead/auto complete
Allow filtering based on user type
Customise results based on various categories of user
Customise appearance of result based on type of content
Inherit template from the site the user came from
Allow feedback on search results
Shopping basket like comparisons

 

Search – Known Issues

This page will be updated with any known issues related to Search and information related to how and when we expect to have them resolved.

Duplicate Staff Profiles

See an example here – this is where a member of staff has multiple profiles across multiple websites. While MyImpact has a notion of a primary profile, the primary profile is still often used on multiple websites – in these cases, the search has no explicit way of knowing which is the primary website to link to. This may not be solvable by the Search project, but a discussion about how and where staff profiles are managed and appeared might be able to shed some light on a way to resolve this.

About the Index – December 2018

The scope of the search, as we’re rolling it out in December 2018, includes the following locations:

The Corporate Web Presence and Marketing:

  • https://www.ncl.ac.uk/
  • https://microsites.ncl.ac.uk/

Academic Resources:

  • https://roomfinder.ncl.ac.uk/
  • https://blackboard.ncl.ac.uk/
  • https://docking.ncl.ac.uk/
  • https://my.ncl.ac.uk/staff/
  • https://my.ncl.ac.uk/students/
  • https://www.ncl.edu.my/itservice

Project Websites:

  • https://research.ncl.ac.uk/
  • https://teaching.ncl.ac.uk/
  • https://conferences.ncl.ac.uk/

Personal Publishing:

  • https://www.societies.ncl.ac.uk/
  • https://www.staff.ncl.ac.uk/
  • https://www.students.ncl.ac.uk/
  • https://blogs.ncl.ac.uk/

This search is designed for external facing use, so does not index NUConnect or other elements of the Newcastle University Intranet (NU Connect runs on SharePoint, which already has SharePoint Enterprise Search capabilities built in, so we need to look at how Funnelback and SES work together and compliment each other, not duplicate or replace the other).

A large quantity of content is already filtered from the results as these pages contain some content that’s not immediately relevant to searches (RSS feeds for example) or only internally accessible.

This results in an index of around 80,000 pages.

Naturally, some parts of this estate are higher priority than others, but the algorithms built into the system should be able to address this without too much manual intervention..

That said, it may take time to tune the new system based on feedback received and requests to add, remove or prioritised parts lower the priority of something else. We’ll be working with stakeholders over the coming weeks to ensure that feedback is addressed with care.

There are a lot of factors to consider, in particular how we handle duplicate content – which can dilute its effect – a couple of examples:

  • Many members of staff have multiple profiles across many websites.
  • Often news is duplicated across multiple sites – and syndicated onto many lists.
  • Blog or News Style sites generate a lot of pages of lists of their content, sometimes many times, but with different filters applied.

Future projects will be looking at integrations with other systems (Library Searches, NUConnect, ePrints, and other knowledge bases)

Please submit Feedback, Suggestions and Comments here.