Categories
Report

Using open citation data to identify new research opportunities

This year’s Annual Forum for the LIS-BIBLIOMETRICS mailing list took place at the British Library Knowledge Centre on 29 January 2019, and focused on the topic of ‘Open Metrics and Measuring Openness’.

As well as two keynotes, three parallel workshop sessions and a panel discussion featuring four participants, a session featuring five-minute ‘lightning talks’ gave nine of us a chance to give presentations on work relevant to the Forum topic.

Image 1
Stephen Pearson giving presentation at LIS-BIBLIOMETRICS Annual Forum 2019

Unlike most of the other offerings at the Forum, my talk wasn’t about the use of metrics in evaluations or about how to measure openness. Instead, I talked about the way in which large open sets of citation data can give interesting information about patterns of citation. I reported on some initial exploratory work we’ve done to see whether this information can help identify new research opportunities.

Where does inspiration for research come from?

Image 2
Isaac Newton

Inspiration for research can come from many different sources. For example, going back about 350 years, the act of noticing that an apple always falls perpendicularly to the ground could lead you to muse whether the earth had some power of attraction which caused this, and ultimately develop the law of universal gravitation.

Of course it’s still possible to come up with new ideas and discoveries on the basis of such ‘Eureka’ moments. But an increasing focus on interdisciplinarity has led to a situation of which it’s been said that ‘revolutionary scientific discoveries … are often the result of connecting ideas that have their origin in different disciplines.’

That quotation is from an article entitled ‘Interdisciplinary Research Boosted by Serendipity’. But do we have to rely on serendipity to discover that ideas from one discipline can be profitably applied in another field in a novel fashion? Or is there a way of systematically identifying such potential links?

I’d suggest that the technique of bibliographic coupling can help.

Bibliographic coupling

Image 3
Diagram illustrating bibliographic coupling

If two documents share several references in common, as Documents A and B do, then those documents are ‘bibliographically coupled’. And there’s at least a possibility that the two Citing Documents are using similar approaches to the research questions they’re respectively addressing.

In many cases the two Citing Documents will be by researchers who are addressing the same research question, or very closely related questions, and so the sharing of references has no deeper significance. A potentially more significant scenario occurs when the two Citing Documents are by researchers working in somewhat different fields. In that situation, the bibliographic coupling is a pointer to at least the possibility of a previously unidentified cross-disciplinary research connection.

COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations

So, what citation database could we use to try to identify such connections? The most well-known ones are the commercial products Web of Science and Scopus, but there are at least two barriers to using them for this type of work. The first is the need to pay a subscription cost to use them at all. The second is the limit to the number of records one can download. This makes it difficult to amass a dataset big enough to allow the required ‘mining’ for links.

We used COCI, a dataset created by the OpenCitations organisation. This was originally known as the Crossref Open Citations Index but it’s now the OpenCitations Index of Crossref open DOI-to-DOI citations.

Updated at least every six months, this dataset comprises all the DOI-to-DOI citations specified by open references in Crossref, which currently amounts to almost 450 million DOI-to-DOI citation links based on over 45 million bibliographic resources.

Looked at in terms of proportional coverage and from our own institution’s point of view, it comprises c.8,000 University of Manchester papers published since 2014 (which makes up around one-third of the total), together with their citation references.

This figure gives an example of the information which COCI provides, for each of the more than 45 million bibliographic resources whose DOIs it contains.

Image 4
Diagram illustrating information in COCI

The data is available in several formats. We downloaded it as a CSV ‘dump’, and then filtered it to extract only those records where the DOI of the citing paper matched that of a paper in our own institutional publication records. This table gives some examples of the information which we then combined with the COCI information.

Image 4B
Table illustrating Manchester metadata for enhancing information from COCI

This meant that we were able to create a rich dataset which comprised c.8,000 records for University of Manchester papers, each in the following format.

Image 5
Diagram illustrating information in COCI enhanced with Manchester metadata

Pointers to new research collaboration possibilities?

A colleague from the Library’s Digital Technologies and Services team wrote a program which pulled out all pairs of bibliographically coupled papers which had at least two references in common but where the authors came from different Faculties. We then used the free VOSViewer software to produce the following visualisation of bibliographical coupling links between publications by Schools/Divisions in different Faculties. (The closeness of the nodes is proportional to the strength of the bibliographic coupling.)

Image 6
Network visualisation of bibliographical coupling links between publications

What does this show? Here are two examples.

  • The two closely juxtaposed purple nodes at the bottom of the visualisation show that papers from the Division of Information, Imaging and Data Sciences (in the Faculty of Biology, Medicine and Health) and the School of Electrical and Electronic Engineering (in the Faculty of Science and Engineering) shared references in common.
  • The two closely juxtaposed green nodes at the left of the visualisation show that papers from the School of Mechanical, Aerospace and Civil Engineering (in the Faculty of Science and Engineering) and the School of Environment, Education and Development (in the Faculty of Humanities) shared references in common.

Do they highlight previously unsuspected opportunities for innovative new cross-disciplinary research? Unfortunately, no.

  • The juxtaposed purple nodes simply reflected the fact that closely related algorithmic approaches to medical diagnosis and to computer vision are used in both the Division of Information, Imaging and Data Sciences and the School of Electrical and Electronic Engineering. Although this is interesting, it’s not the kind of unsuspected connection we’d hoped to uncover.
  • Similarly, the juxtaposed green nodes show that approaches to the optimisation of land, water and energy use are an area of interest both to researchers in Civil Engineering and to those in Environment, Education and Development. Again, this isn’t an unsuspected connection which the bibliographic coupling has surprisingly brought to light.
Image 7
Nobel Prize Medal (Nobel-Prize CC-BY Abhijit Bhaduri via Flickr)

We’re not expecting to hear Manchester’s next Nobel Prize winners thanking us for bibliometric work which first alerted them to the possibility for a ground-breaking collaboration in their acceptance speech any time soon. However, the way in which this work highlighted related research being carried out in different Faculties (however unsurprising the specific examples) serves as an encouraging proof of concept.

Categories
Report

Use your LOAF!

Open Access week banner

To celebrate International Open Access Week, The University of Manchester Library’s Research Services, Academic Engagement and Marketing teams worked together to deliver a seminar on Open Access (OA) at Manchester. The main aim of the session was to engage with our institution’s researchers. Through a combination of presentations, Q&A sessions and networking opportunities, the seminar brought researchers up to date with what Manchester has achieved with OA; the policies of research funders; progress in OA over the last year; and insight into upcoming developments.

Increasing citations

The Vice-President for Research and Innovation Professor Luke Georghiou opened proceedings with his own take on Open Access. His research group has published three OA articles in the past year, which have achieved high levels of download; he is convinced this is due to ease of access, and is sure that OA will contribute to future levels of citation. Professor Georghiou thanked the Library for its excellent support.

Exceeding compliance targets

Open Access Seminar graph 2Helen Dobson reflected on the growth of the Library’s OA service, now playing a key role in the University’s OA support. Our work resulted in a 54% compliance rate for RCUK-funded research, an achievement high above the 45% target set by RCUK at the start of the year. Helen discussed the ‘pain points’ encountered by the team, including authors finding the process confusing, or being too busy to arrange OA. These insights help us develop our system and work with other institutions and publishers to streamline procedures. Despite these difficulties, our service has received great feedback and supported over 500 articles in becoming Open Access.

Making books as accessible as journals

Dr Frances Pinter, CEO of Manchester University Press, spoke of the need to find sustainable routes to OA for specialist scholarly books, and make them as accessible as science journals. The not-for-profit pilot Knowledge Unlatched has succeeded in proof of concept. With this model a library consortium paid for a package of e-books to be made fully open, and librarians participated in the selection of content. There has been a high level of downloads.

HEFCE, COAF and LOAF

Emma Thompson explained the new ‘game changing’ HEFCE policy. All potential REF outputs must be must be deposited in an institutional repository on acceptance, discoverable immediately, and free to read ASAP. We are encouraging researchers to deposit their Author’s Accepted Manuscripts (AAM) ahead of the compliance start date 1 April 2016, and the Library is working with colleagues in Computer Science to develop an easy interface.

Our team will also be administering the new Charities OA Fund (COAF) at Manchester. We have further demonstrated our commitment to innovation, OA and the University’s researchers by announcing the new Library Open Access Fund (LOAF). We want to support authors who do not have funding to cover Article Processing Charges, and have created a pool of funds to support the publication of OA papers. The LOAF pilot will be managed by the Library’s OA team and will be run on a first come, first served basis.

If you would like a slice of LOAF, please contact the OA team.