Based on DOIs from the following file: {{ currentGraph.tabLabel }}

Data retrieved through {{ currentGraph.API }} API on {{ new Date(currentGraph.timestamp).toLocaleDateString() }} (estimated completeness: {{ completenessPercent }}%).

{{ inputArticlesFiltered.length + " / " + currentGraph.input.length }}

These suggested articles are the most cited references from the input articles. They can only be older than them.

{{ suggestedArticlesFiltered.length + " / " + currentGraph.suggested.length }}
newer (how to read this network) older (zoom by scrolling)
(how to read this network)

Click to enter source DOI
or scan a text file for DOIs instead !

(Load examples instead or see Frequently asked questions)

Instructions

What is this web app about?

This web app aims to help scientists with their literature review using metadata from Microsoft Academic and Crossref. Academic papers cite one another, thus creating a citation network (= graph). Each node (= vertex) represents an article and each edge (= link / arrow) represents a reference / citation. Citation graphs are a topic of bibliometrics, for which other great software exists as well.

This web app visualizes subsets of the global citation network that I call "local citation networks", defined by the references of a given set of input articles. In addition, the most cited references missing in the set of input articles are suggested for further review.

How to use this web app?

There are basically two ways to create new networks: First, based on the references of a given paper. This could be a review or just an interesting paper you'd like to dig deeper into. Either click the + button at the top right and enter the digital object identifier (DOI) of the source article, or click the blue reference-button of an interesting citation in the Refs-column of the tables if you already have opened a network.

The second way to create a new network is to scan a text file for DOIs, which allows you to create your own unique lists of input articles. For example, this could be a .ris / .bib / .json file exported by your reference manager (e.g. Zotero or EndNote). In theory, you can scan any plain text file (PDFs and Microsoft Office files don't work). However, if the DOIs are embedded in sentences or URLs and have a trailing fullstop (.) or slash (/), the system will have difficulties extracting them correctly and might not find all of them.

Up to 5 tabs can be open at the same time, representing a different local citation network each. Only one query can be processed at a time, which is why some buttons are disabled when a query is still going on. Microsoft Academic seems to be able to process quite large numbers of references (I've successfully tried up to 500) but Crossref's API returns an error when trying to process too many references (depending on the length of the search string, around >110 DOIs).

Where do the suggested articles come from?

Suggested articles are the (locally) most cited references of the input articles. In order to create the local citation network, this web apps attempts to retrieve reference-lists from all input articles (usually some are missed, see completeness) and then orders these references by number of local citations (in-degree, see network). Some of the top references might be among the input articles, depending on how connected they are. However, some of the top references usually aren't among the input articles and then become suggested articles (they must have at least two citations among the input articles). Currently the number of suggested articles varies from 0 to 10, with 10 being an arbitrary cutoff that could potentially be lifted in the future.

Keep in mind that because of the nature of this approach, suggested articles are always older than the source article. This can help identifying seminal papers of the field but not newer state-of-the-art research.

How to read the citation network?

Zoom by scrolling and pan by dragging. Each node (= vertex) represents an article. They are ordered by year from top (newer) to bottom (older), with each year colored differently. The size of the nodes depends on their "in-degree", the number of citations the paper received from the input articles, acting as a metric for importance in this specific context. The local "in-degree" is often but not always proportional to the global citation count, particularly in cross-sections of scientific fields. Circle-shaped nodes represent input articles and star-shaped (β˜…) nodes represent suggested articles, which are cited by the input articles but not part of them themselves.

Each edge (= link / arrow) represents a reference and citation at the same time: They are references from their outgoing nodes and citations for their incoming nodes. Usually papers can only cite older papers, hence edges tend to point downwards (or sidewards for citations of the same year). Technically this is called a directed acyclic graph and that's why nodes tend to be bigger (cited more) towards the bottom, because they have had more time to become influential. This approach helps with identifying seminal (usually older) papers but cannot suggest new state-of-the-art research.

How to read the co-authorship network?

This "local" author network shows the commonest authors among the source, input and suggested articles. Depending on the discipline, the number of (co)authors among a set of multiple articles can quickly rise to the hundreds, which is why this author network only shows authors with a minimum number of local articles (not to be mistaken with their global number of publications, which is usually much higher). The default minimum number of local articles is determined so that this network shows no more than 50 authors, but it can be changed with the slider at the bottom from 2 to 10.

Each node (= vertex) represents an author. The size depends on their number of local articles and the color depends on the author-cluster they first appear in. Diamond-shaped (β—†) nodes are also authors of the source article whose references define the input articles. Click authors to filter input & suggested articles, ctrl+click to filter for more than one author.

Each edge (= link) represents a collaboration between two authors among the source, input and suggested articles, the width indicating the number of collaborations. Click edges to filter input & suggested articles.

Which API should be used and how to switch?

Microsoft Academic is used by default because it is usually more comprehensive than Crossref. If the Microsoft-button (❖) in the top-right menu is white, Microsoft Academic is being used and if it's black, Crossref is being used. Toggle by clicking. If the Microsoft Academic API is down or the key's monthly quota is exceeded, it will automatically be turned off. Get your own free key here and enter in the dialogue to have your own quota of 10,000 transactions per month! The key stays on your local machine and is only shared with Microsoft.

What does "Autosave results locally" do?

When activated, this option caches your recent networks (up to 5 tabs) and your settings (including your personal Microsoft Academic API key) locally in your browser in the so-called localStorage. When you revisit this web app, you can pick up where you left. When deactivated, this data is deleted.

Troubleshooting

This web app is not working!? Which browsers are supported?

I've tested with current versions of Firefox, Safari and Chrome (as of November 2019) and made sure this web app runs in these. Microsoft Edge and Microsoft Internet Explorer are unfortunately not supported and do not work at all right now. This web app is meant for desktop use and not optimized for smartphones, which have not been tested at all. I'm a single person and created this web app in my spare time for free and for fun, so I don't have the time and resources to test all possible devices, web browsers and scenarios. This web app is open source, feel free to support it by extending compatibility or optimizing mobile / tablet use! If you find a bug or problem with the newest versions of Firefox / Safari / Chrome, please report it on GitHub or contact me.

The queries have turned very slow!? / How to cancel loading?

Both the Microsoft Academic API and the Crossref API are public with varying response speeds depending on their overall usage and workload. In addition, I hypothesize they might be throttling individual users (as identified by IP address) when they perform too many queries in a row. Checking your internet connection, only querying what's really important, switching to the other API and trying again later might help!

Sometimes (particularly with unstable internet connections) queries seem to run forever. In case you want to cancel loading, activate 'Autosave results locally' (see autosave) and refresh page. You can deactivate autosave afterwards again.

Is the data complete? / Some references are missing!?

The data is usually incomplete. Some DOIs cannot be found at all in Microsoft Academic / Crossref and for those that are found, often some metadata like authors, abstracts or reference-lists are missing (particularly in Crossref). For the citation network, the completeness of the reference-lists is most important. Usually Microsoft Academic's data is more complete, which is currently turned {{ useMA ? 'on' : 'off' }} (how to toggle).

The estimated completeness can be seen above the search bar in the "Input articles" tab and is calculated in the following way: For Microsoft Academic it is the fraction of input articles that have reference-lists themselves (multiplied with the fraction of DOIs found in Micrsoft Academic in case of an input file).

Crossref allows more subtle calculation, as it often also provides the total reference count, which is often larger than the reference-list of DOIs (older references often don't have DOIs and neither do some specific books or papers). The estimated completeness is thus calculated as the product of three fractions:

  1. Source reference completeness: (Number of input articles found in Crossref) / (Total reference count of source or total number of DOIs in file)
  2. (Number of input articles that have reference-lists themselves) / (Number of input articles)
  3. Average input articles' reference completeness among those input articles that do have reference-lists: (Length of reference-list) / (Total reference count)

Background

What is bibliometrics? What other software can you recommend?

Bibliometrics is a part of information science that uses statistical methods to analyze publications. It emcompasses scientometrics, which is specifically looking at scientific publications, and citation analysis.

A highly recommended tool very similar to this one is Citation Gecko (based on Crossref), which I didn't know when creating Local Citation Network. The main difference is that you start with a small number of seed articles (5-6 are recommended) and incrementally increase the network by adding recommended articles to the pool of seed articles. Recommended articles include all references and even global citations.

Other great software to help with bibliometric / scientometric analyses include Anne-Wil Harzing's Publish or Perish, which allows sophisticated queries of a plethora of APIs including Microsoft Academic and Crossref but also Google Scholar, Scopus and Web of Science. If you're looking to create more complex networks than this web app allows, check out the Leiden University's great VOSviewer and CitNetExplorer, which allow you to create and customize everything from citation networks to author networks, university networks and journal networks. VOSviewer also allows text mining and co-occurrence networks.

Where is the data coming from, which API is used?

The data is coming either from Microsoft Academic (default as more comprehensive) or Crossref (fallback). Microsoft Academic is a freely available academic search-engine based on their proprietory Microsoft Academic Graph (similar to Clarivate's Web of Science and Elsevier's Scopus, both of which require paid subscriptions). Several publications have analyzed the quality and coverage of Microsoft Academic, with overall very promising results. Hug et al. (2017) have said: "The citation analyses in MA [Microsoft Academic] and Scopus yield uniform results". Microsoft Academic's API needs a personalized key to use, which is free for up to 10,000 transactions per month. This web app comes with a test key but it's faster and more reliable to get your own free key here!

Crossref is a not-for-profit organization that collects metadata from scholarly journals and publications. Unfortunately, not all publishers share their metadata with them, which is why the information is usually incomplete (particularly reference-lists and abstracts are often missing). If you notice a journal that always lacks some metadata, feel free to write them and suggest better integration with Crossref, which is only in the journal's best interest! Crossref's API is freely available.

Why aren't you using Google Scholar or Semantic Scholar's API?

Unfortunately Google Scholar doesn't share their data and discourages webscraping. Consequently, it doesn't have an API.

Semantic Scholar has a free API, which is great. Unfortunately however, it is too limited: It doesn't allow to retrieve information on multiple DOIs at once and enforces a rate limit, effectively rendering it unsuitable for this project's usecase.

I have an idea! Can you implement XYZ?

It's great to see you're getting involved! While I see numerous ways to extend this project myself, I don't have much time to work on it anymore unfortunately. However, this web app is open source and I'd be very happy to see others extend this project! In the meantime, check out these great applications, which might already provide what you're looking for.

I'd like to contribute! Is this open source and are you looking for help? / Is this on GitHub?

Yes, great! This project is open source (GPL-3) and the source can be found on GitHub. I don't have much more time to work on this project, so I'm actively looking for help and contributors. Bugfixes are always welcome but please contact me before any large pull-requests so we can coordinate.

Does this web app use cookies? What's the privacy policy?

No, this web app doesn't use any cookies. No user-tracking or fingerprinting is performed and no usage-data is collected by the author of this web app. Optionally, so-called localStorage can be used to cache results and settings locally when "Autosave results locally" is manually turned on (see autosave).

This web app is running locally on your computer and only contacts the API servers of Microsoft Academic / Crossref during runtime. Microsoft has a general privacy statement, I haven't found anything specific to the Microsoft Academic API. Crossref also has a general privacy statement. This web app is hosted by GitHub pages, see their privacy statement here.

Technicalities: Which JavaScript libraries are used and why no CDN?

This web app uses Vue.js, Buefy, Axios and vis.js Network. I opted against using content delivery networks (CDN) to speed up delivery of these JavaScript libraries and thus initial loading-time, because of the potential user-tracking by CDNs. If you're interested, this rising problem is partly countered by the great browser-plugin Decentraleyes (see their FAQ for more details).

Who are you, why did you create this and how can I contact you?

I'm Tim WΓΆlfle from Germany and started creating this project during the literature review for my Master's thesis. I've often manually scanned reference lists of papers to identify seminal articles and thought this must be automized. You can contact me on Twitter or by e-mail.