Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
D
doi-landing-page
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 8
    • Issues 8
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Jira
    • Jira
  • Merge Requests 1
    • Merge Requests 1
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • ICAT
  • doi-landing-page
  • Issues
  • #83

Closed
Open
Opened Nov 11, 2020 by Maxime Chaillet@mchaille
  • Report abuse
  • New issue
Report abuse New issue

Deploy a tool to dynamically render pages

Our DOI landing page is using JS scripts to render the web page. This occurs when the page is loaded. Unfortunately, not all indexing robots (including google bots) are able to run these scripts to generate the corresponding HTML code of the page.

The solution consists in using a new service which when a bot request arrives, serves the html version of a given doi landing page instead of the one used by a visitor at doi.esrf.fr.

Currently, we did not deploy locally (=at ESRF) such a service. Instead we are using a remote service from a third party. https://gitlab.esrf.fr/icat/doi-landing-page/-/blob/master/www/.htaccess#L22

This is not a long term solution because:

  • we depend on a third party service which can be shutdown at any time
  • we can not configure the service.

One important point is to be able provide the html version of the doi landing page fast enough to google bots. When this takes too much time, the google indexing service could trigger a timeout and the page is not indexed. This can be solved using a caching system which is normally a feature of a dynamic renderer. We have no control on that caching feature on the third party service

Here are stats regarding the time for the google bots to download a doi landing page from us https://www.google.com/webmasters/tools/crawl-stats?siteUrl=https%3A%2F%2Fdoi.esrf.fr%2F&utm_source=search_console&utm_campaign=left-nav-legacy-tool&hl=fr&authuser=1

To do:

  • choose a dynamic renderer (in agreement with TID)
  • deploy it locally (TID)
  • fill its cache with all the current DOI landing pages
  • write a script which periodically refills the cache to include the newly generated DOI landing pages
  • configure the DOI landing page service such that it uses the local dynamic renderer instead of the remote service.
Edited Nov 11, 2020 by Maxime Chaillet
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
0
Labels
None
Assign labels
  • View project labels
Reference: icat/doi-landing-page#83