Scope: This article describes how retracted Pubmed Articles are removed from the IEDB website. There are typically about 2 articles retracted and removed from the IEDB site each year.
Procedures
Normally retracted Pubmed Articles are identified by running the “Check IEDB for inconsistencies with Pubmed” from “Execute Jobs” within the Curation Application (by a Curation Administrator). When this job completes, the following line will be output in the curation.log logging file on the server:
curation.log:
The following articles have been retracted by Pubmed and reside in production:
21859462
A ticket is created to document this retraction (ex: https://bits.lji.org/jira/browse/CUR-3999).
The Curation Adminstrator will report this via email to Curation lead and Document Administrator.
A patch is then written (from a “retracted pubmed” patch template) which does the following:
• Copies basic assay information to deleted_assay table
• Sets curation_status=’Rejected’ for this reference in the reference table (later the nightly promotion job will change the curation_status=’Discarded’ and remove its curated data from the newdb_production schema)
• Adds an entry in the action_log table so this action can be seen in “History” from Curation Application
This patch can be reviewed here as SQL Script (Data Patch) - APPENDIX A.pdf (65.0 KB)
.
SQL for this example:
SQL> desc deleted_assay;
Name Null? Type
ASSAY_ID NOT NULL NUMBER
REFERENCE_ID NOT NULL NUMBER
ASSAY_TYPE VARCHAR2(15)
REASON VARCHAR2(100)
MERGED_ID NUMBER
MODIFIED_DATE DATE
CREATED_DATE DATESQL> select reference_id from article where pubmed_id=‘21859462’;
REFERENCE_ID
10224571 row selected.
SQL> select ASSAY_ID||‘|’||REFERENCE_ID||‘|’||ASSAY_TYPE||‘|’||REASON from deleted_assay where reference_id=1022457;
ASSAY_ID||‘|’||REFERENCE_ID||‘|’||ASSAY_TYPE||‘|’||REASON
1864543|1022457|bcell|retracted by PubMed
1865611|1022457|bcell|retracted by PubMed
1865625|1022457|bcell|retracted by PubMed3 rows selected.
When the next weekly build is run, this reference no longer has data in the newdb_production schema so the reference for this Pubmed will no longer be built onto the IEDB website. There is a 1 week lag time from when the PubMed gets removed from Curation to it’s removal from the IEDB website (after new build is flipped live).
If a user attempts to access this removed reference by the direct link (IEDB Reference 1022457 details), they get the following screen:
If a user attempts to access a removed assay by the direct link (IEDB Assay 1864543 details), they get the same screen.
