當前位置:
首頁 > 知識 > 作為科研質量評估指標,Altmetrics 靠譜么?

作為科研質量評估指標,Altmetrics 靠譜么?

至少在生物科學領域,altmetrics 的參考價值確實比較弱……

作者 Lutz Bornmann & RobinHaunschild

編譯 阿金

審校 貓鷹 譚坤

政策的制定者一直很看重科學的力量,但是,最近他們開始要求科學家和科研機構出示能證明他們科研質量的證據,這下可引發了不小的爭議。長期以來,科學同行評議(peer-review)制度是驗證科研論文質量和影響力的有效手段,但這個方法耗時耗力,且過程繁瑣。因此,衍生出考察論文引證影響力(Citation Impact)來判斷科研水平。但是,引證方法也有一定局限性,比如,是否計入負面引證?是否所有的引證都有同等的價值?此外,論文引用量需要一定時間地累積,這一原因也使得用該方法進行評估時,對年輕的科研人員與新建立的研究組織而言就不太公平了。

面對上述情況,是否存在其它可替代的有效評估手段呢?有!

替代計量指標(Alternative metrics,altmetrics)就是就被作為是一種行之有效、在傳統計量指標之外的一種補充方法,涵蓋了除在學術界使用的傳統計量以外的其它指標,如社交媒體、博客、新聞報道,在線文獻管理等等。目前,諸如Wiley、自然雜誌、F1000等很多權威期刊都在所發表的文章頁面上加上了 altmetrics 的小圖標。另外,科研人員也開始將這一指標添加在個人簡歷和科研經費申請書上。但是在影響力評估方面,altmetrics 指標的意義與價值尚不明確。有人會問,在推特上@某篇論文產生什麼實際的影響力嗎?如果出現高轉發率的假研究,我們又該怎麼辦?其實,不少針對 altmetrics 的研究已經發現,引證和推文之間的關聯性幾乎為零,當然,也有其它研究表明,像在Mendeley這樣的文獻管理軟體中標記出的文章能夠表明其一定的科學影響力。為此,我們向各位介紹兩項作為預印本發表在arXiv上的研究,進一步探討 altmetrics 的潛在價值。

計量指標之間的較量

對於同一篇文章,兩種評估質量的方法是否會呈現相同或迥異的結論。

在第一項研究中,針對同一篇論文,研究人員使用傳統引證指標和替代計量指標(推特和在線文獻管理標籤)分別與專家評審結論進行比較,得出之間的關聯性。收集專家評審觀點的平台是F1000Prime,這個平台在論文發表後會專門提供「發表後的同行評議」,給論文評級打分。經過一番分析,研究人員發現推文與專家評審之間的關聯性要弱於傳統計量指標與後者之間的關聯性。而在線文獻管理軟體中的標籤計量指標倒是與傳統指標倒是頗為一致。

在第二項研究中,研究人員考察了除推特以外的其它替代計量指標。結果印證了第一項研究的結論。事實上,引用計量與專家評審之間的關聯性要強於替代計量指標與後者的關聯性,高出約兩三倍。

綜上所述,至少在生物科學領域,altmetrics 的參考價值確實比較弱。

Altmetrics 還有希望么?

最近幾年,科學政策往往傾向在更大範圍內評估論文和科研質量,比如以整個社會為基礎或非專業領域團體。Altmetrics 作為一種經濟又方便的社會影響指標,仍然發揮著一定作用。對於 Altmetrics,我們希望能得到更多來自各方面的反饋,從而找到能夠證明科研和論文質量的更有效的評估指標。

相關論文信息(一)

[論文題目]Do bibliometricsand altmetrics correlate with the quality of papers? A large-scale empiricalstudy based on F1000Prime, altmetrics, and citation data

[論文作者] Lutz Bornmann, Robin Haunschild

[發表期刊] arXiv.org

[發表時間] 2018年1月18日

[論文鏈接] https://arxiv.org/abs/1711.07291

[論文編號] arXiv:1711.07291

[論文摘要] In this study, we address the question whether (and to whatextent, respectively) altmetrics are related to the scientific quality ofpapers (as measured by peer assessments). Design: In the first step, we analysethe underlying dimensions of measurement for traditional metrics (citationcounts) and altmetrics - by using principal component analysis (PCA) and factoranalysis (FA). In the second step, we test the relationship between thedimensions and quality of papers (as measured by the post-publicationpeer-review system of F1000Prime assessments) - using regression analysis.Results: The results of the PCA and FA show that altmetrics operate alongdifferent dimensions, whereas Mendeley counts are related to citation counts,and tweets form a separate dimension. The results of the regression analysisindicate that citation-based metrics and readership counts are significantlymore related to quality, than tweets. This result on the one hand questions theuse of Twitter counts for research evaluation purposes and on the other handindicates potential use of Mendeley reader counts. Originality: Only a fewstudies have previously investigated the relationship between altmetrics andassessments by peers. The relationship is important to study: if altmetricsdata are used in research evaluation, they should be related to quality.

論文信息(二)

[論文題目]Normalizationof zero-inflated data: An empirical analysis of a new indicator family and itsuse with altmetrics data [論文作者] LutzBornmann, Robin Haunschild

[發表期刊] arXiv.org

[發表時間] 2018年1月26日

[論文鏈接] https://arxiv.org/abs/1712.02228

[論文編號] arXiv:1712.02228

[論文摘要] Recently, two new indicators(Equalized Mean-based Normalized Proportion Cited, EMNPC; Mean-based NormalizedProportion Cited, MNPC) were proposed which are intended for sparsescientometrics data. The indicators compare the proportion of mentioned papers(e.g. on Facebook) of a unit (e.g., a researcher or institution) with theproportion of mentioned papers in the corresponding fields and publicationyears (the expected values). In this study, we propose a third indicator(Mantel-Haenszel quotient, MHq) belonging to the same indicator family. The MHqis based on the MH analysis - an established method in statistics for thecomparison of proportions. We test (using citations and assessments by peers,i.e. F1000Prime recommendations) if the three indicators can distinguishbetween different quality levels as defined on the basis of the assessments bypeers. Thus, we test their convergent validity. We find that the indicator MHqis able to distinguish between the quality levels in most cases while MNPC andEMNPC are not. Since the MHq is shown in this study to be a valid indicator, weapply it to six types of zero-inflated altmetrics data and test whetherdifferent altmetrics sources are related to quality. The results for thevarious altmetrics demonstrate that the relationship between altmetrics(Wikipedia, Facebook, blogs, and news data) and assessments by peers is not asstrong as the relationship between citations and assessments by peers. Actually,the relationship between citations and peer assessments is about two to threetimes stronger than the association between altmetrics and assessments bypeers.

https://blog.f1000.com/2018/01/11/evaluating-research-different-metrics-tell-us-different-things/


喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 科研圈 的精彩文章:

中國學者成功創建新型擬人化冠心病小動物模型
施一公請辭清華大學副校長,將全職執掌西湖大學

TAG:科研圈 |