skillby u9401066
pubmed-multi-source-search
Cross-database search using multiple academic sources. Triggers: 跨資料庫, multi-source, Semantic Scholar, OpenAlex, CORE, Europe PMC, 綜合搜尋
Installs: 0
Used in: 1 repos
Updated: 1d ago
$
npx ai-builder add skill u9401066/pubmed-multi-source-searchInstalls to .claude/skills/pubmed-multi-source-search/
# 多來源綜合搜尋
## 描述
整合 PubMed、Europe PMC、CORE、Semantic Scholar、OpenAlex 等多個學術資料庫,進行全面的跨來源搜尋。
## 觸發條件
- 「搜尋所有來源」
- 「跨資料庫搜尋」
- 「找更多來源」
- 提到 Semantic Scholar、OpenAlex、CORE
- 需要開放取用論文
---
## 資料庫特色比較
| 資料庫 | 收錄量 | 特色 | 最適合 |
|--------|--------|------|--------|
| **PubMed** | 35M+ | 生物醫學權威 | 臨床/基礎醫學 |
| **Europe PMC** | 33M+ | 含預印本、全文 | 歐洲研究、預印本 |
| **CORE** | 200M+ | 最大 OA 庫 | 開放取用全文 |
| **Semantic Scholar** | 200M+ | AI 分析、引用圖譜 | 跨領域、影響力分析 |
| **OpenAlex** | 250M+ | 開放學術圖譜 | 大規模分析、趨勢研究 |
---
## 各資料庫工具
### PubMed(核心)
```python
search_literature(query="remimazolam sedation", limit=30)
generate_search_queries(topic="remimazolam") # MeSH 擴展
```
### Europe PMC
```python
# 搜尋(含預印本)
search_europe_pmc(query="remimazolam", limit=30)
# 取得全文
get_europe_pmc_fulltext(pmcid="PMC6939411")
# 引用資料
get_europe_pmc_citations(pmid="30217674")
```
### CORE
```python
# 搜尋
search_core(query="machine learning radiology", limit=30)
# 全文搜尋
search_core_fulltext(query="adverse events", limit=20)
# 取得全文
get_core_fulltext(core_id="12345678")
# 用標題找
find_in_core(title="Remimazolam versus midazolam...")
```
### Semantic Scholar
```python
# 搜尋
search_semantic_scholar(query="deep learning medical imaging", limit=30)
# 論文詳情(含引用分析)
get_semantic_scholar_paper(paper_id="...")
```
### OpenAlex
```python
# 搜尋
search_openalex(query="CRISPR gene editing", limit=30)
# 作品詳情
get_openalex_work(work_id="W2741809807")
# 作者資訊
search_openalex_authors(query="Jennifer Doudna")
```
---
## 跨來源搜尋策略
### 策略 1:互補搜尋
不同資料庫強項不同,互相補充:
```python
# PubMed:生物醫學文獻(權威性)
pm_results = search_literature(query="COVID-19 vaccine efficacy", limit=50)
# Europe PMC:預印本(最新研究)
epmc_results = search_europe_pmc(query="COVID-19 vaccine efficacy", source="preprint", limit=30)
# Semantic Scholar:跨領域(含 CS、工程)
ss_results = search_semantic_scholar(query="COVID-19 vaccine efficacy", limit=30)
# CORE:開放取用全文
core_results = search_core(query="COVID-19 vaccine efficacy", limit=30)
```
### 策略 2:全文優先
需要全文時的搜尋順序:
```python
# Step 1: PubMed 搜尋建立基礎列表
results = search_literature(query="...", limit=50)
# Step 2: 分析全文可用性
access = analyze_fulltext_access(pmids="last")
# Step 3: Europe PMC 補充全文
for pmid in access["subscription_required_pmids"]:
epmc = search_europe_pmc(query=f"EXT_ID:{pmid}")
# Step 4: CORE 最後嘗試
for pmid in still_missing:
details = fetch_article_details(pmids=pmid)
core = find_in_core(title=details["articles"][0]["title"])
```
### 策略 3:影響力分析
結合 Semantic Scholar 的引用分析:
```python
# PubMed 搜尋
pm = search_literature(query="...", limit=30)
# 取得 Semantic Scholar 的影響力指標
for article in pm["articles"]:
ss = search_semantic_scholar(query=article["title"], limit=1)
if ss["papers"]:
details = get_semantic_scholar_paper(paper_id=ss["papers"][0]["paperId"])
print(f"Citations: {details['citationCount']}")
print(f"Influential Citations: {details['influentialCitationCount']}")
```
---
## 完整跨來源工作流程
### 情境:全面搜尋某主題
```python
# Step 1: PubMed 核心搜尋
pm_results = search_literature(
query="machine learning drug discovery",
limit=50
)
# Step 2: 並行搜尋其他來源
epmc_results = search_europe_pmc(query="machine learning drug discovery", limit=30)
core_results = search_core(query="machine learning drug discovery", limit=30)
ss_results = search_semantic_scholar(query="machine learning drug discovery", limit=30)
oa_results = search_openalex(query="machine learning drug discovery", limit=30)
# Step 3: 整合結果
all_titles = set()
unique_papers = []
for source, results in [
("PubMed", pm_results),
("Europe PMC", epmc_results),
("CORE", core_results),
("Semantic Scholar", ss_results),
("OpenAlex", oa_results)
]:
for paper in results["articles"]:
title_key = paper["title"].lower()[:50] # 標題前50字作為key
if title_key not in all_titles:
all_titles.add(title_key)
paper["source"] = source
unique_papers.append(paper)
print(f"Total unique papers: {len(unique_papers)}")
```
---
## 各來源的獨特功能
### Europe PMC 獨有
```python
# 預印本搜尋
search_europe_pmc(query="...", source="preprint")
# 註解/評論
search_europe_pmc(query="...", has_annotations=True)
# 資料補充材料
get_europe_pmc_supplementary(pmcid="PMC...")
```
### CORE 獨有
```python
# 全文內容搜尋
search_core_fulltext(query="specific methodology term")
# 大規模開放取用
search_core(query="...", open_access=True)
```
### Semantic Scholar 獨有
```python
# 影響力引用(不只是數量,而是「有影響力的」引用)
paper = get_semantic_scholar_paper(paper_id="...")
print(paper["influentialCitationCount"])
# AI 生成摘要
print(paper["tldr"]) # Too Long; Didn't Read
# 引用意圖分析
for citation in paper["citations"]:
print(citation["intent"]) # methodology, background, result
```
### OpenAlex 獨有
```python
# 作者分析
author = search_openalex_authors(query="Jennifer Doudna")[0]
print(f"h-index: {author['hIndex']}")
print(f"Works count: {author['worksCount']}")
# 機構分析
works = search_openalex(query="...", institution="Harvard")
# 開放取用狀態詳情
work = get_openalex_work(work_id="...")
print(work["open_access"]["oa_status"]) # gold, green, hybrid, closed
```
---
## 結果整合技巧
### 去重方法
```python
def deduplicate_papers(all_results):
"""基於標題相似度去重"""
seen_titles = {}
unique = []
for paper in all_results:
# 正規化標題
title_key = paper["title"].lower()
title_key = re.sub(r'[^\w\s]', '', title_key)[:100]
if title_key not in seen_titles:
seen_titles[title_key] = paper
unique.append(paper)
else:
# 合併來源資訊
seen_titles[title_key]["sources"].append(paper["source"])
return unique
```
### 排序優先級
```python
def score_paper(paper):
"""計算論文優先分數"""
score = 0
# 多來源找到 = 更重要
score += len(paper.get("sources", [])) * 10
# 有全文 = 更有用
if paper.get("fulltext_available"):
score += 20
# 高引用 = 更有影響力
score += min(paper.get("citation_count", 0) / 10, 50)
# 最近發表 = 更新穎
if paper.get("year", 0) >= 2023:
score += 15
return score
# 排序
papers.sort(key=score_paper, reverse=True)
```
---
## 使用場景建議
| 需求 | 推薦來源組合 |
|------|-------------|
| 臨床研究 | PubMed + Europe PMC |
| 跨領域研究 | Semantic Scholar + OpenAlex |
| 開放取用優先 | CORE + Europe PMC |
| 最新研究 | Europe PMC (preprint) |
| 影響力分析 | Semantic Scholar + OpenAlex |
| 全面覆蓋 | 全部五個來源 |
---
## 小技巧
### 1. 並行搜尋
```python
# 同時搜尋多個來源(並行呼叫)
# 大幅減少等待時間
```
### 2. 先 PubMed 後擴展
```python
# PubMed 結果最權威
# 其他來源用於補充和取得全文
```
### 3. 標題匹配找全文
```python
# PubMed 找到但無全文 → 用標題在 CORE 搜尋
find_in_core(title="exact paper title")
```
### 4. 引用分析用 Semantic Scholar
```python
# 它的 influentialCitationCount 比單純計數更有意義
```Quick Install
$
npx ai-builder add skill u9401066/pubmed-multi-source-searchDetails
- Type
- skill
- Author
- u9401066
- Slug
- u9401066/pubmed-multi-source-search
- Created
- 4d ago