In this tutorial, we walk through the advanced construction PaperQA2 An AI agent powered by Google’s Gemini model, designed for scientific literature analysis. We set up the environment in Google Colab/notebook, configure the Gemini API, and integrate it seamlessly with PaperQA2 to process and query multiple research papers. By the end of the setup, we have an intelligent agent that is able to answer complex questions, perform multiple question analysis and conduct comparative research across papers, while also providing clear answers to the evidence from the source document. Check The complete code is here.
!pip install paper-qa>=5 google-generativeai requests pypdf2 -q
import os
import asyncio
import tempfile
import requests
from pathlib import Path
from paperqa import Settings, ask, agent_query
from paperqa.settings import AgentSettings
import google.generativeai as genai
GEMINI_API_KEY = "Use Your Own API Key Here"
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY
genai.configure(api_key=GEMINI_API_KEY)
print("✅ Gemini API key configured successfully!")
We first install the required libraries, including PaperQA2 and Google’s Generate AI SDK, and then import the necessary modules for our project. We set the GEMINI API key as an environment variable and configure it to ensure the points are ready. Check The complete code is here.
def download_sample_papers():
"""Download sample AI/ML research papers for demonstration"""
papers = {
"attention_is_all_you_need.pdf": "
"bert_paper.pdf": "
"gpt3_paper.pdf": "
}
papers_dir = Path("sample_papers")
papers_dir.mkdir(exist_ok=True)
print("📥 Downloading sample research papers...")
for filename, url in papers.items():
filepath = papers_dir / filename
if not filepath.exists():
try:
response = requests.get(url, stream=True, timeout=30)
response.raise_for_status()
with open(filepath, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"✅ Downloaded: {filename}")
except Exception as e:
print(f"❌ Failed to download {filename}: {e}")
else:
print(f"📄 Already exists: {filename}")
return str(papers_dir)
papers_directory = download_sample_papers()
def create_gemini_settings(paper_dir: str, temperature: float = 0.1):
"""Create optimized settings for PaperQA2 with Gemini models"""
return Settings(
llm="gemini/gemini-1.5-flash",
summary_llm="gemini/gemini-1.5-flash",
agent=AgentSettings(
agent_llm="gemini/gemini-1.5-flash",
search_count=6,
timeout=300.0,
),
embedding="gemini/text-embedding-004",
temperature=temperature,
paper_directory=paper_dir,
answer=dict(
evidence_k=8,
answer_max_sources=4,
evidence_summary_length="about 80 words",
answer_length="about 150 words, but can be longer",
max_concurrent_requests=2,
),
parsing=dict(
chunk_size=4000,
overlap=200,
),
verbosity=1,
)
We downloaded a set of well-known AI/ML research papers for our analysis and stored them in a dedicated folder. We then create fine-tuning parameters such as search counting, evidence retrieval and parsing for efficient, accurate literature processing, etc. Check The complete code is here.
class PaperQAAgent:
"""Advanced AI Agent for scientific literature analysis using PaperQA2"""
def __init__(self, papers_directory: str, temperature: float = 0.1):
self.settings = create_gemini_settings(papers_directory, temperature)
self.papers_dir = papers_directory
print(f"🤖 PaperQA Agent initialized with papers from: {papers_directory}")
async def ask_question(self, question: str, use_agent: bool = True):
"""Ask a question about the research papers"""
print(f"n❓ Question: {question}")
print("🔍 Searching through research papers...")
try:
if use_agent:
response = await agent_query(query=question, settings=self.settings)
else:
response = ask(question, settings=self.settings)
return response
except Exception as e:
print(f"❌ Error processing question: {e}")
return None
def display_answer(self, response):
"""Display the answer with formatting"""
if response is None:
print("❌ No response received")
return
print("n" + "="*60)
print("📋 ANSWER:")
print("="*60)
answer_text = getattr(response, 'answer', str(response))
print(f"n{answer_text}")
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
print("n" + "-"*40)
print("📚 SOURCES USED:")
print("-"*40)
for i, context in enumerate(contexts[:3], 1):
context_name = getattr(context, 'name', getattr(context, 'doc', f'Source {i}'))
context_text = getattr(context, 'text', getattr(context, 'content', str(context)))
print(f"n{i}. {context_name}")
print(f" Text preview: {context_text[:150]}...")
async def multi_question_analysis(self, questions: list):
"""Analyze multiple questions in sequence"""
results = {}
for i, question in enumerate(questions, 1):
print(f"n🔄 Processing question {i}/{len(questions)}")
response = await self.ask_question(question)
results = response
if response:
print(f"✅ Completed: {question[:50]}...")
else:
print(f"❌ Failed: {question[:50]}...")
return results
async def comparative_analysis(self, topic: str):
"""Perform comparative analysis across papers"""
questions = [
f"What are the key innovations in {topic}?",
f"What are the limitations of current {topic} approaches?",
f"What future research directions are suggested for {topic}?",
]
print(f"n🔬 Starting comparative analysis on: {topic}")
return await self.multi_question_analysis(questions)
async def basic_demo():
"""Demonstrate basic PaperQA functionality"""
agent = PaperQAAgent(papers_directory)
question = "What is the transformer architecture and why is it important?"
response = await agent.ask_question(question)
agent.display_answer(response)
print("🚀 Running basic demonstration...")
await basic_demo()
async def advanced_demo():
"""Demonstrate advanced multi-question analysis"""
agent = PaperQAAgent(papers_directory, temperature=0.2)
questions = [
"How do attention mechanisms work in transformers?",
"What are the computational challenges of large language models?",
"How has pre-training evolved in natural language processing?"
]
print("🧠 Running advanced multi-question analysis...")
results = await agent.multi_question_analysis(questions)
for question, response in results.items():
print(f"n{'='*80}")
print(f"Q: {question}")
print('='*80)
if response:
answer_text = getattr(response, 'answer', str(response))
display_text = answer_text[:300] + "..." if len(answer_text) > 300 else answer_text
print(display_text)
else:
print("❌ No answer available")
print("n🚀 Running advanced demonstration...")
await advanced_demo()
async def research_comparison_demo():
"""Demonstrate comparative research analysis"""
agent = PaperQAAgent(papers_directory)
results = await agent.comparative_analysis("attention mechanisms in neural networks")
print("n" + "="*80)
print("📊 COMPARATIVE ANALYSIS RESULTS")
print("="*80)
for question, response in results.items():
print(f"n🔍 {question}")
print("-" * 50)
if response:
answer_text = getattr(response, 'answer', str(response))
print(answer_text)
else:
print("❌ Analysis unavailable")
print()
print("🚀 Running comparative research analysis...")
await research_comparison_demo()
̌We define a PaperQaagent that uses our Gemini-adjusted PaperQA2 settings to search for paper, answer questions, and quote PaperQaagent with clean display assistants. We then run basic, advanced multi-question and comparison demonstrations so that we can ask literature end-to-end and summarize discoveries effectively. Check The complete code is here.
def create_interactive_agent():
"""Create an interactive agent for custom queries"""
agent = PaperQAAgent(papers_directory)
async def query(question: str, show_sources: bool = True):
"""Interactive query function"""
response = await agent.ask_question(question)
if response:
answer_text = getattr(response, 'answer', str(response))
print(f"n🤖 Answer:n{answer_text}")
if show_sources:
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
print(f"n📚 Based on {len(contexts)} sources:")
for i, ctx in enumerate(contexts[:3], 1):
ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
print(f" {i}. {ctx_name}")
else:
print("❌ Sorry, I couldn't find an answer to that question.")
return response
return query
interactive_query = create_interactive_agent()
print("n🎯 Interactive agent ready! You can now ask custom questions:")
print("Example: await interactive_query('How do transformers handle long sequences?')")
def print_usage_tips():
"""Print helpful usage tips"""
tips = """
🎯 USAGE TIPS FOR PAPERQA2 WITH GEMINI:
1. 📝 Question Formulation:
- Be specific about what you want to know
- Ask about comparisons, mechanisms, or implications
- Use domain-specific terminology
2. 🔧 Model Configuration:
- Gemini 1.5 Flash is free and reliable
- Adjust temperature (0.0-1.0) for creativity vs precision
- Use smaller chunk_size for better processing
3. 📚 Document Management:
- Add PDFs to the papers directory
- Use meaningful filenames
- Mix different types of papers for better coverage
4. ⚡ Performance Optimization:
- Limit concurrent requests for free tier
- Use smaller evidence_k values for faster responses
- Cache results by saving the agent state
5. 🧠 Advanced Usage:
- Chain multiple questions for deeper analysis
- Use comparative analysis for research reviews
- Combine with other tools for complete workflows
📖 Example Questions to Try:
- "Compare the attention mechanisms in BERT vs GPT models"
- "What are the computational bottlenecks in transformer training?"
- "How has pre-training evolved from word2vec to modern LLMs?"
- "What are the key innovations that made transformers successful?"
"""
print(tips)
print_usage_tips()
def save_analysis_results(results: dict, filename: str = "paperqa_analysis.txt"):
"""Save analysis results to a file"""
with open(filename, 'w', encoding='utf-8') as f:
f.write("PaperQA2 Analysis Resultsn")
f.write("=" * 50 + "nn")
for question, response in results.items():
f.write(f"Question: {question}n")
f.write("-" * 30 + "n")
if response:
answer_text = getattr(response, 'answer', str(response))
f.write(f"Answer: {answer_text}n")
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
f.write(f"nSources ({len(contexts)}):n")
for i, ctx in enumerate(contexts, 1):
ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
f.write(f" {i}. {ctx_name}n")
else:
f.write("Answer: No response availablen")
f.write("n" + "="*50 + "nn")
print(f"💾 Results saved to: {filename}")
print("✅ Tutorial complete! You now have a fully functional PaperQA2 AI Agent with Gemini.")
We create an interactive query assistant that allows us to ask custom questions on demand and choose to view the source of references. We also printed out practical usage tips and added a saving saver to write each question and answer into the result file with the source name and wrap the tutorial with an off-the-shelf workflow.
In short, we have successfully created a fully functional AI research assistant that takes advantage of Gemini’s speed and versatility through PaperQA2’s powerful paper processing capabilities. Now we can explore scientific papers interactively, run targeted queries, and even conduct in-depth comparative analysis with minimal effort. This setup enhances our ability to digest complex research and simplifies the entire literature review process, allowing us to focus on insight rather than manual search.
Check The complete code is here. Check out ours anytime Tutorials, codes and notebooks for github pages. Also, please stay tuned for us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.
Asif Razzaq is CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, ASIF is committed to harnessing the potential of artificial intelligence to achieve social benefits. His recent effort is to launch Marktechpost, an artificial intelligence media platform that has an in-depth coverage of machine learning and deep learning news that can sound both technically, both through technical voices and be understood by a wide audience. The platform has over 2 million views per month, demonstrating its popularity among its audience.