AI

Step-by-step guide on how to build AI news summary using Shatlit, Groq, and Tavily


introduce

In this tutorial, we will build an AI-advanced news agent that can search the web for the latest news about a given topic and summarize the results. The agent follows a structured workflow:

  1. Browse: Generate relevant search queries and collect information from the network.
  2. writing: Extract and compile news summary from collected information.
  3. reflection: Criticize the summary by checking the facts’ correctness and making improvements.
  4. Exquisite: Improve summary based on criticism.
  5. Title generation: Generate appropriate headlines for each news summary.

To improve usability, we will also create a simple GUI using simplification. Similar to previous tutorials, we will use valley For LLM-based processing and frank Used for web browsing. You can generate free API keys from their respective websites.

Setting up the environment

We first set environment variables, install the required libraries, and import the necessary dependencies:

Install the required libraries

Copy the codecopyUse another browser
pip install langgraph==0.2.53 langgraph-checkpoint==2.0.6 langgraph-sdk==0.1.36 langchain-groq langchain-community langgraph-checkpoint-sqlite==2.0.1 tavily-python streamlit

Import the library and set the API keys

Copy the codecopyUse another browser
import os
import sqlite3
from langgraph.graph import StateGraph
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_groq import ChatGroq
from tavily import TavilyClient
from langgraph.checkpoint.sqlite import SqliteSaver
from typing import TypedDict, List
from pydantic import BaseModel
import streamlit as st

# Set API Keys
os.environ['TAVILY_API_KEY'] = "your_tavily_key"
os.environ['GROQ_API_KEY'] = "your_groq_key"

# Initialize Database for Checkpointing
sqlite_conn = sqlite3.connect("checkpoints.sqlite", check_same_thread=False)
memory = SqliteSaver(sqlite_conn)

# Initialize Model and Tavily Client
model = ChatGroq(model="Llama-3.1-8b-instant")
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

Define proxy status

The agent maintains status information throughout its workflow:

  1. topic: The topic of the latest news draft: the first draft of the news summary
  2. content: Research content extracted from Tavily’s search results
  3. criticism: Criticism and suggestions generated for the draft in a state of reflection.
  4. Refined abstract: Updated news summary combines critical suggestions

title: Headlines generated for each news article class

Copy the codecopyUse another browser
class AgentState(TypedDict):
    topic: str
    drafts: List[str]
    content: List[str]
    critiques: List[str]
    refined_summaries: List[str]
    headings: List[str]

Definition tips

We define system prompts for each stage of the agent workflow:

Copy the codecopyUse another browser
BROWSING_PROMPT = """You are an AI news researcher tasked with finding the latest news articles on given topics. Generate up to 3 relevant search queries."""

WRITER_PROMPT = """You are an AI news summarizer. Write a detailed summary (1 to 2 paragraphs) based on the given content, ensuring factual correctness, clarity, and coherence."""

CRITIQUE_PROMPT = """You are a teacher reviewing draft summaries against the source content. Ensure factual correctness, identify missing or incorrect details, and suggest improvements.
----------
Content: {content}
----------"""

REFINE_PROMPT = """You are an AI news editor. Given a summary and critique, refine the summary accordingly.
-----------
Summary: {summary}"""

HEADING_GENERATION_PROMPT = """You are an AI news summarizer. Generate a short, descriptive headline for each news summary."""

Structural inquiries and news

We use pydantic to define the structure of query and news articles. Pydantic allows us to define the structure of LLM output. This is important because we want the query to be a list of strings and the content extracted from the web will contain multiple news articles, so there will be a list of strings.

Copy the codecopyUse another browser
from pydantic import BaseModel

class Queries(BaseModel):
    queries: List[str]

class News(BaseModel):
    news: List[str]

Implement AI Agents

1. Browse nodes

This node generates a search query and retrieves relevant content from the network.

Copy the codecopyUse another browser
def browsing_node(state: AgentState):
    queries = model.with_structured_output(Queries).invoke([
        SystemMessage(content=BROWSING_PROMPT),
        HumanMessage(content=state['topic'])
    ])
    content = state.get('content', [])
    for q in queries.queries:
        response = tavily.search(query=q, max_results=2)
        for r in response['results']:
            content.append(r['content'])
    return {"content": content}

2. Write node

Extract news digests from the retrieved content.

Copy the codecopyUse another browser
def writing_node(state: AgentState):
    content = "nn".join(state['content'])
    news = model.with_structured_output(News).invoke([
        SystemMessage(content=WRITER_PROMPT),
        HumanMessage(content=content)
    ])
    return {"drafts": news.news}

3. Reflection node

Criticize the generated summary of the content.

Copy the codecopyUse another browser
def reflection_node(state: AgentState):
    content = "nn".join(state['content'])
    critiques = []
    for draft in state['drafts']:
        response = model.invoke([
            SystemMessage(content=CRITIQUE_PROMPT.format(content=content)),
            HumanMessage(content="draft: " + draft)
        ])
        critiques.append(response.content)
    return {"critiques": critiques}

4. Improve nodes

Improved criticism-based summary.

Copy the codecopyUse another browser
def refine_node(state: AgentState):
    refined_summaries = []
    for summary, critique in zip(state['drafts'], state['critiques']):
        response = model.invoke([
            SystemMessage(content=REFINE_PROMPT.format(summary=summary)),
            HumanMessage(content="Critique: " + critique)
        ])
        refined_summaries.append(response.content)
    return {"refined_summaries": refined_summaries}

5. Toutiao Power Generation Node

Generate a short title for each news summary.

Copy the codecopyUse another browser
def heading_node(state: AgentState):
    headings = []
    for summary in state['refined_summaries']:
        response = model.invoke([
            SystemMessage(content=HEADING_GENERATION_PROMPT),
            HumanMessage(content=summary)
        ])
        headings.append(response.content)
    return {"headings": headings}

Simplified building of UI

Copy the codecopyUse another browser
# Define Streamlit app
st.title("News Summarization Chatbot")

# Initialize session state
if "messages" not in st.session_state:
    st.session_state["messages"] = []

# Display past messages
for message in st.session_state["messages"]:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Input field for user
user_input = st.chat_input("Ask about the latest news...")

thread = 1
if user_input:
    st.session_state["messages"].append({"role": "user", "content": user_input})
    with st.chat_message("assistant"):
        loading_text = st.empty()
        loading_text.markdown("*Thinking...*")

        builder = StateGraph(AgentState)
        builder.add_node("browser", browsing_node)
        builder.add_node("writer", writing_node)
        builder.add_node("reflect", reflection_node)
        builder.add_node("refine", refine_node)
        builder.add_node("heading", heading_node)
        builder.set_entry_point("browser")
        builder.add_edge("browser", "writer")
        builder.add_edge("writer", "reflect")
        builder.add_edge("reflect", "refine")
        builder.add_edge("refine", "heading")
        graph = builder.compile(checkpointer=memory)

        config = {"configurable": {"thread_id": f"{thread}"}}
        for s in graph.stream({"topic": user_input}, config):
            # loading_text.markdown(f"*{st.session_state['loading_message']}*")
            print(s)
        
        s = graph.get_state(config).values
        refined_summaries = s['refined_summaries']
        headings = s['headings']
        thread+=1
        # Display final response
        loading_text.empty()
        response_text = "nn".join([f"{h}n{s}" for h, s in zip(headings, refined_summaries)])
        st.markdown(response_text)
        st.session_state["messages"].append({"role": "assistant", "content": response_text})

in conclusion

This tutorial covers the entire process of building an AI-driven news digest proxy using a simple simplified UI. Now you can fix this issue and make some further improvements:

  • one Better gui For enhanced user interaction.
  • merge Iterative exquisite Make sure the summary is accurate and appropriate.
  • Maintain the environment to continue discussing specific news.

Happy coding!


Also, please feel free to follow us twitter And don’t forget to join us 75K+ ml reddit.

🚨 Recommended open source AI platform: ‘Intellagent is an open source multi-proxy framework that evaluates complex dialogue AI systems(Promotion)

A step-by-step guide on how to build AI news digests using shertlit, groq, and tavily appeared first on Marktechpost.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button