Your Webpages A.I. Summarizer in few lines of Code and for FREE

Federico Scognamiglio
3 min readApr 17, 2024

--

Since curiosity has no limits it’s challenging to say the same for time. Today by the way, since the spread of AI technologies in that article you’ll discover how to build your AI reading and summarizer machine.

Since most of the APIs require a payment, for those who are curious to see their summary of selected articles made by their code this article could fit for You. To keep our experience free we choose to use Python and Cohere platforms that generate your API keys for free. To make your experience super easy I suggest you run that code (that works and can be copied and pasted) on GoogleColab since it is a platform that works online and doesn’t require any performance prerequisite from your laptop for that task.

Since the hype to read your first A.I. homemade summary, let’s start by importing and installing the fundamental packages:

Let’s get the environment ready for the challenge:

!pip install cohere
import requests
from bs4 import BeautifulSoup
import cohere
co = cohere.Client('PASTE HERE YOUR API COEHERE KEYS')

Then, as second step, let’s paste the articles that we are interested in. In this example i picked two cool url’s articles like an AI founder that found the way to not use Nvidia’s chips and the famous OpenAI competitor, Anthropic :

#You can choose the article in which you are more interested in and paste in it
url1 = 'https://www.businessinsider.com/ai-models-can-learn-deceptive-behaviors-anthropic-researchers-say-2024-1'
url2 = 'https://www.businessinsider.com/nvidia-chips-lamini-ai-amd-jensen-huang-sharon-zhou-2024-4'

Then, after these import it’s time to shine and to build the engine of the generative A.I. summarizer.

def extract_text_from_url(url, max_text_length):
# GET request to the URL
response = requests.get(url)

# Get the HTML content of the page using BeautifulSoup package
soup = BeautifulSoup(response.content, 'html.parser')

# Find the paragraphs in the page
text_elements = soup.find_all(text=True)

# Combine all the extracted text a single string
text = ' '.join(element.strip() for element in text_elements if element.strip())

# Truncate text if it exceeds the maximum allowed length
# You can skip that once you buy a API key to compute more token
if len(text) > max_text_length:
text = text[:max_text_length]

return text


# This max length has been chosen since it's a the free version of API
max_text_length = 450

# Extract text from the first URL
text1 = extract_text_from_url(url1, max_text_length)

# Repeat the same work for the second url pasted
text2 = extract_text_from_url(url2, max_text_length)

# Summarize text from the first URL
response1 = co.summarize(
text=text1,
model='command',
length='medium',
extractiveness='medium'
)
summary1 = response1.summary

# Do the same from the second URL
response2 = co.summarize(
text=text2,
model='command',
length='medium',
extractiveness='medium'
)
summary2 = response2.summary

Are you ready to read your first summary of a page made by your own code?


print("Summary of Text from URL 1:")
print(summary1)

print("\nSummary of Text from URL 2:")
print(summary2)

And that’s it.

By pressing this last “run” we’re going to get the summary for every article we paste it and you can read your AI-backed summary.

In conclusion, as you can see this code it’s an easy way to get confident and also cool in front of your friends and colleague. To generate longer version of the summary, the Cohere free version is not the best option. In that case, you can consider other plans by Coehere or also other API providers like OpenAI that use an engine called “Babbage-002” which costs $0.0004 / 1K tokens.

Of course, if you have an issue with the code or you want share some improvements, don’t hesitate to drop me a message, i will be super happy to exchange further ideas.
Especially if you want to share some chocolate.

--

--

Federico Scognamiglio
Federico Scognamiglio

Written by Federico Scognamiglio

Fellow in Data Science at University of Padova

No responses yet