... And everyone is always on the run! It always seems like we have too little time to read, inform ourselves and enjoy some quality content. In this atmosphere, we can just hope that something comes to help, and this something can be AI.🦾
We can quickly leverage our python knowledge to implement and deploy a text summarization chatbot, using pretrained AI models and the frontend framework I already described several times, gradio.
First of all, we want to define a "facility", within the python script, that takes charge of loading and calling the AI model:
from transformers import pipeline
model_checkpoint = "FalconsAI/text_summarization"
summarizer = pipeline("summarization", model=model_checkpoint)
We chose, to start, a fairly little model, which is good to get to know the matter, but is now outperformed by state-of-the-art summarizers.
We need some important and useful functions, such as on that merges pdf if they are uploaded in bulk...
def merge_pdfs(pdfs: list):
merger = PdfMerger()
for pdf in pdfs:
merger.append(pdf)
merger.write(f"{pdfs[-1].split('.')[0]}_results.pdf")
merger.close()
return f"{pdfs[-1].split('.')[0]}_results.pdf"
...And one that turns the merged pdf into a text string suitable for the summarizer:
def pdf2string(pdfpath):
loader = PyPDFLoader(pdfpath)
documents = loader.load()
### Split the documents into smaller chunks for processing
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
fulltext = ""
for text in texts:
fulltext += text.page_content+"\n\n\n"
return fulltext
Now that we have the preprocessing functions set, let's design our chatbot with Gradio: we want it to be multimodal (supporting bot direct texts from our side and uploaded documents), so building it will be a little more difficult.
We want a function that manages our chat history, separating text messages from pdf documents, so we write this piece of code:
def add_message(history, message):
if len(message["files"]) > 0:
history.append((message["files"], None))
if message["text"] is not None and message["text"] != "":
history.append((message["text"], None))
return history, gr.MultimodalTextbox(value=None, interactive=False)
This function returns history
as a list of tuples containing:
("/path/to/file1.pdf", "path/to/file2.pdf...")
) and None
(which represents the message from the chatbot, that has not been written yet)"In this article, we will see why cats are so overwhelmingly cute..."
) and None
(which represents the message from the chatbot, that has not been written yet)Let's see how we can use the history
to generate our text:
def bot(history):
global histr
if not history is None:
if type(history[-1][0]) != tuple:
text = history[-1][0]
response = summarizer(text, max_length=int(len(text.split(" "))*0.5), min_length=int(len(text.split(" "))*0.05), do_sample=False)[0]
response = response["summary_text"]
history[-1][1] = ""
for character in response:
history[-1][1] += character
time.sleep(0.05)
yield history
if type(history[-1][0]) == tuple:
filelist = []
for i in history[-1][0]:
filelist.append(i)
finalpdf = merge_pdfs(filelist)
text = pdf2string(finalpdf)
response = summarizer(text, max_length=int(len(text.split(" "))*0.5), min_length=int(len(text.split(" "))*0.05), do_sample=False)[0]
response = response["summary_text"]
history[-1][1] = ""
for character in response:
history[-1][1] += character
time.sleep(0.05)
yield history
else:
history = histr
bot(history)
As you can see, we check whether the first element of the last tuple in history (history[-1][0]
) is a tuple (which means that it consists of uploaded pdfs) or not:
After that, we stream the output summary as the chatbot response (history[-1][1]
, which previously was None
) with yield
.
And now we only need to build the multimodal chatbot!
with gr.Blocks() as demo:
chatbot = gr.Chatbot(
[[None, "Hi, I'm **ai-summarizer**🤖, your personal summarization assistant😊"]],
label="ai-summarizer",
elem_id="chatbot",
bubble_full_width=False,
)
chat_input = gr.MultimodalTextbox(interactive=True, file_types=["pdf"], placeholder="Enter message or upload file...", show_label=False)
chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input])
bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input])
We then launch our app like this:
demo.queue()
if __name__ == "__main__":
demo.launch(server_name="0.0.0.0", share=False)
And make it run (assuming we saved the script in app.py
):
python3 app.py
Once everything is loaded, you will be able to see the chatbot on localhost:7860
.
And we're done: we can just seat back and relax while our summarization assistant works!😎
This is the last post for the AI enthusiasm series: I don't exclude I will make a season 2 of it, but for now I'll be embarking in a new, exciting educational blog series about Docker and devcontainers.
Let me know in the comments if you would like me to start also a YouTube channel (it's a project that I would be eager to work on if I knew I would have some support from my community here on DEV!).🔥🚀
Thank you so much to everyone, let's keep up!❤️
Источник: dev.to
Наш сайт является информационным посредником. Сообщить о нарушении авторских прав.
python ai tutorial learning