How To Create a Question-Answering Model From Scratch

Building your question-answering model might seem like it could be more manageable. This tutorial will walk you through step by step to get it done.

Joseph owino

May. 10, 23 · Tutorial

Like (4)

Save

5.1K Views

Building your question-answering model might seem like it could be more manageable. This tutorial will walk you through step by step to get it done.

Question and Answering is a natural language processing (NLP) technique used to answer questions based on data given or text. Question Answering models aim to develop a system that automatically understands questions passed in natural language and provides accurate or relevant answers. Question answering is widely used in various disciplines, such as Chatbots and virtual assistants, Education and E-Learning, information retrieval, and search engines.

Setting Development Environment

Before continuing to this tutorial, you need to have some knowledge of Python to follow quickly through this setup and development process. Navigate to Google Colab or activate your Jupyter Notebook. Create a new Notebook. Then execute the following commands to install the required libraries in your development environment.

     Python 
   
   !pip install transformers

You will use transformers to provide pre-trained models and tools for working with state-of-the-art NLP models.

Importing Necessary Libraries

Import the required libraries that you will use to preprocess the data and create the model.

     Python 
   
   import tensorflow as tf
from transformers import pipeline
from transformers import BertForQuestionAnswering

Imported libraries will be used later in the code to develop the model.

Define Context and List of Questions Based on Context

Define a context string that contains the information you want to ask questions about. Here we have chosen a storyline about mountain formation.

     Python 
   
   question_answerer =pipeline("question-answering")
context = "A mountain is an elevated portion of the Earth's crust, generall

Next, create a list of questions you want to ask about the given context.

     Python 
   
   questions = [
"What is a mountain?",
"What is a plateau?"
]

Encode the first question using the tokenizer’s encode method, specifying truncation and padding. Encoding helps improve the model’s consistency, memory efficiency, and training performance.

     Python 
   
   tokenizer.encode(questions[0], truncation=True, padding=True)

Create a question-answering pipeline using the BERT-based model and tokenizer.

     Python 
   
   question_answerer=pipeline('question-answering', model=model, tokenizer=tok
question_answerer({
'question':questions,
'context' : context;
})

Load the Pre-Trained BERT-Based Model and Tokenizer

BERT (Bidirectional Encoder Representations from Transformers) mostly used language representation models in NLP tasks and questions answering. Loading pre-trained BERT-based models increases the chances of the model understanding language and its capabilities for question answering. Tokenization will divide the input text into meaningful units for the model to process.

Combining BERT and tokenizer increases your model’s performance and accurate result.

     Python 
   
   from transformers import AutoTokenizer
model = BertForQuestionAnswering.from_pretrained("deepset/bert-base-cased-s
tokenizer = AutoTokenizer.from_pretrained("deepset/bert-base-cased-squad2")

Obtaining Answers

Now that you have got context and the pre-trained model loaded, you need to pass access to the answers by assigning the answers to a dictionary and printing the score together with the answer. Like in the above, we have two questions and will assign each question by its index. The model will print the answers and the score in each.

     Python 
   
 
 
   question_answerer({
'question': questions[0],
'context': context
})
question_answerer({
'question': questions[1],
'context': context
})
 
  

Wow! Just simply, you have your question-answering model using Hugging Face Transformers.

Conclusion

Following this tutorial, you have a basic QA model using the Hugging Face Transformers library. By following this tutorial, you can quickly adapt the code to work with different contexts and questions, allowing you to create your question-answering model for various contexts.

Remember that the example code provided is just a starting point, and you can explore and customize it further based on your specific needs and requirements. Have fun experimenting and building more advanced question-answering models!

Complete code available on GITHUB REPO

NLP Question answering

Opinions expressed by DZone contributors are their own.

Related

Trending

How To Create a Question-Answering Model From Scratch

Building your question-answering model might seem like it could be more manageable. This tutorial will walk you through step by step to get it done.

Setting Development Environment

Importing Necessary Libraries

Define Context and List of Questions Based on Context

Load the Pre-Trained BERT-Based Model and Tokenizer

Obtaining Answers

Conclusion

Related

Partner Resources