What Is Screen Scraping?
Understand the concept of screen scraping with an example.
Join the DZone community and get the full member experience.
Join For FreeHave you ever thought of extracting the UI elements present on your screen? This can be done using a simple method called Screen Scraping. Let us begin by understanding the concept of Screen Scraping.
What Is Screen Scraping?
Screen scraping is a technique used to fetch the UI elements on the screen which can be used to feed other applications. For example, scraping mark sheets of various students and using them to calculate the overall result of a school through software.
Screen scraping finds its uses in various domains and sectors including banking, aviation etc.
Screen scraping can either be done manually or by using automated software. There are various tools available to scrape screen elements.
How Does Screen Scraping Work?
Screen scraping software use OCR(Optical Character Recognition) technology to detect text from images. The process for conversion looks like this:
1. Convert the image to black and white: The first step is one of the simplest. This involves the conversion of the image into just two colours - black and white. This step helps in differentiating between the background and foreground of the image.
2. Tokenisation: Once the image has been converted to a black-and-white image, the next step is to pick the relevant characters. Each character is fetched on the basis of the colour of the pixels. The characters fetched are known as tokens. Each token may or may not be a valid letter of a known language.
3. Natural Language Processing: The final step involves the processing of the fetched tokens from the previous step. These tokens are then matched with the pre-defined set of letters and symbols stored in the system. The matching technique is used for all the tokens and the result is shared with the user which can be fed to other systems.
What Does Screen Scraping Do?
A screen scraping tool creates the exact copy of the UI element visible on the screen. Let us take an example in which we will capture text from an image. In this blog, we will be using a similar tool called UiPath. UiPath is an RPA(Robotic Process Automation) tool which is used to create automation from start to end. RPA is one of the niche technologies which is expected to rise further in the near future.
Step 1: Download UiPath Studio
The first step is to install the required software. To download the UiPath software, you can check out this link. You will need to Signup on to the UiPath website and then you will be able to download the UiPath Studio.
Step 2: Install UiPath Studio
After the download is successful, you will need to install the software on your system. It would take some time to install depending on the options selected. Once the software is installed, you can open that right away.
Step 3: Change Settings To Enable Screen Scraping
If you have installed the latest version of UiPath studio, you would need to change settings to enable Screen Scraping. Select Settings -> Design and Turn off Modern Design.
Step 4: Create a New Process
After opening the UiPath, you will be able to see the following screen. You will be able to see services like Process, Library, Test Automation etc. Select Process from the right navigation menu.
Once you click on Process, you will be greeted with the following options to create a new process:
- Name: Name of the process to be created.
- Location: Path of the project on the drive.
- Description: Explanation of the project.
- Compatibility: It depends on the operating system on which the project needs to run.
- Language: As of now, UiPath supports C# and VB.
After entering the details, click on Create. Once you click on Create, UiPath will create the project which can take a few minutes to load.
Once the workspace creation is completed, you will be greeted with the following screen.
Step 5: Start Screen Scraping
To start the scraper, just click on Screen Scraping available in the top menu bar. Once you click on Screen Scraping, the scraper will start. For demonstration, we will be scraping text from the following image.
Note the uneven spacing between the letters. A good screen scraping tool will keep the spacing intact while extracting the text from the image.
Once the scraping is done, it will take some time to render the image and collect all the text available in the image. After its completion, you will see a screen like this.
As you can notice, the spacing and new lines remain the same as they appear in the image. Now, you can copy the text and feed the text to other applications.
Thanks for reading. This was a quick demo from a plethora of services provided by screen scraping tools. If you need some help, I am just a comment away.
Opinions expressed by DZone contributors are their own.
Comments