How Macros Work
This tutorial will show you how to create a macro. A macro is an automator which allows you to execute a complex extraction process in a single click:
It configures the Hub the way you want, explores pages, extracts data, exports it then restores the original configuration.
While this window is showing instructions, the user interface of OutWit Hub remains operational.
You can still interact normally with the application and you can move this tutorial window around on the screen to better see the parts of the interface that you want.
Here is the sample data
In previous tutorials about making scrapers, we have seen how to extract the population data and the images from this page.
Our goal is now to create a macro which will both extract this data and download the flag images to your hard disk.
Getting the Images
For each city, the list includes the country flag as an SVG image. In some cases the image URLs are what you need. In others, you will want the image files themselves. For our example, let's say that we want to actually download these files.
To do this once, you simply need to go to the Images view, select them, right-click on the selection and choose 'Download'.
If you know, however, that you will need to do this same task several times, or even regularly, the best way is to include both the scraper and the extraction of images in a macro.
A Macro That Does Both:
Scraping and Downloading
In the Macro Editor, you can define the desired settings for the whole application.
These settings are the same as the ones you find in the bottom panels of each view.
You can also define the exploration parameters and set the type of output you want for the extracted data.
The Macro in a few clicks
Scraping: scrape the data and export it to html.
Each setting you choose is transcribed in real time in the "Macro as URL" textbox. Conversely, when changing the MAU, the controls are updated in real time.
Images: download the SVG images.
In this case, selecting the images we want is easy: the common criterion is the filename extension. Each case is different and you may need to use other criteria like the size, the associated text, etc.
Overwrite the files that already exist on the hard disk.
We could have chosen to increment an index or add the time to the filename. We are leaving the default value for the destination folders, the extracted data and the downloaded files will go to the browser's current download folder.
When you click on the Execute button in the Editor or in the Manager, the program does the exploration, extraction and export.
The results are being exported from the 'scraped' view and the files, downloded into the chosen folder.
Now that you have your macro, you can either run it manually when you want, using the execute button, or execute it at a given time or frequency, by including it in a job.
This one was very simple, only extracting data from one page. Your macros can browse and dig through thousands of Web pages and extract very large amounts of data, but the principle will remain the same.
Stay tuned for more tutorials on the Hub's features.