The refining action is best illustrated using an example.
To scrape the course links from classcentral
|Links||[‘https://www.classcentral.com/course/machine-learning-835’, ‘https://www.classcentral.com/course/information-systems-audit-17979’, …]|
As we are looking to collect all the course links, Multi Select would be the right tool for this job.
Click on the first course title called
Machine learning. We can disregard the post above as it is an ad. As soon as you click on the post title, all similar course titles are also selected intelligently.
Once more, as the above screenshot illustrates, click X to remove the course properties. We can see that all the course titles are extracted in the preview. However the goal is to extract links rather than text titles.
Links are present as an href property of an anchor tag. Click on the dropdown icon as illustrated above to open up the refine menu.
Inside the popup, we can see that currently an
<h2> element is selected. Click on
<a> tag which is present above h2.
-You have succesfully refined the selection and all anchor tags are selected
propand set the name to