HTML screen scrapping

(Andrew Catchpole) #1

Where a source provdes JSON formatted data this is really easy use in Flowxo.

However, I also have a source that is not available in either XML or JSON. Until now it has been possible to use a simple screen scrape approach extracting the data using the Flowxo Text String capability to ‘split’’ the start and end of the value required. This is fine for a small number of data values but becomes too complex where a lot of values need to be collected.

It would be really good to have a Flowxo action/service which accepts an HTML body as an input with start/stop delimiters and is able to output values in similar way to the JSON notation e.g. using the double underscores.

Would this be useful to other Flowxo users?

Thanks Andrew

(Nathan Stults) #2

Andrew,

Building a screen scraping tool is probably not in the cards for us, it would just be too hard and too expensive to support. However, there are a couple of products out there that can turn a web page into a JSON API that you could probably use in between the website and Flow XO’s native JSON capabilities:

https://wrapapi.com/


And if you google search “turn a website into an API” you will see there are even more.

Nathan

1 Like
(Andrew Catchpole) #3

@nathan

Many thanks for the tip. I am using wrapapi which is perfect.

If anyone needs BT Broadband outages then here’s a ready made API:

https://wrapapi.com/use/poweye/supplier/bt_broadband/0.0.8?wrapAPIKey=A0TLYlC5BnjEUIJOTeQIFtZ24EK3J2HU