Knowledgebase Data Source Structures

(Jordan M) #1

What is the required structure for a knowledgebase data source to properly work / crawl?

On my news page: https://machinesitalia.org/news/ the articles are scraped properly, where it pages through each /news/article/ and makes a document.

However, on my company directory:
https://machinesitalia.org/companies

The scraper only seems to grab URLs like: /companies?page=2
Instead of the expected /company/{company-name-1}

Even when I created a data source, then manually pasted in all 900 URLs, it would only read the URL from the first line then it would say “Completed” and never created a 2nd doc / scraped the 2nd URL

(Nathan Stults) #2

Hi, pasting all the individual URLs into the sources as start URLs won’t work. Please delete the data source where each webpage is a seperate start URL, create a new data source with only the https://machinesitalia.org/companies, and then send us an email at support@flowxo.com with a link to the data source and we can tell you why it’s not scraping the way you expect.