Strategies for tackling complex websites, nested data, and dynamic content in your Jsonify workflows.
Extract Data
block after navigating to each respective page type. Jsonify processes one Extract Data
configuration per page view.Find Links
, Follow Links
, and Interact with Page
blocks methodically to reach the desired page before extraction.Extract Data
block, use very specific instructions to target only the relevant parts of a complex page, ignoring sidebars, footers, or unrelated content. Use negative constraints (e.g., “Do not extract from the ‘related articles’ section”).Extract Data
Block’s Advanced Mode (Edit data shape - JSON):
<...>
to guide the AI.Extract Data
block, you would then guide the AI on how to populate this structure (e.g., “For each article, populate the article_title with the main title, and publication_date with the publishing date. For the author object, fill author_name and author_profile_url. For tags, collect all associated tags as an array of strings.”).author_name
, author_profile_url
instead of a nested author
object). This is simpler to set up but provides a less structured output.Interact with Page
Block (for “Load More” buttons, pop-ups, etc.):
Extract Data
or Find Links
blocks will operate on this new state.Interact with Page
to trigger an event (e.g., click a button to open a pop-up window), and then use another Extract Data
block to get information specifically from this newly appeared pop-up/modal content. All these operations are considered to be within the context of the same initial page view, as the primary URL does not change for the modal itself.Paginate a list
Block (for Infinite Scroll & Button-Based Pagination):
Paginate a list
block simulates these scroll actions to load new viewports of content.Interact with Page
or Paginate a list
.Extract Data
descriptions to be somewhat flexible. Instead of “Extract the text from the third paragraph,” try “Extract the paragraph that starts with ‘Product Overview:’.”Extract Data
instructions, always specify how to handle missing fields (e.g., “If the discount price is not available, leave the discount_price
field empty”). This prevents errors and ensures consistent output structure.