Version 1.0 Home | SourceForge | Forums | Contact

News

October 17th, 2007: Web-Harvest 1.0 released.

  • GUI is introduced.
  • html-to-xml processor exposes attributes for controlling cleaner's behaviour.
  • More scripting languages and features supported.
  • Access to HttpClient in runtime supported.
  • Number of other improvements and fixes.

April 16th, 2007: SVN support is added.

January 16th, 2007: Web-Harvest 0.5 released.

  • html-to-xml parser is changed - HtmlCleaner is used instead of TagSoup.
  • Script processor is introduced.
  • template processor is now based on BeanShell instead of OGNL.
  • Types are introduced in XQuery parameters.
  • Few new constructors are added in class ScraperConfiguration.
  • file and include processors now support both relative and absolute paths.
  • Web-Harvest variables are case-sensitive from this version.

October 27th, 2006: Web-Harvest 0.3 released.

  • HTTP authentication supported - two new optional attributes - username and password added to http processor.
  • URL encoding bug fixed: special character # is no more encoded.
  • HTML cleanup fixed - no more default attributes are created if they don't exist in original XML.
  • Examples adjusted and all functional again.

October 12th, 2006: Web-Harvest 0.261 released.

  • Minor bug that caused command line ClassNotFound exception is fixed.

September 28th, 2006: Web-Harvest 0.26 released.

  • URL encoding bug fixed.

September 22nd, 2006: Web-Harvest 0.25 released.

  • Support for HTTP proxy credentials added.

September 13th, 2006: Web-Harvest 0.24 released.

  • Relative redirection URLs bug fixed.

September 7th, 2006: Web-Harvest 0.23 released.

  • Support for HTTPS pages with self-signed certificates added.

September 6th, 2006: Web-Harvest 0.22 released.

  • Support for HTTP proxies added.

September 6th, 2006: Web-Harvest 0.21 released.

  • Circular redirection in HTTP client is enabled.

September 4th, 2006: Licence type changed to BSD.

September 1st, 2006: Web-Harvest 0.2 released.