News
October 17th, 2007:
Web-Harvest 1.0 released.
- GUI is introduced.
- html-to-xml processor exposes attributes for controlling cleaner's behaviour.
- More scripting languages and features supported.
- Access to HttpClient in runtime supported.
- Number of other improvements and fixes.
April 16th, 2007:
SVN support is added.
January 16th, 2007:
Web-Harvest 0.5 released.
- html-to-xml parser is changed - HtmlCleaner is used instead of TagSoup.
- Script processor is introduced.
- template processor is now based on BeanShell instead of OGNL.
- Types are introduced in XQuery parameters.
- Few new constructors are added in class ScraperConfiguration.
- file and include processors now support both relative and absolute paths.
- Web-Harvest variables are case-sensitive from this version.
October 27th, 2006:
Web-Harvest 0.3 released.
- HTTP authentication supported - two new optional attributes -
username and password added to http processor.
- URL encoding bug fixed: special character # is no more encoded.
- HTML cleanup fixed - no more default attributes are created if they don't exist in
original XML.
- Examples adjusted and all functional again.
October 12th, 2006:
Web-Harvest 0.261 released.
- Minor bug that caused command line ClassNotFound exception is fixed.
September 28th, 2006:
Web-Harvest 0.26 released.
September 22nd, 2006:
Web-Harvest 0.25 released.
- Support for HTTP proxy credentials added.
September 13th, 2006:
Web-Harvest 0.24 released.
- Relative redirection URLs bug fixed.
September 7th, 2006:
Web-Harvest 0.23 released.
- Support for HTTPS pages with self-signed certificates added.
September 6th, 2006:
Web-Harvest 0.22 released.
- Support for HTTP proxies added.
September 6th, 2006:
Web-Harvest 0.21 released.
- Circular redirection in HTTP client is enabled.
September 4th, 2006:
Licence type changed to BSD.
September 1st, 2006:
Web-Harvest 0.2 released.
|