| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Home | SourceForge | Forums | Contact | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
New Features in Web-Harvest 2.0Collect log files, create ZIP archive and store it on FTP server
Here new feature of
<config> <ftp server="my.ftp.server" username="myname" password="mypassword"> <ftp-put path="logs_${sys.date()}.zip"> <zip> <loop item="logFileName" empty="true"> <list> <file action="list" path="c:/logs/" listfilter="20??-??-??.log"/> </list> <body> <zip-entry name="${sys.getFilename(logFileName.toString())}"> <file action="read" path="${logFileName}"/> </zip-entry> </body> </loop> </zip> </ftp-put> </ftp> </config> Upload each employee's info from database table to a web server
Records about employees are collected from database table, then for each of them
information, including image for upload are submitted to a web server. New
features from Web-Harvest 2.0 used here are database access and upload with
<config> <loop item="emp" empty="true"> <list> <database connection="jdbc:mysql://myserver/mydb" jdbcclass="com.mysql.jdbc.Driver" username="myusername" password="mypassword"> select firstname, lastname, email, image from employee </database> </list> <body> <http url="http://www.my.users/register.html" method="post" multipart="true"> <http-param name="fname"> <template>${emp.get("firstname")}</template> </http-param> <http-param name="lname"> <template>${emp.get("lastname")}</template> </http-param> <http-param name="email"> <template>${emp.get("email")}</template> </http-param> <http-param name="pic" isfile="true"> <script return='emp.get("image").toBinary()'></script> </http-param> </http> </body> </loop> </config> Find well-rated films and send email with images and previews
List of films from tvguide.com is downloaded and only well-rated (with 4 stars or more)
are filtered. For each one of them, review page is visited where film photo and short
text is extracted. All this information is composed in HTML table and sent to an GMail
account. Here, new
<config> <mail from="tvguide@popularfilms.com" smtp-host="smtp.gmail.com" smtp-port="25" type="html" to="myaccount@gmail.com" username="myaccount" password="mypassword" security="tsl" subject="Best rated films from TV Guide"> <loop item="link" index="index"> <list> <xpath expression='//div[@class="toplist-w"]//tr[count(.//image[@class="stars" and ends-with(@src, "ColorStar.gif")]) >= 4]//a[1]'> <html-to-xml> <http url="http://www.tvguide.com/top-movies"/> </html-to-xml> </xpath> </list> <body> <empty> <var-def name="page"> <html-to-xml omitunknowntags="true"> <http url='${sys.xpath("//@href", link.toString())}'/> </html-to-xml> </var-def> <var-def name="photourl"> <xpath expression='//div[@class="obj-review-pic"]//img[1]/@src'> <var name="page"/> </xpath> </var-def> </empty> <template> <![CDATA[ <h3> ${ index + ". " + sys.xpath("data(.)", link.toString()) } </h3> <div><table><tr><td> ]]> <case> <if condition='${!photourl.toString().equals("")}'> <![CDATA[ <img src="]]> <mail-attach inline="true"> <http url="${photourl}"/> </mail-attach> <![CDATA[ "> ]]> </if> </case> <![CDATA[ </td> <td valign='top'>${ sys.xpath("//div[@class='obj-review-recap']/span[1]/text()", page.toString()) }</td> </tr></table></div> <hr> ]]> </template> </body> </loop> </mail> </config> Download Dilbert comics and store them to databaseThis example illustrates database inserts, including storing downloaded images to BLOB (Binary Large OBject) fields. For specified number of images, page is downloaded, then image urls, number of votes and ratings are extracted with XPath and data is inserted to the table.
<config> <var-def name="count" overwrite="false">50</var-def> <loop item="node" empty="true"> <list> <xpath expression="//div[@class='STR_Strip_Full']"> <html-to-xml> <http url="http://www.dilbert.com/strips/?ViewType=Full&PerPage=${count}"/> </html-to-xml> </xpath> </list> <body> <var-def name="imgUrl"> <xpath expression="//div[@class='STR_Content']/a[1]/img[1]/@src"><var name="node"/></xpath> </var-def> <database connection="jdbc:mysql://myserver/mydb" jdbcclass="com.mysql.jdbc.Driver" username="myuser" password="mypass"> insert into dilbert (rating, votes, img) values ( <xpath expression="substring-before( substring-after(//div[@class='STR_Footer']//script[1]/text(), 'curvalue: '), '}' )"><var name="node"/></xpath>, <xpath expression="data(//div[@class='STR_Metric STR_VoteCount'])"><var name="node"/></xpath>, <db-param> <http url='${sys.fullUrl("http://www.dilbert.com", imgUrl.toString())}'/> </db-param> ) </database> </body> </loop> </config> |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Copyright © 2006-2013 by vnikic at users.sourceforge.net |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||