<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bandos&#039; Arcade &#187; Mashup</title>
	<atom:link href="http://www.nuwanbando.com/tag/mashup/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.nuwanbando.com</link>
	<description>&#34;It&#039;s not about how it is, but how I see it &#34; - Stranger Than Fiction</description>
	<lastBuildDate>Thu, 02 Feb 2012 08:52:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Web Scraping &amp; Parsing HTML to XML in Javascript</title>
		<link>http://www.nuwanbando.com/2010/04/web-scraping-parsing-html-to-xml-in-javascript/</link>
		<comments>http://www.nuwanbando.com/2010/04/web-scraping-parsing-html-to-xml-in-javascript/#comments</comments>
		<pubDate>Tue, 27 Apr 2010 19:55:50 +0000</pubDate>
		<dc:creator>Nuwan Bandara</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Google gadgets]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[WSO2]]></category>
		<category><![CDATA[WSO2 Gadget Server]]></category>
		<category><![CDATA[WSO2 Mashup Server]]></category>
		<category><![CDATA[Mashup]]></category>

		<guid isPermaLink="false">http://www.nuwanbando.com/?p=412</guid>
		<description><![CDATA[Today I was working on a customer POC and happened to create few Google gadgets to visualize selected data sets from *.gov.uk sites. The scenario which is implemented was, mixed with inter-gadget communication and content search over data.gov.uk sites. I created three simple gadgets which communicates with each other, and one acted as the controlling [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.nuwanbando.com%2F2010%2F04%2Fweb-scraping-parsing-html-to-xml-in-javascript%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.nuwanbando.com%2F2010%2F04%2Fweb-scraping-parsing-html-to-xml-in-javascript%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Today I was working on a customer POC and happened to create few <a href="http://code.google.com/apis/gadgets/">Google gadgets</a> to visualize selected data sets from *.gov.uk sites. The scenario which is implemented was, mixed with inter-gadget communication and content search over data.gov.uk sites. I created three simple gadgets which communicates with each other, and one acted as the controlling gadget which pushed the search parameters to other two gadgets. The two content gadgets showed UK (1) primary school information and (2) electoral information. The pushed parameter was the postal code of different parts of UK. The <a href="http://www.direct.gov.uk/en/index.htm" target="_blank">direct.gov.uk</a> has a form based implementation of this.</p>
<p><a href="http://www.nuwanbando.com/wp-content/uploads/2010/04/Screenshot.png"><img class="size-medium wp-image-413 alignleft" title="Screenshot" src="http://www.nuwanbando.com/wp-content/uploads/2010/04/Screenshot-241x300.png" alt="" width="241" height="300" /></a></p>
<p>The Requirements for the POC was, simple and we already had working samples of <a href="http://wso2.org/library/articles/2010/03/wso2-gadget-server-inter-gadget-communication-pubsub" target="_blank">such a scenario</a> at WSO2 library.</p>
<ol>
<li>Show how one gadget can pass the context to other gadgets</li>
<li>How gadgets can harvest data in various formats (in my previous post I explained on how to get data from RDF endpoints, which are also available in *.gov.uk sites)</li>
</ol>
<p>The building blocks for the implementation was the search url, which was quite straight forward. for all the requests based on postal codes the direct.gov site served in the same manner (because of this important fact, the automation process became trivial). for an instance the url for primary school information retrial was,</p>
<p><a href="http://local.direct.gov.uk/LDGRedirect/LocationSearch.do?LGSL=13&amp;searchtype=1&amp;LGIL=8&amp;Style=&amp;formsub=t&amp;text=SE1+7DU" target="_blank">http://local.direct.gov.uk/LDGRedirect/LocationSearch.do?LGSL=13&amp;searchtype=1&amp;LGIL=8&amp;Style=&amp;formsub=t&amp;text=<strong>SE1+7DU</strong></a></p>
<p>Where the param &#8220;text&#8221; changed according to the postal code. So far everything seemed straight forward, however at implementation, while using <a href="http://code.google.com/apis/gadgets/docs/dev_guide.html">Gadgets API</a> for content retrial, I faced problems in parsing text with javascript. Hence the <a href="http://code.google.com/apis/opensocial/docs/0.7/reference/gadgets.io.html#makeRequest" target="_blank">gadgets.io.makeRequest </a>supported HTML as text and the API method returned the retrieved HTML document as string making it quite impossible to process.</p>
<p>With some thinking and advise, I brought the <a href="http://wso2.com/products/mashup-server/">Mashup Server</a> in to the picture and used it to retrieve the data from the gov site and returned the result in XML format. Using the Mashup Server web scraping seems to be a piece of cake, We created a simple mashup using the scraper host-object and captured the result set in the search result page. The mashup code as follows,</p>
<pre class="js" name="code">
function search(searchUrl) {
	var scraper = new Scraper(
		<config>
		    <var-def name="url">{searchUrl}</var-def>
			<var-def name="response">
			    <xpath expression="//div[@id='bodyContent']//ul[@class='resultsList']/li/a">
				<html-to-xml>
				   <http method='get' url='${url}'/>
				</html-to-xml>
			     </xpath>
			</var-def>
		</config>
	);
	return new XMLList(scraper.response);
}
</pre>
<p>And finally the two gadgets were making service calls to the mashup service and retrieved the data as an XML object, making the data processing painless. The final version at the Gadget Server looked quite appealing.</p>
<div id="attachment_427" class="wp-caption aligncenter" style="width: 829px"><a href="http://www.nuwanbando.com/wp-content/uploads/2010/04/gs.png"><img class="size-large wp-image-427 " title="gs" src="http://www.nuwanbando.com/wp-content/uploads/2010/04/gs-1024x509.png" alt="WSO2 Gadget Server with UK gov data" width="819" height="407" /></a><p class="wp-caption-text">Gadget Server look - in the end</p></div>
<p>Special thanks goes to <a href="http://ruchirawageesha.blogspot.com/">Ruchira</a> for helping me out with the mashup service <img src='http://www.nuwanbando.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  You can <a href="http://www.nuwanbando.com/wp-content/uploads/2010/04/wso2gs-samples.zip">download</a> the Gadget code and the Mashup service and try the scenario yourself.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nuwanbando.com/2010/04/web-scraping-parsing-html-to-xml-in-javascript/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mashing up RDF data with WSO2 Mashup Server</title>
		<link>http://www.nuwanbando.com/2010/04/mashing-up-rdf-data-with-wso2-mashup-server/</link>
		<comments>http://www.nuwanbando.com/2010/04/mashing-up-rdf-data-with-wso2-mashup-server/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 20:27:21 +0000</pubDate>
		<dc:creator>Nuwan Bandara</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[Semantic web]]></category>
		<category><![CDATA[SOA]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[WSO2]]></category>
		<category><![CDATA[WSO2 Gadget Server]]></category>
		<category><![CDATA[Mashup]]></category>
		<category><![CDATA[web services]]></category>
		<category><![CDATA[WSO2 Mashup Server]]></category>

		<guid isPermaLink="false">http://www.nuwanbando.com/?p=341</guid>
		<description><![CDATA[Okey so this is the fun part that I promised to write about . I managed to cook up a use-case to demonstrate RDF querying and making use of the semantic data. The data that I am using for querying, is the rdf data sources available in the UK data.gov site. With some analysis I [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.nuwanbando.com%2F2010%2F04%2Fmashing-up-rdf-data-with-wso2-mashup-server%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.nuwanbando.com%2F2010%2F04%2Fmashing-up-rdf-data-with-wso2-mashup-server%2F&amp;style=normal&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Okey so this is the fun part that I promised to write about <img src='http://www.nuwanbando.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> . I managed to cook up a use-case to demonstrate RDF querying and making use of the semantic data. The data that I am using for querying, is the rdf data sources available in the UK data.gov site. With some analysis I figured out that this task can be fundamentally archived using the combination of Mashup and Gadget Technologies. My choice of tools were <a href="http://wso2.com/products/mashup-server/">WSO2 Mashup Server</a> and <a href="http://wso2.com/products/gadget-server/">WSO2 Gadget Server</a> for their great flexibility and of cause for other obvious reasons <img src='http://www.nuwanbando.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> . However the Mashup Server does not natively support RDF data retrieval, hence I had to do some work to get such functionality integrated. The great fact about the mashup server is its extensibility, the <a href="http://en.wikipedia.org/wiki/WSO2_Mashup_Server">concept of host objects</a> and the ability to write custom host objects and its pluggable nature comes handy in such cases. The high level architecture of what I am trying to achieve is as follows.</p>
<div id="attachment_348" class="wp-caption aligncenter" style="width: 850px"><a href="http://www.nuwanbando.com/wp-content/uploads/2010/04/rdf.png"><img class="size-full wp-image-348" style="background-color: #ffffff;" title="rdf" src="http://www.nuwanbando.com/wp-content/uploads/2010/04/rdf.png" alt="" width="840" height="262" /></a><p class="wp-caption-text">RDF data retrival with WSO2 Mashup server / WSO2 Gadget Server</p></div>
<p style="text-align: left;">To implement the above architecture with the tools at hand I created a <a href="http://wso2.org/library/tutorials/writing-custom-hostobjectttp://" target="_blank">custom host object</a> that can be plugged to the Mashup Server. When dealing with semantic web related tasks and RDF data handling HP&#8217;s <a href="http://jena.sourceforge.net/" target="_blank">Jena</a> java library comes in handy. With the use of <a href="http://openjena.org/ARQ/">Jena-ARQ</a> (for <a href="http://en.wikipedia.org/wiki/SPARQL" target="_blank">SPARQL</a>) api I managed to get the host object working with few lines of code.</p>
<p style="text-align: left;">
<pre name="code" class="java">.....
            Dataset dataSet = DatasetFactory.create(sparqlObject.rdfDataSource);
            // Create a new query form a given user query
            String queryString = sparqlObject.spaqrlQuery;
            Query query = QueryFactory.create(queryString);
            QueryExecution qe = QueryExecutionFactory.create(query, dataSet);
            ResultSet results = qe.execSelect();
.....
           resultString = ResultSetFormatter.asXMLString(results);
..... OR.....
           ByteArrayOutputStream bos = new ByteArrayOutputStream();
           ResultSetFormatter.outputAsJSON(bos, results);
</pre>
<p style="text-align: left;">With the host object in place, the next task was to create a Mashup in-order to query the rdf data with a given source (EndPoint or data source). The javascript service (Mashup) is created to serve this purpose, where the consumer can specify the RDF endpoint or the data source with the SPARQL query and retrieve the dataset in XML or JSON.</p>
<pre name="code" class="js">.....
function RdfDocQueryService(rdfDataSource, rdfQuery, resultType) {
   var sparqlObj = new SparqlHostObject();
   sparqlObj.rdfDataSource = rdfDataSource;
   sparqlObj.spaqrlQuery = rdfQuery;
   sparqlObj.resultType = resultType;
   return new XML(sparqlObj.getDataFromRdfSource());
}
</pre>
<p>Finally to bind everything together, lets try querying some data. My example usecase is to use the query at <a href="http://blogs.talis.com/n2/archives/836" target="_blank">N2 blog</a> to retrieve traffic monitoring points in UK roads. The query to retrieve the data set as follows,</p>
<pre name="code" class="sql">#List the uri, latitude and longitude for road traffic monitoring points on the M5
PREFIX road:
PREFIX rdf:
PREFIX geo:
PREFIX wgs84:
PREFIX xsd:
SELECT ?point ?lat ?long WHERE {
  ?x a road:Road.
  ?x road:number "A4"^^xsd:NCName.
  ?x geo:point ?point.
  ?point wgs84:lat ?lat.
  ?point wgs84:long ?long.
}
</pre>
<p>To visualize these points I have created a gadget with the aid of Google Maps api. This gadget can be hosted in the Gadget Server, where it can dynamically retrieve traffic monitoring points for each road in the UK and display them in the map as follows.</p>
<p style="text-align: center;">
<div id="attachment_362" class="wp-caption aligncenter" style="width: 841px"><a href="http://www.nuwanbando.com/wp-content/uploads/2010/04/WSO2-Gadget-Server_1271189245784.png"><img class="size-full wp-image-362" title="WSO2 Gadget Server_1271189245784" src="http://www.nuwanbando.com/wp-content/uploads/2010/04/WSO2-Gadget-Server_1271189245784.png" alt="" width="831" height="414" /></a><p class="wp-caption-text">Traffic points in A4 road, UK</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.nuwanbando.com/2010/04/mashing-up-rdf-data-with-wso2-mashup-server/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

