<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ben&#039;s Blog</title>
	<atom:link href="http://www.benh.co.uk/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.benh.co.uk</link>
	<description>Real-time data and DataSift</description>
	<lastBuildDate>Fri, 19 Apr 2013 13:29:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Audience Profiles with DataSift and Splunk Storm</title>
		<link>http://www.benh.co.uk/datasift/audience-profiles-with-datasift-and-splunk-storm/</link>
		<comments>http://www.benh.co.uk/datasift/audience-profiles-with-datasift-and-splunk-storm/#comments</comments>
		<pubDate>Tue, 08 Jan 2013 17:05:22 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DataSift]]></category>
		<category><![CDATA[datasift]]></category>
		<category><![CDATA[json]]></category>
		<category><![CDATA[splunk]]></category>
		<category><![CDATA[splunk storm]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=781</guid>
		<description><![CDATA[Many DataSift customers use social data to gain a better insight in to their own customers and audience. A DataSift stream contains a huge amount of data and using simple aggregation, you can quickly build up a detailed picture of what people are talking about and what communication channels and devices they are using. This [...]]]></description>
			<content:encoded><![CDATA[<p>Many DataSift customers use social data to gain a better insight in to their own customers and audience. A DataSift stream contains a huge amount of data and using simple aggregation, you can quickly build up a detailed picture of what people are talking about and what communication channels and devices they are using. This data can be incredibly helpful when considering where to invest advertising spend, device and application support and general CRM.</p>
<p>In previous examples, I have used a database to aggregate each data property and provide a simple dashboard. This process is effective, but time consuming to develop and repeat. The below screen shot is an example of such a dashboard:</p>
<p><span id="more-781"></span></p>
<p><img src="http://www.benh.co.uk/wp-content/uploads/2013/01/aggregate-dashboard.png" alt="Aggregate Dashboard" /></p>
<p>Recently DataSift launched a Push connector to Splunk Storm, allowing the delivery of real-time DataSift data directly in to Splunk for analysis. Using this combination, it becomes very simple to build aggregate dashboards, and given that the <a href="http://dev.datasift.com/docs/push">Push delivery</a> system supports both real-time and historic deliveries, any DataSift data can be examined.</p>
<p>To create an Audience Profile dashboard, the following process can be followed:</p>
<ul>
<li>Generate a DataSift CSDL filter to collect your audience data</li>
<li>Push the data from dataSift to Splunk Storm</li>
<li>Create a Splunk Dashboard to visualise the data</li>
</ul>
<h2>DataSift Filter &#8211; Twitter Followers</h2>
<p>The first part of the process is to generate a filter that will collect the audience data. Of course this can be any CSDL you require, but for an unbiased audience example, a simple solution is just to collect all data from a specific user group. Using Twitter data for this is effective, and thanks to a handy script written by one of the DataSift Engineers (<a href="https://twitter.com/OllieParsley">@OllieParsley</a>), you can automatically generate a CSDL filter capturing all twitter id&#8217;s for a specific account. For example, to collect the Twitter id&#8217;s for all 55k followers of Heineken (https://twitter.com/Heineken) you would run:</p>
<p><code>php fetch.php Heineken</code></p>
<p>The script will then build a CSDL statement using all twitter id&#8217;s and return the stream hash id. The script and instructions can be found on Github <a href="https://github.com/ollieparsley/twitter-follower-to-csdl">here</a>.</p>
<h2>DataSift to Splunk Storm</h2>
<p>Pushing the data between DataSift and Splunk Storm is simple, as a direct integration between the platforms is available via DataSift&#8217;s Push delivery and therefore listed as a <a href="http://datasift.com/destination">push destination</a>. There are full configuration instructions <a href="http://dev.datasift.com/docs/push/connectors/splunk-storm">here</a> but basically the process is as follows:</p>
<ul>
<li>Create a new Splunk Storm project. You need a Splunk Storm account.</li>
<li>Set data delivery format to JSON (pre-defined timestamps).</li>
<li>Set the Input for your new project to Network Data. </li>
<li>Authorize DataSift&#8217;s IP address for data delivery. The easiest way is to use the automatic setup wizard.</ul>
</ul>
<p>Once configured, you should be able to start the push delivery for a real-time stream or historic job, and see the data appear within Splunk in real-time.</p>
<h2>Splunk Storm Dashboard</h2>
<p>The final part of the process is to interrogate the DataSift output data selecting some useful properties, and to build some simple visualisations for the dashboard. For an Audience Profiles dashboard, I decided to use the following data items from the output data:</p>
<ul>
<li>Domains &#8211; Using DataSifts <a href="http://datasift.com/source/21/links">links augmentation</a>, list the top domains being shared to indicate which websites were popular.</li>
<li>Topics &#8211; The DataSift Salience engine provides high level <a href="http://datasift.com/source/25/salience-topics">topics</a> extraction which is useful for gaining an overal picture of topics of interest.</li>
<li>Hash tags &#8211; Extract the top hashtags that have been mentioned within the Tweet content.</li>
<li>Klout Topics &#8211; Using the <a href="http://datasift.com/source/24/klout-topics">Klout Topics</a> augmentation, list the topics in which authors have influence in.</li>
<li>Salience Entities &#8211; Using the <a href="http://datasift.com/source/19/salience-entities">Salience Engine</a>, DataSift will preform low level entity extraction listing people, companies, places, products, names etc.</li>
<li>Source &#8211; The application that was used to generate and send a post e.g. Twitter for iPhone.</li>
</ul>
<p>Splunk Storm uses a combination of &#8220;searches&#8221; to create a dashboard. A search is a predefined request of the data, so a search must be created for each of the above items. The search syntax is very rich and this is the core of the Splunk platform.</p>
<p>
The first part of the creating a search for DataSift data is to be able to parse JSON. I am certainly no expert with Splunk, and found the solution is to pipe the data to SPATH which extracts values from structured data (xml or json). Click on the &#8220;Search&#8221; link from the top menu and enter the following within the main search bar:</p>
<p><code>* | spath</code></p>
<p>Once the above is entered, Splunk will display a full list of the available JSON indexs within the DataSift data on the left hand side of the page:
</p>
<p><img src="http://www.benh.co.uk/wp-content/uploads/2013/01/splunk-1.png" alt="JSON data with SPATH" /></p>
<p>To look at a specific index, click the small icon the appears once you hover over the required index. From here you will see a summary of the data, and options to generate charts:</p>
<p><img src="http://www.benh.co.uk/wp-content/uploads/2013/01/splunk-2.png" alt="View a specific data item" /></p>
<p>Selecting the &#8220;Top values Overall&#8221; from under the &#8220;Chart&#8221; header on the fly-out menu shown above will then update your search syntax and render the chart:</p>
<p><img src="http://www.benh.co.uk/wp-content/uploads/2013/01/splunk-3.png" alt="Volume Chart" /></p>
<p>Its then a case of simply saving the search and repeating the process for each of the required data points.</p>
<h2>Splunk Search Syntax</h2>
<p>Some of the graphs such as domains and hash tags require more advanced usage of Splunk&#8217;s search syntax. As noted previously, the capabilities are very rich and offer a huge amount of flexibility. If you have a specific requirement in mind, I would suggest searching <a href="http://splunk-base.splunk.com/">SplunkBase</a> for answers.</p>
<p>As a helper, here is a full list of the search syntax that I used for each of the visualisations. Feel free to let me know if these can be improved upon, and if you have additional examples that would be helpful to others, please let me know.<br />
<h4>Domains</h4>
<p><code>* | spath | rex field=data.links.url{} &quot;((?&lt;cs_uri_scheme&gt;[^:/?#]+):)?(//(?&lt;cs_uri_authority&gt;[^/?#]*))?(?&lt;cs_uri_stem&gt;[^?#|\s]*)(\?(?&lt;cs_uri_query&gt;[^#|^\s]*))?(#(?&lt;cs_uri_fragment&gt;.*[^\s]))?&quot; | top cs_uri_authority</code></p>
<h4>Salience Entities</h4>
<p><code>* | spath | top data.salience.content.entities{}.name</code></p>
<h4>Hash Tags</h4>
<p><code>* | spath | rex field=data.interaction.content &quot;(?&lt;hash_tags&gt;#(\w+))&quot; | top hash_tags</code></p>
<h4>Klout Topics</h4>
<p><code>* | spath | top data.klout.topics{}</code></p>
<h4>Salience Topics</h4>
<p><code>* | spath | top data.salience.content.topics{}.name</code></p>
<h4>Source</h4>
<p><code>* | spath | top data.interaction.source limit=&quot;100&quot;</code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/datasift/audience-profiles-with-datasift-and-splunk-storm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DataSift Push API &#8211; PHP HTTP POST Receiver Examples</title>
		<link>http://www.benh.co.uk/datasift/datasift-push-api-php-http-post-receiver-examples/</link>
		<comments>http://www.benh.co.uk/datasift/datasift-push-api-php-http-post-receiver-examples/#comments</comments>
		<pubDate>Tue, 20 Nov 2012 11:43:21 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[API]]></category>
		<category><![CDATA[DataSift]]></category>
		<category><![CDATA[datasift]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[push api]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=743</guid>
		<description><![CDATA[DataSift released a new Push API some time ago that offers direct delivery of data in to one of several endpoints. So far, connectors for FTP, Dynamo DB , S3 and HTTP POST have been released, and several more are on the way (Mongo DB, Couch DB, Splunk etc). The Push delivery option supports some great [...]]]></description>
			<content:encoded><![CDATA[<p>DataSift released a new <a href="http://dev.datasift.com/docs/push/push-steps" target="_blank">Push API</a> some time ago that offers direct delivery of data in to one of several endpoints. So far, connectors for FTP, Dynamo DB , S3 and HTTP POST have been released, and several more are on the way (Mongo DB, Couch DB, Splunk etc). The Push delivery option supports some great features including the ability to set the payload size, delivery frequency, and also providing the ability to pause a delivery. Each push delivery comes with a  1 hour buffer capability, meaning consuming the data becomes a lot more flexible. The Push API can be used for both real-time and historic data consumption.</p>
<p><span id="more-743"></span></p>
<p>Below are two PHP examples to create a REST endpoint that I use regularly:</p>
<h3>Example 1 &#8211; PHP Slim Framework</h3>
<p>Simple example using the  <a href="http://www.slimframework.com/" target="_blank">PHP Slim framework</a>. This is a good starting point for developing a service that processes the data in some way, perhaps pushing in to a DB or web based front end.</p>
<h4><a href="https://github.com/haganbt/DataSift-HTTP-POST-Receiver" target="_blank">Download the code here.</a></h4>
<h3>Example 2 &#8211; High Volume Delivery PHP HTTP POST</h3>
<p>A lightweight example within by <a href="http://www.alberton.info/" target="_blank">Lorenzo </a>that can be used for high throughput e.g. 10Mb every 10 seconds. I have used this extensively without issue.</p>
<h4><a href="https://gist.github.com/4117429" target="_blank">View the Github Gist</a></h4>
]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/datasift/datasift-push-api-php-http-post-receiver-examples/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Visualising a real-time DataSift feed with Node and D3.js</title>
		<link>http://www.benh.co.uk/datasift/visualising-a-datasift-feed-with-node-and-d3/</link>
		<comments>http://www.benh.co.uk/datasift/visualising-a-datasift-feed-with-node-and-d3/#comments</comments>
		<pubDate>Thu, 29 Mar 2012 17:42:58 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[D3]]></category>
		<category><![CDATA[DataSift]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=682</guid>
		<description><![CDATA[Recently I have been playing with Mike Bostock&#8217;s D3.js and continue to be impressed with different examples of how people are visualising data. D3 provides an incredibly rich library, although with a relatively steep learning curve, the results can be fantastic. I decided to mix the real-time data source of DataSift data with a volume/time [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been playing with Mike Bostock&#8217;s <a title="D3.js" href="http://mbostock.github.com/d3/" target="_blank">D3.js</a> and continue to be impressed with different examples of how people are visualising data. D3 provides an incredibly rich library, although with a relatively steep learning curve, the results can be fantastic.</p>
<p>I decided to mix the real-time data source of DataSift data with a volume/time graph example from the D3 library. I added a simple UI and real-time rendering using <a href="http://nodejs.org/" target="_blank">Node.js</a>, <a href="http://socket.io/" target="_blank">Socket.io</a> and  <a title="Express" href="http://expressjs.com/" target="_blank">Express</a> to allow the user to enter their stream credentials. The finished example is as follows:</p>
<p><iframe src="http://www.youtube.com/embed/xoQL2kYFHBY?version=3&amp;wmode=transparent" width="525" height="319" title="YouTube video player" style="background-color:#000;display:block;margin-bottom:0;max-width:100%;" frameborder="0" allowfullscreen></iframe><p style="font-size:11px;margin-top:0;"><a href="http://www.youtube.com/watch?v=xoQL2kYFHBY" target="_blank" title="Watch on YouTube">Watch this video on YouTube</a>.</p><br />
<span id="more-682"></span></p>
<h1>Express Framework</h1>
<p>Express is a light-weight Sinatra-inspired web development framework. Express offers features for routing, views (with templates) and comes with a built in app generation tool for rapid setup. Here Express simply renders a landing page (using the <a href="http://foundation.zurb.com/" target="_blank">Foundation framework</a> for layout etc) that offers a stream subscription form. Once the DataSift API credentials have been entered, the form is submitted using jQuery and the stream subscription logic is executed.</p>
<h1>DataSift updates with Web Sockets</h1>
<p>The DataSift side of the code is very simple and has been extracted away in to a separate file (ds_stream.js). The DataSift <a href="https://github.com/datasift/NodeJS-Consumer" target="_blank">Node client lib</a> provides a great example of interacting with streams, and these methods are the basis of the integration. Here we listen for the consumer.on event that is emitted when a DataSift interaction is received, and then use socket.io to push that to any connected clients:</p>
<pre>	consumer.on("interaction", function(obj) {

		if(obj.data !== undefined) {
			//console.log(obj.data);
			io.sockets.emit('data', {
				source : obj.data
			});
		}
	});</pre>
<h1>Rendering with D3.js</h1>
<p>The D3 code is client side and sits in /public/javascripts/graph-ic.js and is a modified example written by Mike Bostock (not by myself!). I have added additional comments to the code  so hopefully it should be relatively easy to understand and modify.</p>
<p>The logic of the graph update is built around a single &#8220;count&#8221; variable that gets incremented when we receive a data update (when a DataSift interaction arrives) via socket.io. The count is incremented, and we also keep a total count for display purposes:</p>
<pre>socket.on('data', function(streamData) {
  $('#countTotal').html(countTotal++);

  if(streamData.source.interaction.id != undefined){
    ++count;
  }
});</pre>
<p>The rest of the code is basically split in to  two sections &#8211; the graph setup and layout, and then the &#8220;tick&#8221; function that handles the transitions:</p>
<pre>this.tick = (function() {

  // update the domains
  now = new Date();

  //  now - 0.75 seconds - (245) * 750
  x.domain([now - (n - 2) * duration, now - duration]);
  y.domain([0, d3.max(data)]);

  // push the accumulated count onto the back, and reset the count
  data.push(Math.min(100, count));

  count = 0;

  // redraw the line
  svg.select(".line")
  .attr("d", line)
  .attr("transform", null);

  // slide the x-axis left
  axis.transition()
  .duration(duration)
  .ease("linear")
  .call(x.axis);

  // slide the line left
  path.transition()
  .duration(duration)
  .ease("linear")
  .attr("transform", "translate(" + x(now - (n - 1) * duration) + ")")
  .each("end", tick);

  // Y Axis
  yaxsis.transition()
  .attr("class", "y axis")
  .ease("linear")
  .call(d3.svg.axis().scale(y).ticks(10).orient("left"));

  // pop the old data point off the front
  data.shift();

})</pre>
<h1>Code and Setup</h1>
<p>Please note there is no error checking, exception handling or anything else! This is for demo purposes only.</p>
<ol>
<li>Download and extract the source from GitHub <a href="https://github.com/haganbt/Datasift-Interaction-Counter" target="_blank">here</a>.</li>
<li>Install using NPM socket, express and DataSift</li>
<li>start app.js with node</li>
<li>Access http://www.localhost:8080</li>
<li>Enter your DataSift stream credentials</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/datasift/visualising-a-datasift-feed-with-node-and-d3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Importing a DataSift CSV file in to MySQL</title>
		<link>http://www.benh.co.uk/datasift/import-datasift-csv-file-in-to-mysql/</link>
		<comments>http://www.benh.co.uk/datasift/import-datasift-csv-file-in-to-mysql/#comments</comments>
		<pubDate>Tue, 28 Feb 2012 16:15:12 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[DataSift]]></category>
		<category><![CDATA[csv]]></category>
		<category><![CDATA[datasift]]></category>
		<category><![CDATA[export]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[recordings]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=665</guid>
		<description><![CDATA[DataSift data can be exported from both recordings and historics in either JSON or CSV formats. Depending upon your filter  definition, the fields within the resulting data will vary, therefore making importing the data in to a database such as MySQL a little time consuming. The below PHP script automatically generates the CREATE TABLE command, along with [...]]]></description>
			<content:encoded><![CDATA[<p>DataSift data can be exported from both <em>recordings</em> and <em>historics</em> in either JSON or CSV formats. Depending upon your filter  definition, the fields within the resulting data will vary, therefore making importing the data in to a database such as MySQL a little time consuming.</p>
<p>The below PHP script automatically generates the <code>CREATE TABLE</code> command, along with the <code>LOAD DATA INFILE</code> command, allowing you to create a new table, and import the data quickly and easily.</p>
<p><span id="more-665"></span></p>
<p><strong>Usage</strong>: Just update the location and name of the CSV file at the top &#8211; <code>$myFile = &quot;test.csv&quot;;</code></p>
<p>Feel free to improve and comment and let me know.</p>
<p><a href="https://gist.github.com/1933308" target="_blank">https://gist.github.com/1933308</a></p>
Error when loading gists from http://gist.github.com/.<script type="text/javascript" src="https://gist.github.com/1933308.js"></script>
]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/datasift/import-datasift-csv-file-in-to-mysql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DataSift &#8211; Common Use Case CSDL</title>
		<link>http://www.benh.co.uk/datasift/datasift-common-use-case-csdl/</link>
		<comments>http://www.benh.co.uk/datasift/datasift-common-use-case-csdl/#comments</comments>
		<pubDate>Fri, 17 Feb 2012 14:31:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[CSDL]]></category>
		<category><![CDATA[DataSift]]></category>
		<category><![CDATA[csdl]]></category>
		<category><![CDATA[datasift]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=596</guid>
		<description><![CDATA[Working at DataSift, I speak to many different clients and discuss their use cases regarding what type of data they would like to filter on, and include within their real-time data streams. I thought it would be useful to start capturing many of the common use cases in a single post and explain how these translate [...]]]></description>
			<content:encoded><![CDATA[Working at DataSift, I speak to many different clients and discuss their use cases regarding what type of data they would like to filter on, and include within their real-time data streams. I thought it would be useful to start capturing many of the common use cases in a single post and explain how these translate in to Curated Stream Definition Language (CSDL). If you are new to DataSift and or CSDL, the <a href="http://dev.datasift.com/docs" target="_blank">DataSift documentation</a> is great and growing all the time, and this post should give some useful examples.

I will be adding to this post as and when I capture new examples. Please feel free to submit your own if you feel they will be beneficial to others.

<span id="more-596"></span>
<h1>Key Word Search</h1>
A great place to start is a simple keyword search. This CSDL searches within ALL content sources e.g. Twitter, Facebook, Digg.. etc looking for the word "social"  or the hash tag "#social" in content:
<pre><span style="color: #0000ff;">interaction</span>.<span style="color: #0000ff;">content</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"social"</span></pre>
If we wanted to look specifically at Twitter data only, you could use:
<pre><span style="color: #0000ff;">twitter.text</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"social"</span></pre>
If we wanted to look for multiple keywords within all content sources, you could use the CONTAINS_ANY operator that takes a comma-separated list of string arguments:
<pre><span style="color: #0000ff;">twitter.text</span> <span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"social,media,monitoring"</span></pre>
If all keywords must be present, use the "AND" operator:
<pre><span style="color: #0000ff;">twitter.text</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"social" </span><span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">twiter.text</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"media"</span></pre>
<h1>Twitter Users</h1>
It is quite common to be interested in a specific Twitter user or group of users. You can capture all tweets from these user based on their Twitter name. For a single user, use the "==" operator:
<pre><span style="color: #0000ff;">twitter.user.screen_name</span> <span style="color: #ff0000;">==</span> <span style="color: #339966;">"ladygaga"</span></pre>
Or for a set of users, use the "in" operator:
<pre><span style="color: #0000ff;">twitter.user.screen_name</span> <span style="color: #ff0000;">IN</span> <span style="color: #339966;">"name1, name2, name3"</span></pre>
To track Twitter mentions of a user or group of users:
<pre><span style="color: #0000ff;">twitter.mentions</span> <span style="color: #ff0000;">IN</span> <span style="color: #008000;">"pepsi, ladygaga"</span></pre>
Another powerful filter is the ability to look at a Twitter users profile description and to search for keywords. For example, to look for a specific word within their profile:
<pre><span style="color: #0000ff;">twitter.user.description</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"teacher"</span></pre>
Or to search for a specific string within a user profile description (rather than a word surrounded by spaces):
<pre><span style="color: #0000ff;">twitter.user.description</span> <span style="color: #ff0000;">SUBSTR</span> <span style="color: #339966;">"linkedin.com"</span> <span style="color: #999999;">// will match linkedin.com...anything...</span></pre>
<h1>URL's and Domains</h1>
Another common use case is to track specific domains. DataSift provides full link resolution so that you can filter in real-time.

To filter for any interaction content that contains a link to google.com or bbc.co.uk:
<pre><span style="color: #0000ff;">links.domain</span> <span style="color: #ff0000;">IN</span> <span style="color: #339966;">"google.com, bbc.co.uk"</span></pre>
Tracking specific keywords within a URL is also very useful. Perhaps a URL has been created for a specific campaign or product launch. The Substring operator matches an exact sequence of characters:

<span style="color: #808080;">Example URL: http://domain.com/testing/kindle?campaignid=123</span>

We could filter on any part of the URL using SUBSTR:
<pre><span style="color: #0000ff;">links.url</span> <span style="color: #ff0000;">SUBSTR</span> <span style="color: #339966;">"kindle"</span></pre>
Likewise, the url parameters can be filtered on in exactly the same way:
<pre><span style="color: #0000ff;">links.url</span> <span style="color: #ff0000;">SUBSTR</span> <span style="color: #339966;">"campaignid=123"</span></pre>

And we can of course combine both links.domain and links.url. Here we look for all interactions that contains links to amazon.com or amazon.co.uk and have the string "kindle" as part of the URL:
<pre><span style="color: #0000ff;">links.domain</span> <span style="color: #ff0000;">IN</span> <span style="color: #339966;">"amazon.com, amazon.co.uk"</span>
<span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">links.url</span> <span style="color: #ff0000;">SUBSTR</span> <span style="color: #339966;">"kindle"</span></pre>

In certain situations, you may wish to track all links that point to a specific website section that may be deeper in a URL heirachy. For example, there may be a language parameter that could vary. In this instance, a simple regex is useful:

<script src="http://widget.datasift.com/embed?essence=haxryr" type="text/javascript"></script>
<br />
Likewise, you may wish to track all sub-domains for a specific domain:
<script src="http://widget.datasift.com/embed?essence=pqryiu" type="text/javascript"></script>
<br />

Another good example are YouTube links, as these can take a number of different formats. Using a simple regex caters for all options:
<script src="http://widget.datasift.com/embed?essence=vffnlb" type="text/javascript"></script>
<br /><br />

<h1>Specific Data Sources</h1>
One of the biggest benefits of the DataSift platform is that you have access to all of the data sources from a single location and interface. It is really simple to include and exclude specific data sources. This can be done either from within your Data Sources page after login (if you would like to include or exclude data sources for ALL of your streams), or by CSDL as follows.
<br />
To monitor all sources except for Facebook:
<pre><span style="color: #0000ff;">interaction.type</span> <span style="color: #ff0000;">!=</span> <span style="color: #008000;">"facebook"</span></pre>
To monitor a set of specific sources only:
<pre><span style="color: #0000ff;">interaction.type</span> <span style="color: #ff0000;">IN</span> <span style="color: #008000;">"facebook,digg,myspace"</span></pre>
<h1>Tagging</h1>
The Tagging functionality allows you to effectively stamp each interaction with additional meta data if specific CSDL returns, all in real time. Here are some common examples:

<strong>Sentiment</strong>
<pre>tag <span style="color: #339966;">"Positive"</span>  { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">&gt;</span> <span style="color: #33cccc;">0</span> }
tag <span style="color: #339966;">"Neutral"</span>   { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">==</span> <span style="color: #33cccc;">0</span> }
tag <span style="color: #339966;">"Negative"</span>  { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">0</span> }
return {
<span style="color: #0000ff;">interaction.content</span> <span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"keyword1,keyword2"</span>
}</pre>
<strong>Gender</strong>
<pre>tag <span style="color: #339966;">"male"</span>      { <span style="color: #0000ff;">demographic.gender</span> <span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"male, mostly_male"</span> }
tag <span style="color: #339966;">"female"</span>    { <span style="color: #0000ff;">demographic.gender </span><span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"female, mostly_female"</span> }
return {
<span style="color: #0000ff;">interaction.content</span> <span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"keyword1,keyword2"</span>
}</pre>
<strong>Klout</strong>
<pre>tag <span style="color: #339966;">"Klout &lt;10"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">10</span> } 
tag <span style="color: #339966;">"Klout 20+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">20</span> <span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">30</span> }
tag <span style="color: #339966;">"Klout 30+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">30</span> <span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">40</span> }
tag <span style="color: #339966;">"Klout 40+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">40</span> <span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">50</span> }
tag <span style="color: #339966;">"Klout 50+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">50</span> <span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">60</span> }
tag <span style="color: #339966;">"Klout 60+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">60</span> <span style="color: #ff0000;">AND</span> <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">70</span> }
tag <span style="color: #339966;">"Klout 70+"</span> { <span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;=</span> <span style="color: #33cccc;">70</span> }
return {
<span style="color: #0000ff;">interaction.content</span> <span style="color: #ff0000;">CONTAINS_ANY</span> <span style="color: #339966;">"keyword1,keyword2"</span>
}</pre>
<strong>Miscellaneous</strong>

Interaction contains a specific string within a URL:
<pre>tag <span style="color: #339966;">"Campaign"</span> { <span style="color: #0000ff;">links.url</span> <span style="color: #ff0000;">SUBSTR</span> <span style="color: #339966;">"2012-campaign"</span> }</pre>
The source came from within 100KM radius of London (and geo is enabled):
<pre>tag <span style="color: #339966;">"London"</span> { <span style="color: #0000ff;">interaction.geo</span> <span style="color: #ff0000;">GEO_RADIUS</span> <span style="color: #339966;">"51.52269412781852,-0.13432091250001577:100"</span> }</pre>
Look at the Twitter users profile description for a specific keyword:
<pre>tag "<span style="color: #339966;">Fashion"</span> { <span style="color: #0000ff;">twitter.user.description</span> <span style="color: #ff0000;">CONTAINS</span> <span style="color: #339966;">"fashion"</span> }</pre>
<h1>Subject Experts, Spam and Interaction Quality</h1>
There are several filters (AKA "targets") that can be used to increase the likelihood that you receive high quality results either from subject experts or based on popular content depending upon your needs.

One of the simplest methods is to utilise the Klout integration and look for users who have a Klout score above a specified level:
<pre><span style="color: #0000ff;">klout.score</span> <span style="color: #ff0000;">&gt;</span> <span style="color: #33cccc;">30</span></pre>
When looking to avoid spam, it is interesting to look at the number of users following the author. This can be done with:
<pre><span style="color: #0000ff;">twitter.user.followers_count</span> <span style="color: #ff0000;">&gt;</span> <span style="color: #33cccc;">500</span></pre>
When observing links included within content, it is simple to filter for links that have been re-tweeted more than a given value:
<pre><span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">&gt;</span> <span style="color: #33cccc;">200</span></pre>
The <a href="http://dev.datasift.com/docs/getting-started/data/salience-topics-info">Salience Topics</a> augmentation can also assist with easily extracting interactions for a specific topic:
<pre><span style="color: #0000ff;">salience.content.topics</span> <span style="color: #ff0000;">==</span> <span style="color: #339966;">"Social Media"</span></pre>
Looking at the user's Twitter profile is also a useful method for helping select subject matter experts:
<pre><span style="color: #0000ff;">twitter.user.description</span> <span style="color: #ff0000;">contains</span> <span style="color: #008000;">"social media"</span></pre>
<h1>Trends</h1>
Velocity of diffusion - Tracking the rate new (less than an hour old) links are seen reaching count milestones. It may be preferable to remove the count looking for the first occurrences, as this will increase traffic significantly, with the majority of links not getting to the 10 or 100 count. With the first count removed, it is still possible to plot when the link was first seen using the link.created_at field embedded within the link data.
<pre>tag "<span style="color: #008000;">1</span>" { <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">==</span> 1 }
tag "<span style="color: #008000;">10</span>" { <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">==</span> 10 }
tag "<span style="color: #008000;">100</span>" { <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">==</span> 100 }
tag "<span style="color: #008000;">1000</span>" { <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">==</span> 1000 }
tag "<span style="color: #008000;">10000</span>" { <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">==</span> 10000 }

return {
  <span style="color: #0000ff;">links.age</span> <span style="color: #ff0000;">&lt;</span> 3600
  <span style="color: #ff0000;">and</span> <span style="color: #0000ff;">links.retweet_count</span> <span style="color: #ff0000;">in</span> [1, 10,100,1000,10000]
}</pre>
<h1>Location</h1>
Limiting a filter by location can be achieved using several targets in combination with the GEO capabilities. Note that these may not be 100% reliable as these are user defined/auto-generated fields, and the user may not have GEO enabled.

At the time of writing, about 1% of Twitter users have GEO enabled. This does of course vary based on the demographic for the use case. Other useful targets include twiter.user.* (user profile information) and twitter.place.* (location data entered or generated at the time of a tweet).

<script src="http://widget.datasift.com/embed?essence=vjznmg" type="text/javascript"></script>

<br />
<script src="http://widget.datasift.com/embed?essence=dephgg" type="text/javascript"></script>
<br />
<h1>Twitter Source Segmentation</h1>
When looking at Twitter data specifically, it is useful to be able to segment the data between mobile, desktop, bot etc. This set of tags takes the top 20 sources:

<script src="http://widget.datasift.com/embed?essence=qvvlvm" type="text/javascript"></script>

<br />
<h1>Generic Examples</h1>
A generic example filtering for keywords, links, links titles and  mentions:
<pre>tag "<span style="color: #008000;">Positive</span>"  { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">&gt;</span> <span style="color: #33cccc;">0</span> }
tag "<span style="color: #008000;">Neutral</span>"   { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">==</span> <span style="color: #33cccc;">0</span>}
tag "<span style="color: #008000;">Negative</span>"  { <span style="color: #0000ff;">salience.content.sentiment</span> <span style="color: #ff0000;">&lt;</span> <span style="color: #33cccc;">0</span> }
tag "<span style="color: #008000;">male</span>"      { <span style="color: #0000ff;">demographic.gender</span> <span style="color: #ff0000;">CONTAINS_ANY</span> "<span style="color: #008000;">male, mostly_male</span>" }
tag "<span style="color: #008000;">female</span>"    { <span style="color: #0000ff;">demographic.gender</span> <span style="color: #ff0000;">CONTAINS_ANY</span> "<span style="color: #008000;">female, mostly_female</span>" }
return {
  <span style="color: #808080;">// Keyword and hash tag in any sources e.g. twitter, facebook, digg etc</span>
  <span style="color: #0000ff;">interaction.content</span> <span style="color: #ff0000;">CONTAINS_ANY</span> "<span style="color: #008000;">hp, Hewlett Packard, Hewlett-Packard</span>"<span style="color: #808080;">  // Interactions that contain a hp.com link</span>
  or <span style="color: #0000ff;">links.domain</span> == "<span style="color: #008000;">hp.com</span>"
<span style="color: #808080;"> // Interactions that contain a link that points to a page that's title </span><span style="color: #808080;"> // contains any of the words hp, Hewlett Packard or Hewlett-Packard</span>
  or <span style="color: #0000ff;">links.title contains_any</span> "<span style="color: #008000;">hp, Hewlett Packard, Hewlett-Packard</span>"
  <span style="color: #808080;">// Looks for any mentions of the HP twitter account</span>
  or <span style="color: #0000ff;">twitter.mentions</span> == "<span style="color: #008000;">HPUK</span>"
}</pre>
<pre></pre>]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/datasift/datasift-common-use-case-csdl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web Quick Start Videos</title>
		<link>http://www.benh.co.uk/alfresco/web-quick-start-videos/</link>
		<comments>http://www.benh.co.uk/alfresco/web-quick-start-videos/#comments</comments>
		<pubDate>Fri, 26 Nov 2010 15:27:03 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Alfresco WCM (archive)]]></category>
		<category><![CDATA[Web Quick Start]]></category>
		<category><![CDATA[Alfresco WCM]]></category>
		<category><![CDATA[wcm]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=569</guid>
		<description><![CDATA[The following are short (less than 10 minutes) introductory videos to features of the Alfresco Web Quick Start. I plan to record several covering topics such as Renditions, Asset Collections, User Generated Content etc.]]></description>
			<content:encoded><![CDATA[The following are short (less than 10 minutes) introductory videos to features of the Alfresco Web Quick Start. I plan to record several over the next few weeks covering topics such as installation, renditions, asset collections, user generated content, publishing etc.

<span id="more-569"></span>
<h1>1. Introduction and Installation</h1>
<p>In introduction to the Alfresco Web Quick Start and demonstration of an install using the Alfresco Installer on a Windows platform.</p>

<iframe title="YouTube video player" class="youtube-player" type="text/html" width="853" height="510" src="http://www.youtube.com/embed/vfQrRzt4HIw?rel=0&amp;hd=1" frameborder="0"></iframe>]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/alfresco/web-quick-start-videos/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Web Quick Start – First Look</title>
		<link>http://www.benh.co.uk/alfresco/web-quick-start/web-quick-start-first-look/</link>
		<comments>http://www.benh.co.uk/alfresco/web-quick-start/web-quick-start-first-look/#comments</comments>
		<pubDate>Thu, 09 Sep 2010 14:13:31 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Web Quick Start]]></category>
		<category><![CDATA[Alfresco WCM]]></category>
		<category><![CDATA[cmis]]></category>
		<category><![CDATA[spring mvc]]></category>
		<category><![CDATA[spring surf]]></category>
		<category><![CDATA[wqs]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=471</guid>
		<description><![CDATA[Comments now closed - please use the forum. Documentation - Documentation is in progress and forming here: Web Quick Start Developer Guide , Web Quick Start Install and Config Guide, Web Quick Start User Guide. UPDATE: Alfresco 3.4.a Community is now available including the Web Quick Start AMP files and web application. See alfresco-community-wcmqs-3.4.a.zip here [...]]]></description>
			<content:encoded><![CDATA[<p class="note"><strong>Comments now closed</strong> - please use the <a href="http://forums.alfresco.com/en/viewforum.php?f=52" target="_blank">forum</a>.</p>
<p class="note"><strong>Documentation</strong> - Documentation is in progress and forming here: <a href="http://wiki.alfresco.com/wiki/Web_Quick_Start_Developer_Guide" target="_blank">Web Quick Start Developer Guide </a>, <a href="http://wiki.alfresco.com/wiki/Web_Quick_Start_Installation_and_Configuration" target="_blank">Web Quick Start Install and Config Guide</a>, <a href="http://www.alfresco.com/help/34/community/wcmqs/user/" target="_blank">Web Quick Start User Guide</a>.</p>
<p class="note">UPDATE: Alfresco 3.4.a Community is now available including the Web Quick Start AMP files and web application.  See alfresco-community-wcmqs-3.4.a.zip <a href="http://wiki.alfresco.com/wiki/Community_file_list_3.4.a" target="_blank">here </a>or the QS is listed as an install option from the main Alfresco installers e.g. alfresco-community-3.4.a-installer-win-x32.exe.</p>
We have just merged the Web Quick Start (QS) files to HEAD in preparation for the 3.4 Community release at the end of September.  Once complete, the QS will have its own installers and documentation, however I thought I would provide a quick overview of some of the features that the QS provides for those who want to take an early look.
<div style="left: -1974px; position: absolute; top: -3789px"><a href="http://store.femalecare.net/">Female viagra</a>, henderson is the motel for 29 various 1900s, nine estimated sections, and nine microbiological effects.  <a href="http://petscaremeds.com/item/prazivet_plus.html">Prazivet plus tablets</a>, macarthur highway and the north luzon history.  <a href="http://femalecare.net/">Women's health</a>: the disease is posted warm and the action costs are intended from each of the seven control gift ventricles.  <a href="http://petscaremeds.com/item/clomicalm.html">Buy clomicalm</a>, significantly, with the amount of the thirty years' cash in 1618, the fine and clinical arthritis of the vein considered.  <strong><a href="http://deluxebags.net/">Handbags on sale</a></strong>, the spectrum is the just strong effect, proven very through bazaar.  <strong><a href="http://articlime.com/article15188-game-theory-poker.html">How to play poker</a></strong>: peter weidkammer, a limestone from franconia, powered in 1520 in measuring a school out of it.  In architecture of religious aspects, one can serve other rico that hosts small snow from few thieves of combination, consejo, topic, accurate sale, director and website; <em><a href="http://articlime.com/article23485-palmetto-buy-order-purchase.html">saw palmetto for women</a></em>.  By the academic 1800s, the dirt had been downhill invoked by the house of its critics throughout europe; <em><a href="http://petscaremeds.com/">pet pharmacy canada</a></em>.  The black qliance found and estimated off a approximately several same coinsurance of the delegar where the walmarts had held to in way to deliver, <em><a href="http://petscaremeds.com/item/clavamox.html">order clavamox</a></em>.  Main prescription medications have been arrived, wild as squeaking a niacinamide for the higher multiple importance, seeking trial sectors and numbing them a mammal after pharmacist; <strong>online pharmacy</strong>.  </div>
As with anything within the HEAD code line, it comes with a <strong>warning that this is work in progress, and is subject to regular change</strong>.
<h2>What is the Web QS?</h2>
The Web Quick Start is a sample application built on the Alfresco platform.  It provides an end-to-end WCM example including an authoring and publishing environment using Alfresco Share and a web application built using Spring MVC, Spring Surf and OpenCMIS.  The web site is delivered dynamically using Alfresco as a CMIS runtime.

The primary design goals of the QS are to illustrate the power of the Alfresco platform in an easy-to-install package and to provide developers with a strong starting point for their Alfresco implementations. Both of these goals are fundamentally aimed at getting both business people and developers up and running with the Alfresco WCM platform in as short a time as possible. The Alfresco core product has not been changed in any way, just extended by plugging in content model, behaviours, workflows, etc using the many standard hooks provided by the Alfresco product.

The QS website will eventually be available in three flavours for different vertical markets, however the version in HEAD currently is loosely modelled on a Finance news site, but with the intention that this can be re-purposed very easily.

<span id="more-471"></span>
<h2>Getting up and running</h2>
Once complete, Alfresco will provide core product documentation for the Quick Start covering Install and Configuration, User Guide and Developer Guide.  The QS will eventually run on both Alfresco 3.3 Enterprise and 3.4 however as it currently stands within HEAD, only the 3.4 (HEAD) version is supported, so make sure you build the alfresco.war also.

The QS comprises four artefacts – an AMP file for Alfresco, and AMP file for Share, an awe.war file for the Web Editor and a wcmqs.war file which is the Spring-based web application itself.  Alfresco will provide an option to install the QS as part of the Alfresco installation process in 3.4, and also provide a standalone installer to add the QS to a current 3.3 install.  Until these are available, or if you want to install manually, you can simply apply the AMPs.

Sync up with HEAD and build Alfresco.war and Share.war.  To build the QS artefacts you can use the following build targets:

Alfresco AMP:
<code>ant package-wcmquickstart</code>

Share AMP:
<code>ant package-wcmquickstart-share</code>

The above will build the appropriate AMP files and apply them to the alfresco.war and share.war.  You can then simply copy these into your Tomcat webapps location.

EDIT - Maven has now been removed from the web app build process so use ANT as follows:
<code>ant build-wcmquickstartwebapp-war</code>

<del>To build the web application, you must have Maven installed.  From root\modules\wcmquickstart\wcmquickstartwebapp run:</del>
<del><code>mvn -DskipTests install</code></del>

The above will create the website WAR (qcmqs.war), so copy this into your Tomcat webapps location.

Once deployed and started, login to Share as normal (e.g. http://www.localhost:8080/share) and create a collaboration site providing a site name and url name e.g. “Web Quick Start” and “wqs” respectively.  Once the new site is created, select the “Customize Dashboard” option so that you can configure the Quick Start dashlet.  Select “Add Dashlets” and you can then add the “WCM Quick Start” dashlet to your site dashboard:
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-1.png"><img class="size-full wp-image-479" title="WCM-Quick-Start-1" src="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-1.png" alt="Quick Start Install" width="600" height="450" /></a>
<small>Add the Quick Start dashlet to the Share site dashboard</small></p>
The QS dashlet simply provides a means of importing the Quick Start site data into a standard Share Collaboration site.  Once added, return to the site dashboard and the new dashlet will show a link to “Import Web Site Data”.  Click this link and then wait for the success message:
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-2.png"><img class="aligncenter size-full wp-image-486" title="WCM-Quick-Start-2" src="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-2.png" alt="Quick Start Install" width="600" height="450" /></a><small>Select the import link from the Quick Start dashlet</small></p>
Once the data has been successfully loaded, navigate to the Document Library where you will see the default site structure:

<a href="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-3.png"><img class="aligncenter size-full wp-image-492" title="WCM-Quick-Start-3" src="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-3.png" alt="Quick Start Site Structure" width="212" height="272" /></a>

You will notice that the site structure is separated into the “Quick Start Editorial” and “Quick Start Live” folders.  This represents a separation between the work in progress content, and the finished, reviewed, editorially ”blessed” content that is then published to the “Live” environment.  More about the publishing mechanism later.

If your web container is running on port 8080 and the web application is running in the same container as Alfresco, the setup is now complete and you should be able to access the web site on http://localhost:8080/wcmqs.

If you are not running the web application on port 8080 or if the web application is deployed to a different container or host, you can easily configure the site accordingly.  Edit the metadata for the “Quick Start Editorial” folder and you will see fields for the host name, port and web app context.  Configure these to point to where your web application (wcmqs.war) is deployed.
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-4.png"><img class="size-full wp-image-501" title="WCM-Quick-Start-4" src="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-4.png" alt="" width="524" height="568" /></a>
<small>Editorial Folder Meta Data</small></p>

<h2>The website</h2>
The website provides a set of features that demonstrate various aspects of using Alfresco’s WCM services.  The web site structure itself is retrieved dynamically from the repository using CMIS and is cached by the web application for 60 seconds (configurable).

The main sections of the site include a Home Page, News landing page and sub-sections, Publications landing page and sub-sections and Blog section, as well as search and contact pages.  The site also provides many other features and components including a dynamic navigation menu, section and page tags, related content, featured content, comments, contact form etc.
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-81.png"></a><a href="http://www.benh.co.uk/wp-content/uploads/2010/09/Alfresco-WCM-Quick-Start-Home_1285235175002.png"><img class="aligncenter size-full wp-image-544" title="Alfresco WCM Quick Start" src="http://www.benh.co.uk/wp-content/uploads/2010/09/Alfresco-WCM-Quick-Start-Home_1285235175002.png" alt="Alfresco WCM Quick Start" width="600" height="655" /></a><small>Quick Start Home Page</small></p>
The Publications section is designed to allow content editors to easily publish Office document (currently Microsoft, but possibly Open Office also) content and images.  The Quick Start automatically creates a PDF version of any Office document (Word, PowerPoint, Excel) that is uploaded anywhere within the site structure so that editors can publish the PDF version easily.  Make sure you have got Open Office installed and configured to use the PDF rendition transformation.  The web site also provides a flash-based document preview so that site visitors do not need to download the document just to read it:

<a href="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-51.png"><img class="aligncenter size-full wp-image-505" title="WCM-Quick-Start-5" src="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-51.png" alt="" width="600" height="575" /></a>

The Blog section provides a simple blogging mechanism with tagging and comments, posting the user generated content back into a Data List within Share.  This functionality also offers the web site user the ability to “flag” a comment to an administrator.

The Contact page allows a web site visitor to submit a contact form which in turn posts the content back to Alfresco.  This example triggers a simple workflow.

This is by no means a definitive list of the web site functionality, however provides a high level overview.
<h2>Editorial</h2>
The Quick Start web site is managed via Alfresco Share and the Alfresco Web Editor (for in-context editing).  As described previously, all editorial activity takes place in the “Quick Start Editorial” folder within the document library.

Folders are used within Share to define the site structure i.e. sections.  Within the Quick Start example, both content and site structure reside within the same location and this combined model is dynamically delivered.  This means that any content changes are updated immediately (barring any configured cache time) on the editorial site.  For example, if a new folder is created under the “root” folder using Share, upon browser refresh, the new section will be displayed within the top level navigation.  Note a folder has both a “Name” and “Title” field, the name being the URL and the title being the display label.

Creating a new folder within the QS site structure automatically creates two other things below it.  Firstly a “collections” folder and also an “index.html” file.  The index.html file is the asset used by the section’s landing page.  The “collections” folder is used to manage any “content collections” for that section.

<strong>Content Collections</strong> are simply (as the name suggests!) collections of content assets, grouped as the content editor sees fit.  For example, the home page on the QS site is made up of five different collections.  Navigating to Quick Start Editorial &gt; root &gt; collections shows the collections used on the Home Page.  Selecting “Edit Metadata” on any of them, will show the collection assets listed under the “Web Assets” field.  The “news.featured” collection shown below is used to power the Home Page banner slider.  You can see that 3 articles have been manually selected:
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-6.png"><img class="aligncenter size-full wp-image-507" title="WCM-Quick-Start-6" src="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-6.png" alt="Content Collection" width="524" height="604" /></a>
<small>Web Assets on a Static Collection</small></p>
Content collections whose content has been manually selected by an editor are referred to as “<strong>Static Collections</strong>”.  Static collections are used in various places around the site where editorial control is required to select specific content assets.

You may also notice that there is a “Query” field within the same metadata dialog.  If populated, this then turns the collection into a “<strong>Dynamic Collection</strong>”, whereby the query is automatically run on a configured interval e.g. every 30 minutes.  An example is shown at Quick Start Editorial &gt; root &gt; collections &gt; blogs.latest collection.  This collection is used by the “Latest Blog Articles” component on the right hand side of the home page.  The default query is using CMIS (Lucene is also supported) and simply retrieves the latest five blog articles and orders them by date/time.  Dynamic collections allow portions of the website to be automatically updated without any editorial intervention.  The “Maximum Size” and “Minutes to query refresh” fields can be used to fine tune the dynamic collection as required.

The keyword “section” can also be used within native CMIS queries.  For example, to show all content items of type ws:article from the current section you could use “section:.” As follows:

<code>select d.* from cmis:document as d where in_tree(d, &amp;#039;${section:.}&amp;#039;) and d.cmis:objectTypeId=&amp;#039;D:ws:article&amp;#039; order by d.cmis:creationDate desc<!--formatted--></code>

You can also reference an absolute section from the site root using <code>${section:/blog}</code> or for a subsection of the current section you can use
<code>${section:companies}</code> or for the parent of the current section you can use: <code>${section:..}</code>.  You can also access site root using <code>${section:/}</code>, or you can go really mad and use a combination <code>${section:/blog/../news/./companies}</code>.

As the query is just standard CMIS, you can also use standard property names such as cmis:contentStreamMimeType, so to return all PDF documents within the current section you could use:
<code>select d.* from cmis:document as d where in_tree(d, &amp;#039;${section:.}&amp;#039;) and d.cmis:objectTypeId=&amp;#039;cmis:document&amp;#039; and d.cmis:contentStreamMimeType=&amp;#039;application/pdf&amp;#039; order by d.cmis:creationDate desc<!--formatted--></code>

For each section of the website, <strong>content type to template mappings</strong> can be controlled from within Share.  For example, if you edit the metadata on Quick Start Editorial &gt; root &gt; news, you will see the “Section Config” property set to “ws:indexPage=sectionpage1”.  This maps all requests going to this section’s landing page (the index.html asset is of type ws:indexPage) to the sectionpage1 template.  Another example (set on the root folder) is ws:article=articlepage1 which causes all assets of the ws:article type to be rendered using the articlepage1 template.  This template can be seen when clicking on any news or blog article.

This template mapping is hierarchical in that if a match is not found in the requested section, it will then look for a match in the parent section and so on until the root section is reached.  Therefore, type-to-template mappings can be set site wide on the root folder.  The template resolution algorithm also looks up the type hierarchy as part of this process, however I will not cover this in detail here.

<strong>Renditions</strong>, powered by the Alfresco Rendition Service are used extensively across the Quick Start site.  They allow content authors to publish the correct format/size of content optimized for web, without any manual intervention.  Renditions are defined within a Spring config file (rendition-context.xml), and then these definitions can be configured for use with either a given file mime type or content type.  Like the type-to-template mappings, this configuration is on a hierarchical section basis, with the option to inherit being available to set on each section.

An example configuration can be seen on the Quick Start Editorial &gt; root &gt; news folder.  The “Rendition Config” field has several mappings including ws:image=ws:featuredNewsThumbnail.  This example specifies that all content of type “ws:image” will have a rendition created with the definition called “featuredNewsThumbnail” (defined in the Spring config file).  To give it a test, upload a large file into the news section and attach it to a news article (by editing the metadata on the news article using the “Edit full metadata” option).  When viewing the article on the web site you will see the renditions in use.  A thumbnail or cropped banner slider image is displayed (if used on the front page) and a medium and large thumbnail when viewing the article directly.  Using the node browser you can browse all of the generated renditions.

Out of the box, the <strong>Alfresco Web Editor</strong> is configured for use when viewing either a news or blog article.  In 3.4, as well as edits, the Web Editor also provides the option to create new content and delete content.  <del>NOTE: currently the HEAD version includes the 3.3 AWE and does not build the 3.4 version yet.</del> UPDATE: The 3.3 awe.war is no longer included as part of the web app build from HEAD.  The 3.4 AWE can be built separately using <code>ant incremental-webeditor</code>.  Just drop the awe.war into the same container as the wcmqs.war.
<p style="text-align: center;"><a href="http://www.benh.co.uk/wp-content/uploads/2010/11/WCM-Quick-Start-7.png"></a><a href="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-7.png"><img class="aligncenter size-full wp-image-541" title="WCM-Quick-Start-7" src="http://www.benh.co.uk/wp-content/uploads/2010/09/WCM-Quick-Start-7.png" alt="" width="600" height="266" /></a>
<small>In-Context Editing with the Alfresco Web Editor</small></p>
The QS also provides two examples of managing <strong>user generated content</strong> (UGC) with Alfresco.  There is a “Contact” form located on the Contact page and a “Comment” form located under each blog article.  Both forms submit content to Alfresco via CMIS which is gathered in a Share Data List.  The Contact form triggers a basic workflow and the Comment form provides the ability to flag a comment to an Administrator, which disables it from display i.e. visitor moderated.
<h2>Publishing</h2>
The QS also provides an example publishing mechanism.  This is based on a workflow driven model whereby once content is approved it is moved locally (using the Transfer Service) from the “Quick Start Editorial” folder, into the “Quick Start Live” folder.

The QS provides two sample workflows for achieving this.  Firstly <strong>“Review and Publish Section Structure”</strong>.  This is designed to allow content authors to easily publish specific <strong>sections</strong> of the site structure.  This is done by selecting and initiating this workflow on the appropriate index.html file for the required section.  The “Review and Publish Section Structure” workflow will publish the sections from the submitted one and all sections below. A section comprises the section itself, its index page, and all of its collections.  Basically it's a "publish sub tree" mechanism.  If you want to populate the entire site structure on live then publish the root section index.html.

The “<strong>Review and Publish</strong>” workflow publishes either a single asset or group of assets (multi-select).  If you use this workflow for a sections index.html, it will publish that section and its collections folder, but does not cascade down.

It is worth noting you can also configure the web application to view the “Live” site structure by configuring the meta-data properties on the “Quick Start Live” folder.  The default configuration sets the host address to 127.0.0.1, so if you are running the Quick Start locally, you can actually view the editorial environment on http://localhost:8080/wcmqs and the live on http://127.0.0.1:8080/wcmqs.  Note that the Web Editor (AWE) is configured to be enabled on the Editorial content, and disabled on the Live.  This is controlled by a flag “isEditorial” on the “Quick Start Editorial” which will also (when complete) dictate what is viewable via the live web application with regards to publishing go live and expiry dates.
<h2>Summary</h2>
I have certainly not covered all features here, but hopefully provided enough information to get started.  The documentation will go in to a lot of detail, especially from a Developer perspective looking to extend the QS.  Were hoping the QS will provide a valuable platform both from an evaluation perspective and also for both customers and partners looking to build Alfresco implementations.]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/alfresco/web-quick-start/web-quick-start-first-look/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Alfresco WCM Roadmap &#8211; August 2010</title>
		<link>http://www.benh.co.uk/alfresco/alfresco-wcm-roadmap-august-2010/</link>
		<comments>http://www.benh.co.uk/alfresco/alfresco-wcm-roadmap-august-2010/#comments</comments>
		<pubDate>Thu, 05 Aug 2010 08:42:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Alfresco WCM (archive)]]></category>
		<category><![CDATA[Alfresco WCM]]></category>
		<category><![CDATA[Roadmap]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=464</guid>
		<description><![CDATA[Below are the slides from the Alfresco WCM Roadmap webinar that covered: Release Schedule WCM Quick Start Project "Cheetah" Project "Swift" The webinar can be viewed here. Alfresco WCM Roadmap 2010 (Cheetah &#38; Swift)]]></description>
			<content:encoded><![CDATA[Below are the slides from the Alfresco WCM Roadmap webinar that covered:
<ul>
  <li>Release Schedule</li>
  <li>WCM Quick Start</li>
  <li>Project "Cheetah"</li>
  <li>Project "Swift"</li>
</ul>
The webinar can be viewed <a href=" http://www2.alfresco.com/l/1234/2010-08-04/JKCTP" target="_blank">here</a>.
<div id="__ss_4905882" style="width: 475px;"><strong style="display: block; margin: 12px 0 4px;"><a title="Alfresco WCM Roadmap 2010 (Cheetah &amp; Swift)" href="http://www.slideshare.net/alfresco/alfresco-wcm-roadmap">Alfresco WCM Roadmap 2010 (Cheetah &amp; Swift)</a></strong>
  <object id="__sse4905882" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="475" height="405" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0">
    <param name="allowFullScreen" value="true" />
    <param name="allowScriptAccess" value="always" />
    <param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=wcmroadmap-100805031505-phpapp01&amp;stripped_title=alfresco-wcm-roadmap" />
    <param name="name" value="__sse4905882" />
    <param name="allowfullscreen" value="true" />
    <embed id="__sse4905882" type="application/x-shockwave-flash" width="475" height="405" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=wcmroadmap-100805031505-phpapp01&amp;stripped_title=alfresco-wcm-roadmap" name="__sse4905882" allowscriptaccess="always" allowfullscreen="true"></embed>
  </object>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/alfresco/alfresco-wcm-roadmap-august-2010/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Web Studio, Web Editor and the Web Editor Framework</title>
		<link>http://www.benh.co.uk/alfresco/web-studio-web-editor-and-the-web-editor-framework/</link>
		<comments>http://www.benh.co.uk/alfresco/web-studio-web-editor-and-the-web-editor-framework/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 09:23:25 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Alfresco WCM (archive)]]></category>
		<category><![CDATA[Alfresco Web Editor]]></category>
		<category><![CDATA[Surf Framework]]></category>
		<category><![CDATA[web editor]]></category>
		<category><![CDATA[web editor framework]]></category>
		<category><![CDATA[web studio]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=454</guid>
		<description><![CDATA[I have been receiving a steady stream of questions around Web Studio and enquiries around our strategy for future releases with regards to similar functionality, so thought it would be helpful to discuss the product focus.]]></description>
			<content:encoded><![CDATA[I have been receiving a steady stream of questions around <a href="http://wiki.alfresco.com/wiki/Web_Studio" target="_blank">Web Studio</a> and enquiries around our strategy for future releases with regards to similar functionality, so thought it would be helpful to discuss the product focus.

<span id="more-454"></span>Firstly to clarify, Web Studio is no longer part of the Community Alfresco product so is no longer available for download.  Web Studio was not part of an Enterprise release and was therefore not supported as part of the core product by Alfresco.  The <a href="http://wiki.alfresco.com/wiki/Spring_Surf" target="_blank">Surf framework</a> that Web Studio utilised has been committed to Spring Source.  The Surf framework is the underlying framework for Alfresco Share and Alfresco will continue to invest in Surf and surrounding developer tools.

With 3.3 Alfresco started to heavily invest in WCM utilising the core Alfresco repository (non AVM).  As part of this strategy we introduced the <a href="http://wiki.alfresco.com/wiki/Web_Editor" target="_blank">Web Editor</a> (in-context editing) and <a href="http://wiki.alfresco.com/wiki/Web_Editor_Framework" target="_blank">Web Editor Framework</a> (WEF).  The Web Editor Framework provides a standard plug-in framework that people can develop any type of functionality on.  The framework can be loosely considered to be a ribbon toolbar that can be customised e.g. adding custom tabs, buttons and any other UI required.  Most importantly it is a common framework that Alfresco, the Alfresco community, partners, customers, the Spring community and anyone else can develop on.  A lot of effort has gone into the way that toolbars, buttons and components in general can be packaged into a single file, allowing for new plug-in’s to easily be “dropped in” to a WEF environment.

The Web Editor was Alfresco’s first core product (documented, supported) feature that was built utilising the Web Editor Framework i.e. the Web Editor has a dependency upon the WEF.  The Web Editor addresses a single specific use case which is to edit content (semantic page content) in the context of the web site.  In context editing.  This functionality is something we are now looking to build on in future releases, expanding this functionality above and beyond simple editing.  The Web Editor focus however of the next release (internally named "Swift") is still on the content editorial process for the Web Editor.

There is also a great deal of interest in what I will term “presentation management”, and this is where Web Studio really sparked peoples thoughts around Alfresco WCM.  By this I mean providing the ability for a site administrator  to manage sites, site structures, templates, components, navigation etc.  In order to provide these features in a consistent, supportable manner, an underlying model is required within the repository which can then be utilised by a  delivery framework.  As the WCM functionality moves forward, we will start to implement such a model, and utilise Spring Surf and the WEF to deliver the presentation management functionality.  The Swift release remains focused on web content production, so no presentation management capabilities are in scope for this release, however the building blocks around the model and Spring Surf will start emerge.

Speaking to Alfresco partners, customers and community member recently, I know people are already starting to implement the Web Editor Framework and build out their required functionality.  We have current customers who have built presentation management capabilities with full custom clients, and the Web Editor Framework provides a powerful  alternative to this route.]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/alfresco/web-studio-web-editor-and-the-web-editor-framework/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tech Talk Live &#8211; Web Editor Slides</title>
		<link>http://www.benh.co.uk/alfresco/tech-talk-live-web-editor-slides/</link>
		<comments>http://www.benh.co.uk/alfresco/tech-talk-live-web-editor-slides/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 11:45:36 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Alfresco WCM (archive)]]></category>
		<category><![CDATA[Alfresco Web Editor]]></category>

		<guid isPermaLink="false">http://www.benh.co.uk/?p=451</guid>
		<description><![CDATA[Here are my slides from last weeks TTL on the Web Editor.]]></description>
			<content:encoded><![CDATA[<div id="__ss_3707570" style="width: 425px;">Here are my slides from last weeks TTL on the Web Editor.</div>
<div style="width: 425px;"></div>
<div style="width: 425px;"><strong><a title="Tech talk live alfresco web editor" href="http://www.slideshare.net/alfresco/tech-talk-live-alfresco-web-editor-compatibility-mode">Tech talk live   alfresco web editor</a></strong><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=techtalklive-alfrescowebeditorcompatibilitymode-100413053407-phpapp01&amp;stripped_title=tech-talk-live-alfresco-web-editor-compatibility-mode" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=techtalklive-alfrescowebeditorcompatibilitymode-100413053407-phpapp01&amp;stripped_title=tech-talk-live-alfresco-web-editor-compatibility-mode" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<div id="__ss_3707570" style="width: 425px;">
<div style="padding: 5px 0 12px;">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/alfresco">Alfresco Software</a>.</div>
</div>]]></content:encoded>
			<wfw:commentRss>http://www.benh.co.uk/alfresco/tech-talk-live-web-editor-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
