There are several data points within Twitter data that can be used for filtering by location. Which of these to use really depends upon the use case. This article describes each of these properties and develops a working example for a country specific filter.Posted on September 6, 2013 by admin Read More
The output properties of a typical DataSift stream are incredibly rich. Depending which augmentations and data sources are in use, the resulting data set can contain a few hundred properties. Selecting which of these to analyse will depend upon the use case and area of interest.
This article serves to provide an overview of the primary data points of interest for user who are new to social data and specifically DataSift data, and also introduce a utility script for quickly aggregating and extracting insight from a DataSift output JSON file.Posted on June 15, 2013 by admin Read More
Many DataSift customers use social data to gain a better insight in to their own customers and audience. A DataSift stream contains a huge amount of data and using simple aggregation, you can quickly build up a detailed picture of what people are talking about and what communication channels and devices they are using. This data can be incredibly helpful when considering where to invest advertising spend, device and application support and general CRM.
In previous examples, I have used a database to aggregate each data property and provide a simple dashboard. This process is effective, but time consuming to develop and repeat. The below screen shot is an example of such a dashboard:Posted on January 8, 2013 by admin Read More
DataSift released a new Push API some time ago that offers direct delivery of data in to one of several endpoints. So far, connectors for FTP, Dynamo DB , S3 and HTTP POST have been released, and several more are on the way (Mongo DB, Couch DB, Splunk etc). The Push delivery option supports some great features including the ability to set the payload size, delivery frequency, and also providing the ability to pause a delivery. Each push delivery comes with a 1 hour buffer capability, meaning consuming the data becomes a lot more flexible. The Push API can be used for both real-time and historic data consumption.Posted on November 20, 2012 by admin Read More
Recently I have been playing with Mike Bostock’s D3.js and continue to be impressed with different examples of how people are visualising data. D3 provides an incredibly rich library, although with a relatively steep learning curve, the results can be fantastic.
I decided to mix the real-time data source of DataSift data with a volume/time graph example from the D3 library. I added a simple UI and real-time rendering using Node.js, Socket.io and Express to allow the user to enter their stream credentials. The finished example is as follows:Posted on March 29, 2012 by admin Read More
DataSift data can be exported from both recordings and historics in either JSON or CSV formats. Depending upon your filter definition, the fields within the resulting data will vary, therefore making importing the data in to a database such as MySQL a little time consuming.
The below PHP script automatically generates the
CREATE TABLE command, along with the
LOAD DATA INFILE command, allowing you to create a new table, and import the data quickly and easily.
Working at DataSift, I speak to many different clients and discuss their use cases regarding what type of data they would like to filter on, and include within their real-time data streams. I thought it would be useful to start capturing many of the common use cases in a single post and explain how these translate in to Curated Stream Definition Language (CSDL). If you are new to DataSift and or CSDL, the DataSift documentation is great and growing all the time, and this post should give some useful examples.
I will be adding to this post as and when I capture new examples. Please feel free to submit your own if you feel they will be beneficial to others.Posted on February 17, 2012 by admin Read More