Archive | data RSS feed for this section

Dat – git for data

dat is an open source tool that enables the sharing of large datasets, allowing for a decentralized collaboration flow similar to what git offers for source code. It isn’t quite ready for prime time yet

(Full Story: http://dat-data.com/ )

Who Is Facebook’s Billionth User? – Business Insider

In other words, Facebook had to estimate its user count because actually counting them user by user would have been too taxing on its servers. So User No. 1,000,000,000′s identity will remain a mystery.

(Full Story: Who Is Facebook’s Billionth User? – Business Insider)

Microsoft Codename “Data Explorer”

Identify the data you care about from the sources you work with (e.g. Excel spreadsheets, files, SQL Server databases).
Discover relevant data and services via automatic recommendations from the Windows Azure Marketplace.
Enrich your data by combining it and visualizing the results.
Collaborate with your colleagues to refine the data.
Publish the results to share them with others or power solutions.

(Full Story: Microsoft Codename “Data Explorer”)

Building data science teams – O’Reilly Radar

It’s hard to understate the sophistication of the tools needed to instrument, track, move, and process data at scale. The development and implementation of these technologies is the responsibility of the data engineering and infrastructure team. The technologies have evolved tremendously over the past decade, with an incredible amount of collaboration taking place through open source projects.hive

(Full Story: Building data science teams – O’Reilly Radar)

Maltego – data forensics application

Maltego is an open source intelligence and forensics application. It will offer you timous mining and gathering of information as well as the representation of this information in a easy to understand format.

(Full Story: Maltego – data forensics application)

Needlebase – where data comes together

Needle platform for acquiring, integrating, cleansing, analyzing and publishing data on the web.  Using Needle through a web browser, without programmers or DBA

(Full Story: Needlebase – where data comes together)

Teiid is a data virtualization system that allows applications to use data from multiple, heterogenous data stores.

Teiid is comprised of tools, components and services for creating and executing bi-directional data services. Through abstraction and federation, data is accessed and integrated in real-time across distributed data sources without copying or otherwise moving data from its system of record.
(Link: Teiid is a data virtualization system that allows applications to use data from multiple, heterogenous data stores.)

'Scrapers' Dig Deep for Data on the Web – WSJ.com

PatientsLikeMe managed to block and identify the intruder: Nielsen Co., the privately held New York media-research firm. Nielsen monitors online “buzz” for clients, including major drug makers, which buy data gleaned from the Web to get insight from consumers about their products, Nielsen says.

“I felt totally violated,” says Bilal Ahmed
(Link: ‘Scrapers’ Dig Deep for Data on the Web – WSJ.com)

Microsoft Pivot – easier to interact with massive amounts of data in ways that are powerful

Here at Live Labs we’re all about experiments, and Pivot is our most ambitious to date. Pivot makes it easier to interact with massive amounts of data in ways that are powerful, informative, and fun. We tried to step back and design an interaction model that accommodates the complexity and scale of information rather than the traditional structure of the Web.
(Link: Microsoft Pivot – easier to interact with massive amounts of data in ways that are powerful)

Follow

Get every new post delivered to your Inbox.