Screenshot showing chart editing in SSRS Report Builder

5 Reasons why PowerView can’t replace Reporting Services

The other week I wrote a post discussing how PowerView was the future of SQL Server Reporting Services, and the killer features that made it a compelling choice.  Despite the numerous positive advances that PowerView brings to Microsoft/SQL-based reporting, there are of course a number of counter arguments.  I deliberately left these out in order to look at some of these reasons in a later post.

As such, here are five reasons why PowerView, despite all its pizzazz, is simply not capable (in its current form) of replacing the venerable SSRS. Continue reading “5 Reasons why PowerView can’t replace Reporting Services” »

Screenshot of PowerView in Excel 2013

6 Reasons why PowerView is the Future of Reporting Services

Since its introduction with SQL Server 2012, PowerView has started to become the familiar face of Microsoft’s self service business intelligence offering.  Its inclusion in Excel 2013 has only reinforced its position as the premier tool for quick, interactive  visualisation of data, in conjunction of course with the magnificent PowerPivot.  But ask any “traditional” BI developer/architect about Reporting Services (SSRS), and it’s likely that they will staunchly defend it, decrying PowerView as a gimmick.

So here are 6 reasons why PowerView is NOT a gimmick, and may in fact be the future of SQL Server Reporting Services. Continue reading “6 Reasons why PowerView is the Future of Reporting Services” »

GeoFlow brings 3D geographical visualisation to Excel 2013

The other week, Microsoft announced GeoFlow for Excel 2013 at the SQL PASS Business Analytics conference in Chicago.  While it’s not exactly new, it is at least, a pretty impressive looking addition to the data visualisation toolkit.

However, while GeoFlow finally brings 3D geographical visualisation to Microsoft’s self-service BI utility belt (in your face, Batman), it’s hard to make a case for it for any purpose except wowing executives and potential clients.

[tube]LNI0r9_BJUM[/tube]

Continue reading “GeoFlow brings 3D geographical visualisation to Excel 2013” »

Screenshot of the "Installation Type" page in the SQL Server 2012 SP1 setup wizard

Installing SQL Server Data Tools – Business Intelligence for Visual Studio 2012

As if renaming the accurately titled Business Intelligence Development Studio (BIDS) to the rather ambiguous SQL Server Data Tools (SSDT) wasn’t bad enough, in December, Microsoft’s latest SSDT release only brought half the expected capabilities to Visual Studio 2012.  Yep, the December 2012 SSDT download was missing a key component: the project and item templates for developing MS BI projects in Visual Studio.  Thankfully, the newest release (5th March, 2013) has finally added all of the MS BI templates to SSDT, so you can now develop SSIS packages, SSAS cubes and SSRS reports in the Visual Studio 2012 environment.

Which version?

Unfortunately, they’ve not made the whole process easy.  Searching for “SQL Server Data Tools” will likely lead you to a download which, upon installation, will add connectivity and server management tools to VS 2012 – making it like an up-to-date version of SQL Server Management Studio (SSMS), but without the BI project templates.

The latest release (with the BI templates) is actually called:

Microsoft SQL Server Data Tools – Business Intelligence for Visual Studio 2012

So make sure that if you’re trying to get SSDT for BI development work, that you download the correct version.  Unfortunately, that wasn’t the end of the issues, as I had a bit of trouble with installation that I felt needed sharing.

Installation woes

Begin by downloading the correct installer for the BI enabled version of SSDT from http://www.microsoft.com/en-us/download/details.aspx?id=36843 (782 MB).

Screenshot of the "Installation Type" page in the SQL Server 2012 SP1 setup wizard

Fig. 1: On a 64-bit SQL instance, be sure to select “Perform a new installation of SQL Server 2012”

Once you execute it, the installer will unpack and run the SQL Server 2012 SP1 setup wizard.  Don’t worry about this, remember that SSDT, like BIDS before it, is actually a component of SQL Server based upon the Visual Studio shell, NOT actually an extension to Visual Studio itself.

The trick with the installation is when you reach the Installation Type step (see Fig 1.).

If you’re running an x64-based SQL instance (64-bit), make sure to select “New Instance” on the Installation Type page, and NOT “Add features to an existing instance”.

This is because although the SQL Server instance is 64-bit, the Visual Studio 2012 shell is actually 32-bit.  If you attempt to upgrade a 64-bit instance with a 32-bit component, it fails the Installation Rules checks and won’t allow you to proceed.

Choosing “New Instance” will work but don’t worry, it doesn’t actually require creation of a new SQL instance, it just allows the installer to get past the pre-installation checks.

If you’ve got a 32-bit instance of SQL Server, it doesn’t matter what option you choose here.

Installation complete

Screenshot showing adding a new Business Intelligence project in Visual Studio 2012

The new Business Intelligence Project templates in action

Once the installation has completed (may require a restart), you can open Visual Studio 2012 (or the new SQL Server Data Tools 2012 item on your start menu) and get developing.  Click “New Project” in the File menu and check for the “Business Intelligence” templates to confirm that it’s worked.

I’ve yet to find any real differences between the Visual Studio 2012 based SSDT and the Visual Studio 2010 based version that shipped with SQL Server 2012.  At the moment, the main advantage of using this release seems to be to take advantage of the improved features of Visual Studio 2012 over its 2010 counterpart, rather than any advancements in the Business Intelligence templates/tools themselves.

They might be there, however, I just haven’t come across them yet.  Let me know in the comments below if you’ve spotted any improvements over SSDT 2010 and what they are.

Is Hadoop the right tool for the job?

I recently posted some thoughts regarding Microsoft’s Windows-compatible Hadoop implementation, HDInsight.  I was investigating it for a project that I figured would benefit from a distributed processing approach, although ultimately decided to pursue other alternatives.  It led our team to make some quite interesting discoveries about Hadoop, and some scenarios of when current distributed processing solutions are and aren’t appropriate.

 

Example scenario

The project in question is actually a large-scale data processing solution, required to process millions of varied data files daily, parsing data points from JSON, XML, HTML and more, and writing to a storage solution.  Going back to the “Big Data” terminology, we were definitely looking at the potential for moving Terra-bytes of data per day, at least once we scale up, so needed a technology that could handle this, while remaining responsive, as processing time is actually a crucial factor.

What we quickly noticed, was that we actually didn’t need to use the Reduce part of the functionality.  All we wanted to do was just simply run Map jobs to identify and retrieve data points, rather than aggregating and summarising said data points.

 

Distributed processing options

Following investigation of, and subsequent rejection of HDInsight as a viable option for this project, we took a look into vanilla Hadoop, as well as some other distributed processing implementations and Hadoop add-ons.  Fortunately, there are a lot of very cool products out there.

 

Graph comparing Spark vs Hadoop

Spark claims to run up to 100x faster than Hadoop MapReduce

Cloudera Impala actually introduces its own distributed query engine, which avoids MapReduce to deliver near real-time query results.  It’s not intended as a replacement for MapReduce however, and is meant to complement a Hadoop cluster by offering alternative query techniques for accessing data from Hive and HDFS.

To properly evaluate the performance of these products against one another, we realised we needed a baseline.  Having a great deal of MS BI experience in our team, we thought it would be fun to create this baseline using our usual go-to data processing solution: SSIS.

The more we dug into the distributed architecture, the more it seems like we were looking for something else for our purposes, given the complete lack of requirement for a reduction function.

 

SSIS vs Hadoop

I won’t go into detail on this, as Links has already written up the results over on thinknook.com, but running our Map function on a single SSIS instance performed significantly better in each test than our Hadoop cluster.  The results we gathered seem to suggest that distributed is really only the correct approach when you are using both the Map AND Reduce functionality and/or working with extremely large datasets.  Indeed, the larger the dataset and the more data points involved, the more powerful and useful the reduce functionality becomes.

There is quite simply no straightforward alternative for performing this type of operation in traditional ETL platforms such as SSIS.

I’d like to find out what the comparison is like when performing this same test with Spark vs SSIS, just to see if the in-memory implementation provides the necessary performance boost, or if it’s still better to keep Map and MapReduce in two separate places.

 

Is Hadoop the right tool for the job?

Bottom line: It depends on the job.

If you’re not utilising both sides of the MapReduce coin though, even when processing millions of files, then the overhead of creating and managing jobs, is just not worth it.  And if you are using both Map and Reduce functionality, it may just be worth considering some of the other solutions out there as an alternative to Hadoop MapReduce.