Microsoft’s Build conference for 2016 took place a couple of weeks ago, and true to form, there were a number of killer announcements and reveals for a number of services, tools, and frameworks, many of which are available today. Not one to ever really post something when it’s actually relevant, here are a few of the things that jumped out at me from the event. Continue reading “5 Observations from Microsoft Build 2016” »
With SQL Server 2008, Microsoft introduced the new, improved datetime2 format. This newer time storage format is great, because it takes up less storage space, plus you have control over precision and can define your field to the exact specification required. Database columns defined as datetime2 can be mapped in SSIS by using the DT_DBTIMESTAMP2 type. However, in the scenario where you may have a Script Transformation in your SSIS package, and want to assign a .NET DateTime type to a Data Flow column that is mapped to a datetime2 field, you might encounter a DoesNotFitBufferException.
The reason for this is likely down to your specified field precision, and is easily fixed. Continue reading “Mapping C# DateTime to SQL Server datetime2 via SSIS” »
As if renaming the accurately titled Business Intelligence Development Studio (BIDS) to the rather ambiguous SQL Server Data Tools (SSDT) wasn’t bad enough, in December, Microsoft’s latest SSDT release only brought half the expected capabilities to Visual Studio 2012. Yep, the December 2012 SSDT download was missing a key component: the project and item templates for developing MS BI projects in Visual Studio. Thankfully, the newest release (5th March, 2013) has finally added all of the MS BI templates to SSDT, so you can now develop SSIS packages, SSAS cubes and SSRS reports in the Visual Studio 2012 environment.
Unfortunately, they’ve not made the whole process easy. Searching for “SQL Server Data Tools” will likely lead you to a download which, upon installation, will add connectivity and server management tools to VS 2012 – making it like an up-to-date version of SQL Server Management Studio (SSMS), but without the BI project templates.
The latest release (with the BI templates) is actually called:
So make sure that if you’re trying to get SSDT for BI development work, that you download the correct version. Unfortunately, that wasn’t the end of the issues, as I had a bit of trouble with installation that I felt needed sharing.
Begin by downloading the correct installer for the BI enabled version of SSDT from http://www.microsoft.com/en-us/download/details.aspx?id=36843 (782 MB).
Once you execute it, the installer will unpack and run the SQL Server 2012 SP1 setup wizard. Don’t worry about this, remember that SSDT, like BIDS before it, is actually a component of SQL Server based upon the Visual Studio shell, NOT actually an extension to Visual Studio itself.
The trick with the installation is when you reach the Installation Type step (see Fig 1.).
This is because although the SQL Server instance is 64-bit, the Visual Studio 2012 shell is actually 32-bit. If you attempt to upgrade a 64-bit instance with a 32-bit component, it fails the Installation Rules checks and won’t allow you to proceed.
Choosing “New Instance” will work but don’t worry, it doesn’t actually require creation of a new SQL instance, it just allows the installer to get past the pre-installation checks.
If you’ve got a 32-bit instance of SQL Server, it doesn’t matter what option you choose here.
Once the installation has completed (may require a restart), you can open Visual Studio 2012 (or the new SQL Server Data Tools 2012 item on your start menu) and get developing. Click “New Project” in the File menu and check for the “Business Intelligence” templates to confirm that it’s worked.
I’ve yet to find any real differences between the Visual Studio 2012 based SSDT and the Visual Studio 2010 based version that shipped with SQL Server 2012. At the moment, the main advantage of using this release seems to be to take advantage of the improved features of Visual Studio 2012 over its 2010 counterpart, rather than any advancements in the Business Intelligence templates/tools themselves.
They might be there, however, I just haven’t come across them yet. Let me know in the comments below if you’ve spotted any improvements over SSDT 2010 and what they are.
Hadoop. Everyone and their dog is talking about it. That and “Big Data”. There was an excellent post on Brent Ozar’s DBA Reactions Tumblr blog recently that encapsulated it perfectly, titled “When the executives ask if we’re Hadooping”. It’s a valid point though, Hadoop is mentioned in just about every article these days, along with the phrase “Big Data” (which I personally don’t like at all). The consensus, at least on the surface, seems to be that Hadoop will solve everyone’s problems, process anything, oh and bring world peace while it’s doing that. My sarcastic tone belies a genuine interest in playing about with it though. With so many people talking about Hadoop (in its many implementations), I was very keen to get an opportunity to try it for myself.
Fortunately, a project came along recently that seemed like it might benefit from a distributed processing approach. So naturally, being primarily a Microsoft Business Intelligence person, I figured the best place for me to get started was to jump onto Windows Azure and try out HDInsight, Microsoft’s own Hadoop implementation (in conjunction with Hortonworks).
Getting started with HDInsight is simple. Incredibly simple. Just hit up https://www.hadooponazure.com/ and sign-in to get started and request your cluster. The good news is, it’ll be live in minutes. the bad is that you can only get 3 nodes to begin with, which severely limits your processing capacity, except for only the simplest jobs.
This led me to actually discount HDInsight as a platform for this project soon after. Aside from the fact that at the time of writing, it’s still in preview stage (therefore no extra nodes, pricing information or scale-out options obviously available), on the default 3 nodes, we found that the performance was terribly slow, and the management of jobs and file system actually obscured somewhat by the web interface MS have added to try and simplify the experience. Even as a predominantly .NET/Windows person, I was much more comfortable configuring jobs and manipulating HDFS directly via the command line, rather than via the web interface (That could totally just be me though). If you use Remote Desktop to connect to your cluster, you can actually just launch the command line from there, and also browse HDFS using the HDFS web interface by connecting to the cluster’s head node.
The preview nature of the platform was definitely a killer, at least for this project, as we were looking for something we could start with immediately, with the option to quickly boost capacity if necessary. One of the key selling points for using a distributed architecture has to be the ability to quickly and easily scale out capacity by adding more nodes to the cluster. Add to that the fact that we found performance to be very slow, and it was clearly not the best option for our purposes (To be completely fair though, my experiences with distributed processing solutions suggest they’re not the best choice for processing extremely large numbers of files, being more suited to handling smaller numbers of extremely large files).
Unfortunately, there’s not a huge amount of documentation available, and that which is available is not complete, so be prepared to roll up your sleeves and get your hands dirty.
I’m not for a second saying don’t try HDInsight though. As a project, it’s still in its infancy and perhaps not moving as quickly as some of the others out there. A Windows-based Hadoop implementation is still a very positive thing however, and while I didn’t really get on with the web UI, I’m sure others will find it fits their needs perfectly.
HDInsight just needs to haul itself up off its hands and knees and take those first couple of tentative steps.
- Easy to get started
- .NET code MapReduce functions
- Awesome SDK
- Pretty UI
- Slow, especially on the default 3 nodes
- UI obscures Hadoop and HDFS functionality
- Incomplete documentation
- Still in preview stage
I suggest that everyone gives it a go for themselves, as with most things in life (I was going to say in BI, but it’s equally applicable), one man’s trash is another man’s treasure, and depending on the requirements of each individual project, HDInsight may or may not be suitable. Would I recommend it at the moment, ahead of a Linux-hosted Hadoop implementation? No, I have to say I probably wouldn’t, but it’s good to see Hadoop hit Windows regardless, and there is definite promise in HDInsight.
It just needs to haul itself up off its hands and knees and take those first couple of tentative steps.
I recently posted about a quandary in which I found myself that led to me building my own extended ForEach File Enumerator in SSIS. All things considered, it was a reasonably straightforward experience, with most of my issues stemming from a relative unfamiliarity with Windows Forms development (I was always an ASP.NET man). The whole process can actually be split into four very simple steps to make things easier:
- Create your Enumeration function
- Design your UI for SQL Server Data Tools (SSDT)
- Validation and assignment of input from the UI
- Deploy your new component.
As long as your new custom component isn’t too complicated, these steps can be completed very quickly, meaning you can be up and running in only a little longer than it would take to write everything in a Script Task, and think of the re-usability! Continue reading “Notes from building a Custom ForEach Enumerator in SSIS” »