Last year, Microsoft added a preview connector enabling Power BI to query Amazon Redshift. This wasn’t publicised as an “official” data source, and took some steps in order to be able to even see the connector in Power BI Desktop. Crucially, you could only use this connector in Power BI Desktop, not when workbooks are deployed to the cloud. Yesterday, Microsoft announced the connector is now available within the Power BI Service, which means that workbooks containing Redshift data connections can now be deployed to the cloud. I’ve been working a lot with Redshift over the past year or so, and Power BI’s still my go-to data-viz solution, so I’m delighted to see the this announcement, as it means that Redshift-based workbooks can now be shared with others via powerbi.com.
You can read details of the announcement of Redshift for the Power BI Service over on the Power BI blog, I’m not going to replicate it here.
As with most Database-type data sources, Power BI offers two query modes: Import or Direct Query. Import mode allows you to select a number of tables and views from the data source, and then loads all the data from these into Power BI. That’s fine in a lot of data sources, but when you’re dealing with potentially billions of rows of data like you normally would be in Redshift (or other big data solutions like Google BigQuery, Spark, Snowflake, etc.), this isn’t really an option. You’re paying for the processing power these solutions offer, so use it. DirectQuery mode pushes the execution down onto the database. It allows something like Redshift to use the power of the cluster to execute the query and return the results to the client, in this case Power BI. This is a very common model found with client tools that support big data repositories. Tableau, Qlik, Alteryx etc. all support a similar practice under various names. These queries are issued in real-time, as a user filters and interacts with the visualisation. There are some limitations to this approach, as outlined here.
The configuration on powerbi.com is still a little involved, and there aren’t direct connectors set up as of yet, but it’s great to see Microsoft weaving Redshift support deeper into Power BI. Watch this space!
If you’ve used the Redshift connector for Power BI (or any of the other experimental connectors like Impala or Snowflake), let me know in the comments below how your experience has been and what your thoughts are.