THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Dejan Sarka

  • SQL Server 2017 Integration Services Cookbook

    I coauthored my 15th bookSmile Together with Christian Cote (lead author) and Matija Lah (coauthor) we publishes SQL Server 2017 Integration Services Cookbook. Of course, it is kind of early to say this is a definitive guide to SSIS 2017. More accurate name would be SSIS 2016 / 2017 Cookbook. Besides detailed guidelines how to use the 2016 version, you will also find a chapter on some new information on scaling out SSIS 2017. In the future, we will add an online chapter, if it will be needed, about additional new SSIS 2017 functionalities. Anyway, here is a brief description of the chapters.

    Chapter 1: SSIS Setup

    This chapter will describe step by step how to setup SQL Server 2016 to get the features that are used in the book.

    Chapter 2: What is New in SSIS 2016

    This chapter is an overview of Integration Services 2016 new features. Some of the topics covered here are covered extensively later in the book.

    Chapter 3: Key Components of a Modern ETL Solution

    This chapter will explain how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse.

    Chapter 4: Data Warehouse Loading Techniques

    This chapter describes many patterns used when it comes to data warehouse (DW) or operational data store (ODS) load.

    Chapter 5: Dealing with Data Quality

    This chapter will describe how SSIS, DQS and MDS can be leveraged to validate, cleanse, maintain, and load data.

    Chapter 6: SSIS Performance and Scalability

    This chapter talks about how to monitor SSIS package execution. It also provides solutions to scale out processes by using parallelism. Readers learn how to identify bottlenecks and how to resolve them using various techniques.

    Chapter 7: Unleash the Power of SSIS Script Task and Component

    Readers learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.

    Chapter 8: SSIS and Advanced Analytics

    This chapter talks about using SSIS to prepare data for and do advanced analyses like data mining, machine learning, and text mining. Readers learn how sampling components can be used for preparing the training and the test set, how to use SQL Server Analysis Services data mining models, how to execute R code inside SSIS, and how to analyze texts with SSIS.

    Chapter 9: On-Premises and Azure Big Data Integration

    This chapter talks about the Azure Feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. Readers learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture.

    Chapter 10: Extending SSIS Custom Task and Transformations

    This chapter talks about extending and customize the toolbox using custom developed tasks and transforms.

    Chapter 11: Scale Out with SSIS 2017

    The last chapter is dedicated to SSIS 2017 and teaches you how to scale out SSIS package executions on multiple servers.

    Enjoy the reading!

  • PASS SQL Saturday #687 Slovenia 2017

    Here we go for the fifth time!

    The leading event dedicated to Microsoft SQL Server in Slovenia, PASS SQL Saturday #687, will take place on Saturday, December 9th 2017, at the Faculty of Computer and Information Science of the University of Ljubljana, Večna pot 113, Ljubljana (https://www.fri.uni-lj.si/en/about-faculty/how-to-reach-us).

    This is an English-only event. Therefore, it is completely international. This way, it might be also interesting for potential sponsors. We get a lot of attendees from the neighboring countries. In addition, we get very low drop-off rate, for the first four years more than 90% of the registered attendees really attended the conference. There will be 30 top sessions, two original and interesting pre-conference seminars, a small party after the conference, an organized dinner for the speakers and sponsors… But first of all, expect a lot of good vibrations, mingling with friends, smiling faces, great atmosphere. You might also consider visiting Ljubljana and Slovenia for couple of additional days. Ljubljana is a very beautiful and lively city, especially in December.

    Hope to see you there!

  • Embrace R @ SQL Nexus 2017 & SQL Saturday #626

    R is the hottest topic in SQL Server 2016. If you want to learn how to use it for advanced analytics, join my seminar at SQL Nexus conference on my 1st in Copenhagen. Although there is still nearly a month before the seminar, there are less than half places still available. You are also very welcome to visit my session Using R in SQL Server, Power BI, and Azure ML during the main conference.

    For beginners, I have another session in the same week, just this time in Budapest. You can join me at the Introducing R session on May 6th at SQL Saturday #626 Budapest.

    Here is the description of the seminar.

    As being an open source development, R is the most popular analytical engine and programming language for data scientists worldwide. The number of libraries with new analytical functions is enormous and continuously growing. However, there are also some drawbacks. R is a programming language, so you have to learn it to use it. Open source development also means less control over code. Finally, the free R engine is not scalable.

    Microsoft added support for R code in SQL Server 2016 and, Azure Machine Learning, or Azure ML, and in Power BI. A parallelized highly scalable execution engine is used to execute the R scripts. In addition, not every library is allowed in these two environments.

    Attendees of this seminar learn to program with R from the scratch. Basic R code is introduced using the free R engine and RStudio IDE. Then the seminar shows some more advanced data manipulations, matrix calculations and statistical analysis together with graphing options. The mathematics behind is briefly explained as well. Then the seminar switches more advanced data mining and machine learning analyses. Attendees also learn how to use the R code in SQL Server, Azure ML, and create SQL Server Reporting Services (SSRS) reports that use R.

    • The seminar consists of the following modules:
    • Introduction to R
    • Data overview and manipulation
    • Basic and advanced visualizations
    • Data mining and machine learning methods
    • Scalable R in SQL Server
    • Using R in SSRS, Power BI, and Azure ML

    Hope to see you there!

  • SQL Server 2016 Developer’s Guide

    I can proudly announce that it is already possible to preorder the SQL Server 2016 Developer’s Guide book (https://www.amazon.com/SQL-Server-2016-Developers-Guide-ebook/dp/B01MS5L01Q/ref=sr_1_20?ie=UTF8&qid=1488533994&sr=8-20&keywords=SQL+Server+2016).

    This is the 14th book I authored or coauthoredSmile This time, my coauthors are Miloš Radivojević (@MilosSQL) and William Durkin (@sql_williamd). It was not an easy job, but it was very nice to work with them, so thank you both! Hope that we will work together soon again.

    Here is a very brief description of the fourteen chapters of the book. Hopy you will enjoy reading it!

    Chapter 1: Introduction to SQL Server 2016

    Many new improvements have been made to SQL Server 2016. In this chapter we’ll cover very briefly most important features and enhancements, not only those for developers. We want to show the whole picture and to point where the things are moving on. Although this book is for developers and covers developer related features, it is pretty clear that in 5-10 years all developers need to deal with some features which are currently developed and are under development within the business intelligence scope. Therefore, it is important to show tendencies so that everyone could consider to embrace some of them. We will also present how Microsoft plan to deliver services and products in the future.

    Chapter 2: Review of SQL Server Features for Developers

    A brief recapitulation of the features available for developers in previous versions of SQL Server in this chapter serves as a foundation for explanation of the many new features in SQL Server 2016. Some best practices are covered as well.

    Chapter 3: SQL Server Tools

    Understanding changes in the release management of SQL Server tools and exploring small and handy enhancements in SQL Server Management Studio (SSMS). Using new fancy feature live query statistics. Exploring SQL Server Data Tools (SSDT) and its support for continuous integration and deployment automation.

    Chapter 4: Transact-SQL Enhancements

    Exploring Transact-SQL enhancements: new functions and syntax extensions, discovering ALTER TABLE improvements for online operations and considering new query hints for query tuning.

    Chapter 5: JSON Support

    Supporting JSON data was the most requested feature on the Microsoft SQL Server connect site. This feature has been finally added in SQL Server 2016. Having JSON support built into SQL Server should make it easier for applications to exchange JSON data with SQL Server.

    Chapter 6: Stretch Database

    Understanding how to migrate historical or less accessed data transparently and securely to the Microsoft Azure by using Stretch Database (Stretch DB) feature.

    Chapter 7: Temporal Tables

    SQL Server 2016 introduces support for system-versioned temporal tables based on the SQL:2011 standard. We’ll explain how this implemented in SQL Server is and demonstrates some use cases for it (time-travel application). We’ll also discuss what is still missing for the full Temporal Data support in SQL Server.

    Chapter 8: Tightening the Security

    SQL Server 2016 introduces three new security features. With Always Encrypted SQL Server finally enables full data encryption, so that no tools or persons regardless their database and server permissions can read encrypted data except the client application with an appropriate key. Row-level security on the other side restricts which data in a table can be seen by specific user. This is very useful in multi-tenant environments where you usually want to avoid data-reading intersection between different customers. Dynamic data masking is a soft feature that limits sensitive data exposure by masking it to non-privileged users.

    Chapter 9: Query Store

    Understanding how to use Query Store to troubleshoot and fix performance problems that are related to execution plan changes. Although this is primarily DBA feature, it can be also very useful for developers to identify most expensive and queries with regressed (or heavily changed) execution plans. It will also help them to analyze and become more familiar with the workload patterns generated by their applications and services.

    Chapter 10: Columnstore Indexes

    Columnar storage was first added to SQL Server in version 2012. It included nonclustered columnstore indexes (NCCI) only. Clustered columnstore indexes (CCI) were added in version 2014. In this chapter, the readers revise the columnar storage and then explore huge improvements for columnstore indexes in SQL Server 2016: updateable nonclustered columnstore indexes, columnstore indexes on in-memory tables, and many other new features for operational analytics.

    Chapter 11: Introducing SQL Server In-Memory OLTP

    Understanding in SQL Server 2014 introduced, but still underused In-Memory database engine that provides significant performance gains for OLTP workloads.

    Chapter 12: In-Memory OLTP Improvements in SQL Server 2016

    With the new SQL Server 2016 release many of the issues that might block the adoption of In-Memory OLTP have been eliminated: supporting foreign keys, check and unique constraints, parallelism, recommended maximum size of In-Memory tables has been increased to 2 TB, tables, stored procedures and indexes can be altered… Also Transact-SQL constructs support for by In-Memory tables and compiled stored procedures has been extended. All these improvements extend the number of potential use cases and allow the implementation with less development effort and risk.

    Chapter 13: Supporting R in SQL Server

    SQL Server R Services combines the power and flexibility of the open source R language with enterprise-level tools for data storage and management, workflow development, and reporting and visualization. This chapter introduces the R Services and the R language.

    Chapter 14: Data Exploration and Predictive Modeling with R in SQL Server

    Just knowing that you can use the R language inside SQL Server does not help much. After R and R support in SQL Server were introduced in the previous chapter, this chapter shows how you can use R for advanced data exploration and manipulation and for statistical analysis and predictive modeling way beyond the possibilities when using T-SQL language only.

  • PASS SQL Saturday #567 Slovenia Recapitulation

    So it is over:-) The fourth SQL Saturday in Ljubljana, Slovenia, the last SQL Saturday in Europe this year… SQL Saturday #567. Probably still some time is needed to collect and sort the impressions. However, from my perspective, the conference was a huge success.

    We have been all together, the attendees, the speakers, the sponsors, and the organizers, 240 out of 278 registered, having 86.33% attendance and less than 14% drop off rate. The drop-of rate makes our SQL Saturday again one of the most successful in the world. For me, this is very important. I really like when registered attendees show respect to speakers, sponsors and organizers, and simply come. I don’t like events where the drop off rate is 50% or even more. Although SQL Saturday is a free event, everybody should respect the fact that the speakers and the organizers are giving their free time, and the sponsors are giving the money.

    The two pre-conference seminars help us closing the budget. Therefore, special thanks goes to the pre-conference speakers. Of course, besides the seminars, the sponsors are the ones who enabled the event.

    On Saturday, we started with a short keynote. Surprisingly, most of the attendees, and not so surprisingly, all of the organizers were there:-)

    IMG_8902

    Then we continued with the regular sessions. Except for the first time slot, when we had some issues with wireless, everything went smoothly. With 30 sessions in 5 tracks, the day was quite intensive.

    IMG_9037

    Of course, we provided food and drinks. So far, there were no complaints, seems like the food was really good. No surprise for me, in Slovenia food is important, with bad food you get immediately bad evaluations, no matter of the quality of the presentations. Seems like the body food is more important than the food for the soul:-)

    IMG_9069

    One of the special traditions of our SQL Saturday is also wine and schnapps tasting after the raffle, at the end of the event.

    IMG_9229

    And we finished with speakers dinner and a party that for some lasted till the morning. Let’s skip the details here. For now, thanks to everybody involved in this great event!

    Hope we all meet again next year!

  • PASS SQL Saturday #567 Slovenia 2016 Sponsors

    Time to expose our wonderful sponsors, who enabled the event! Please check their companies and their great product and services.

    And here is the list, with the link to the SQL Saturday #567 Sponsors page, where you can click individual links.

    image

  • SQL Saturday #567 Slovenia: Please Attend or Unregister

    We would like to make an appeal to all of you who are registered to the PASS SQL Saturday #567 Slovenia event: please come. Please remember that this conference was made possible because of the speakers, who are using their time and come on their own expenses to give you state of the art presentations, because of the sponsors, who are giving us and financing the venue, the food, the raffle awards, and more, and of course, because of many volunteers who spend their free time to help with the organization. We are also paying a fixed number of meals to the catering company; therefore, we would throw the money away for those who are registered and would not come. In short: all you need to do is to wake up, get out of bed, get into a good mood, and come to the event to get top presentations, good food and meet friends!

    However, if you are registered and already know that you will not be able to attend: please unregister and make room for those who would like to attend, but are on the waiting list or did not register yet. use the Register Now button, and if you are already registered, you should get an option to unregister.

    Thank you very much for understanding,

    Matija, Mladen and Dejan

  • PASS SQL Saturday #567 Slovenia Schedule

    The schedule for the last SQL Saturday this year in Europe is alive. Hurry up with registrations, two months before the even we are already 70% full!

    Also don’t forget our great pre-conference seminars:

    See you in beautiful Ljubljana!

  • Data Mining Algorithms – Logistic Regression

    It’s been awhile since I wrote the last blog on the data mining / machine learning algorithms. I described the Neural Network algorithm. In addition, it is a good time to write another post in order to remind the readers of the two upcoming seminars about the algorithms I have in Oslo, Friday, September 2nd, 2016, and in Cambridge, Thursday, September 8th. Hope to see you in one of the seminars. Finally, to conclude this marketing part: if you are interested in the R language, I am preparing another seminar “EmbRace R”, which will cover R from basics to advanced analytics. Stay tuned.

    Now for the algorithm. If you remember the post, a Neural network has an input, an output, and one or more hidden layers. The Neural Network algorithm uses the hyperbolic tangent activation function in the hidden layer and the sigmoid function in output layer. However, the Sigmoid function is called the Logistic function as well. Therefore, describing the Logistic Regression algorithm is simple after I described the Neural Network. If a neural network has only input neurons that are directly connected to the output neurons, it is a Logistic Regression. Or, to repeat the same thing in a different way: Logistic Regression is Neural Network with zero hidden layers.

    This was quick:-) To add more meat to the post, I am adding the formulas and the graphs for the hyperbolic tangent and sigmoid functions.

    image

  • Conferences, User Groups, and Seminars Q3 and Q4 2016

    I am closing my plan for the second semester of this year. Before listing the events I plan to attend, just a quick comment. I had conversation about some specific events and why don’t I visit them many times, especially about the events in vicinity. My answer is pretty simple. I try to plan my events for six months in advance. My schedule for the year 2016 is full. I simply can’t visit the events that are announced only couple of months in advance. I prefer long-term planning.

    Anyway, here is my list, pretty long again.

    And this should be enough for this year:-)

  • PASS SQL Saturday #567 Slovenia 2016

    So we are back againSmile

    The leading event dedicated to Microsoft SQL Server in Slovenia will take place on Saturday, 10th December 2016, at the Faculty of Computer and Information Science of the University of Ljubljana, Večna pot 113, Ljubljana (http://www.fri.uni-lj.si/en/about/how_to_reach_us/).

    As always, this is an English-only event. We don’t expect the speakers and the attendees to understand SlovenianSmile However, this way, our SQL Saturday has become quite well known especially in the neighboring countries. Therefore, expect not only international speakers, expect international attendees as well. There will be 30 top sessions, two original and interesting pre-conference seminars, a small party after the conference, an organized dinner for the speakers and sponsors… But first of all, expect a lot of good vibrations, mingling with friends, smiling faces, great atmosphere. You might also consider visiting Ljubljana and Slovenia for couple of additional days. Ljubljana is a very beautiful and lively city, especially in December.

    In cooperation with Kompas Xnet d.o.o. we are once again organizing two pre-conference seminars by three distinguished Microsoft SQL Server experts:

    The seminars will take place the day before the main event, on Friday, 9th December 2016, at Kompas Xnet d.o.o., Stegne 7, Ljubljana. The attendance fee for each seminar is 149.00 € per person; until 31st October 2016 you can register for each seminar for 119.00 € per person.

    Hope we meet at the event!

  • MDS 2016 Installation on Windows Server 2012 R2

    This is a tip that should help installing SQL Server 2016 (tested on CTP33, RC2 and RC3) Master Data Services. The documentation is pretty old and incomplete (I already sent the feedback).

    The page “Web Application Requirements (Master Data Services)” (https://msdn.microsoft.com/en-us/library/ee633744.aspx) should be seriously updated.

    First of all, there should be documented also how to use operating systems Windows Server 2012 R2 and Windows 10. I managed to install it on Windows Server 2012 R2. However, there is a bullet missing in the Role and Role Services part. In the Performance section, only Static Content Compression is mentioned. However, Dynamic Content Compression is needed as well.

    I managed to get it up and runningSmile

  • DevWeek 2016 BI in SQL Server 2016 Workshop Setup

    I got some questions about virtual machine / notebook setup for my Business Intelligence in SQL Server 2016 DevWeek post-conference workshop. I am writing this blog because I want to spread this information as quickly as possible.

    There will be no labs during the seminar, no time for this. However, I will make all of the code available. Therefore, if the attendees would like to test the code, they need to prepare their own setup. I will use the following SW:

    Windows Server 2012 R2

    SQL Server 2016 components

    • Database Engine
    • SSIS
    • SSRS
    • SSAS
    • DQS
    • MDS
    • R Services

    Tools

    • SQL Server Management Studio (this is not included in SQL Server setup anymore)
    • SQL Server Data Tools
    • R Tools for Visual Studio
    • R Studio

    Excel 2016 Professional Plus with add-ins

    • MDS add-in
    • Power Pivot
    • Power Query
    • Power Map
    • Power View
    • Azure ML add-in

    Excel 2013 Professional Plus with add-ins

    • Data Mining add-in (this add-in does not work for Excel 2016 yet, this one is announced for Excel 2016 only later this year, after SQL Server 2016 release)

    Power BI Apps and Services

    • Power BI Desktop
    • Power BI Service (they need to create a free account at PowerBI.com)
    • Azure ML (they need to create a free account at AzureML.com)

    Mobile Publisher

    AdventureWorks demo databases version 2016, 2014 or 2012

    I know the list is long:-) However, nobody needs to test everything. Just pick the parts you need and you want to learn about.

    See you soon!

  • Conferences, User Groups, and Seminars Q1 and Q2 2016–Update

    Traditionally, I write down a list of presentations I am giving on different events every semester. This semester, I am already a bit late. I am still missing some info. So here is the list of the events I am planning to attend. I will add events and correct the list as needed later. Here is the updated info. Of course, more updates will come when I get the relevant information.

    1. Bulgarian UG meeting, Sofia, January 14th: presentation Introducing R and Azure ML
    2. Slovenian UG meeting, Ljubljana, February 18th: presentation Introducing R and Using R in SQL Server 2016, Power BI, and Azure ML
    3. SQL Server Konferenz 2016, Darmstadt, February 23rd – 25th:
      • pre-conference seminar Data Mining Algorithms in SSAS, Excel, R, and Azure ML
      • presentation SQL Server & Power BI Geographic and Temporal Predictions
    4. PASS SQL Saturday #495, Pordenone, February 27th:
      • presentation SQL Server 2012-2016 Columnar Storage
      • presentation Enterprise Information Management with SQL Server 2016
    5. DevWeek 2016, London, April 22nd – 26th:
      • post-conference seminar Business Intelligence in SQL Server 2016
      • presentation Using R in SQL Server 2016 Database Engine and Reporting Services
      • presentation SQL Server Isolation Levels and Locking
    6. SQL Nexus, Copenhagen, May 2nd – 4th: presentation Identity Mapping and De-Duplicating
    7. SQL Bits 2016, Liverpool, May 4th – 7th: presentation Using R in SQL Server 2016 Database Engine and SSRS
    8. SQL Day, Wroclaw, May 16th – 18th:
      • pre-conference seminar Data Mining Algorithms in SSAS, Excel, R, and Azure ML
      • presentation: Statistical Analysis with T-SQL
      • presentation: Anomaly Detection
    9. PASS SQL Saturday #508, Kyiv, May 21st: information to follow.
    10. PASS SQL Saturday #510, Paris, June 25th: information to follow.
    11. PASS SQL Saturday #520, Cambridge, September 10th: information to follow. And yes, this is already quarter 3, but I am late with this ist anywaySmile
  • Data Mining Algorithms – Neural Network

    A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways:

    • A neural network acquires knowledge through learning
    • A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights

    The Neural Network algorithm is an artificial intelligence technique that explores more possible data relationships than other algorithms. Because it is such a thorough technique, the processing of it is usually slower than the processing of other classification algorithms.

    A neural network consists of basic units modeled after biological neurons. Each unit has many inputs that it combines into a single output value. These inputs are connected together, so the outputs of some units are used as inputs into other units. The network can have one or more middle layers called hidden layers. The simplest are feed-forward networks (pictured), where there is only a one-way flow through the network from the inputs to the outputs. There are no cycles in the feed-forward networks.

    As mentioned, units combine inputs into a single output value. This combination is called the unit’s activation function. Consider this example: The human ear can function near a working jet engine. Yet, if it were only 10 times more sensitive, you would be able to hear a single molecule hitting the membrane in your ears! What does that mean? When you go from 0.01 to 0.02, the difference should be comparable with going from 100 to 200. In biology, there are many types of non-linear behavior.

    Thus, an activation function has two parts. The first part is the combination function that merges all of the inputs into a single value (weighted sum, for example). The second part is the transfer function, which transfers the value of the combination function to the output value of the unit. The linear transfer function would do just the linear regression. The transfer functions are S-shaped, like the sigmoid function:

    Sigmoid(x) = 1 / (1 + e(-x)).

    A single hidden layer is optimal, so the Neural Network algorithm always uses a maximum of one (or zero for Logistic Regression).

    The Neural Network algorithm uses the hyperbolic tangent activation function in the hidden layer and the sigmoid function in output layer. You can see a Neural Network with a single hidden layer in the following picture.

    image

    Training a neural network is the process of setting the best weights on the inputs of each of the units. This backpropagation process does the following:

    • Gets a training example and calculates outputs
    • Calculates the error – the difference between the calculated and the expected (known) result
    • Adjusts the weights to minimize the error

    Like the Decision Trees algorithm, you can use the Neural Network algorithm for classification and prediction. The interpretation of the Neural Network algorithm results is somewhat more complex than the interpretation of the Decision Trees algorithm results. Consequently, the Decision Trees algorithm is more popular.

More Posts Next page »

This Blog

Syndication

Privacy Statement