The Picture of Work

graduation-day

Our dad was a computer scientist in the 50s and 60s for the space and military complex. My brothers and I are at his graduation from college in the picture. This led to us going to 16 different schools growing up.

Later, I would learn computer programming from him at G.E. in Boston (Watertown actually)  on an early computerized medical records platform they’d created called Medinet. I believe it ran on a MULTICS system and used some form of MUMPS. What I am sure of – it was an ENTIRE building across the street! We had to call over there and get the Flex-O-Writer’s connection established for my sessions. I was a regular there every summer for a few years and drank all the cocoa on more than one occasion.

Since that time I’ve loved taking pictures at work ; machinery, co-workers – it didn’t matter.

hp-blade-installation-at-minuteclinic

Here I’m installing and configuring a new HP blade system. Note the SQL Rock Star t-shirt🙂.  I’m going to have to dig, but I have a few from Kennedy Space Center as well from the few times we were allowed to have a camera on site.

One thing’s for sure: Take your kids to work so they see you in a professional environment and come to understand what you do. I was so lucky to have such a great father and I think of him often.

 

Hello New York!

Carnegie's Deli in NYC

My dad was a New Yorker. Born in Manhattan in 1931 along with my grandmother who was born in 1909 in Harlem on St. Nicolas Ave.  That’s just two of New Yorkers in my family.

I don’t get to the city very often these days but I can say I’ve been here every decade since the ’60s and watched it change over the years. I like a couple of things when I’m here: Hot pastrami sandwiches at Carnegie Deli (yes I know it’s closing soon – so sad!) and a Broadway show.

So tonight I might like to take in a play. I love Broadway and musicals and really don’t care what anyone thinks of that🙂

Which brings me to my current fun. I want to see what’s open tonight, when it is, where it is and the like. I decided to whip out Power BI and use BroadwayForBrokePeople.com (what a great site!) as my data source and oila! I might have to catch Wicked! at the Gershwin.  Side note: Can you believe that Phantom of the Opera opened in 1988 and it’s still going??!

The ability to easily pull in tables like that makes Power BI fun for me. Baseball stats or whatever – it’s remarkably easy to pull in data and do things your way.

Have you tried Power BI yet?

 

Course Correction

Fresnel Lens - Lighthouse

The year is more than half over which is a good time for me to step back and do any course corrections for the year.

I create my professional and personal goals each year in December. These include training, technology, platform choices, and so much more. About July I begin thinking about the list and how it compares to how the year has progressed and by August I’m making adjustments.

Several technology goals have been Microsoft focused:

  • Learn the new SQL Server 2016 product
  • Azure!, Azure! Azure!
  • Finish up my certification trek to MCSE

First of all, this year has been incredibly busy with the roll out of SQL Server 2016,  SQL Server is a really big product with so many aspects that include the well known database engine and solutions covering:

  • OLAP – both multidimensional and tabular!
  • Reporting
  • ETL

Spread all that fun across 3 editions ( Express, Standard and Enterprise) and it can get to be a lot of turf!

To date, there has been a lot of activity with SQL Server 2016 and I’ve had to learn a lot including features like the new Stretch Database, Polybase which allows for loading data from Hadoop’s HDFS or Azure Blob stores, and the inclusion of R Services. You can read more about these here.

Azure engagements have been in both the Infrastructure as a Service (IaaS)  (mostly SQL Server 2016) and Platform as a Service (PaaS):

  • Azure SQL Database:
  • Azure SQL Data Warehouse
  • Azure Data Lake
  • Data Factory
  • Power BI

In addition I’ve had a couple of opportunities to re-engage with Cloudera – one of several Hadoop vendors I’ve worked with in the past.

My main goal with Hadoop is to provide a low cost, highly available, secure data store for things like archiving large amounts of old data. This reduces the size of the native SQL Server backups along with the time and storage these need.

Also this year I will have spoken at three SQL Saturdays (Jacksonville, Phoenix and Minneapolis). The topics have centered on Apache Spark and its inclusion into Hadoop. This has the effect of speeding up batch processing.

I have more work to do on my goals for this year including learning the IoT approach using Raspberry Pi on Windows 10 and learning some R. I still have 133 days to get it done🙂

 

SQL Server 2016 – New Features – 1

to-the-stars-through-difficulty_4230542112_o

I’m starting to really enjoy working with SQL Server 2016! There are a lot of features that will help in my daily work like execution plan comparison. What? You didn’t know there was one?

Pull up a plan in SQL Server Management Studio (SSMS) and right click. You’ll see a menu like

Compare Showplan

Open another plan and you’ll see something similar to this:

Compare Showplan2

Very cool!

 

Beautifully Consistent Code

Bridge -Jax

I’m going to be doing a series to help make your code easier to read, understand through a series of techniques and ideas I’ve had over the years. This post is the first in that series.

A number of years ago I learned the value of carefully naming, formatting and commenting code. I was working with an expert in PL/SQL on a project who, over a lifetime of working in all sorts of languages, had learned that rigorously following standards brought order to difficult code. His views were similar to my own but much better defined at that time. Some of the inspirations for this post are thanks to Ted (you know who you are!).

Header Block

Comment header blocks give me a place to tell YOU why I wrote this thing. What it does, what is assumed or ignored and most of all – why it matters!

-- ////////////////////////////////////////////////////////////////////////////
-- //$Revision: 1 $
-- //$Date: <LastEdited, SYSNAME, LastEdited> 8:49a $
-- //$Modtime: <LastEdited, SYSNAME, LastEdited> 8:44a $
-- //$Workfile: <ScriptFileName, SYSNAME, ScriptFileName>.sql $
-- ////////////////////////////////////////////////////////////////////////////
-- ////////////////////////////////////////////////////////////////////////////
-- //
-- // 
-- //
-- //  Modification history:
-- //  Rev#  Date       Author     Description
-- //   ----- --------  ---------  --------------------------------------------
-- //  1.0   <LastEdited, SYSNAME, LastEdited> R Gosling  Checked into SVN
-- ////////////////////////////////////////////////////////////////////////////

This one is set up for replaceable parameters: By pressing CTRL+M in SQL Server Management Studio (starting in 2005) it will prompt for the various fields to be filled in – in this case LastEdited, and ScriptFileName. This is an often overlooked feature too as most don’t know it exists.

You might notice the $Revision, $Date, $Modtime and $Workfile values. These are automatically substituted when the code is checked into your favorite source control product. Mine is still Subversion despite the religious wars going on over Git. TFS, CVS and just about all version control systems do it the same way.  Git, however, doesn’t support this idea – at least not in this way.

Here are a list from the Microsoft  Visual Source Safe Documentation.  These are also supported by most of the common source control products as mentioned:

$Archive: $
Specifies the Visual SourceSafe archive file location.

$Author: $
Specifies the user who last changed the file.

$Date: $
Specifies the date and time of the last check-in.

$Header: $
Specifies a header in the form: Logfile, Revision, Date, Author.

$History: $
Specifies file history in Visual SourceSafe format.

$JustDate: $
Specifies a simple date, without a time.

$Log: $
Specifies a file history in RCS format.

$Logfile: $
Same as Archive keyword.

$Modtime: $
Specifies the date and time of the last modification.

$Revision: $
Specifies a Visual SourceSafe version number.

$Workfile: $
Specifies the file name.

$NoKeywords: $
Specifies that there is no keyword expansion for all keywords that follow.

 

 

Also be aware of the limits of command line interfaces (CLIs). They often only support 79 characters before wrapping around to the next line. So if you needed to ‘cat’ a piece of code to the screen having it wrap made an unreadable mess.

Naming Convention

A common approach in SQL Server in some shops is to prefix tables with tbl_ , views with vw_, stored procedures with sp_ and so forth. In the case of stored procedures, Microsoft (and Sybase before it) were already using sp_. Some time ago it would be discovered that the internal search SQL Server does on the default path includes anything named ‘sp_’  which can lead to a performance hit.

As for naming tables tbl_* I never saw the wisdom of that but you could make a case that naming views with vw_* helps identify them as opposed to tables since they can behave similarly.

Functions are a little different as well. It’s not necessary to identify dbo.fn_myfunction by way of context: ‘SELECT * FROM dbo.fn_myfunction(id);’ you can plainly see that it is. If anything, there is more than one kind of function (scalar vs. table-valued functions). So perhaps your standard would be like fn_myfunction  for a scalar and tvf_myfunction for a table-valued function.

Whatever standard you chose, be sure to be consistent. That way if there is a need to make some wholesale change (like naming) it will be possible to automate it. Also, it makes your code look professional, well organized and most of all, understandable.

Schemas

One thing you don’t see enough of in the SQL Server world is the intelligent use of Schemas. These can help organize code by area. Sure, they were originally designed for use as a way to provide ownership of a group of things, but they really come into their own in SQL Server as a way to easily name things that belong together – like Sales.Orders.

A really great example of the smart use of Schemas is in the AdventureWorks database. You can just look at the names and see what the designer(s) intended. You don’t need a document to know what’s what. Isn’t that really the point of all this?

Code Layout

Then comes code layout. I’m picky about having my code look consistent, organized and easy to read.:

  • Commas to the front
  • Lists are Aligned
  • Keywords are ALL CAPS
  • Camel Case for User Defined objects

There are tools to help you achieve this such as the SSMS Tool Pack (maybe free – depends), Red-Gate’s SQL Prompt($), Apex SQLRefactor(free!) and Poor Man’s SQL (free!).  I’ve used all with success. The main thing is these can help get your code in beautifully readable shape! They reformat the code to a predefined set of rules – rules that you decide on.

Summary

Take time at the beginning of your next project and think about organizing and naming objects.

Transactions at the Speed of Memory

So Help Me Codd

The E.F. Codd, the Father of Relational Theory gave us all these incredibly valuable principles which are foundational to Relational Database Management Systems (RDBMS) like SQL Server.

These days the trade-off is to have either consistency or speed when it comes to data processing systems. The new technologies of Hadoop, Cassandra, CouchDB and Mongo bypass the guarantees-counter for speed. Do I really care if the system loses my Facebook post or a tweet?  Probably Not. Do I care if my bank account balance is correct? Absolutely!

Through the ACID Principles

  • Atomicity
  • Consistency
  • Isolation
  • Durability

the guarantees ensured accuracy (and still do). But the amount of effort it takes to get that accuracy hurts performance which is the chief reason why the new Big Data tools are so much faster. They’re not in the business of making these kinds guarantees.

Lately there has been an attempt to bring Online Transaction Processing (OLTP) workloads into the same speeds as their Big Data brethren but continue providing the guarantees many systems depend on. In SQL Server 2014 there are two new features:

  • Delayed Durability (All Editions)
  • In-Memory OLTP (Enterprise Only)

Delayed Durability allows that transactions can be committed asynchronously, e.g. that log flushes aren’t immediate. This can help with high-speed transactions.

In-Memory OLTP is a bit more complicated. For example, there are two durability options:

  • SCHEMA_AND_DATA (Default)
    • Data Survives Restart of SQL Server
  • SCHEMA_ONLY
    • Does Not Survive Restart of SQL Server

Notes and Limitations

  • There are no Pages or Extents
  • Seperate Parser, Optimizer etc.
  • No Locking/Blocking
  • No Disk IO – Everything is in Memory
  • Uses Same Log File(s) As Disk Based Tables

Summary:

SQL Server 2014 and the upcoming 2016 release have a few ways to assist with OLTP performance making it more like some of the big data technologies while keeping the most of ACID. Give it a try!

Welcome to the Rodeo!

Bulldogging2

I work for Pragmatic Works and was recently discussing the art of consulting with one of my peers. During the conversation I realized I’ve come to see this like a rodeo. As a kid in Santa Fe I loved going to the rodeo and sometimes the task for the cowboy/cowgirl is easy and smooth – other times it can be rough!  In consulting how you fare in these situations largely depends on documentation.

Recently a peer said “If it’s not in writing, it never happen!”.  How true!  Nothing substitutes for having the facts in writing as they occur. Emails, memos, special documents outlining a problem are all effective ways to communicating and failing to do so is failing is a serious way.

Raising issues in a timely fashion has always been a problem for me. Rather than whining to the client that this or that is broken I fix things. One of my strengths is dealing with problems (that’s why they hired me -right?).  But only to a point and knowing when to raise your hand is key. Like many people I don’t like admitting defeat but sometimes it takes a team to solve problems. It’s all in the balance.

So I have a rule these days: limit the time for a problem. If it’s not fixed within that time raise the issue. For one week assignments the rule is 4 hours. For month long engagements or longer it’s a day. To that end, I now set a timer to help keep me in check. Often this is just an appointment on my Outlook calendar. When the time’s up – it’s up!

How do you deal with a problem when it raises its ugly head? Discuss!